vocab_size not found in data/openwebtext/meta.pkl, using GPT-2 default of 50257
Initializing a new model from scratch
number of parameters: 50.96M
step 0: train loss 10.9687, val loss 10.9745
iter 0: loss 12.4264, time 5918.26ms
iter 10: loss 12.4834, time 125.37ms
iter 20: loss 12.4455, time 125.16ms
iter 30: loss 12.5036, time 124.95ms
iter 40: loss 12.4247, time 124.89ms
iter 50: loss 12.4632, time 124.98ms
iter 60: loss 12.2482, time 125.02ms
iter 70: loss 12.4012, time 125.19ms
iter 80: loss 12.3714, time 127.92ms
iter 90: loss 12.4468, time 125.51ms
iter 100: loss 12.4471, time 126.45ms
iter 110: loss 12.3934, time 126.22ms
iter 120: loss 12.4024, time 124.81ms
iter 130: loss 12.3185, time 125.54ms
iter 140: loss 12.3583, time 125.65ms
iter 150: loss 12.5399, time 125.68ms
iter 160: loss 12.4251, time 124.73ms
iter 170: loss 12.4769, time 125.34ms
iter 180: loss 12.3567, time 125.75ms
iter 190: loss 12.3578, time 128.29ms
iter 200: loss 12.3845, time 125.31ms
iter 210: loss 12.4546, time 125.89ms
iter 220: loss 12.3898, time 126.20ms
iter 230: loss 12.3402, time 125.85ms
iter 240: loss 12.4136, time 125.87ms
step 250: train loss 10.9394, val loss 10.9391
saving checkpoint to out-shakespeare-char
iter 250: loss 12.4442, time 2920.84ms
iter 260: loss 12.4056, time 125.19ms
iter 270: loss 12.3959, time 125.59ms
iter 280: loss 12.2712, time 125.98ms
iter 290: loss 12.3636, time 125.41ms
iter 300: loss 12.4626, time 128.26ms
iter 310: loss 12.3655, time 125.04ms
iter 320: loss 12.3617, time 125.45ms
iter 330: loss 12.4248, time 125.48ms
iter 340: loss 12.3736, time 125.15ms
iter 350: loss 12.3409, time 125.81ms
iter 360: loss 12.3949, time 125.49ms
iter 370: loss 12.4899, time 125.34ms
iter 380: loss 12.3253, time 125.32ms
iter 390: loss 12.3107, time 125.68ms
iter 400: loss 12.4006, time 126.00ms
iter 410: loss 12.4104, time 128.73ms
iter 420: loss 12.2700, time 125.85ms
iter 430: loss 12.3569, time 125.76ms
iter 440: loss 12.3952, time 126.30ms
iter 450: loss 12.4085, time 125.79ms
iter 460: loss 12.4200, time 125.66ms
iter 470: loss 12.4086, time 125.58ms
iter 480: loss 12.2611, time 129.11ms
iter 490: loss 12.4260, time 126.23ms
step 500: train loss 10.8967, val loss 10.8967
saving checkpoint to out-shakespeare-char
iter 500: loss 12.3557, time 2918.89ms
iter 510: loss 12.3097, time 121.99ms
iter 520: loss 12.3478, time 122.50ms
iter 530: loss 12.2533, time 122.60ms
iter 540: loss 12.3010, time 123.17ms
iter 550: loss 12.3365, time 121.99ms
iter 560: loss 12.3572, time 124.89ms
iter 570: loss 12.3141, time 122.07ms
iter 580: loss 12.4365, time 122.30ms
iter 590: loss 12.2522, time 122.11ms
iter 600: loss 12.3074, time 123.11ms
iter 610: loss 12.4228, time 121.66ms
iter 620: loss 12.4247, time 123.21ms
iter 630: loss 12.3460, time 122.02ms
iter 640: loss 12.4190, time 122.97ms
iter 650: loss 12.3002, time 122.19ms
iter 660: loss 12.4012, time 122.15ms
iter 670: loss 12.2861, time 122.10ms
iter 680: loss 12.3095, time 123.03ms
iter 690: loss 12.3342, time 120.89ms
iter 700: loss 12.3138, time 123.32ms
iter 710: loss 12.3231, time 121.96ms
iter 720: loss 12.3056, time 123.30ms
iter 730: loss 12.3258, time 122.00ms
iter 740: loss 12.2809, time 123.25ms
step 750: train loss 10.8616, val loss 10.8587
saving checkpoint to out-shakespeare-char
iter 750: loss 12.2034, time 2939.88ms
iter 760: loss 12.3029, time 121.27ms
iter 770: loss 12.4756, time 124.80ms
iter 780: loss 12.3359, time 122.07ms
iter 790: loss 12.2638, time 125.07ms
iter 800: loss 12.3231, time 122.13ms
iter 810: loss 12.2850, time 125.16ms
iter 820: loss 12.2979, time 121.91ms
iter 830: loss 12.3287, time 125.12ms
iter 840: loss 12.2465, time 122.45ms
iter 850: loss 12.2645, time 124.88ms
iter 860: loss 12.3570, time 122.07ms
iter 870: loss 12.2353, time 123.86ms
iter 880: loss 12.2748, time 122.09ms
iter 890: loss 12.3186, time 125.02ms
iter 900: loss 12.4063, time 122.82ms
iter 910: loss 12.3293, time 127.15ms
iter 920: loss 12.3545, time 125.97ms
iter 930: loss 12.3700, time 126.26ms
iter 940: loss 12.2637, time 125.91ms
iter 950: loss 12.3674, time 125.98ms
iter 960: loss 12.2366, time 125.99ms
iter 970: loss 12.1297, time 125.92ms
iter 980: loss 12.3093, time 126.09ms
iter 990: loss 12.3963, time 126.18ms
step 1000: train loss 10.8139, val loss 10.8181
saving checkpoint to out-shakespeare-char
iter 1000: loss 12.3309, time 2916.97ms
iter 1010: loss 12.1936, time 126.15ms
iter 1020: loss 12.2429, time 126.03ms
iter 1030: loss 12.1798, time 126.40ms
iter 1040: loss 12.2936, time 126.25ms
iter 1050: loss 12.2300, time 126.09ms
iter 1060: loss 12.2765, time 126.06ms
iter 1070: loss 12.2824, time 126.35ms
iter 1080: loss 12.1608, time 126.44ms
iter 1090: loss 12.3530, time 126.29ms
iter 1100: loss 12.2431, time 125.04ms
iter 1110: loss 12.1676, time 126.75ms
iter 1120: loss 12.2801, time 126.58ms
iter 1130: loss 12.3120, time 129.46ms
iter 1140: loss 12.1968, time 126.09ms
iter 1150: loss 12.2273, time 126.19ms
iter 1160: loss 12.3283, time 125.98ms
iter 1170: loss 12.2385, time 128.03ms
iter 1180: loss 12.2628, time 125.65ms
iter 1190: loss 12.2164, time 125.64ms
iter 1200: loss 12.2649, time 125.53ms
iter 1210: loss 12.2448, time 124.70ms
iter 1220: loss 12.2508, time 126.01ms
iter 1230: loss 12.1882, time 125.16ms
iter 1240: loss 12.3457, time 128.93ms
step 1250: train loss 10.7635, val loss 10.7733
saving checkpoint to out-shakespeare-char
iter 1250: loss 12.1614, time 2907.72ms
iter 1260: loss 12.1060, time 126.15ms
iter 1270: loss 12.1594, time 125.59ms
iter 1280: loss 12.2744, time 126.07ms
iter 1290: loss 12.2268, time 126.38ms
iter 1300: loss 12.1694, time 128.27ms
iter 1310: loss 12.2119, time 125.94ms
iter 1320: loss 12.2207, time 124.82ms
iter 1330: loss 12.2468, time 125.72ms
iter 1340: loss 12.1886, time 125.79ms
iter 1350: loss 12.2366, time 125.82ms
iter 1360: loss 12.2221, time 125.72ms
iter 1370: loss 12.2697, time 128.10ms
iter 1380: loss 12.0438, time 125.54ms
iter 1390: loss 12.1994, time 125.39ms
iter 1400: loss 12.3019, time 126.24ms
iter 1410: loss 12.1690, time 126.09ms
iter 1420: loss 12.2184, time 125.95ms
iter 1430: loss 12.1405, time 126.11ms
iter 1440: loss 12.1524, time 126.14ms
iter 1450: loss 12.1187, time 126.03ms
iter 1460: loss 12.1428, time 125.76ms
iter 1470: loss 12.1500, time 126.27ms
iter 1480: loss 12.0892, time 129.04ms
iter 1490: loss 12.1786, time 125.41ms
step 1500: train loss 10.7282, val loss 10.7254
saving checkpoint to out-shakespeare-char
iter 1500: loss 12.2166, time 2907.61ms
iter 1510: loss 12.2862, time 126.12ms
iter 1520: loss 12.1679, time 125.72ms
iter 1530: loss 12.0429, time 125.71ms
iter 1540: loss 12.1208, time 127.90ms
iter 1550: loss 12.2081, time 125.51ms
iter 1560: loss 12.2104, time 125.63ms
iter 1570: loss 12.2260, time 125.72ms
iter 1580: loss 12.1619, time 125.47ms
iter 1590: loss 12.1314, time 125.48ms
iter 1600: loss 12.2140, time 125.65ms
iter 1610: loss 12.1599, time 126.41ms
iter 1620: loss 12.1675, time 125.82ms
iter 1630: loss 12.0817, time 125.82ms
iter 1640: loss 12.1381, time 125.69ms
iter 1650: loss 12.0591, time 124.53ms
iter 1660: loss 12.1298, time 125.63ms
iter 1670: loss 12.1652, time 125.08ms
iter 1680: loss 12.2089, time 125.94ms
iter 1690: loss 12.1630, time 125.66ms
iter 1700: loss 12.2421, time 122.12ms
iter 1710: loss 12.0789, time 123.35ms
iter 1720: loss 12.1468, time 122.29ms
iter 1730: loss 12.1763, time 122.28ms
iter 1740: loss 12.1365, time 121.98ms
step 1750: train loss 10.6744, val loss 10.6678
saving checkpoint to out-shakespeare-char
iter 1750: loss 12.0940, time 2920.87ms
iter 1760: loss 12.2561, time 121.67ms
iter 1770: loss 12.1375, time 121.89ms
iter 1780: loss 12.3102, time 121.72ms
iter 1790: loss 11.9598, time 121.96ms
iter 1800: loss 12.0765, time 121.59ms
iter 1810: loss 12.1584, time 122.01ms
iter 1820: loss 12.1512, time 121.11ms
iter 1830: loss 11.9891, time 122.04ms
iter 1840: loss 12.0240, time 121.90ms
iter 1850: loss 12.1777, time 122.19ms
iter 1860: loss 12.1032, time 121.89ms
iter 1870: loss 12.0644, time 123.23ms
iter 1880: loss 12.1923, time 121.79ms
iter 1890: loss 12.0990, time 121.92ms
iter 1900: loss 12.1549, time 121.78ms
iter 1910: loss 12.1838, time 124.84ms
iter 1920: loss 12.0924, time 120.73ms
iter 1930: loss 12.0364, time 124.84ms
iter 1940: loss 12.2728, time 122.04ms
iter 1950: loss 12.0022, time 124.77ms
iter 1960: loss 12.0241, time 122.04ms
iter 1970: loss 12.1571, time 124.80ms
iter 1980: loss 12.1440, time 121.96ms
iter 1990: loss 12.0677, time 124.97ms
step 2000: train loss 10.6187, val loss 10.6216
saving checkpoint to out-shakespeare-char
iter 2000: loss 11.9698, time 2907.84ms
iter 2010: loss 12.2146, time 122.01ms
iter 2020: loss 12.1570, time 121.50ms
iter 2030: loss 12.1795, time 121.55ms
iter 2040: loss 11.9636, time 121.50ms
iter 2050: loss 12.0854, time 121.63ms
iter 2060: loss 12.0294, time 121.67ms
iter 2070: loss 12.1122, time 121.29ms
iter 2080: loss 11.9810, time 121.42ms
iter 2090: loss 12.0503, time 122.06ms
iter 2100: loss 11.9943, time 121.94ms
iter 2110: loss 11.8194, time 121.78ms
iter 2120: loss 12.1974, time 121.86ms
iter 2130: loss 11.9960, time 121.67ms
iter 2140: loss 11.9919, time 121.74ms
iter 2150: loss 12.0474, time 121.86ms
iter 2160: loss 12.1414, time 121.61ms
iter 2170: loss 12.0377, time 121.64ms
iter 2180: loss 11.9881, time 121.22ms
iter 2190: loss 12.0222, time 121.78ms
iter 2200: loss 12.0814, time 121.62ms
iter 2210: loss 12.0870, time 121.48ms
iter 2220: loss 12.0532, time 121.68ms
iter 2230: loss 12.0961, time 122.21ms
iter 2240: loss 12.0361, time 121.73ms
step 2250: train loss 10.5740, val loss 10.5664
saving checkpoint to out-shakespeare-char
iter 2250: loss 12.1585, time 2901.72ms
iter 2260: loss 11.8944, time 124.84ms
iter 2270: loss 11.9278, time 122.11ms
iter 2280: loss 12.0740, time 124.58ms
iter 2290: loss 12.0490, time 121.64ms
iter 2300: loss 11.8662, time 124.55ms
iter 2310: loss 11.9350, time 121.52ms
iter 2320: loss 11.8963, time 124.28ms
iter 2330: loss 11.9787, time 122.01ms
iter 2340: loss 11.9192, time 124.47ms
iter 2350: loss 12.0691, time 121.19ms
iter 2360: loss 12.0349, time 124.72ms
iter 2370: loss 12.0694, time 121.72ms
iter 2380: loss 12.0389, time 124.64ms
iter 2390: loss 12.0398, time 121.71ms
iter 2400: loss 12.1365, time 124.64ms
iter 2410: loss 12.0510, time 122.01ms
iter 2420: loss 11.9951, time 124.48ms
iter 2430: loss 11.9756, time 121.56ms
iter 2440: loss 11.6994, time 124.62ms
iter 2450: loss 11.9113, time 121.54ms
iter 2460: loss 11.7598, time 124.53ms
iter 2470: loss 12.1410, time 122.14ms
iter 2480: loss 11.9744, time 124.56ms
iter 2490: loss 12.1819, time 121.88ms
step 2500: train loss 10.5012, val loss 10.5125
saving checkpoint to out-shakespeare-char
iter 2500: loss 11.9529, time 2907.33ms
iter 2510: loss 12.1162, time 122.23ms
iter 2520: loss 12.0790, time 124.62ms
iter 2530: loss 11.9612, time 122.00ms
iter 2540: loss 11.8070, time 125.01ms
iter 2550: loss 11.8428, time 121.90ms
iter 2560: loss 12.0267, time 124.91ms
iter 2570: loss 11.9329, time 121.98ms
iter 2580: loss 11.9304, time 124.82ms
iter 2590: loss 11.8893, time 121.89ms
iter 2600: loss 12.2371, time 124.74ms
iter 2610: loss 12.1044, time 122.01ms
iter 2620: loss 11.8408, time 124.72ms
iter 2630: loss 11.7391, time 121.76ms
iter 2640: loss 11.8718, time 124.30ms
iter 2650: loss 11.9002, time 121.88ms
iter 2660: loss 12.0475, time 124.75ms
iter 2670: loss 12.0176, time 121.05ms
iter 2680: loss 11.9337, time 124.67ms
iter 2690: loss 12.0587, time 121.40ms
iter 2700: loss 11.9544, time 125.04ms
iter 2710: loss 11.6951, time 122.31ms
iter 2720: loss 11.9463, time 124.78ms
iter 2730: loss 11.9420, time 121.56ms
iter 2740: loss 11.7488, time 124.48ms
step 2750: train loss 10.4339, val loss 10.4469
saving checkpoint to out-shakespeare-char
iter 2750: loss 11.9933, time 2905.83ms
iter 2760: loss 11.8463, time 121.42ms
iter 2770: loss 11.7095, time 122.59ms
iter 2780: loss 11.8921, time 121.47ms
iter 2790: loss 11.7309, time 123.16ms
iter 2800: loss 11.9568, time 121.89ms
iter 2810: loss 11.9890, time 122.65ms
iter 2820: loss 11.8545, time 121.95ms
iter 2830: loss 11.9926, time 123.07ms
iter 2840: loss 11.5521, time 121.96ms
iter 2850: loss 11.8663, time 122.65ms
iter 2860: loss 11.7764, time 121.79ms
iter 2870: loss 11.6937, time 122.60ms
iter 2880: loss 11.7183, time 121.78ms
iter 2890: loss 11.8968, time 122.59ms
iter 2900: loss 11.8182, time 121.52ms
iter 2910: loss 11.9931, time 122.64ms
iter 2920: loss 11.6290, time 121.48ms
iter 2930: loss 11.8047, time 122.61ms
iter 2940: loss 11.8217, time 121.56ms
iter 2950: loss 11.7719, time 122.50ms
iter 2960: loss 11.5089, time 121.37ms
iter 2970: loss 11.9529, time 123.04ms
iter 2980: loss 11.6088, time 121.88ms
iter 2990: loss 11.5635, time 123.09ms
step 3000: train loss 10.3710, val loss 10.3665
saving checkpoint to out-shakespeare-char
iter 3000: loss 12.0825, time 2898.81ms
iter 3010: loss 11.6270, time 124.71ms
iter 3020: loss 11.7634, time 121.05ms
iter 3030: loss 11.7036, time 124.77ms
iter 3040: loss 11.8196, time 121.90ms
iter 3050: loss 11.9974, time 124.67ms
iter 3060: loss 11.6445, time 122.08ms
iter 3070: loss 12.0701, time 124.77ms
iter 3080: loss 11.9567, time 121.94ms
iter 3090: loss 11.7048, time 125.36ms
iter 3100: loss 11.8589, time 121.92ms
iter 3110: loss 11.6724, time 124.61ms
iter 3120: loss 11.7524, time 122.35ms
iter 3130: loss 11.9331, time 124.84ms
iter 3140: loss 11.9501, time 121.91ms
iter 3150: loss 11.9392, time 124.82ms
iter 3160: loss 11.6840, time 121.92ms
iter 3170: loss 11.7184, time 124.80ms
iter 3180: loss 11.6655, time 122.42ms
iter 3190: loss 11.7296, time 125.14ms
iter 3200: loss 11.9106, time 121.78ms
iter 3210: loss 11.8843, time 124.85ms
iter 3220: loss 11.8155, time 121.90ms
iter 3230: loss 11.8379, time 124.44ms
iter 3240: loss 12.0128, time 121.93ms
step 3250: train loss 10.2869, val loss 10.2971
saving checkpoint to out-shakespeare-char
iter 3250: loss 11.9300, time 2916.48ms
iter 3260: loss 11.8677, time 121.62ms
iter 3270: loss 11.9401, time 122.09ms
iter 3280: loss 11.7480, time 121.94ms
iter 3290: loss 11.6431, time 122.92ms
iter 3300: loss 11.9193, time 121.96ms
iter 3310: loss 12.0849, time 123.21ms
iter 3320: loss 11.9781, time 121.61ms
iter 3330: loss 11.8415, time 123.59ms
iter 3340: loss 11.6938, time 121.52ms
iter 3350: loss 11.7051, time 123.17ms
iter 3360: loss 11.8692, time 121.53ms
iter 3370: loss 11.6239, time 122.94ms
iter 3380: loss 11.7621, time 121.85ms
iter 3390: loss 11.8027, time 123.23ms
iter 3400: loss 11.9088, time 121.65ms
iter 3410: loss 11.5998, time 122.92ms
iter 3420: loss 11.8205, time 121.55ms
iter 3430: loss 11.7734, time 123.00ms
iter 3440: loss 11.8712, time 122.27ms
iter 3450: loss 11.6658, time 123.18ms
iter 3460: loss 11.4748, time 121.65ms
iter 3470: loss 11.9932, time 122.97ms
iter 3480: loss 11.9218, time 121.62ms
iter 3490: loss 11.7887, time 123.81ms
step 3500: train loss 10.2313, val loss 10.2155
saving checkpoint to out-shakespeare-char
iter 3500: loss 11.5323, time 2908.53ms
iter 3510: loss 11.5609, time 124.49ms
iter 3520: loss 11.4239, time 128.03ms
iter 3530: loss 11.6658, time 125.49ms
iter 3540: loss 11.5370, time 125.30ms
iter 3550: loss 11.8921, time 125.47ms
iter 3560: loss 11.7794, time 125.35ms
iter 3570: loss 11.7506, time 126.38ms
iter 3580: loss 11.6189, time 125.23ms
iter 3590: loss 11.7555, time 125.41ms
iter 3600: loss 11.8566, time 124.79ms
iter 3610: loss 11.5964, time 122.33ms
iter 3620: loss 11.2216, time 121.73ms
iter 3630: loss 11.5169, time 121.40ms
iter 3640: loss 11.4578, time 121.63ms
iter 3650: loss 11.5772, time 121.59ms
iter 3660: loss 11.5159, time 121.55ms
iter 3670: loss 11.7793, time 121.53ms
iter 3680: loss 11.5878, time 122.26ms
iter 3690: loss 11.6511, time 121.13ms
iter 3700: loss 11.5741, time 121.68ms
iter 3710: loss 11.5858, time 121.61ms
iter 3720: loss 11.6654, time 121.81ms
iter 3730: loss 11.6658, time 121.56ms
iter 3740: loss 11.2275, time 121.82ms
step 3750: train loss 10.1344, val loss 10.1488
saving checkpoint to out-shakespeare-char
iter 3750: loss 11.6297, time 2914.65ms
iter 3760: loss 11.3832, time 121.98ms
iter 3770: loss 11.6072, time 121.96ms
iter 3780: loss 11.7167, time 122.07ms
iter 3790: loss 11.7008, time 121.81ms
iter 3800: loss 11.5010, time 122.01ms
iter 3810: loss 11.5097, time 121.83ms
iter 3820: loss 11.7074, time 121.98ms
iter 3830: loss 11.6268, time 121.99ms
iter 3840: loss 11.8273, time 121.89ms
iter 3850: loss 11.4600, time 121.92ms
iter 3860: loss 11.5309, time 121.93ms
iter 3870: loss 11.8267, time 121.75ms
iter 3880: loss 11.3999, time 122.03ms
iter 3890: loss 11.5194, time 121.80ms
iter 3900: loss 11.4931, time 121.90ms
iter 3910: loss 11.2422, time 120.64ms
iter 3920: loss 11.7606, time 121.83ms
iter 3930: loss 11.4712, time 122.29ms
iter 3940: loss 11.6171, time 122.10ms
iter 3950: loss 11.5151, time 121.81ms
iter 3960: loss 11.7111, time 121.98ms
iter 3970: loss 11.5383, time 122.01ms
iter 3980: loss 11.4907, time 121.92ms
iter 3990: loss 11.2962, time 121.98ms
step 4000: train loss 10.0453, val loss 10.0351
saving checkpoint to out-shakespeare-char
iter 4000: loss 11.5885, time 2898.85ms
iter 4010: loss 11.2704, time 123.19ms
iter 4020: loss 11.6388, time 121.77ms
iter 4030: loss 11.6703, time 123.27ms
iter 4040: loss 11.4598, time 121.74ms
iter 4050: loss 11.6195, time 122.08ms
iter 4060: loss 11.1861, time 121.78ms
iter 4070: loss 11.5345, time 122.26ms
iter 4080: loss 11.6606, time 121.78ms
iter 4090: loss 11.5907, time 122.86ms
iter 4100: loss 11.6229, time 121.82ms
iter 4110: loss 11.3531, time 122.78ms
iter 4120: loss 11.3464, time 121.92ms
iter 4130: loss 11.5649, time 122.98ms
iter 4140: loss 11.3236, time 121.64ms
iter 4150: loss 11.7438, time 123.17ms
iter 4160: loss 11.3215, time 121.75ms
iter 4170: loss 11.5748, time 122.85ms
iter 4180: loss 11.5649, time 121.70ms
iter 4190: loss 11.6199, time 122.69ms
iter 4200: loss 11.6113, time 122.14ms
iter 4210: loss 11.3580, time 122.83ms
iter 4220: loss 11.2947, time 121.92ms
iter 4230: loss 11.1867, time 122.84ms
iter 4240: loss 11.6711, time 121.83ms
step 4250: train loss 9.9574, val loss 9.9460
saving checkpoint to out-shakespeare-char
iter 4250: loss 11.3733, time 2898.51ms
iter 4260: loss 11.2417, time 121.95ms
iter 4270: loss 11.3291, time 121.71ms
iter 4280: loss 11.7148, time 121.72ms
iter 4290: loss 11.4368, time 121.71ms
iter 4300: loss 10.9453, time 121.47ms
iter 4310: loss 11.4731, time 121.63ms
iter 4320: loss 11.4240, time 121.37ms
iter 4330: loss 11.2314, time 121.58ms
iter 4340: loss 11.2316, time 121.46ms
iter 4350: loss 11.4927, time 121.55ms
iter 4360: loss 11.6339, time 121.46ms
iter 4370: loss 11.3752, time 122.34ms
iter 4380: loss 11.2277, time 121.43ms
iter 4390: loss 11.2569, time 121.59ms
iter 4400: loss 10.6192, time 121.45ms
iter 4410: loss 11.4802, time 121.67ms
iter 4420: loss 11.4017, time 121.86ms
iter 4430: loss 11.1790, time 121.77ms
iter 4440: loss 11.8498, time 121.81ms
iter 4450: loss 11.5331, time 121.54ms
iter 4460: loss 11.4680, time 121.88ms
iter 4470: loss 11.7787, time 121.94ms
iter 4480: loss 11.7684, time 121.67ms
iter 4490: loss 11.1888, time 121.89ms
step 4500: train loss 9.8653, val loss 9.8720
saving checkpoint to out-shakespeare-char
iter 4500: loss 11.3710, time 2901.89ms
iter 4510: loss 11.2892, time 122.17ms
iter 4520: loss 11.4652, time 124.79ms
iter 4530: loss 11.3140, time 121.88ms
iter 4540: loss 11.2570, time 125.07ms
iter 4550: loss 11.4171, time 121.68ms
iter 4560: loss 11.0960, time 125.01ms
iter 4570: loss 11.6571, time 121.98ms
iter 4580: loss 11.3571, time 124.99ms
iter 4590: loss 11.2143, time 122.10ms
iter 4600: loss 11.3115, time 125.06ms
iter 4610: loss 11.2755, time 122.42ms
iter 4620: loss 11.0373, time 124.60ms
iter 4630: loss 10.8068, time 122.16ms
iter 4640: loss 11.1268, time 124.96ms
iter 4650: loss 11.4338, time 121.66ms
iter 4660: loss 11.4617, time 124.10ms
iter 4670: loss 11.3811, time 122.13ms
iter 4680: loss 11.2672, time 124.59ms
iter 4690: loss 11.2258, time 121.95ms
iter 4700: loss 11.2259, time 125.00ms
iter 4710: loss 11.4811, time 121.51ms
iter 4720: loss 11.4401, time 124.36ms
iter 4730: loss 11.4650, time 120.88ms
iter 4740: loss 11.1958, time 124.54ms
step 4750: train loss 9.7672, val loss 9.7870
saving checkpoint to out-shakespeare-char
iter 4750: loss 11.3004, time 2890.06ms
iter 4760: loss 11.5312, time 121.53ms
iter 4770: loss 11.1183, time 124.69ms
iter 4780: loss 10.9940, time 121.30ms
iter 4790: loss 10.9756, time 124.71ms
iter 4800: loss 11.1375, time 121.42ms
iter 4810: loss 11.0653, time 124.34ms
iter 4820: loss 11.5315, time 121.07ms
iter 4830: loss 11.4717, time 124.39ms
iter 4840: loss 11.2244, time 122.48ms
iter 4850: loss 11.2747, time 124.69ms
iter 4860: loss 11.2322, time 121.97ms
iter 4870: loss 10.9947, time 124.31ms
iter 4880: loss 11.4661, time 122.45ms
iter 4890: loss 11.0973, time 124.89ms
iter 4900: loss 11.1656, time 122.08ms
iter 4910: loss 11.1399, time 125.19ms
iter 4920: loss 11.2298, time 122.33ms
iter 4930: loss 11.2473, time 124.91ms
iter 4940: loss 10.9879, time 122.04ms
iter 4950: loss 11.3741, time 124.69ms
iter 4960: loss 11.2159, time 122.37ms
iter 4970: loss 10.5853, time 124.50ms
iter 4980: loss 11.2549, time 121.67ms
iter 4990: loss 11.3212, time 125.07ms
step 5000: train loss 9.6842, val loss 9.6651
saving checkpoint to out-shakespeare-char
iter 5000: loss 11.3223, time 2910.67ms
iter 5010: loss 11.5989, time 121.79ms
iter 5020: loss 10.9854, time 125.51ms
iter 5030: loss 11.1943, time 121.97ms
iter 5040: loss 11.0537, time 124.70ms
iter 5050: loss 11.6495, time 122.00ms
iter 5060: loss 10.8724, time 124.61ms
iter 5070: loss 11.5061, time 122.04ms
iter 5080: loss 11.2826, time 125.35ms
iter 5090: loss 11.0323, time 122.18ms
iter 5100: loss 10.5957, time 125.42ms
iter 5110: loss 11.3095, time 122.05ms
iter 5120: loss 11.0562, time 124.71ms
iter 5130: loss 11.4320, time 121.82ms
iter 5140: loss 10.8733, time 125.07ms
iter 5150: loss 10.9657, time 122.11ms
iter 5160: loss 11.1356, time 125.09ms
iter 5170: loss 11.4683, time 121.97ms
iter 5180: loss 10.4597, time 125.15ms
iter 5190: loss 10.9871, time 121.96ms
iter 5200: loss 11.0129, time 124.66ms
iter 5210: loss 11.1047, time 122.52ms
iter 5220: loss 11.2083, time 125.09ms
iter 5230: loss 10.8558, time 122.16ms
iter 5240: loss 11.0666, time 124.89ms
step 5250: train loss 9.6123, val loss 9.5905
saving checkpoint to out-shakespeare-char
iter 5250: loss 11.1890, time 2900.70ms
iter 5260: loss 10.9800, time 121.95ms
iter 5270: loss 11.7224, time 122.03ms
iter 5280: loss 10.9962, time 122.00ms
iter 5290: loss 11.4983, time 121.80ms
iter 5300: loss 11.0400, time 121.97ms
iter 5310: loss 11.0009, time 122.19ms
iter 5320: loss 11.1060, time 121.90ms
iter 5330: loss 11.0286, time 122.00ms
iter 5340: loss 10.8171, time 121.89ms
iter 5350: loss 11.1386, time 122.08ms
iter 5360: loss 10.9525, time 121.98ms
iter 5370: loss 11.1009, time 121.92ms
iter 5380: loss 10.8830, time 121.87ms
iter 5390: loss 10.7788, time 122.02ms
iter 5400: loss 10.7447, time 122.11ms
iter 5410: loss 10.8071, time 121.93ms
iter 5420: loss 11.0601, time 121.85ms
iter 5430: loss 11.3779, time 121.99ms
iter 5440: loss 11.2467, time 122.00ms
iter 5450: loss 11.2382, time 122.11ms
iter 5460: loss 11.0128, time 122.53ms
iter 5470: loss 11.4327, time 122.21ms
iter 5480: loss 11.0931, time 122.34ms
iter 5490: loss 11.0571, time 122.05ms
step 5500: train loss 9.5257, val loss 9.5520
saving checkpoint to out-shakespeare-char
iter 5500: loss 11.1589, time 2896.90ms
iter 5510: loss 11.1295, time 121.47ms
iter 5520: loss 10.5436, time 122.87ms
iter 5530: loss 11.0505, time 121.83ms
iter 5540: loss 10.8149, time 121.88ms
iter 5550: loss 10.7997, time 121.64ms
iter 5560: loss 10.9008, time 121.41ms
iter 5570: loss 10.7166, time 121.59ms
iter 5580: loss 10.5790, time 121.59ms
iter 5590: loss 10.6824, time 121.61ms
iter 5600: loss 10.8278, time 121.32ms
iter 5610: loss 11.0490, time 121.57ms
iter 5620: loss 11.2321, time 121.43ms
iter 5630: loss 10.6312, time 122.03ms
iter 5640: loss 10.9272, time 121.24ms
iter 5650: loss 10.4456, time 121.78ms
iter 5660: loss 11.2607, time 121.85ms
iter 5670: loss 10.9651, time 121.63ms
iter 5680: loss 10.9335, time 121.48ms
iter 5690: loss 10.5079, time 121.36ms
iter 5700: loss 10.5492, time 121.84ms
iter 5710: loss 10.7522, time 121.89ms
iter 5720: loss 11.2465, time 121.77ms
iter 5730: loss 11.0202, time 121.49ms
iter 5740: loss 10.9660, time 121.43ms
step 5750: train loss 9.4648, val loss 9.4311
saving checkpoint to out-shakespeare-char
iter 5750: loss 11.1056, time 2907.23ms
iter 5760: loss 10.9076, time 122.71ms
iter 5770: loss 11.4481, time 122.04ms
iter 5780: loss 10.7241, time 121.69ms
iter 5790: loss 11.0634, time 122.18ms
iter 5800: loss 10.5386, time 121.68ms
iter 5810: loss 11.1681, time 122.24ms
iter 5820: loss 10.4260, time 122.54ms
iter 5830: loss 10.6304, time 121.65ms
iter 5840: loss 11.0905, time 122.09ms
iter 5850: loss 11.0735, time 121.20ms
iter 5860: loss 11.0646, time 120.92ms
iter 5870: loss 11.2098, time 119.89ms
iter 5880: loss 10.5983, time 122.07ms
iter 5890: loss 10.4222, time 120.03ms
iter 5900: loss 11.3596, time 121.27ms
iter 5910: loss 10.9338, time 122.04ms
iter 5920: loss 11.2885, time 122.21ms
iter 5930: loss 10.6326, time 121.79ms
iter 5940: loss 11.4481, time 123.10ms
iter 5950: loss 11.2481, time 122.04ms
iter 5960: loss 10.5743, time 123.17ms
iter 5970: loss 10.5598, time 121.96ms
iter 5980: loss 11.0196, time 122.39ms
iter 5990: loss 11.1491, time 122.03ms
step 6000: train loss 9.3782, val loss 9.4482
saving checkpoint to out-shakespeare-char
iter 6000: loss 11.0357, time 2917.88ms
iter 6010: loss 10.3521, time 125.41ms
iter 6020: loss 10.6611, time 125.63ms
iter 6030: loss 11.2016, time 125.27ms
iter 6040: loss 10.7035, time 125.41ms
iter 6050: loss 10.9222, time 125.30ms
iter 6060: loss 11.2707, time 125.69ms
iter 6070: loss 10.6402, time 128.49ms
iter 6080: loss 10.6899, time 125.61ms
iter 6090: loss 11.3525, time 124.25ms
iter 6100: loss 10.8227, time 125.66ms
iter 6110: loss 10.7159, time 128.50ms
iter 6120: loss 10.6500, time 125.47ms
iter 6130: loss 11.0161, time 125.40ms
iter 6140: loss 10.7855, time 124.38ms
iter 6150: loss 11.3455, time 125.40ms
iter 6160: loss 10.2165, time 125.22ms
iter 6170: loss 10.8496, time 125.55ms
iter 6180: loss 11.4047, time 124.86ms
iter 6190: loss 10.9551, time 125.46ms
iter 6200: loss 10.4764, time 124.61ms
iter 6210: loss 10.9879, time 125.46ms
iter 6220: loss 11.0821, time 128.49ms
iter 6230: loss 10.7616, time 125.56ms
iter 6240: loss 11.2144, time 125.44ms
step 6250: train loss 9.3609, val loss 9.3619
saving checkpoint to out-shakespeare-char
iter 6250: loss 10.9282, time 2915.17ms
iter 6260: loss 10.6153, time 125.93ms
iter 6270: loss 10.2909, time 125.59ms
iter 6280: loss 10.6831, time 128.95ms
iter 6290: loss 10.6405, time 125.37ms
iter 6300: loss 10.5273, time 125.28ms
iter 6310: loss 11.0658, time 125.22ms
iter 6320: loss 10.8332, time 125.76ms
iter 6330: loss 10.7011, time 125.46ms
iter 6340: loss 10.8982, time 125.49ms
iter 6350: loss 10.2838, time 124.67ms
iter 6360: loss 10.6112, time 125.27ms
iter 6370: loss 10.5524, time 125.31ms
iter 6380: loss 10.9297, time 125.42ms
iter 6390: loss 10.7878, time 127.21ms
iter 6400: loss 11.0179, time 125.91ms
iter 6410: loss 10.8829, time 125.46ms
iter 6420: loss 10.7108, time 125.52ms
iter 6430: loss 10.7989, time 125.20ms
iter 6440: loss 10.7638, time 125.47ms
iter 6450: loss 10.6668, time 125.57ms
iter 6460: loss 10.8050, time 125.55ms
iter 6470: loss 10.9088, time 125.78ms
iter 6480: loss 10.5719, time 125.67ms
iter 6490: loss 10.7699, time 125.92ms
step 6500: train loss 9.3239, val loss 9.3509
saving checkpoint to out-shakespeare-char
iter 6500: loss 10.9127, time 2904.60ms
iter 6510: loss 10.5569, time 126.29ms
iter 6520: loss 10.6215, time 124.90ms
iter 6530: loss 10.8833, time 125.54ms
iter 6540: loss 10.6086, time 125.14ms
iter 6550: loss 10.9829, time 125.45ms
iter 6560: loss 11.1334, time 125.41ms
iter 6570: loss 10.3621, time 124.91ms
iter 6580: loss 10.6620, time 125.50ms
iter 6590: loss 10.7067, time 125.44ms
iter 6600: loss 10.3122, time 128.72ms
iter 6610: loss 10.4558, time 125.20ms
iter 6620: loss 10.1882, time 125.42ms
iter 6630: loss 11.1909, time 125.55ms
iter 6640: loss 11.1896, time 125.26ms
iter 6650: loss 10.5699, time 124.45ms
iter 6660: loss 10.4308, time 125.52ms
iter 6670: loss 10.6105, time 125.00ms
iter 6680: loss 11.1672, time 125.65ms
iter 6690: loss 10.8886, time 125.86ms
iter 6700: loss 10.6603, time 125.05ms
iter 6710: loss 10.8651, time 128.57ms
iter 6720: loss 11.0031, time 125.56ms
iter 6730: loss 11.0701, time 125.76ms
iter 6740: loss 10.8439, time 126.32ms
step 6750: train loss 9.2578, val loss 9.2561
saving checkpoint to out-shakespeare-char
iter 6750: loss 10.6121, time 2897.98ms
iter 6760: loss 10.8065, time 125.65ms
iter 6770: loss 10.3896, time 128.92ms
iter 6780: loss 10.4132, time 125.96ms
iter 6790: loss 11.0537, time 125.99ms
iter 6800: loss 10.7553, time 125.97ms
iter 6810: loss 10.7305, time 125.93ms
iter 6820: loss 10.8231, time 127.20ms
iter 6830: loss 11.0128, time 125.97ms
iter 6840: loss 10.9247, time 126.07ms
iter 6850: loss 10.3340, time 125.66ms
iter 6860: loss 10.3846, time 125.41ms
iter 6870: loss 10.8085, time 125.85ms
iter 6880: loss 10.1494, time 128.70ms
iter 6890: loss 10.8587, time 125.85ms
iter 6900: loss 9.7503, time 125.56ms
iter 6910: loss 10.3017, time 125.98ms
iter 6920: loss 11.1065, time 125.82ms
iter 6930: loss 10.4904, time 124.99ms
iter 6940: loss 10.4652, time 125.75ms
iter 6950: loss 10.6406, time 125.16ms
iter 6960: loss 10.9700, time 125.77ms
iter 6970: loss 10.1311, time 125.70ms
iter 6980: loss 10.9461, time 125.96ms
iter 6990: loss 10.8175, time 128.70ms
step 7000: train loss 9.2402, val loss 9.2785
saving checkpoint to out-shakespeare-char
iter 7000: loss 10.1586, time 2899.51ms
iter 7010: loss 10.2654, time 128.51ms
iter 7020: loss 10.7378, time 125.96ms
iter 7030: loss 10.9419, time 124.60ms
iter 7040: loss 11.2587, time 125.64ms
iter 7050: loss 10.2896, time 125.58ms
iter 7060: loss 10.7629, time 125.65ms
iter 7070: loss 10.5180, time 125.75ms
iter 7080: loss 10.9122, time 125.76ms
iter 7090: loss 10.7072, time 125.78ms
iter 7100: loss 10.2361, time 125.12ms
iter 7110: loss 10.7954, time 125.54ms
iter 7120: loss 10.6905, time 128.75ms
iter 7130: loss 9.9090, time 125.49ms
iter 7140: loss 10.6457, time 125.27ms
iter 7150: loss 10.2326, time 124.83ms
iter 7160: loss 10.7031, time 125.80ms
iter 7170: loss 10.5167, time 125.68ms
iter 7180: loss 10.4102, time 125.42ms
iter 7190: loss 10.1075, time 125.26ms
iter 7200: loss 11.0354, time 125.89ms
iter 7210: loss 10.8108, time 125.87ms
iter 7220: loss 10.4281, time 125.51ms
iter 7230: loss 11.0163, time 128.79ms
iter 7240: loss 10.6633, time 125.54ms
step 7250: train loss 9.2318, val loss 9.2185
saving checkpoint to out-shakespeare-char
iter 7250: loss 10.4321, time 2899.91ms
iter 7260: loss 11.0022, time 129.29ms
iter 7270: loss 10.3210, time 126.04ms
iter 7280: loss 10.8027, time 126.07ms
iter 7290: loss 9.8870, time 126.03ms
iter 7300: loss 10.4233, time 125.87ms
iter 7310: loss 10.9986, time 125.75ms
iter 7320: loss 10.6561, time 126.42ms
iter 7330: loss 10.8522, time 125.64ms
iter 7340: loss 10.7525, time 124.86ms
iter 7350: loss 10.9598, time 125.84ms
iter 7360: loss 11.1037, time 126.51ms
iter 7370: loss 10.4622, time 124.95ms
iter 7380: loss 10.9258, time 125.70ms
iter 7390: loss 10.6702, time 125.32ms
iter 7400: loss 10.7614, time 125.63ms
iter 7410: loss 11.3529, time 128.52ms
iter 7420: loss 10.6029, time 125.94ms
iter 7430: loss 10.4715, time 125.38ms
iter 7440: loss 11.2313, time 126.07ms
iter 7450: loss 10.8526, time 125.95ms
iter 7460: loss 10.9541, time 125.74ms
iter 7470: loss 10.7708, time 124.70ms
iter 7480: loss 11.0696, time 125.82ms
iter 7490: loss 10.7450, time 125.79ms
step 7500: train loss 9.1519, val loss 9.1532
saving checkpoint to out-shakespeare-char
iter 7500: loss 10.3203, time 2894.20ms
iter 7510: loss 10.3944, time 125.70ms
iter 7520: loss 10.7328, time 125.86ms
iter 7530: loss 10.6498, time 125.70ms
iter 7540: loss 10.9229, time 125.52ms
iter 7550: loss 10.4734, time 125.90ms
iter 7560: loss 10.7464, time 125.63ms
iter 7570: loss 10.3704, time 125.45ms
iter 7580: loss 10.8283, time 128.73ms
iter 7590: loss 10.1584, time 125.48ms
iter 7600: loss 10.6568, time 125.65ms
iter 7610: loss 10.6441, time 125.13ms
iter 7620: loss 10.8117, time 125.35ms
iter 7630: loss 10.2611, time 125.35ms
iter 7640: loss 10.6446, time 125.55ms
iter 7650: loss 10.2351, time 124.78ms
iter 7660: loss 10.5886, time 125.10ms
iter 7670: loss 10.4903, time 125.22ms
iter 7680: loss 10.7380, time 125.58ms
iter 7690: loss 10.6472, time 128.36ms
iter 7700: loss 10.2389, time 125.45ms
iter 7710: loss 10.1966, time 125.67ms
iter 7720: loss 10.5561, time 125.58ms
iter 7730: loss 10.5462, time 125.21ms
iter 7740: loss 10.6717, time 125.75ms
step 7750: train loss 9.1322, val loss 9.1555
saving checkpoint to out-shakespeare-char
iter 7750: loss 11.1110, time 2894.10ms
iter 7760: loss 11.1475, time 125.68ms
iter 7770: loss 11.1229, time 125.72ms
iter 7780: loss 10.5533, time 125.72ms
iter 7790: loss 10.7781, time 122.45ms
iter 7800: loss 10.6992, time 121.72ms
iter 7810: loss 10.4744, time 122.70ms
iter 7820: loss 10.8141, time 121.52ms
iter 7830: loss 10.1986, time 122.97ms
iter 7840: loss 10.6948, time 121.73ms
iter 7850: loss 10.5802, time 123.08ms
iter 7860: loss 9.9953, time 121.25ms
iter 7870: loss 10.2110, time 123.11ms
iter 7880: loss 10.6656, time 121.67ms
iter 7890: loss 11.0271, time 123.09ms
iter 7900: loss 10.1738, time 121.91ms
iter 7910: loss 10.8965, time 123.29ms
iter 7920: loss 10.6651, time 122.16ms
iter 7930: loss 10.5694, time 123.25ms
iter 7940: loss 10.9308, time 121.08ms
iter 7950: loss 10.7251, time 123.12ms
iter 7960: loss 11.1141, time 121.74ms
iter 7970: loss 10.9620, time 123.03ms
iter 7980: loss 10.8435, time 121.70ms
iter 7990: loss 10.5178, time 123.11ms
step 8000: train loss 9.1263, val loss 9.1650
saving checkpoint to out-shakespeare-char
iter 8000: loss 10.8295, time 2913.00ms
iter 8010: loss 10.6235, time 121.01ms
iter 8020: loss 10.5849, time 121.38ms
iter 8030: loss 10.4137, time 121.33ms
iter 8040: loss 10.3825, time 122.23ms
iter 8050: loss 10.5982, time 122.00ms
iter 8060: loss 10.8221, time 121.85ms
iter 8070: loss 10.3491, time 121.68ms
iter 8080: loss 10.4738, time 121.53ms
iter 8090: loss 11.1456, time 121.63ms
iter 8100: loss 10.8773, time 121.46ms
iter 8110: loss 10.0175, time 122.13ms
iter 8120: loss 10.8064, time 121.38ms
iter 8130: loss 10.7137, time 121.80ms
iter 8140: loss 9.8674, time 121.49ms
iter 8150: loss 10.5224, time 133.27ms
iter 8160: loss 10.7111, time 121.72ms
iter 8170: loss 10.7241, time 125.01ms
iter 8180: loss 10.4862, time 121.94ms
iter 8190: loss 10.9514, time 123.23ms
iter 8200: loss 10.5689, time 121.77ms
iter 8210: loss 10.6547, time 124.67ms
iter 8220: loss 10.5813, time 121.26ms
iter 8230: loss 10.0014, time 124.51ms
iter 8240: loss 10.4933, time 122.20ms
step 8250: train loss 9.0897, val loss 9.1111
saving checkpoint to out-shakespeare-char
iter 8250: loss 10.9540, time 2909.70ms
iter 8260: loss 10.9305, time 120.99ms
iter 8270: loss 10.3346, time 121.65ms
iter 8280: loss 10.4769, time 121.50ms
iter 8290: loss 10.1821, time 120.39ms
iter 8300: loss 11.0118, time 121.39ms
iter 8310: loss 10.6460, time 121.48ms
iter 8320: loss 10.6569, time 120.74ms
iter 8330: loss 10.3978, time 121.48ms
iter 8340: loss 10.4792, time 121.09ms
iter 8350: loss 10.3921, time 121.61ms
iter 8360: loss 10.8516, time 121.37ms
iter 8370: loss 10.4448, time 120.74ms
iter 8380: loss 10.9908, time 121.69ms
iter 8390: loss 10.2554, time 121.51ms
iter 8400: loss 10.0863, time 121.54ms
iter 8410: loss 10.5776, time 121.68ms
iter 8420: loss 10.3936, time 121.51ms
iter 8430: loss 10.8245, time 121.17ms
iter 8440: loss 10.2068, time 121.66ms
iter 8450: loss 10.4507, time 125.10ms
iter 8460: loss 10.6791, time 121.79ms
iter 8470: loss 10.6080, time 123.86ms
iter 8480: loss 10.7716, time 121.77ms
iter 8490: loss 10.1115, time 124.90ms
step 8500: train loss 9.0708, val loss 9.0592
saving checkpoint to out-shakespeare-char
iter 8500: loss 10.3433, time 2910.97ms
iter 8510: loss 10.6241, time 121.44ms
iter 8520: loss 10.7067, time 121.48ms
iter 8530: loss 10.5849, time 120.93ms
iter 8540: loss 10.8597, time 121.93ms
iter 8550: loss 10.7666, time 120.93ms
iter 8560: loss 10.1634, time 121.34ms
iter 8570: loss 10.6418, time 120.65ms
iter 8580: loss 10.1259, time 120.86ms
iter 8590: loss 10.5571, time 120.98ms
iter 8600: loss 10.3637, time 121.10ms
iter 8610: loss 10.6467, time 121.35ms
iter 8620: loss 10.1618, time 121.05ms
iter 8630: loss 10.5348, time 121.14ms
iter 8640: loss 10.8767, time 120.36ms
iter 8650: loss 10.1770, time 120.89ms
iter 8660: loss 10.5547, time 121.83ms
iter 8670: loss 10.6123, time 120.87ms
iter 8680: loss 10.7302, time 120.73ms
iter 8690: loss 10.5829, time 121.34ms
iter 8700: loss 10.9172, time 121.49ms
iter 8710: loss 10.2647, time 121.19ms
iter 8720: loss 10.6413, time 120.57ms
iter 8730: loss 9.9856, time 121.08ms
iter 8740: loss 10.2588, time 120.66ms
step 8750: train loss 9.0546, val loss 9.0437
saving checkpoint to out-shakespeare-char
iter 8750: loss 10.8725, time 2902.89ms
iter 8760: loss 10.2658, time 125.98ms
iter 8770: loss 10.7441, time 125.32ms
iter 8780: loss 10.8187, time 125.57ms
iter 8790: loss 10.2858, time 125.49ms
iter 8800: loss 10.3145, time 125.20ms
iter 8810: loss 10.3242, time 125.63ms
iter 8820: loss 10.8940, time 125.43ms
iter 8830: loss 10.1078, time 125.81ms
iter 8840: loss 10.4672, time 125.95ms
iter 8850: loss 9.9888, time 125.70ms
iter 8860: loss 10.6472, time 126.85ms
iter 8870: loss 10.9156, time 124.99ms
iter 8880: loss 10.1934, time 128.15ms
iter 8890: loss 10.8813, time 125.66ms
iter 8900: loss 10.6264, time 125.44ms
iter 8910: loss 10.7033, time 126.14ms
iter 8920: loss 11.0707, time 125.58ms
iter 8930: loss 10.1554, time 125.80ms
iter 8940: loss 10.4029, time 125.21ms
iter 8950: loss 10.8079, time 124.68ms
iter 8960: loss 11.2280, time 125.68ms
iter 8970: loss 10.8215, time 128.54ms
iter 8980: loss 9.9277, time 125.28ms
iter 8990: loss 10.6223, time 125.25ms
step 9000: train loss 9.0382, val loss 8.9854
saving checkpoint to out-shakespeare-char
iter 9000: loss 10.4879, time 2897.52ms
iter 9010: loss 11.0203, time 123.97ms
iter 9020: loss 10.9069, time 121.72ms
iter 9030: loss 10.0686, time 124.80ms
iter 9040: loss 10.6439, time 121.55ms
iter 9050: loss 10.5882, time 124.72ms
iter 9060: loss 10.2173, time 121.76ms
iter 9070: loss 10.8086, time 124.85ms
iter 9080: loss 10.4591, time 122.05ms
iter 9090: loss 10.4893, time 125.00ms
iter 9100: loss 10.5643, time 121.75ms
iter 9110: loss 10.0964, time 124.72ms
iter 9120: loss 10.5259, time 121.72ms
iter 9130: loss 10.4937, time 125.37ms
iter 9140: loss 10.5600, time 121.61ms
iter 9150: loss 10.1321, time 124.61ms
iter 9160: loss 10.7188, time 121.52ms
iter 9170: loss 9.9015, time 124.63ms
iter 9180: loss 10.9908, time 121.70ms
iter 9190: loss 10.1781, time 124.61ms
iter 9200: loss 10.1730, time 121.69ms
iter 9210: loss 9.9490, time 124.31ms
iter 9220: loss 10.9071, time 121.80ms
iter 9230: loss 10.4775, time 125.30ms
iter 9240: loss 10.2895, time 121.63ms
step 9250: train loss 8.9657, val loss 9.0293
saving checkpoint to out-shakespeare-char
iter 9250: loss 10.2622, time 2906.67ms
iter 9260: loss 10.8051, time 126.12ms
iter 9270: loss 10.1513, time 125.35ms
iter 9280: loss 10.3907, time 125.86ms
iter 9290: loss 11.0177, time 125.75ms
iter 9300: loss 10.3192, time 125.82ms
iter 9310: loss 10.9334, time 128.71ms
iter 9320: loss 10.6219, time 125.77ms
iter 9330: loss 10.1883, time 125.60ms
iter 9340: loss 10.8125, time 125.73ms
iter 9350: loss 10.1765, time 125.64ms
iter 9360: loss 10.5436, time 125.73ms
iter 9370: loss 10.1743, time 126.23ms
iter 9380: loss 9.7038, time 125.12ms
iter 9390: loss 10.3788, time 125.06ms
iter 9400: loss 10.5820, time 126.19ms
iter 9410: loss 10.5762, time 125.90ms
iter 9420: loss 10.2346, time 125.99ms
iter 9430: loss 10.4134, time 125.80ms
iter 9440: loss 10.4330, time 125.76ms
iter 9450: loss 10.0116, time 125.63ms
iter 9460: loss 10.4264, time 126.10ms
iter 9470: loss 10.4586, time 125.75ms
iter 9480: loss 10.8252, time 126.10ms
iter 9490: loss 10.4115, time 128.66ms
step 9500: train loss 9.0386, val loss 9.0098
saving checkpoint to out-shakespeare-char
iter 9500: loss 10.5893, time 2878.87ms
iter 9510: loss 10.4536, time 125.90ms
iter 9520: loss 10.3206, time 126.00ms
iter 9530: loss 10.1991, time 125.74ms
iter 9540: loss 10.5199, time 126.24ms
iter 9550: loss 10.6336, time 125.46ms
iter 9560: loss 10.4302, time 125.73ms
iter 9570: loss 10.4892, time 125.61ms
iter 9580: loss 10.2038, time 126.05ms
iter 9590: loss 10.7111, time 128.42ms
iter 9600: loss 10.4457, time 125.85ms
iter 9610: loss 10.2152, time 125.88ms
iter 9620: loss 10.4787, time 125.79ms
iter 9630: loss 10.4031, time 127.70ms
iter 9640: loss 10.1346, time 125.96ms
iter 9650: loss 10.4368, time 126.11ms
iter 9660: loss 9.9960, time 125.67ms
iter 9670: loss 10.7028, time 125.28ms
iter 9680: loss 10.5628, time 125.61ms
iter 9690: loss 10.2775, time 125.85ms
iter 9700: loss 10.5401, time 125.96ms
iter 9710: loss 10.9145, time 125.91ms
iter 9720: loss 10.8103, time 125.75ms
iter 9730: loss 10.5522, time 126.09ms
iter 9740: loss 10.6919, time 128.71ms
step 9750: train loss 8.9446, val loss 8.9278
saving checkpoint to out-shakespeare-char
iter 9750: loss 10.7352, time 2868.26ms
iter 9760: loss 10.9527, time 129.36ms
iter 9770: loss 10.4372, time 125.66ms
iter 9780: loss 10.2469, time 125.24ms
iter 9790: loss 10.1726, time 125.90ms
iter 9800: loss 10.4102, time 128.52ms
iter 9810: loss 10.3504, time 126.04ms
iter 9820: loss 10.2025, time 125.96ms
iter 9830: loss 10.0591, time 125.93ms
iter 9840: loss 10.2783, time 125.51ms
iter 9850: loss 10.5520, time 126.11ms
iter 9860: loss 10.2244, time 125.46ms
iter 9870: loss 10.5325, time 125.96ms
iter 9880: loss 10.0912, time 125.80ms
iter 9890: loss 10.6659, time 125.95ms
iter 9900: loss 9.9608, time 125.41ms
iter 9910: loss 10.0951, time 125.98ms
iter 9920: loss 10.7607, time 125.82ms
iter 9930: loss 10.5536, time 126.34ms
iter 9940: loss 10.4294, time 125.80ms
iter 9950: loss 10.0391, time 129.13ms
iter 9960: loss 10.2505, time 125.69ms
iter 9970: loss 10.5144, time 126.04ms
iter 9980: loss 10.2076, time 125.29ms
iter 9990: loss 10.9116, time 125.68ms
step 10000: train loss 8.9363, val loss 8.9376
saving checkpoint to out-shakespeare-char
iter 10000: loss 9.8786, time 2890.30ms
iter 10010: loss 9.8603, time 129.07ms
iter 10020: loss 10.4067, time 126.49ms
iter 10030: loss 10.2183, time 125.89ms
iter 10040: loss 10.6320, time 126.04ms
iter 10050: loss 10.7159, time 126.23ms
iter 10060: loss 10.6978, time 125.82ms
iter 10070: loss 10.3307, time 126.20ms
iter 10080: loss 10.6984, time 125.72ms
iter 10090: loss 10.4839, time 125.87ms
iter 10100: loss 10.7023, time 125.99ms
iter 10110: loss 10.5044, time 126.08ms
iter 10120: loss 10.3783, time 128.61ms
iter 10130: loss 10.4145, time 126.04ms
iter 10140: loss 10.5833, time 125.95ms
iter 10150: loss 10.6535, time 125.69ms
iter 10160: loss 9.8563, time 125.81ms
iter 10170: loss 10.0629, time 125.83ms
iter 10180: loss 10.3958, time 125.77ms
iter 10190: loss 10.1114, time 125.70ms
iter 10200: loss 10.9215, time 125.63ms
iter 10210: loss 9.7614, time 125.74ms
iter 10220: loss 10.8443, time 126.23ms
iter 10230: loss 10.0898, time 125.56ms
iter 10240: loss 10.9162, time 124.72ms
step 10250: train loss 8.9186, val loss 8.9354
saving checkpoint to out-shakespeare-char
iter 10250: loss 10.4977, time 2896.69ms
iter 10260: loss 10.6657, time 125.97ms
iter 10270: loss 10.4588, time 125.75ms
iter 10280: loss 10.3908, time 126.32ms
iter 10290: loss 10.6396, time 128.62ms
iter 10300: loss 10.3428, time 125.66ms
iter 10310: loss 10.2338, time 125.82ms
iter 10320: loss 10.0478, time 126.16ms
iter 10330: loss 10.5535, time 127.10ms
iter 10340: loss 10.0409, time 125.90ms
iter 10350: loss 10.6739, time 126.00ms
iter 10360: loss 10.3026, time 125.75ms
iter 10370: loss 10.2690, time 125.99ms
iter 10380: loss 10.5025, time 126.04ms
iter 10390: loss 10.2629, time 125.89ms
iter 10400: loss 10.2359, time 128.52ms
iter 10410: loss 10.5737, time 125.80ms
iter 10420: loss 10.2456, time 125.68ms
iter 10430: loss 10.0815, time 126.14ms
iter 10440: loss 10.9917, time 127.80ms
iter 10450: loss 10.4506, time 126.02ms
iter 10460: loss 10.6842, time 125.78ms
iter 10470: loss 10.1029, time 126.50ms
iter 10480: loss 10.0799, time 126.05ms
iter 10490: loss 10.0923, time 125.82ms
step 10500: train loss 8.9405, val loss 8.9029
saving checkpoint to out-shakespeare-char
iter 10500: loss 10.5281, time 2896.75ms
iter 10510: loss 10.4739, time 126.18ms
iter 10520: loss 11.0713, time 125.77ms
iter 10530: loss 10.5375, time 125.80ms
iter 10540: loss 10.0850, time 128.75ms
iter 10550: loss 10.0830, time 125.79ms
iter 10560: loss 10.0080, time 125.88ms
iter 10570: loss 10.6665, time 126.00ms
iter 10580: loss 10.6947, time 126.38ms
iter 10590: loss 10.1580, time 126.05ms
iter 10600: loss 9.9383, time 125.80ms
iter 10610: loss 9.9113, time 125.63ms
iter 10620: loss 10.5432, time 125.79ms
iter 10630: loss 9.6956, time 125.26ms
iter 10640: loss 10.3191, time 126.09ms
iter 10650: loss 9.8704, time 128.60ms
iter 10660: loss 10.7832, time 125.76ms
iter 10670: loss 9.5819, time 125.84ms
iter 10680: loss 10.3878, time 125.89ms
iter 10690: loss 10.4474, time 126.12ms
iter 10700: loss 10.4079, time 125.80ms
iter 10710: loss 10.1571, time 125.69ms
iter 10720: loss 9.6494, time 125.88ms
iter 10730: loss 10.1393, time 125.70ms
iter 10740: loss 10.2098, time 125.41ms
step 10750: train loss 8.9194, val loss 8.8322
saving checkpoint to out-shakespeare-char
iter 10750: loss 10.6540, time 2885.37ms
iter 10760: loss 10.2843, time 126.01ms
iter 10770: loss 10.3767, time 126.37ms
iter 10780: loss 10.5304, time 125.37ms
iter 10790: loss 10.5683, time 125.88ms
iter 10800: loss 10.4299, time 125.88ms
iter 10810: loss 10.8021, time 125.62ms
iter 10820: loss 10.8292, time 125.71ms
iter 10830: loss 10.2270, time 125.70ms
iter 10840: loss 9.8171, time 125.45ms
iter 10850: loss 10.4759, time 128.58ms
iter 10860: loss 9.8427, time 125.72ms
iter 10870: loss 10.5208, time 125.73ms
iter 10880: loss 10.3682, time 125.78ms
iter 10890: loss 10.3582, time 125.51ms
iter 10900: loss 10.1050, time 125.84ms
iter 10910: loss 9.6672, time 125.96ms
iter 10920: loss 10.1540, time 125.07ms
iter 10930: loss 10.4391, time 125.75ms
iter 10940: loss 10.2266, time 125.93ms
iter 10950: loss 10.8093, time 126.06ms
iter 10960: loss 10.4565, time 125.61ms
iter 10970: loss 9.9745, time 125.84ms
iter 10980: loss 10.5537, time 125.81ms
iter 10990: loss 10.5975, time 125.85ms
step 11000: train loss 8.8769, val loss 8.8477
saving checkpoint to out-shakespeare-char
iter 11000: loss 10.6368, time 2900.42ms
iter 11010: loss 10.2820, time 126.07ms
iter 11020: loss 10.0902, time 125.72ms
iter 11030: loss 9.7288, time 125.77ms
iter 11040: loss 10.3311, time 124.86ms
iter 11050: loss 10.4743, time 125.69ms
iter 11060: loss 10.8805, time 125.69ms
iter 11070: loss 10.2560, time 126.00ms
iter 11080: loss 10.1430, time 125.64ms
iter 11090: loss 9.8676, time 126.14ms
iter 11100: loss 10.5599, time 128.56ms
iter 11110: loss 10.6203, time 125.78ms
iter 11120: loss 10.3136, time 124.89ms
iter 11130: loss 10.9516, time 125.37ms
iter 11140: loss 10.5409, time 125.32ms
iter 11150: loss 10.1688, time 125.53ms
iter 11160: loss 10.4572, time 125.65ms
iter 11170: loss 9.7941, time 126.04ms
iter 11180: loss 10.9162, time 125.46ms
iter 11190: loss 10.5569, time 125.95ms
iter 11200: loss 10.4245, time 128.90ms
iter 11210: loss 10.8199, time 125.66ms
iter 11220: loss 10.0098, time 125.38ms
iter 11230: loss 10.2937, time 125.39ms
iter 11240: loss 10.1086, time 125.79ms
step 11250: train loss 8.8898, val loss 8.8462
saving checkpoint to out-shakespeare-char
iter 11250: loss 10.8277, time 2893.23ms
iter 11260: loss 10.6102, time 121.71ms
iter 11270: loss 10.9731, time 121.68ms
iter 11280: loss 10.1536, time 121.95ms
iter 11290: loss 10.3429, time 122.94ms
iter 11300: loss 10.2357, time 121.94ms
iter 11310: loss 10.2286, time 121.82ms
iter 11320: loss 11.0345, time 121.99ms
iter 11330: loss 10.3242, time 121.71ms
iter 11340: loss 10.3500, time 122.10ms
iter 11350: loss 10.1609, time 121.69ms
iter 11360: loss 9.9286, time 121.48ms
iter 11370: loss 10.4073, time 121.56ms
iter 11380: loss 10.4626, time 121.57ms
iter 11390: loss 10.5704, time 121.39ms
iter 11400: loss 10.4160, time 121.56ms
iter 11410: loss 10.1170, time 121.40ms
iter 11420: loss 10.2541, time 121.52ms
iter 11430: loss 10.2202, time 121.57ms
iter 11440: loss 9.9982, time 122.08ms
iter 11450: loss 10.5416, time 125.23ms
iter 11460: loss 9.7950, time 125.10ms
iter 11470: loss 10.4248, time 128.09ms
iter 11480: loss 10.9400, time 125.37ms
iter 11490: loss 10.0542, time 125.48ms
step 11500: train loss 8.8523, val loss 8.8279
saving checkpoint to out-shakespeare-char
iter 11500: loss 10.2011, time 2892.25ms
iter 11510: loss 9.8394, time 124.93ms
iter 11520: loss 10.6715, time 124.97ms
iter 11530: loss 9.5136, time 127.81ms
iter 11540: loss 10.1382, time 127.50ms
iter 11550: loss 9.5900, time 125.64ms
iter 11560: loss 10.0130, time 125.87ms
iter 11570: loss 10.1119, time 125.60ms
iter 11580: loss 10.2925, time 125.94ms
iter 11590: loss 10.3177, time 125.52ms
iter 11600: loss 10.2036, time 125.32ms
iter 11610: loss 10.3114, time 124.88ms
iter 11620: loss 10.2248, time 126.04ms
iter 11630: loss 9.7168, time 125.63ms
iter 11640: loss 9.9893, time 127.37ms
iter 11650: loss 9.8369, time 125.90ms
iter 11660: loss 10.8920, time 125.87ms
iter 11670: loss 10.1979, time 125.52ms
iter 11680: loss 9.8647, time 125.82ms
iter 11690: loss 10.9107, time 125.95ms
iter 11700: loss 10.5986, time 125.81ms
iter 11710: loss 10.3426, time 125.50ms
iter 11720: loss 10.3808, time 125.19ms
iter 11730: loss 10.2203, time 125.30ms
iter 11740: loss 10.6143, time 125.82ms
step 11750: train loss 8.8401, val loss 8.8243
saving checkpoint to out-shakespeare-char
iter 11750: loss 10.1575, time 2897.14ms
iter 11760: loss 10.0698, time 125.97ms
iter 11770: loss 9.7070, time 128.37ms
iter 11780: loss 10.8560, time 126.49ms
iter 11790: loss 10.8282, time 125.11ms
iter 11800: loss 10.2723, time 125.36ms
iter 11810: loss 10.3675, time 125.95ms
iter 11820: loss 10.2711, time 126.27ms
iter 11830: loss 10.2728, time 125.95ms
iter 11840: loss 9.6388, time 125.40ms
iter 11850: loss 10.3385, time 128.28ms
iter 11860: loss 10.4612, time 125.43ms
iter 11870: loss 10.4599, time 125.82ms
iter 11880: loss 9.9743, time 125.36ms
iter 11890: loss 10.8871, time 125.74ms
iter 11900: loss 9.8003, time 125.37ms
iter 11910: loss 9.8415, time 125.92ms
iter 11920: loss 10.3942, time 125.62ms
iter 11930: loss 10.1693, time 125.93ms
iter 11940: loss 10.5278, time 124.69ms
iter 11950: loss 10.2271, time 126.02ms
iter 11960: loss 9.8256, time 128.70ms
iter 11970: loss 9.9171, time 125.79ms
iter 11980: loss 10.0110, time 124.82ms
iter 11990: loss 9.9218, time 126.52ms
step 12000: train loss 8.7503, val loss 8.8164
saving checkpoint to out-shakespeare-char
iter 12000: loss 10.4463, time 2879.52ms
iter 12010: loss 10.7832, time 125.88ms
iter 12020: loss 10.1391, time 124.63ms
iter 12030: loss 10.1591, time 125.41ms
iter 12040: loss 10.3407, time 125.53ms
iter 12050: loss 9.8960, time 125.41ms
iter 12060: loss 9.7358, time 125.48ms
iter 12070: loss 10.7667, time 125.66ms
iter 12080: loss 9.8016, time 124.89ms
iter 12090: loss 10.6641, time 128.41ms
iter 12100: loss 10.8098, time 125.65ms
iter 12110: loss 10.3899, time 125.54ms
iter 12120: loss 9.9686, time 125.82ms
iter 12130: loss 10.1794, time 128.58ms
iter 12140: loss 9.9392, time 125.85ms
iter 12150: loss 10.0104, time 124.96ms
iter 12160: loss 9.4710, time 124.03ms
iter 12170: loss 10.3951, time 126.12ms
iter 12180: loss 10.1330, time 125.76ms
iter 12190: loss 10.1657, time 125.69ms
iter 12200: loss 10.2949, time 128.74ms
iter 12210: loss 10.4172, time 125.67ms
iter 12220: loss 9.8562, time 125.87ms
iter 12230: loss 10.1886, time 125.08ms
iter 12240: loss 10.4990, time 125.35ms
step 12250: train loss 8.7692, val loss 8.8214
saving checkpoint to out-shakespeare-char
iter 12250: loss 10.3290, time 2891.03ms
iter 12260: loss 9.9169, time 129.19ms
iter 12270: loss 9.9297, time 125.08ms
iter 12280: loss 10.4112, time 125.64ms
iter 12290: loss 10.3881, time 124.62ms
iter 12300: loss 10.2962, time 125.90ms
iter 12310: loss 9.8838, time 125.76ms
iter 12320: loss 10.0777, time 125.79ms
iter 12330: loss 11.2096, time 124.84ms
iter 12340: loss 10.3069, time 125.80ms
iter 12350: loss 9.7673, time 125.47ms
iter 12360: loss 10.3428, time 125.66ms
iter 12370: loss 10.0334, time 124.39ms
iter 12380: loss 9.7891, time 128.20ms
iter 12390: loss 10.3638, time 125.46ms
iter 12400: loss 10.2023, time 125.37ms
iter 12410: loss 10.4337, time 123.60ms
iter 12420: loss 10.6044, time 125.56ms
iter 12430: loss 9.8252, time 125.81ms
iter 12440: loss 10.4815, time 125.87ms
iter 12450: loss 10.1329, time 124.21ms
iter 12460: loss 9.8343, time 125.72ms
iter 12470: loss 9.8628, time 125.70ms
iter 12480: loss 10.3927, time 126.24ms
iter 12490: loss 10.3924, time 128.57ms
step 12500: train loss 8.7434, val loss 8.7544
saving checkpoint to out-shakespeare-char
iter 12500: loss 10.3711, time 2899.47ms
iter 12510: loss 9.8612, time 124.44ms
iter 12520: loss 10.1962, time 121.79ms
iter 12530: loss 10.6001, time 124.67ms
iter 12540: loss 10.6091, time 121.74ms
iter 12550: loss 9.3718, time 124.80ms
iter 12560: loss 10.2790, time 121.72ms
iter 12570: loss 9.7721, time 124.71ms
iter 12580: loss 10.2875, time 121.79ms
iter 12590: loss 10.2212, time 124.98ms
iter 12600: loss 10.1078, time 121.90ms
iter 12610: loss 10.0495, time 124.89ms
iter 12620: loss 10.5597, time 121.76ms
iter 12630: loss 10.2277, time 124.49ms
iter 12640: loss 10.4074, time 121.90ms
iter 12650: loss 10.4231, time 124.69ms
iter 12660: loss 9.7175, time 121.61ms
iter 12670: loss 10.7628, time 124.66ms
iter 12680: loss 9.9566, time 121.69ms
iter 12690: loss 10.7893, time 123.97ms
iter 12700: loss 10.8117, time 121.68ms
iter 12710: loss 10.6056, time 124.55ms
iter 12720: loss 9.8801, time 121.69ms
iter 12730: loss 10.4289, time 124.62ms
iter 12740: loss 10.3431, time 121.85ms
step 12750: train loss 8.7350, val loss 8.7677
saving checkpoint to out-shakespeare-char
iter 12750: loss 10.2930, time 2893.15ms
iter 12760: loss 10.4628, time 121.77ms
iter 12770: loss 10.1810, time 121.47ms
iter 12780: loss 9.9900, time 122.44ms
iter 12790: loss 10.7958, time 121.44ms
iter 12800: loss 10.8557, time 122.65ms
iter 12810: loss 10.5296, time 121.93ms
iter 12820: loss 10.5593, time 122.76ms
iter 12830: loss 9.9273, time 121.67ms
iter 12840: loss 10.3220, time 122.77ms
iter 12850: loss 9.5467, time 121.65ms
iter 12860: loss 10.0143, time 122.49ms
iter 12870: loss 9.9002, time 121.49ms
iter 12880: loss 10.2992, time 122.70ms
iter 12890: loss 10.4718, time 121.56ms
iter 12900: loss 10.7292, time 122.70ms
iter 12910: loss 10.6638, time 121.54ms
iter 12920: loss 10.4413, time 122.69ms
iter 12930: loss 10.1336, time 121.49ms
iter 12940: loss 10.4518, time 122.58ms
iter 12950: loss 10.3708, time 121.43ms
iter 12960: loss 10.1183, time 122.83ms
iter 12970: loss 9.6926, time 121.62ms
iter 12980: loss 10.5085, time 122.71ms
iter 12990: loss 10.4414, time 122.00ms
step 13000: train loss 8.7252, val loss 8.7225
saving checkpoint to out-shakespeare-char
iter 13000: loss 10.0442, time 2905.95ms
iter 13010: loss 10.2740, time 125.09ms
iter 13020: loss 10.2758, time 125.15ms
iter 13030: loss 10.4068, time 127.81ms
iter 13040: loss 9.9657, time 125.22ms
iter 13050: loss 10.1738, time 125.50ms
iter 13060: loss 9.9800, time 125.44ms
iter 13070: loss 10.8790, time 125.03ms
iter 13080: loss 9.9787, time 125.70ms
iter 13090: loss 10.1095, time 125.22ms
iter 13100: loss 10.1962, time 125.28ms
iter 13110: loss 10.2264, time 125.59ms
iter 13120: loss 9.7962, time 125.27ms
iter 13130: loss 10.2758, time 125.22ms
iter 13140: loss 10.2423, time 124.98ms
iter 13150: loss 10.0998, time 125.34ms
iter 13160: loss 10.4922, time 125.55ms
iter 13170: loss 10.0533, time 124.99ms
iter 13180: loss 10.7271, time 124.84ms
iter 13190: loss 10.2881, time 121.85ms
iter 13200: loss 9.7164, time 124.23ms
iter 13210: loss 10.4153, time 121.44ms
iter 13220: loss 9.7012, time 124.77ms
iter 13230: loss 11.0560, time 121.13ms
iter 13240: loss 9.9575, time 125.43ms
step 13250: train loss 8.7462, val loss 8.7324
saving checkpoint to out-shakespeare-char
iter 13250: loss 10.4125, time 2893.04ms
iter 13260: loss 9.8602, time 121.80ms
iter 13270: loss 9.9361, time 122.69ms
iter 13280: loss 10.0486, time 120.31ms
iter 13290: loss 10.5323, time 121.19ms
iter 13300: loss 10.2504, time 121.70ms
iter 13310: loss 10.3266, time 121.61ms
iter 13320: loss 10.3350, time 121.51ms
iter 13330: loss 10.8140, time 121.16ms
iter 13340: loss 10.0364, time 120.72ms
iter 13350: loss 10.8090, time 121.93ms
iter 13360: loss 10.3657, time 119.57ms
iter 13370: loss 9.8032, time 121.71ms
iter 13380: loss 9.9417, time 121.69ms
iter 13390: loss 10.0293, time 120.53ms
iter 13400: loss 10.3804, time 121.65ms
iter 13410: loss 9.5787, time 122.00ms
iter 13420: loss 9.5376, time 121.62ms
iter 13430: loss 10.8955, time 121.79ms
iter 13440: loss 9.7874, time 122.07ms
iter 13450: loss 10.1935, time 121.95ms
iter 13460: loss 9.8436, time 122.11ms
iter 13470: loss 9.8940, time 121.72ms
iter 13480: loss 10.2882, time 121.72ms
iter 13490: loss 10.2503, time 121.74ms
step 13500: train loss 8.6982, val loss 8.7094
saving checkpoint to out-shakespeare-char
iter 13500: loss 9.7629, time 2906.59ms
iter 13510: loss 9.9665, time 120.87ms
iter 13520: loss 10.5344, time 124.78ms
iter 13530: loss 10.3164, time 121.73ms
iter 13540: loss 9.7545, time 124.56ms
iter 13550: loss 9.4561, time 121.63ms
iter 13560: loss 10.4570, time 123.62ms
iter 13570: loss 9.8200, time 121.65ms
iter 13580: loss 10.4232, time 125.03ms
iter 13590: loss 9.9565, time 120.82ms
iter 13600: loss 9.3835, time 124.93ms
iter 13610: loss 10.4842, time 121.54ms
iter 13620: loss 10.3345, time 124.35ms
iter 13630: loss 10.2682, time 122.12ms
iter 13640: loss 9.9715, time 124.25ms
iter 13650: loss 10.0292, time 121.75ms
iter 13660: loss 10.4675, time 124.44ms
iter 13670: loss 9.3784, time 121.10ms
iter 13680: loss 10.6578, time 125.53ms
iter 13690: loss 9.9321, time 125.69ms
iter 13700: loss 10.3363, time 125.82ms
iter 13710: loss 10.0966, time 125.72ms
iter 13720: loss 10.1961, time 125.55ms
iter 13730: loss 10.1982, time 125.52ms
iter 13740: loss 10.1837, time 125.62ms
step 13750: train loss 8.7076, val loss 8.7111
saving checkpoint to out-shakespeare-char
iter 13750: loss 9.8643, time 2893.89ms
iter 13760: loss 11.0431, time 126.10ms
iter 13770: loss 10.4049, time 125.78ms
iter 13780: loss 9.7122, time 125.43ms
iter 13790: loss 10.4557, time 125.69ms
iter 13800: loss 10.4156, time 126.94ms
iter 13810: loss 9.7013, time 126.28ms
iter 13820: loss 10.3361, time 127.92ms
iter 13830: loss 10.5689, time 125.92ms
iter 13840: loss 9.9196, time 126.32ms
iter 13850: loss 9.8149, time 126.04ms
iter 13860: loss 9.4067, time 125.99ms
iter 13870: loss 10.1853, time 125.64ms
iter 13880: loss 10.2709, time 126.08ms
iter 13890: loss 10.2450, time 126.40ms
iter 13900: loss 9.9940, time 126.35ms
iter 13910: loss 11.0084, time 124.87ms
iter 13920: loss 10.4254, time 126.02ms
iter 13930: loss 9.8098, time 128.45ms
iter 13940: loss 9.9785, time 125.41ms
iter 13950: loss 10.3714, time 125.44ms
iter 13960: loss 9.6166, time 125.72ms
iter 13970: loss 10.0026, time 125.09ms
iter 13980: loss 10.6233, time 125.37ms
iter 13990: loss 10.0386, time 125.71ms
step 14000: train loss 8.6869, val loss 8.7029
saving checkpoint to out-shakespeare-char
iter 14000: loss 10.2960, time 2886.90ms
iter 14010: loss 9.7960, time 125.67ms
iter 14020: loss 10.0138, time 125.57ms
iter 14030: loss 10.6609, time 124.92ms
iter 14040: loss 10.2110, time 127.54ms
iter 14050: loss 10.3617, time 125.12ms
iter 14060: loss 10.0609, time 126.03ms
iter 14070: loss 9.8506, time 129.19ms
iter 14080: loss 10.5016, time 125.72ms
iter 14090: loss 10.2100, time 126.04ms
iter 14100: loss 10.0278, time 126.03ms
iter 14110: loss 10.3804, time 128.66ms
iter 14120: loss 9.6752, time 125.67ms
iter 14130: loss 9.7943, time 125.86ms
iter 14140: loss 10.6247, time 126.17ms
iter 14150: loss 10.2872, time 124.99ms
iter 14160: loss 9.8568, time 125.21ms
iter 14170: loss 9.6058, time 125.30ms
iter 14180: loss 9.8908, time 125.14ms
iter 14190: loss 9.8798, time 121.92ms
iter 14200: loss 10.1859, time 122.68ms
iter 14210: loss 9.5344, time 121.22ms
iter 14220: loss 10.0508, time 120.78ms
iter 14230: loss 10.3182, time 121.27ms
iter 14240: loss 9.9599, time 121.77ms
step 14250: train loss 8.6461, val loss 8.6861
saving checkpoint to out-shakespeare-char
iter 14250: loss 10.0873, time 2893.28ms
iter 14260: loss 9.9259, time 122.70ms
iter 14270: loss 9.8708, time 121.61ms
iter 14280: loss 9.8447, time 122.74ms
iter 14290: loss 10.1347, time 121.38ms
iter 14300: loss 9.9394, time 122.89ms
iter 14310: loss 10.5619, time 121.44ms
iter 14320: loss 9.8522, time 122.25ms
iter 14330: loss 9.7519, time 122.15ms
iter 14340: loss 9.7921, time 122.76ms
iter 14350: loss 10.3200, time 121.51ms
iter 14360: loss 10.2499, time 122.73ms
iter 14370: loss 9.6015, time 121.05ms
iter 14380: loss 10.1173, time 123.75ms
iter 14390: loss 10.1905, time 121.55ms
iter 14400: loss 10.1123, time 121.68ms
iter 14410: loss 9.4825, time 121.42ms
iter 14420: loss 10.8359, time 121.04ms
iter 14430: loss 10.5696, time 121.80ms
iter 14440: loss 10.5393, time 121.81ms
iter 14450: loss 10.2071, time 121.07ms
iter 14460: loss 10.4366, time 121.73ms
iter 14470: loss 10.6044, time 121.58ms
iter 14480: loss 9.9314, time 121.74ms
iter 14490: loss 10.0149, time 121.53ms
step 14500: train loss 8.6430, val loss 8.6624
saving checkpoint to out-shakespeare-char
iter 14500: loss 10.2249, time 2895.98ms
iter 14510: loss 10.3686, time 121.67ms
iter 14520: loss 10.2908, time 121.57ms
iter 14530: loss 10.2678, time 121.43ms
iter 14540: loss 10.2542, time 121.65ms
iter 14550: loss 9.4932, time 122.09ms
iter 14560: loss 9.9316, time 122.38ms
iter 14570: loss 10.1983, time 122.11ms
iter 14580: loss 9.4801, time 122.31ms
iter 14590: loss 10.3443, time 122.02ms
iter 14600: loss 10.4254, time 121.54ms
iter 14610: loss 10.6704, time 122.02ms
iter 14620: loss 9.5809, time 121.50ms
iter 14630: loss 10.7495, time 121.91ms
iter 14640: loss 10.3943, time 121.69ms
iter 14650: loss 9.9814, time 121.62ms
iter 14660: loss 10.0703, time 121.68ms
iter 14670: loss 10.2009, time 121.57ms
iter 14680: loss 10.7096, time 121.98ms
iter 14690: loss 9.9977, time 122.63ms
iter 14700: loss 10.3696, time 121.38ms
iter 14710: loss 9.8170, time 121.70ms
iter 14720: loss 10.1318, time 120.57ms
iter 14730: loss 9.9316, time 121.90ms
iter 14740: loss 10.2865, time 121.84ms
step 14750: train loss 8.6350, val loss 8.6371
saving checkpoint to out-shakespeare-char
iter 14750: loss 9.8574, time 2893.29ms
iter 14760: loss 10.4352, time 125.80ms
iter 14770: loss 10.1146, time 125.86ms
iter 14780: loss 10.2511, time 125.95ms
iter 14790: loss 9.5394, time 125.16ms
iter 14800: loss 9.9667, time 125.52ms
iter 14810: loss 10.3376, time 125.87ms
iter 14820: loss 9.8214, time 125.73ms
iter 14830: loss 9.7372, time 129.26ms
iter 14840: loss 10.1044, time 125.17ms
iter 14850: loss 9.6497, time 126.04ms
iter 14860: loss 9.9494, time 125.13ms
iter 14870: loss 10.1531, time 124.98ms
iter 14880: loss 10.2816, time 124.84ms
iter 14890: loss 10.1048, time 125.29ms
iter 14900: loss 10.0043, time 125.08ms
iter 14910: loss 9.8160, time 126.01ms
iter 14920: loss 10.2488, time 125.77ms
iter 14930: loss 9.4211, time 126.33ms
iter 14940: loss 10.0191, time 125.52ms
iter 14950: loss 10.4679, time 125.57ms
iter 14960: loss 10.2555, time 125.91ms
iter 14970: loss 10.4109, time 125.33ms
iter 14980: loss 10.1036, time 125.79ms
iter 14990: loss 10.5234, time 125.71ms
step 15000: train loss 8.6362, val loss 8.5997
saving checkpoint to out-shakespeare-char
iter 15000: loss 10.3138, time 2890.22ms
iter 15010: loss 9.5152, time 126.14ms
iter 15020: loss 9.7198, time 125.98ms
iter 15030: loss 10.2108, time 124.52ms
iter 15040: loss 9.8390, time 125.76ms
iter 15050: loss 10.3071, time 125.75ms
iter 15060: loss 10.1649, time 125.81ms
iter 15070: loss 10.3836, time 125.69ms
iter 15080: loss 10.2073, time 125.13ms
iter 15090: loss 10.2943, time 125.78ms
iter 15100: loss 10.2276, time 125.34ms
iter 15110: loss 9.3610, time 125.81ms
iter 15120: loss 10.4360, time 125.32ms
iter 15130: loss 10.2224, time 125.99ms
iter 15140: loss 10.5001, time 126.09ms
iter 15150: loss 9.6038, time 128.79ms
iter 15160: loss 10.1562, time 125.66ms
iter 15170: loss 10.3678, time 125.83ms
iter 15180: loss 9.7700, time 126.04ms
iter 15190: loss 9.8112, time 126.13ms
iter 15200: loss 10.0305, time 125.74ms
iter 15210: loss 10.4707, time 124.99ms
iter 15220: loss 9.4120, time 125.69ms
iter 15230: loss 10.5926, time 125.96ms
iter 15240: loss 10.3755, time 125.78ms
step 15250: train loss 8.5790, val loss 8.6183
saving checkpoint to out-shakespeare-char
iter 15250: loss 9.6645, time 2896.77ms
iter 15260: loss 10.9786, time 125.78ms
iter 15270: loss 10.0638, time 124.87ms
iter 15280: loss 9.9179, time 125.57ms
iter 15290: loss 10.1035, time 125.99ms
iter 15300: loss 10.5736, time 126.13ms
iter 15310: loss 9.8454, time 125.74ms
iter 15320: loss 9.7986, time 124.92ms
iter 15330: loss 10.0544, time 126.27ms
iter 15340: loss 10.0368, time 125.90ms
iter 15350: loss 10.1465, time 125.54ms
iter 15360: loss 10.0650, time 128.53ms
iter 15370: loss 9.7927, time 125.57ms
iter 15380: loss 10.3162, time 125.92ms
iter 15390: loss 10.2578, time 125.65ms
iter 15400: loss 10.1214, time 125.85ms
iter 15410: loss 10.0093, time 125.68ms
iter 15420: loss 10.0169, time 125.70ms
iter 15430: loss 9.8181, time 125.41ms
iter 15440: loss 9.8190, time 125.90ms
iter 15450: loss 10.2151, time 125.58ms
iter 15460: loss 9.8843, time 125.58ms
iter 15470: loss 9.3605, time 125.56ms
iter 15480: loss 9.9732, time 125.66ms
iter 15490: loss 10.1506, time 125.82ms
step 15500: train loss 8.6381, val loss 8.5784
saving checkpoint to out-shakespeare-char
iter 15500: loss 10.0420, time 2864.29ms
iter 15510: loss 10.0390, time 125.97ms
iter 15520: loss 9.6560, time 125.86ms
iter 15530: loss 9.8056, time 127.76ms
iter 15540: loss 10.0665, time 124.73ms
iter 15550: loss 9.3565, time 124.53ms
iter 15560: loss 9.9820, time 124.57ms
iter 15570: loss 10.2013, time 124.93ms
iter 15580: loss 10.3367, time 125.80ms
iter 15590: loss 10.0918, time 126.41ms
iter 15600: loss 10.3809, time 125.86ms
iter 15610: loss 9.7929, time 125.57ms
iter 15620: loss 9.7039, time 125.76ms
iter 15630: loss 10.5962, time 126.46ms
iter 15640: loss 9.9772, time 125.94ms
iter 15650: loss 10.2906, time 125.92ms
iter 15660: loss 9.7242, time 125.68ms
iter 15670: loss 10.4584, time 125.30ms
iter 15680: loss 10.0800, time 125.49ms
iter 15690: loss 10.5624, time 125.75ms
iter 15700: loss 10.5604, time 125.71ms
iter 15710: loss 9.6287, time 128.73ms
iter 15720: loss 10.0137, time 125.96ms
iter 15730: loss 9.9253, time 124.91ms
iter 15740: loss 10.1372, time 125.13ms
step 15750: train loss 8.5839, val loss 8.5961
saving checkpoint to out-shakespeare-char
iter 15750: loss 10.5268, time 2874.31ms
iter 15760: loss 10.3849, time 125.81ms
iter 15770: loss 9.8489, time 125.36ms
iter 15780: loss 10.3831, time 125.38ms
iter 15790: loss 10.2436, time 126.15ms
iter 15800: loss 9.8585, time 128.28ms
iter 15810: loss 10.1503, time 125.89ms
iter 15820: loss 9.4552, time 125.68ms
iter 15830: loss 10.0640, time 125.29ms
iter 15840: loss 9.4197, time 125.13ms
iter 15850: loss 10.1483, time 125.41ms
iter 15860: loss 10.0505, time 125.41ms
iter 15870: loss 10.2226, time 124.89ms
iter 15880: loss 9.5908, time 126.31ms
iter 15890: loss 10.0689, time 125.95ms
iter 15900: loss 10.1077, time 125.71ms
iter 15910: loss 9.8466, time 128.52ms
iter 15920: loss 9.9395, time 124.47ms
iter 15930: loss 9.8349, time 125.14ms
iter 15940: loss 10.8248, time 125.21ms
iter 15950: loss 10.3930, time 125.20ms
iter 15960: loss 10.1800, time 127.34ms
iter 15970: loss 9.8039, time 125.50ms
iter 15980: loss 9.8268, time 125.34ms
iter 15990: loss 10.3219, time 125.27ms
step 16000: train loss 8.5709, val loss 8.5392
saving checkpoint to out-shakespeare-char
iter 16000: loss 9.8553, time 2904.50ms
iter 16010: loss 10.3856, time 126.40ms
iter 16020: loss 10.2605, time 127.63ms
iter 16030: loss 10.6722, time 126.01ms
iter 16040: loss 10.5433, time 129.07ms
iter 16050: loss 9.8023, time 126.82ms
iter 16060: loss 10.0151, time 125.21ms
iter 16070: loss 10.4675, time 125.74ms
iter 16080: loss 10.1261, time 125.15ms
iter 16090: loss 10.3141, time 125.51ms
iter 16100: loss 9.7140, time 125.49ms
iter 16110: loss 10.0509, time 125.36ms
iter 16120: loss 9.7075, time 125.24ms
iter 16130: loss 9.8436, time 125.61ms
iter 16140: loss 10.2276, time 125.02ms
iter 16150: loss 10.4274, time 125.64ms
iter 16160: loss 10.4764, time 124.81ms
iter 16170: loss 9.7115, time 125.16ms
iter 16180: loss 10.2100, time 123.41ms
iter 16190: loss 9.9252, time 125.29ms
iter 16200: loss 10.3484, time 125.64ms
iter 16210: loss 10.0522, time 125.91ms
iter 16220: loss 9.9030, time 128.16ms
iter 16230: loss 9.7938, time 125.53ms
iter 16240: loss 10.5007, time 125.54ms
step 16250: train loss 8.5496, val loss 8.5214
saving checkpoint to out-shakespeare-char
iter 16250: loss 10.0955, time 2890.71ms
iter 16260: loss 9.6165, time 125.85ms
iter 16270: loss 9.8567, time 123.36ms
iter 16280: loss 9.3492, time 121.62ms
iter 16290: loss 10.0434, time 123.33ms
iter 16300: loss 9.5019, time 121.67ms
iter 16310: loss 10.1594, time 122.87ms
iter 16320: loss 10.0510, time 122.08ms
iter 16330: loss 9.7109, time 121.72ms
iter 16340: loss 9.7226, time 121.53ms
iter 16350: loss 9.9240, time 121.34ms
iter 16360: loss 10.0243, time 121.69ms
iter 16370: loss 9.3816, time 122.54ms
iter 16380: loss 10.2646, time 121.52ms
iter 16390: loss 9.9717, time 121.99ms
iter 16400: loss 9.8203, time 121.36ms
iter 16410: loss 9.5800, time 121.77ms
iter 16420: loss 9.8652, time 122.57ms
iter 16430: loss 9.9710, time 121.94ms
iter 16440: loss 10.5436, time 121.59ms
iter 16450: loss 10.7837, time 121.90ms
iter 16460: loss 10.1779, time 121.46ms
iter 16470: loss 9.9559, time 122.98ms
iter 16480: loss 10.1378, time 121.42ms
iter 16490: loss 9.9180, time 121.87ms
step 16500: train loss 8.5377, val loss 8.4940
saving checkpoint to out-shakespeare-char
iter 16500: loss 9.9567, time 2905.26ms
iter 16510: loss 10.1545, time 128.24ms
iter 16520: loss 9.9429, time 125.00ms
iter 16530: loss 10.1653, time 124.73ms
iter 16540: loss 9.6926, time 125.57ms
iter 16550: loss 10.0727, time 124.68ms
iter 16560: loss 10.3252, time 125.22ms
iter 16570: loss 10.3880, time 125.21ms
iter 16580: loss 9.7788, time 128.25ms
iter 16590: loss 9.5092, time 123.76ms
iter 16600: loss 10.1798, time 124.92ms
iter 16610: loss 9.9112, time 125.08ms
iter 16620: loss 10.1494, time 125.43ms
iter 16630: loss 10.1958, time 125.17ms
iter 16640: loss 10.2652, time 125.13ms
iter 16650: loss 9.5780, time 124.71ms
iter 16660: loss 9.7106, time 124.33ms
iter 16670: loss 9.9121, time 125.28ms
iter 16680: loss 10.4087, time 125.06ms
iter 16690: loss 9.7919, time 127.68ms
iter 16700: loss 10.0945, time 124.86ms
iter 16710: loss 9.5891, time 120.32ms
iter 16720: loss 9.7676, time 119.61ms
iter 16730: loss 9.9233, time 120.56ms
iter 16740: loss 9.3559, time 120.44ms
step 16750: train loss 8.5106, val loss 8.4939
saving checkpoint to out-shakespeare-char
iter 16750: loss 9.4846, time 2895.15ms
iter 16760: loss 9.8221, time 119.94ms
iter 16770: loss 9.5641, time 120.60ms
iter 16780: loss 9.7533, time 119.78ms
iter 16790: loss 10.2391, time 120.85ms
iter 16800: loss 9.6450, time 119.64ms
iter 16810: loss 9.7982, time 119.93ms
iter 16820: loss 9.9958, time 119.84ms
iter 16830: loss 9.7826, time 119.49ms
iter 16840: loss 9.8650, time 120.63ms
iter 16850: loss 9.8276, time 119.84ms
iter 16860: loss 9.9804, time 119.74ms
iter 16870: loss 9.7110, time 119.52ms
iter 16880: loss 10.5093, time 119.61ms
iter 16890: loss 9.7714, time 120.81ms
iter 16900: loss 9.4044, time 119.62ms
iter 16910: loss 10.2991, time 119.88ms
iter 16920: loss 9.9966, time 119.80ms
iter 16930: loss 9.6139, time 119.67ms
iter 16940: loss 10.4413, time 120.75ms
iter 16950: loss 9.9778, time 119.74ms
iter 16960: loss 9.5584, time 119.64ms
iter 16970: loss 9.7500, time 119.94ms
iter 16980: loss 9.8532, time 120.37ms
iter 16990: loss 9.3178, time 119.91ms
step 17000: train loss 8.5342, val loss 8.4673
saving checkpoint to out-shakespeare-char
iter 17000: loss 10.2083, time 2882.61ms
iter 17010: loss 10.0911, time 121.91ms
iter 17020: loss 10.5441, time 121.80ms
iter 17030: loss 9.8377, time 121.73ms
iter 17040: loss 9.8398, time 120.79ms
iter 17050: loss 9.6525, time 120.93ms
iter 17060: loss 10.0687, time 122.07ms
iter 17070: loss 10.1804, time 120.89ms
iter 17080: loss 10.3994, time 122.04ms
iter 17090: loss 10.0230, time 121.07ms
iter 17100: loss 10.0901, time 122.12ms
iter 17110: loss 9.5263, time 120.95ms
iter 17120: loss 9.7470, time 120.98ms
iter 17130: loss 10.3660, time 121.79ms
iter 17140: loss 10.1994, time 120.50ms
iter 17150: loss 9.8717, time 121.66ms
iter 17160: loss 10.1372, time 121.92ms
iter 17170: loss 10.0177, time 122.83ms
iter 17180: loss 9.9127, time 121.73ms
iter 17190: loss 9.9606, time 121.82ms
iter 17200: loss 10.3876, time 121.79ms
iter 17210: loss 9.7850, time 121.85ms
iter 17220: loss 9.6422, time 121.90ms
iter 17230: loss 10.1127, time 121.88ms
iter 17240: loss 10.2336, time 121.81ms
step 17250: train loss 8.5180, val loss 8.4995
saving checkpoint to out-shakespeare-char
iter 17250: loss 9.8198, time 2892.34ms
iter 17260: loss 10.0412, time 122.07ms
iter 17270: loss 9.9114, time 121.57ms
iter 17280: loss 10.0493, time 121.86ms
iter 17290: loss 9.9977, time 122.14ms
iter 17300: loss 10.1519, time 122.32ms
iter 17310: loss 10.0112, time 122.26ms
iter 17320: loss 10.1546, time 122.55ms
iter 17330: loss 10.0514, time 121.76ms
iter 17340: loss 9.2373, time 121.94ms
iter 17350: loss 9.6759, time 121.61ms
iter 17360: loss 9.8293, time 121.90ms
iter 17370: loss 9.8248, time 121.63ms
iter 17380: loss 9.2225, time 121.77ms
iter 17390: loss 9.7845, time 121.93ms
iter 17400: loss 9.5622, time 121.18ms
iter 17410: loss 9.9136, time 121.59ms
iter 17420: loss 9.7072, time 121.78ms
iter 17430: loss 10.3409, time 122.15ms
iter 17440: loss 9.4158, time 121.99ms
iter 17450: loss 9.7146, time 121.84ms
iter 17460: loss 9.4570, time 121.83ms
iter 17470: loss 9.8772, time 121.69ms
iter 17480: loss 10.3352, time 121.88ms
iter 17490: loss 9.5886, time 121.11ms
step 17500: train loss 8.4595, val loss 8.4440
saving checkpoint to out-shakespeare-char
iter 17500: loss 9.9093, time 2892.75ms
iter 17510: loss 9.9378, time 122.16ms
iter 17520: loss 9.7832, time 122.28ms
iter 17530: loss 10.0330, time 122.29ms
iter 17540: loss 10.0605, time 122.20ms
iter 17550: loss 10.2042, time 121.94ms
iter 17560: loss 9.8415, time 121.79ms
iter 17570: loss 10.1512, time 122.12ms
iter 17580: loss 9.9648, time 121.79ms
iter 17590: loss 9.9350, time 121.88ms
iter 17600: loss 10.0123, time 121.93ms
iter 17610: loss 9.8745, time 122.01ms
iter 17620: loss 10.4746, time 122.16ms
iter 17630: loss 9.7758, time 122.01ms
iter 17640: loss 10.2433, time 121.92ms
iter 17650: loss 9.5709, time 121.90ms
iter 17660: loss 10.2433, time 121.89ms
iter 17670: loss 9.8165, time 122.99ms
iter 17680: loss 9.7598, time 121.86ms
iter 17690: loss 9.8311, time 121.95ms
iter 17700: loss 9.4662, time 121.82ms
iter 17710: loss 9.7455, time 121.81ms
iter 17720: loss 10.1414, time 122.12ms
iter 17730: loss 9.3820, time 122.05ms
iter 17740: loss 10.1693, time 122.00ms
step 17750: train loss 8.4299, val loss 8.4947
saving checkpoint to out-shakespeare-char
iter 17750: loss 10.1985, time 2892.50ms
iter 17760: loss 9.1929, time 123.69ms
iter 17770: loss 9.5844, time 122.93ms
iter 17780: loss 10.0806, time 124.14ms
iter 17790: loss 9.4134, time 121.98ms
iter 17800: loss 10.0162, time 125.63ms
iter 17810: loss 9.3995, time 128.36ms
iter 17820: loss 10.0715, time 125.97ms
iter 17830: loss 10.0507, time 125.72ms
iter 17840: loss 9.8587, time 126.52ms
iter 17850: loss 9.4443, time 128.15ms
iter 17860: loss 10.2870, time 125.30ms
iter 17870: loss 10.1197, time 125.37ms
iter 17880: loss 9.7851, time 125.48ms
iter 17890: loss 9.3515, time 125.28ms
iter 17900: loss 10.1403, time 125.30ms
iter 17910: loss 10.2240, time 125.26ms
iter 17920: loss 9.9536, time 125.15ms
iter 17930: loss 10.3080, time 125.33ms
iter 17940: loss 9.6228, time 124.84ms
iter 17950: loss 9.8782, time 125.44ms
iter 17960: loss 10.0418, time 128.26ms
iter 17970: loss 9.8865, time 126.49ms
iter 17980: loss 9.6338, time 124.76ms
iter 17990: loss 9.9931, time 125.35ms
step 18000: train loss 8.4320, val loss 8.4687
saving checkpoint to out-shakespeare-char
iter 18000: loss 9.6404, time 2894.50ms
iter 18010: loss 10.1735, time 125.93ms
iter 18020: loss 9.9077, time 128.78ms
iter 18030: loss 10.7176, time 125.48ms
iter 18040: loss 10.1474, time 125.27ms
iter 18050: loss 9.7715, time 125.59ms
iter 18060: loss 10.0658, time 125.73ms
iter 18070: loss 10.2936, time 125.52ms
iter 18080: loss 10.0277, time 125.52ms
iter 18090: loss 9.6182, time 125.02ms
iter 18100: loss 9.7960, time 125.28ms
iter 18110: loss 9.9518, time 124.81ms
iter 18120: loss 9.4532, time 125.08ms
iter 18130: loss 9.2452, time 127.92ms
iter 18140: loss 10.4698, time 124.96ms
iter 18150: loss 10.1977, time 125.08ms
iter 18160: loss 9.5376, time 124.80ms
iter 18170: loss 10.0030, time 124.97ms
iter 18180: loss 10.0206, time 124.79ms
iter 18190: loss 9.8115, time 125.09ms
iter 18200: loss 10.1901, time 128.42ms
iter 18210: loss 9.6289, time 125.30ms
iter 18220: loss 9.8744, time 125.12ms
iter 18230: loss 9.8136, time 125.16ms
iter 18240: loss 10.9838, time 124.97ms
step 18250: train loss 8.4637, val loss 8.4667
saving checkpoint to out-shakespeare-char
iter 18250: loss 10.1182, time 2897.01ms
iter 18260: loss 9.8440, time 121.59ms
iter 18270: loss 10.2595, time 121.65ms
iter 18280: loss 9.7876, time 122.37ms
iter 18290: loss 9.8279, time 122.18ms
iter 18300: loss 9.8585, time 121.64ms
iter 18310: loss 9.6345, time 120.82ms
iter 18320: loss 10.1326, time 122.31ms
iter 18330: loss 9.4773, time 121.58ms
iter 18340: loss 9.8338, time 121.79ms
iter 18350: loss 9.8492, time 121.59ms
iter 18360: loss 9.9047, time 121.71ms
iter 18370: loss 10.1149, time 121.46ms
iter 18380: loss 9.8033, time 121.69ms
iter 18390: loss 9.6213, time 121.77ms
iter 18400: loss 10.1610, time 121.89ms
iter 18410: loss 10.0784, time 121.70ms
iter 18420: loss 10.1307, time 121.84ms
iter 18430: loss 9.9000, time 121.82ms
iter 18440: loss 10.3610, time 121.65ms
iter 18450: loss 9.3013, time 121.56ms
iter 18460: loss 10.2355, time 121.82ms
iter 18470: loss 10.1649, time 120.64ms
iter 18480: loss 9.6408, time 121.86ms
iter 18490: loss 9.7745, time 121.67ms
step 18500: train loss 8.4360, val loss 8.4127
saving checkpoint to out-shakespeare-char
iter 18500: loss 9.6865, time 2909.18ms
iter 18510: loss 9.5116, time 121.59ms
iter 18520: loss 9.4589, time 121.69ms
iter 18530: loss 9.8518, time 123.23ms
iter 18540: loss 9.6666, time 121.63ms
iter 18550: loss 10.2431, time 122.86ms
iter 18560: loss 10.0817, time 121.72ms
iter 18570: loss 9.7588, time 125.51ms
iter 18580: loss 9.8860, time 127.84ms
iter 18590: loss 10.1727, time 125.73ms
iter 18600: loss 10.0949, time 125.63ms
iter 18610: loss 9.3099, time 124.78ms
iter 18620: loss 9.1589, time 125.38ms
iter 18630: loss 10.1534, time 125.56ms
iter 18640: loss 9.5378, time 125.63ms
iter 18650: loss 9.6633, time 125.66ms
iter 18660: loss 9.6732, time 125.71ms
iter 18670: loss 9.8631, time 125.63ms
iter 18680: loss 9.7105, time 125.79ms
iter 18690: loss 10.0591, time 129.02ms
iter 18700: loss 10.4759, time 125.54ms
iter 18710: loss 10.3941, time 125.62ms
iter 18720: loss 10.0409, time 125.66ms
iter 18730: loss 9.4819, time 125.67ms
iter 18740: loss 9.9027, time 125.84ms
step 18750: train loss 8.4262, val loss 8.4216
saving checkpoint to out-shakespeare-char
iter 18750: loss 9.5254, time 2901.91ms
iter 18760: loss 9.5710, time 125.30ms
iter 18770: loss 9.8902, time 125.78ms
iter 18780: loss 9.9543, time 125.57ms
iter 18790: loss 9.8128, time 125.35ms
iter 18800: loss 10.2428, time 125.97ms
iter 18810: loss 9.7717, time 126.19ms
iter 18820: loss 10.2133, time 125.62ms
iter 18830: loss 10.2163, time 125.62ms
iter 18840: loss 10.2040, time 125.62ms
iter 18850: loss 9.6469, time 125.77ms
iter 18860: loss 9.7632, time 128.11ms
iter 18870: loss 9.7031, time 125.43ms
iter 18880: loss 9.8247, time 125.50ms
iter 18890: loss 10.4991, time 125.41ms
iter 18900: loss 10.0712, time 125.71ms
iter 18910: loss 9.4645, time 125.38ms
iter 18920: loss 9.6100, time 125.34ms
iter 18930: loss 10.0394, time 125.54ms
iter 18940: loss 9.4835, time 125.60ms
iter 18950: loss 9.9745, time 125.51ms
iter 18960: loss 9.7503, time 125.42ms
iter 18970: loss 9.4897, time 128.67ms
iter 18980: loss 9.7562, time 125.33ms
iter 18990: loss 9.9335, time 125.54ms
step 19000: train loss 8.4366, val loss 8.3869
saving checkpoint to out-shakespeare-char
iter 19000: loss 9.2106, time 2914.26ms
iter 19010: loss 9.9731, time 122.30ms
iter 19020: loss 10.3152, time 122.18ms
iter 19030: loss 9.4674, time 122.37ms
iter 19040: loss 9.4543, time 121.78ms
iter 19050: loss 9.7522, time 121.93ms
iter 19060: loss 9.7307, time 122.22ms
iter 19070: loss 9.9165, time 122.92ms
iter 19080: loss 9.8387, time 121.82ms
iter 19090: loss 9.8590, time 121.80ms
iter 19100: loss 10.2114, time 121.79ms
iter 19110: loss 9.6021, time 122.11ms
iter 19120: loss 10.0538, time 122.02ms
iter 19130: loss 10.2919, time 121.68ms
iter 19140: loss 9.1777, time 121.66ms
iter 19150: loss 9.9473, time 121.83ms
iter 19160: loss 9.6052, time 121.75ms
iter 19170: loss 9.6189, time 122.52ms
iter 19180: loss 10.0129, time 121.67ms
iter 19190: loss 10.2136, time 121.74ms
iter 19200: loss 10.2661, time 121.72ms
iter 19210: loss 9.6139, time 122.05ms
iter 19220: loss 9.5319, time 121.47ms
iter 19230: loss 9.9924, time 121.67ms
iter 19240: loss 10.1545, time 121.97ms
step 19250: train loss 8.3572, val loss 8.4191
saving checkpoint to out-shakespeare-char
iter 19250: loss 9.2819, time 2902.17ms
iter 19260: loss 9.8903, time 125.60ms
iter 19270: loss 9.8081, time 126.18ms
iter 19280: loss 10.3016, time 125.29ms
iter 19290: loss 9.7226, time 125.37ms
iter 19300: loss 10.1964, time 125.14ms
iter 19310: loss 9.3175, time 124.55ms
iter 19320: loss 10.1515, time 125.37ms
iter 19330: loss 9.6591, time 125.14ms
iter 19340: loss 8.7755, time 125.25ms
iter 19350: loss 9.8981, time 125.25ms
iter 19360: loss 10.6854, time 125.64ms
iter 19370: loss 9.9976, time 125.64ms
iter 19380: loss 9.3627, time 125.49ms
iter 19390: loss 10.0637, time 128.14ms
iter 19400: loss 10.3213, time 125.45ms
iter 19410: loss 9.7359, time 125.16ms
iter 19420: loss 10.0012, time 125.73ms
iter 19430: loss 9.9562, time 125.00ms
iter 19440: loss 9.4860, time 125.32ms
iter 19450: loss 9.8657, time 125.04ms
iter 19460: loss 9.8753, time 125.77ms
iter 19470: loss 9.6761, time 125.63ms
iter 19480: loss 10.1287, time 127.08ms
iter 19490: loss 9.3191, time 125.82ms
step 19500: train loss 8.3949, val loss 8.3891
saving checkpoint to out-shakespeare-char
iter 19500: loss 9.7285, time 2897.24ms
iter 19510: loss 10.1208, time 125.68ms
iter 19520: loss 9.5757, time 125.71ms
iter 19530: loss 10.2960, time 125.40ms
iter 19540: loss 9.0878, time 125.13ms
iter 19550: loss 9.3022, time 125.00ms
iter 19560: loss 9.7198, time 124.77ms
iter 19570: loss 9.1255, time 125.57ms
iter 19580: loss 9.8290, time 125.66ms
iter 19590: loss 9.4146, time 125.54ms
iter 19600: loss 10.0593, time 125.53ms
iter 19610: loss 10.1173, time 124.42ms
iter 19620: loss 9.6484, time 124.97ms
iter 19630: loss 9.3867, time 125.00ms
iter 19640: loss 9.9069, time 125.23ms
iter 19650: loss 10.3854, time 124.04ms
iter 19660: loss 10.3532, time 125.14ms
iter 19670: loss 10.4620, time 128.35ms
iter 19680: loss 9.3967, time 125.35ms
iter 19690: loss 9.3398, time 125.89ms
iter 19700: loss 9.8762, time 125.84ms
iter 19710: loss 10.3832, time 125.68ms
iter 19720: loss 10.0561, time 124.99ms
iter 19730: loss 10.0280, time 125.33ms
iter 19740: loss 9.9393, time 125.40ms
step 19750: train loss 8.3488, val loss 8.3810
saving checkpoint to out-shakespeare-char
iter 19750: loss 10.2071, time 2878.07ms
iter 19760: loss 10.5961, time 125.64ms
iter 19770: loss 10.2973, time 125.36ms
iter 19780: loss 9.8024, time 125.72ms
iter 19790: loss 9.6716, time 125.63ms
iter 19800: loss 9.6154, time 126.02ms
iter 19810: loss 9.1526, time 128.91ms
iter 19820: loss 9.8062, time 124.49ms
iter 19830: loss 10.1037, time 125.15ms
iter 19840: loss 9.5677, time 125.30ms
iter 19850: loss 9.7481, time 125.01ms
iter 19860: loss 9.7577, time 125.68ms
iter 19870: loss 9.3897, time 125.63ms
iter 19880: loss 9.9383, time 125.46ms
iter 19890: loss 9.5776, time 125.98ms
iter 19900: loss 10.0950, time 125.33ms
iter 19910: loss 9.5781, time 125.22ms
iter 19920: loss 9.9624, time 124.39ms
iter 19930: loss 10.1173, time 126.05ms
iter 19940: loss 10.1101, time 124.96ms
iter 19950: loss 9.9546, time 125.04ms
iter 19960: loss 10.1281, time 125.21ms
iter 19970: loss 10.3269, time 126.26ms
iter 19980: loss 10.2249, time 125.42ms
iter 19990: loss 9.6231, time 126.19ms
step 20000: train loss 8.3826, val loss 8.3539
saving checkpoint to out-shakespeare-char
iter 20000: loss 9.3344, time 2893.42ms
iter 20010: loss 9.9747, time 125.37ms
iter 20020: loss 10.2043, time 125.28ms
iter 20030: loss 10.0264, time 128.07ms
iter 20040: loss 9.9281, time 125.50ms
iter 20050: loss 9.3345, time 125.48ms
iter 20060: loss 9.7865, time 125.91ms
iter 20070: loss 10.3168, time 125.44ms
iter 20080: loss 10.4439, time 126.03ms
iter 20090: loss 9.4027, time 126.00ms
iter 20100: loss 9.8816, time 126.18ms
iter 20110: loss 9.8218, time 126.08ms
iter 20120: loss 9.5517, time 125.98ms
iter 20130: loss 9.7243, time 126.11ms
iter 20140: loss 10.1978, time 126.00ms
iter 20150: loss 9.6009, time 125.90ms
iter 20160: loss 9.8068, time 126.02ms
iter 20170: loss 9.1246, time 126.13ms
iter 20180: loss 9.9276, time 125.13ms
iter 20190: loss 10.1926, time 125.44ms
iter 20200: loss 9.7274, time 126.23ms
iter 20210: loss 10.7488, time 125.36ms
iter 20220: loss 10.6547, time 127.41ms
iter 20230: loss 9.9875, time 125.02ms
iter 20240: loss 9.2829, time 125.77ms
step 20250: train loss 8.3556, val loss 8.3621
saving checkpoint to out-shakespeare-char
iter 20250: loss 9.9006, time 2905.20ms
iter 20260: loss 9.6723, time 126.14ms
iter 20270: loss 9.1942, time 125.66ms
iter 20280: loss 10.0367, time 125.94ms
iter 20290: loss 9.4279, time 125.73ms
iter 20300: loss 9.8032, time 125.83ms
iter 20310: loss 10.1185, time 125.55ms
iter 20320: loss 9.8600, time 125.92ms
iter 20330: loss 9.6722, time 128.84ms
iter 20340: loss 10.3415, time 125.39ms
iter 20350: loss 9.7470, time 125.33ms
iter 20360: loss 10.1774, time 125.19ms
iter 20370: loss 9.5471, time 125.03ms
iter 20380: loss 9.8996, time 124.47ms
iter 20390: loss 9.5999, time 125.05ms
iter 20400: loss 9.4765, time 124.91ms
iter 20410: loss 9.7568, time 125.03ms
iter 20420: loss 9.4727, time 124.85ms
iter 20430: loss 9.8959, time 125.11ms
iter 20440: loss 9.7557, time 127.17ms
iter 20450: loss 10.3067, time 124.98ms
iter 20460: loss 9.8556, time 124.36ms
iter 20470: loss 9.9584, time 124.74ms
iter 20480: loss 8.4746, time 124.35ms
iter 20490: loss 9.2674, time 125.05ms
step 20500: train loss 8.3759, val loss 8.3781
saving checkpoint to out-shakespeare-char
iter 20500: loss 9.9836, time 2872.28ms
iter 20510: loss 10.2669, time 125.64ms
iter 20520: loss 9.5456, time 125.47ms
iter 20530: loss 9.2268, time 125.18ms
iter 20540: loss 9.9793, time 125.02ms
iter 20550: loss 10.0950, time 125.74ms
iter 20560: loss 9.6282, time 125.15ms
iter 20570: loss 10.1453, time 123.94ms
iter 20580: loss 9.8335, time 124.97ms
iter 20590: loss 10.3979, time 124.00ms
iter 20600: loss 10.1130, time 125.72ms
iter 20610: loss 9.9500, time 127.40ms
iter 20620: loss 9.8513, time 125.55ms
iter 20630: loss 9.8490, time 124.64ms
iter 20640: loss 9.2563, time 125.72ms
iter 20650: loss 9.5965, time 124.09ms
iter 20660: loss 10.0226, time 125.35ms
iter 20670: loss 9.3096, time 124.06ms
iter 20680: loss 10.3232, time 125.47ms
iter 20690: loss 9.6412, time 124.83ms
iter 20700: loss 9.4319, time 126.65ms
iter 20710: loss 9.4250, time 125.62ms
iter 20720: loss 10.2327, time 125.33ms
iter 20730: loss 9.8782, time 124.46ms
iter 20740: loss 10.0029, time 126.94ms
step 20750: train loss 8.2745, val loss 8.3146
saving checkpoint to out-shakespeare-char
iter 20750: loss 9.1570, time 2883.90ms
iter 20760: loss 9.2692, time 125.52ms
iter 20770: loss 9.8150, time 125.04ms
iter 20780: loss 9.9239, time 124.82ms
iter 20790: loss 9.7532, time 124.18ms
iter 20800: loss 9.7230, time 127.09ms
iter 20810: loss 9.6448, time 124.80ms
iter 20820: loss 9.4142, time 125.85ms
iter 20830: loss 9.9573, time 125.73ms
iter 20840: loss 9.5186, time 125.45ms
iter 20850: loss 9.6763, time 125.59ms
iter 20860: loss 9.2394, time 125.68ms
iter 20870: loss 9.7625, time 125.61ms
iter 20880: loss 9.4021, time 125.92ms
iter 20890: loss 10.2686, time 125.61ms
iter 20900: loss 10.2478, time 125.60ms
iter 20910: loss 9.8521, time 128.46ms
iter 20920: loss 10.2930, time 125.37ms
iter 20930: loss 9.5054, time 125.57ms
iter 20940: loss 9.7029, time 126.22ms
iter 20950: loss 9.8342, time 125.86ms
iter 20960: loss 9.2451, time 125.64ms
iter 20970: loss 10.0900, time 124.45ms
iter 20980: loss 10.6005, time 124.15ms
iter 20990: loss 9.5655, time 125.62ms
step 21000: train loss 8.2532, val loss 8.3013
saving checkpoint to out-shakespeare-char
iter 21000: loss 9.7910, time 2886.92ms
iter 21010: loss 9.5176, time 128.93ms
iter 21020: loss 10.0472, time 126.12ms
iter 21030: loss 10.0743, time 126.14ms
iter 21040: loss 9.6832, time 126.25ms
iter 21050: loss 9.7044, time 128.75ms
iter 21060: loss 9.3903, time 125.46ms
iter 21070: loss 10.0062, time 125.42ms
iter 21080: loss 9.8962, time 126.27ms
iter 21090: loss 9.5732, time 125.30ms
iter 21100: loss 9.1530, time 126.08ms
iter 21110: loss 9.9109, time 125.65ms
iter 21120: loss 10.1824, time 126.46ms
iter 21130: loss 9.3622, time 128.01ms
iter 21140: loss 9.4261, time 125.46ms
iter 21150: loss 10.7015, time 125.46ms
iter 21160: loss 9.7990, time 124.48ms
iter 21170: loss 10.3213, time 128.53ms
iter 21180: loss 8.4300, time 124.88ms
iter 21190: loss 9.5729, time 126.29ms
iter 21200: loss 9.8610, time 125.32ms
iter 21210: loss 9.9962, time 127.71ms
iter 21220: loss 10.6914, time 125.38ms
iter 21230: loss 9.8346, time 124.85ms
iter 21240: loss 9.9433, time 125.14ms
step 21250: train loss 8.3055, val loss 8.3382
saving checkpoint to out-shakespeare-char
iter 21250: loss 9.5961, time 2875.14ms
iter 21260: loss 9.7460, time 125.59ms
iter 21270: loss 10.0955, time 125.92ms
iter 21280: loss 10.2644, time 128.22ms
iter 21290: loss 9.5675, time 125.63ms
iter 21300: loss 9.2971, time 125.94ms
iter 21310: loss 9.1538, time 126.71ms
iter 21320: loss 10.0264, time 125.49ms
iter 21330: loss 10.5859, time 125.72ms
iter 21340: loss 9.9137, time 125.65ms
iter 21350: loss 9.5914, time 123.81ms
iter 21360: loss 9.6976, time 125.74ms
iter 21370: loss 9.4932, time 125.74ms
iter 21380: loss 9.7391, time 125.96ms
iter 21390: loss 10.2817, time 125.60ms
iter 21400: loss 10.6276, time 126.00ms
iter 21410: loss 9.6064, time 121.92ms
iter 21420: loss 9.3280, time 124.86ms
iter 21430: loss 10.1449, time 122.00ms
iter 21440: loss 9.9565, time 124.09ms
iter 21450: loss 9.5168, time 121.30ms
iter 21460: loss 9.5775, time 124.69ms
iter 21470: loss 9.6290, time 121.97ms
iter 21480: loss 9.8907, time 124.26ms
iter 21490: loss 9.2701, time 121.50ms
step 21500: train loss 8.2743, val loss 8.3177
saving checkpoint to out-shakespeare-char
iter 21500: loss 9.5318, time 2869.32ms
iter 21510: loss 9.9354, time 124.63ms
iter 21520: loss 9.9734, time 121.59ms
iter 21530: loss 9.5554, time 123.23ms
iter 21540: loss 9.5027, time 120.14ms
iter 21550: loss 9.8542, time 124.14ms
iter 21560: loss 10.0253, time 121.65ms
iter 21570: loss 10.1960, time 125.57ms
iter 21580: loss 9.4515, time 121.62ms
iter 21590: loss 10.3277, time 124.50ms
iter 21600: loss 9.6345, time 121.63ms
iter 21610: loss 9.5331, time 124.90ms
iter 21620: loss 9.8465, time 122.14ms
iter 21630: loss 9.8541, time 123.83ms
iter 21640: loss 10.1030, time 121.80ms
iter 21650: loss 9.6594, time 124.53ms
iter 21660: loss 9.3949, time 121.69ms
iter 21670: loss 9.6104, time 124.77ms
iter 21680: loss 10.1003, time 121.61ms
iter 21690: loss 10.0255, time 124.50ms
iter 21700: loss 9.7061, time 121.46ms
iter 21710: loss 9.7900, time 124.83ms
iter 21720: loss 8.9350, time 121.28ms
iter 21730: loss 9.3800, time 124.65ms
iter 21740: loss 9.3178, time 121.19ms
step 21750: train loss 8.2812, val loss 8.3033
saving checkpoint to out-shakespeare-char
iter 21750: loss 9.9402, time 2907.22ms
iter 21760: loss 10.0253, time 124.65ms
iter 21770: loss 10.2146, time 125.56ms
iter 21780: loss 9.4433, time 127.92ms
iter 21790: loss 9.6193, time 124.56ms
iter 21800: loss 9.5716, time 125.42ms
iter 21810: loss 9.3026, time 125.35ms
iter 21820: loss 9.7792, time 124.63ms
iter 21830: loss 10.1906, time 125.12ms
iter 21840: loss 9.7403, time 124.28ms
iter 21850: loss 10.5473, time 125.92ms
iter 21860: loss 9.6493, time 125.10ms
iter 21870: loss 9.6880, time 125.91ms
iter 21880: loss 9.9813, time 125.89ms
iter 21890: loss 9.4090, time 124.60ms
iter 21900: loss 9.6861, time 125.27ms
iter 21910: loss 9.9757, time 125.12ms
iter 21920: loss 10.2615, time 125.77ms
iter 21930: loss 9.8991, time 125.17ms
iter 21940: loss 10.0501, time 125.25ms
iter 21950: loss 9.7269, time 125.39ms
iter 21960: loss 9.2681, time 128.52ms
iter 21970: loss 10.0181, time 125.37ms
iter 21980: loss 9.4498, time 125.31ms
iter 21990: loss 10.0127, time 125.47ms
step 22000: train loss 8.2839, val loss 8.2576
saving checkpoint to out-shakespeare-char
iter 22000: loss 10.2809, time 2898.72ms
iter 22010: loss 9.5310, time 125.07ms
iter 22020: loss 9.5697, time 125.31ms
iter 22030: loss 9.0186, time 125.02ms
iter 22040: loss 9.4633, time 127.25ms
iter 22050: loss 9.7513, time 125.54ms
iter 22060: loss 10.0266, time 125.46ms
iter 22070: loss 8.9656, time 125.13ms
iter 22080: loss 10.2427, time 125.56ms
iter 22090: loss 9.9861, time 125.11ms
iter 22100: loss 9.3116, time 125.91ms
iter 22110: loss 9.6439, time 125.47ms
iter 22120: loss 9.2605, time 125.22ms
iter 22130: loss 9.7281, time 124.66ms
iter 22140: loss 9.3846, time 125.19ms
iter 22150: loss 9.2715, time 125.25ms
iter 22160: loss 9.9065, time 125.49ms
iter 22170: loss 10.2787, time 126.23ms
iter 22180: loss 9.7098, time 125.51ms
iter 22190: loss 9.5891, time 125.94ms
iter 22200: loss 9.8468, time 126.03ms
iter 22210: loss 9.6137, time 125.52ms
iter 22220: loss 9.4013, time 128.78ms
iter 22230: loss 10.1057, time 125.16ms
iter 22240: loss 9.8606, time 125.11ms
step 22250: train loss 8.2596, val loss 8.2655
saving checkpoint to out-shakespeare-char
iter 22250: loss 10.1817, time 2892.81ms
iter 22260: loss 9.7177, time 124.76ms
iter 22270: loss 9.2842, time 125.38ms
iter 22280: loss 10.3443, time 126.49ms
iter 22290: loss 9.5012, time 125.54ms
iter 22300: loss 10.1569, time 125.19ms
iter 22310: loss 9.1658, time 125.31ms
iter 22320: loss 9.8905, time 128.20ms
iter 22330: loss 10.0641, time 124.56ms
iter 22340: loss 9.5018, time 125.47ms
iter 22350: loss 9.4010, time 128.12ms
iter 22360: loss 10.0891, time 125.27ms
iter 22370: loss 10.1830, time 124.77ms
iter 22380: loss 9.5562, time 125.40ms
iter 22390: loss 9.5363, time 125.18ms
iter 22400: loss 8.8890, time 125.55ms
iter 22410: loss 9.9651, time 125.45ms
iter 22420: loss 9.6902, time 125.21ms
iter 22430: loss 9.9889, time 125.29ms
iter 22440: loss 10.2786, time 125.20ms
iter 22450: loss 10.7014, time 125.05ms
iter 22460: loss 9.8458, time 128.11ms
iter 22470: loss 10.0815, time 124.21ms
iter 22480: loss 9.8486, time 124.98ms
iter 22490: loss 9.5388, time 125.40ms
step 22500: train loss 8.2784, val loss 8.2579
saving checkpoint to out-shakespeare-char
iter 22500: loss 9.7287, time 2891.18ms
iter 22510: loss 9.6739, time 125.49ms
iter 22520: loss 9.8636, time 125.40ms
iter 22530: loss 10.1032, time 125.08ms
iter 22540: loss 10.2606, time 125.05ms
iter 22550: loss 10.8984, time 125.20ms
iter 22560: loss 9.9750, time 125.01ms
iter 22570: loss 9.3777, time 125.35ms
iter 22580: loss 10.5007, time 124.82ms
iter 22590: loss 10.1867, time 125.40ms
iter 22600: loss 9.8214, time 128.13ms
iter 22610: loss 9.8613, time 125.01ms
iter 22620: loss 9.8933, time 124.81ms
iter 22630: loss 9.7634, time 125.27ms
iter 22640: loss 9.3143, time 125.00ms
iter 22650: loss 9.5748, time 124.93ms
iter 22660: loss 9.1636, time 125.24ms
iter 22670: loss 9.7311, time 125.29ms
iter 22680: loss 9.4912, time 124.85ms
iter 22690: loss 9.4354, time 125.49ms
iter 22700: loss 9.8838, time 125.81ms
iter 22710: loss 9.3580, time 127.60ms
iter 22720: loss 10.0074, time 125.75ms
iter 22730: loss 9.6894, time 124.95ms
iter 22740: loss 10.3963, time 125.86ms
step 22750: train loss 8.2481, val loss 8.2622
saving checkpoint to out-shakespeare-char
iter 22750: loss 8.8150, time 2887.12ms
iter 22760: loss 9.8531, time 125.98ms
iter 22770: loss 8.6599, time 124.99ms
iter 22780: loss 9.6241, time 124.28ms
iter 22790: loss 9.5035, time 125.16ms
iter 22800: loss 9.5618, time 125.02ms
iter 22810: loss 9.3062, time 125.95ms
iter 22820: loss 9.8355, time 125.16ms
iter 22830: loss 9.7476, time 125.84ms
iter 22840: loss 9.4788, time 125.84ms
iter 22850: loss 9.9139, time 128.55ms
iter 22860: loss 9.6689, time 125.66ms
iter 22870: loss 9.6511, time 125.59ms
iter 22880: loss 9.6073, time 126.23ms
iter 22890: loss 10.0705, time 125.55ms
iter 22900: loss 9.9454, time 125.66ms
iter 22910: loss 9.7966, time 126.39ms
iter 22920: loss 9.8156, time 126.26ms
iter 22930: loss 9.1397, time 125.59ms
iter 22940: loss 9.8413, time 124.79ms
iter 22950: loss 9.4414, time 124.86ms
iter 22960: loss 9.3684, time 124.62ms
iter 22970: loss 9.3522, time 125.71ms
iter 22980: loss 9.6407, time 125.58ms
iter 22990: loss 9.8221, time 125.71ms
step 23000: train loss 8.3096, val loss 8.2834
saving checkpoint to out-shakespeare-char
iter 23000: loss 9.6402, time 2886.48ms
iter 23010: loss 9.4823, time 125.44ms
iter 23020: loss 8.8761, time 125.14ms
iter 23030: loss 9.9857, time 125.36ms
iter 23040: loss 9.2851, time 125.30ms
iter 23050: loss 9.0775, time 127.89ms
iter 23060: loss 9.6346, time 125.42ms
iter 23070: loss 9.8108, time 125.38ms
iter 23080: loss 9.2242, time 125.71ms
iter 23090: loss 9.9442, time 128.44ms
iter 23100: loss 9.1103, time 125.44ms
iter 23110: loss 9.6283, time 124.97ms
iter 23120: loss 10.0129, time 125.11ms
iter 23130: loss 9.9048, time 127.77ms
iter 23140: loss 9.4822, time 124.80ms
iter 23150: loss 9.5461, time 125.97ms
iter 23160: loss 9.4580, time 125.58ms
iter 23170: loss 9.7808, time 125.83ms
iter 23180: loss 9.9616, time 125.96ms
iter 23190: loss 9.6063, time 125.67ms
iter 23200: loss 9.7501, time 125.39ms
iter 23210: loss 9.3064, time 125.67ms
iter 23220: loss 9.2872, time 125.21ms
iter 23230: loss 9.3190, time 125.63ms
iter 23240: loss 9.6818, time 128.38ms
step 23250: train loss 8.2060, val loss 8.2265
saving checkpoint to out-shakespeare-char
iter 23250: loss 9.4410, time 2894.52ms
iter 23260: loss 9.9804, time 125.81ms
iter 23270: loss 9.4234, time 128.37ms
iter 23280: loss 9.3849, time 127.12ms
iter 23290: loss 9.3807, time 126.03ms
iter 23300: loss 9.0150, time 125.86ms
iter 23310: loss 9.2506, time 125.82ms
iter 23320: loss 10.3023, time 125.87ms
iter 23330: loss 9.8866, time 125.90ms
iter 23340: loss 9.6682, time 125.31ms
iter 23350: loss 9.2576, time 125.77ms
iter 23360: loss 9.4116, time 125.58ms
iter 23370: loss 9.4757, time 125.51ms
iter 23380: loss 9.6666, time 128.33ms
iter 23390: loss 9.7551, time 125.26ms
iter 23400: loss 10.2821, time 125.36ms
iter 23410: loss 9.1188, time 126.35ms
iter 23420: loss 9.5896, time 125.31ms
iter 23430: loss 10.2685, time 125.11ms
iter 23440: loss 9.6848, time 124.92ms
iter 23450: loss 9.7597, time 125.59ms
iter 23460: loss 9.8834, time 125.71ms
iter 23470: loss 9.0484, time 125.42ms
iter 23480: loss 9.7776, time 125.34ms
iter 23490: loss 9.4942, time 128.35ms
step 23500: train loss 8.2330, val loss 8.2520
saving checkpoint to out-shakespeare-char
iter 23500: loss 9.4043, time 2906.01ms
iter 23510: loss 9.8466, time 126.69ms
iter 23520: loss 10.3483, time 124.77ms
iter 23530: loss 8.9710, time 125.25ms
iter 23540: loss 9.3236, time 126.00ms
iter 23550: loss 9.5900, time 125.52ms
iter 23560: loss 10.0805, time 125.36ms
iter 23570: loss 9.5441, time 125.35ms
iter 23580: loss 9.8452, time 126.20ms
iter 23590: loss 9.5443, time 128.99ms
iter 23600: loss 9.6799, time 125.30ms
iter 23610: loss 9.2852, time 125.39ms
iter 23620: loss 9.9265, time 125.62ms
iter 23630: loss 9.1875, time 125.27ms
iter 23640: loss 9.7549, time 125.57ms
iter 23650: loss 9.6146, time 124.95ms
iter 23660: loss 9.6091, time 125.98ms
iter 23670: loss 9.9357, time 124.89ms
iter 23680: loss 9.5581, time 125.78ms
iter 23690: loss 9.7963, time 126.02ms
iter 23700: loss 9.6839, time 128.34ms
iter 23710: loss 9.7753, time 125.71ms
iter 23720: loss 9.5089, time 125.64ms
iter 23730: loss 9.4751, time 125.54ms
iter 23740: loss 9.3464, time 124.85ms
step 23750: train loss 8.2574, val loss 8.2911
saving checkpoint to out-shakespeare-char
iter 23750: loss 9.4888, time 2898.04ms
iter 23760: loss 9.6903, time 125.65ms
iter 23770: loss 9.7232, time 125.71ms
iter 23780: loss 9.9381, time 125.12ms
iter 23790: loss 9.7872, time 125.74ms
iter 23800: loss 9.6890, time 128.54ms
iter 23810: loss 9.7289, time 125.38ms
iter 23820: loss 9.6653, time 125.99ms
iter 23830: loss 10.1141, time 125.55ms
iter 23840: loss 9.1125, time 125.64ms
iter 23850: loss 9.9446, time 125.68ms
iter 23860: loss 9.3887, time 125.80ms
iter 23870: loss 9.8202, time 125.97ms
iter 23880: loss 10.3467, time 125.45ms
iter 23890: loss 9.6171, time 125.80ms
iter 23900: loss 10.3929, time 126.19ms
iter 23910: loss 9.3452, time 128.81ms
iter 23920: loss 9.4274, time 126.28ms
iter 23930: loss 9.9327, time 125.84ms
iter 23940: loss 8.7696, time 125.72ms
iter 23950: loss 9.9635, time 125.97ms
iter 23960: loss 9.8451, time 125.90ms
iter 23970: loss 9.9967, time 125.75ms
iter 23980: loss 9.9947, time 125.74ms
iter 23990: loss 9.6905, time 125.98ms
step 24000: train loss 8.2489, val loss 8.2451
saving checkpoint to out-shakespeare-char
iter 24000: loss 10.0207, time 2899.92ms
iter 24010: loss 9.3118, time 125.88ms
iter 24020: loss 9.6118, time 126.27ms
iter 24030: loss 9.7660, time 125.81ms
iter 24040: loss 10.1787, time 128.06ms
iter 24050: loss 9.7948, time 125.66ms
iter 24060: loss 9.8936, time 126.00ms
iter 24070: loss 9.1981, time 125.90ms
iter 24080: loss 9.5633, time 126.83ms
iter 24090: loss 9.2868, time 125.88ms
iter 24100: loss 10.1467, time 126.08ms
iter 24110: loss 9.7049, time 125.68ms
iter 24120: loss 9.2752, time 125.91ms
iter 24130: loss 9.5963, time 125.55ms
iter 24140: loss 9.6491, time 126.03ms
iter 24150: loss 9.8981, time 127.54ms
iter 24160: loss 9.5144, time 125.60ms
iter 24170: loss 9.2913, time 125.72ms
iter 24180: loss 9.6733, time 125.37ms
iter 24190: loss 10.0677, time 125.28ms
iter 24200: loss 10.2265, time 125.22ms
iter 24210: loss 9.5914, time 125.31ms
iter 24220: loss 9.4266, time 125.11ms
iter 24230: loss 10.0334, time 124.60ms
iter 24240: loss 9.5812, time 125.26ms
step 24250: train loss 8.2424, val loss 8.2064
saving checkpoint to out-shakespeare-char
iter 24250: loss 9.3506, time 2880.39ms
iter 24260: loss 9.6455, time 126.20ms
iter 24270: loss 9.4746, time 120.60ms
iter 24280: loss 9.2716, time 121.42ms
iter 24290: loss 9.8863, time 120.82ms
iter 24300: loss 10.2167, time 121.43ms
iter 24310: loss 9.8705, time 121.33ms
iter 24320: loss 9.2135, time 121.38ms
iter 24330: loss 9.4522, time 121.43ms
iter 24340: loss 9.5430, time 121.37ms
iter 24350: loss 9.2892, time 121.33ms
iter 24360: loss 9.7303, time 121.17ms
iter 24370: loss 10.2888, time 121.28ms
iter 24380: loss 10.1973, time 121.45ms
iter 24390: loss 9.2807, time 121.27ms
iter 24400: loss 9.5782, time 121.25ms
iter 24410: loss 9.5702, time 121.36ms
iter 24420: loss 9.4744, time 121.42ms
iter 24430: loss 9.0450, time 121.39ms
iter 24440: loss 9.6834, time 121.55ms
iter 24450: loss 10.1832, time 121.35ms
iter 24460: loss 9.9922, time 121.36ms
iter 24470: loss 9.5086, time 121.36ms
iter 24480: loss 9.0976, time 121.46ms
iter 24490: loss 9.4191, time 121.33ms
step 24500: train loss 8.1906, val loss 8.2341
saving checkpoint to out-shakespeare-char
iter 24500: loss 9.5385, time 2893.25ms
iter 24510: loss 10.1371, time 121.74ms
iter 24520: loss 9.8156, time 121.30ms
iter 24530: loss 9.7170, time 121.91ms
iter 24540: loss 9.5646, time 121.19ms
iter 24550: loss 10.5105, time 121.38ms
iter 24560: loss 9.8630, time 121.24ms
iter 24570: loss 8.8779, time 121.43ms
iter 24580: loss 10.1371, time 121.11ms
iter 24590: loss 9.5272, time 121.29ms
iter 24600: loss 9.3897, time 121.57ms
iter 24610: loss 10.4943, time 121.40ms
iter 24620: loss 9.7155, time 121.08ms
iter 24630: loss 8.9185, time 121.24ms
iter 24640: loss 10.2887, time 121.13ms
iter 24650: loss 9.5129, time 121.35ms
iter 24660: loss 9.5294, time 121.45ms
iter 24670: loss 9.5225, time 121.25ms
iter 24680: loss 10.1080, time 121.24ms
iter 24690: loss 10.0483, time 121.28ms
iter 24700: loss 8.8468, time 121.66ms
iter 24710: loss 10.3668, time 121.34ms
iter 24720: loss 8.9851, time 121.21ms
iter 24730: loss 9.3994, time 121.41ms
iter 24740: loss 9.1819, time 121.24ms
step 24750: train loss 8.1939, val loss 8.1838
saving checkpoint to out-shakespeare-char
iter 24750: loss 9.6195, time 2894.22ms
iter 24760: loss 9.8652, time 121.77ms
iter 24770: loss 9.3670, time 121.45ms
iter 24780: loss 10.0475, time 122.51ms
iter 24790: loss 10.4538, time 121.48ms
iter 24800: loss 9.1817, time 122.64ms
iter 24810: loss 9.4934, time 121.38ms
iter 24820: loss 9.6862, time 122.89ms
iter 24830: loss 9.1579, time 121.74ms
iter 24840: loss 10.2230, time 122.65ms
iter 24850: loss 9.5318, time 121.53ms
iter 24860: loss 9.9557, time 122.62ms
iter 24870: loss 9.5193, time 121.53ms
iter 24880: loss 9.7437, time 122.65ms
iter 24890: loss 9.6166, time 121.52ms
iter 24900: loss 9.7279, time 122.57ms
iter 24910: loss 9.7250, time 121.49ms
iter 24920: loss 9.0299, time 122.89ms
iter 24930: loss 9.7545, time 121.55ms
iter 24940: loss 9.1080, time 123.10ms
iter 24950: loss 9.7784, time 121.60ms
iter 24960: loss 9.6740, time 122.82ms
iter 24970: loss 9.3937, time 121.81ms
iter 24980: loss 9.8196, time 122.64ms
iter 24990: loss 9.3765, time 121.45ms
step 25000: train loss 8.1753, val loss 8.1408
saving checkpoint to out-shakespeare-char
iter 25000: loss 9.7290, time 2890.03ms
iter 25010: loss 9.9990, time 121.80ms
iter 25020: loss 9.6103, time 121.69ms
iter 25030: loss 9.7454, time 121.56ms
iter 25040: loss 9.7405, time 121.67ms
iter 25050: loss 9.8346, time 121.69ms
iter 25060: loss 9.5953, time 121.48ms
iter 25070: loss 9.2654, time 121.51ms
iter 25080: loss 10.2209, time 121.51ms
iter 25090: loss 9.9961, time 121.96ms
iter 25100: loss 9.9826, time 121.56ms
iter 25110: loss 9.8283, time 121.76ms
iter 25120: loss 9.6878, time 122.84ms
iter 25130: loss 9.4069, time 121.72ms
iter 25140: loss 9.8063, time 121.91ms
iter 25150: loss 9.7642, time 121.78ms
iter 25160: loss 9.8480, time 121.23ms
iter 25170: loss 9.3001, time 121.50ms
iter 25180: loss 10.1692, time 121.54ms
iter 25190: loss 8.9544, time 121.59ms
iter 25200: loss 9.8410, time 121.88ms
iter 25210: loss 9.7555, time 121.62ms
iter 25220: loss 10.1408, time 121.57ms
iter 25230: loss 9.6853, time 121.46ms
iter 25240: loss 10.1866, time 121.60ms
step 25250: train loss 8.2030, val loss 8.2419
saving checkpoint to out-shakespeare-char
iter 25250: loss 9.7005, time 2892.87ms
iter 25260: loss 9.9999, time 121.47ms
iter 25270: loss 9.5080, time 121.47ms
iter 25280: loss 9.7064, time 121.67ms
iter 25290: loss 8.3784, time 121.43ms
iter 25300: loss 9.3199, time 121.48ms
iter 25310: loss 9.7315, time 121.64ms
iter 25320: loss 9.7384, time 121.43ms
iter 25330: loss 9.8704, time 121.41ms
iter 25340: loss 9.5251, time 121.52ms
iter 25350: loss 9.9650, time 121.54ms
iter 25360: loss 9.8936, time 121.56ms
iter 25370: loss 9.1800, time 121.58ms
iter 25380: loss 9.5039, time 121.53ms
iter 25390: loss 10.1188, time 121.59ms
iter 25400: loss 9.9596, time 121.60ms
iter 25410: loss 9.1958, time 121.44ms
iter 25420: loss 9.8145, time 121.40ms
iter 25430: loss 9.7095, time 121.50ms
iter 25440: loss 9.3985, time 121.65ms
iter 25450: loss 9.3133, time 121.52ms
iter 25460: loss 9.5317, time 121.94ms
iter 25470: loss 10.1666, time 121.28ms
iter 25480: loss 9.1031, time 121.58ms
iter 25490: loss 8.9154, time 121.43ms
step 25500: train loss 8.1895, val loss 8.1828
saving checkpoint to out-shakespeare-char
iter 25500: loss 9.6056, time 2889.71ms
iter 25510: loss 9.4154, time 122.78ms
iter 25520: loss 9.9374, time 121.37ms
iter 25530: loss 9.5826, time 122.65ms
iter 25540: loss 9.3659, time 121.45ms
iter 25550: loss 9.4840, time 122.76ms
iter 25560: loss 9.7105, time 121.68ms
iter 25570: loss 9.6253, time 122.71ms
iter 25580: loss 9.0370, time 121.80ms
iter 25590: loss 9.3953, time 122.71ms
iter 25600: loss 9.3800, time 121.41ms
iter 25610: loss 10.0025, time 122.69ms
iter 25620: loss 9.1099, time 121.59ms
iter 25630: loss 9.7133, time 122.94ms
iter 25640: loss 9.7622, time 121.59ms
iter 25650: loss 9.2076, time 123.39ms
iter 25660: loss 9.8670, time 121.46ms
iter 25670: loss 9.8640, time 123.21ms
iter 25680: loss 9.5084, time 121.52ms
iter 25690: loss 9.6348, time 122.85ms
iter 25700: loss 9.1014, time 121.55ms
iter 25710: loss 9.3445, time 123.11ms
iter 25720: loss 9.3611, time 121.57ms
iter 25730: loss 9.6534, time 122.71ms
iter 25740: loss 9.4235, time 121.54ms
step 25750: train loss 8.1266, val loss 8.2234
saving checkpoint to out-shakespeare-char
iter 25750: loss 9.3335, time 2880.73ms
iter 25760: loss 9.0710, time 121.75ms
iter 25770: loss 9.0643, time 122.71ms
iter 25780: loss 9.5541, time 121.53ms
iter 25790: loss 9.3122, time 121.64ms
iter 25800: loss 10.1915, time 121.44ms
iter 25810: loss 9.4788, time 121.71ms
iter 25820: loss 9.6379, time 120.90ms
iter 25830: loss 9.9024, time 121.63ms
iter 25840: loss 9.1931, time 121.26ms
iter 25850: loss 10.2717, time 121.14ms
iter 25860: loss 9.2791, time 122.49ms
iter 25870: loss 9.4193, time 121.37ms
iter 25880: loss 9.7154, time 121.42ms
iter 25890: loss 8.6938, time 121.54ms
iter 25900: loss 10.3013, time 121.50ms
iter 25910: loss 8.9944, time 121.61ms
iter 25920: loss 9.6034, time 121.23ms
iter 25930: loss 9.5442, time 121.46ms
iter 25940: loss 9.7606, time 121.15ms
iter 25950: loss 9.3439, time 121.53ms
iter 25960: loss 9.5688, time 121.47ms
iter 25970: loss 9.8554, time 121.49ms
iter 25980: loss 9.7688, time 121.36ms
iter 25990: loss 9.7102, time 121.83ms
step 26000: train loss 8.1191, val loss 8.1221
saving checkpoint to out-shakespeare-char
iter 26000: loss 9.1988, time 2892.96ms
iter 26010: loss 9.8015, time 121.67ms
iter 26020: loss 9.6774, time 124.40ms
iter 26030: loss 8.7985, time 121.50ms
iter 26040: loss 9.5976, time 124.45ms
iter 26050: loss 9.9931, time 121.15ms
iter 26060: loss 9.4027, time 124.36ms
iter 26070: loss 9.4539, time 121.66ms
iter 26080: loss 9.6944, time 124.35ms
iter 26090: loss 9.2507, time 121.33ms
iter 26100: loss 9.3625, time 124.01ms
iter 26110: loss 9.6934, time 121.55ms
iter 26120: loss 9.2129, time 124.50ms
iter 26130: loss 10.3525, time 121.67ms
iter 26140: loss 9.3899, time 123.92ms
iter 26150: loss 8.9585, time 121.50ms
iter 26160: loss 8.8599, time 124.46ms
iter 26170: loss 9.6902, time 121.07ms
iter 26180: loss 9.4577, time 124.57ms
iter 26190: loss 8.7187, time 121.38ms
iter 26200: loss 10.0311, time 124.22ms
iter 26210: loss 10.1235, time 121.55ms
iter 26220: loss 9.3295, time 124.42ms
iter 26230: loss 9.3458, time 121.60ms
iter 26240: loss 9.2830, time 124.27ms
step 26250: train loss 8.1492, val loss 8.1509
saving checkpoint to out-shakespeare-char
iter 26250: loss 9.7486, time 2882.52ms
iter 26260: loss 9.3403, time 121.31ms
iter 26270: loss 9.9558, time 122.59ms
iter 26280: loss 9.5269, time 121.60ms
iter 26290: loss 9.6217, time 122.56ms
iter 26300: loss 9.8833, time 121.26ms
iter 26310: loss 10.0576, time 122.55ms
iter 26320: loss 9.7932, time 121.35ms
iter 26330: loss 10.1516, time 122.55ms
iter 26340: loss 9.4121, time 121.34ms
iter 26350: loss 9.7694, time 123.90ms
iter 26360: loss 8.7918, time 121.64ms
iter 26370: loss 9.3737, time 122.64ms
iter 26380: loss 9.7094, time 121.58ms
iter 26390: loss 9.8550, time 122.52ms
iter 26400: loss 9.6510, time 121.58ms
iter 26410: loss 9.7146, time 123.49ms
iter 26420: loss 9.1760, time 119.65ms
iter 26430: loss 9.6608, time 120.16ms
iter 26440: loss 9.4309, time 119.03ms
iter 26450: loss 9.4121, time 122.61ms
iter 26460: loss 9.8916, time 121.26ms
iter 26470: loss 10.1034, time 124.90ms
iter 26480: loss 9.5129, time 121.80ms
iter 26490: loss 9.6176, time 122.88ms
step 26500: train loss 8.2046, val loss 8.1356
saving checkpoint to out-shakespeare-char
iter 26500: loss 9.5733, time 2901.96ms
iter 26510: loss 10.3814, time 121.77ms
iter 26520: loss 9.6240, time 121.71ms
iter 26530: loss 9.3090, time 121.83ms
iter 26540: loss 9.1482, time 122.38ms
iter 26550: loss 9.2701, time 121.69ms
iter 26560: loss 9.7460, time 121.79ms
iter 26570: loss 9.3541, time 121.54ms
iter 26580: loss 9.4947, time 123.13ms
iter 26590: loss 9.1800, time 121.91ms
iter 26600: loss 9.8491, time 121.92ms
iter 26610: loss 9.8906, time 121.82ms
iter 26620: loss 9.8380, time 121.35ms
iter 26630: loss 9.9412, time 122.06ms
iter 26640: loss 9.7164, time 121.62ms
iter 26650: loss 9.2097, time 121.64ms
iter 26660: loss 9.9273, time 121.81ms
iter 26670: loss 10.3685, time 121.91ms
iter 26680: loss 9.4059, time 121.55ms
iter 26690: loss 9.3423, time 121.71ms
iter 26700: loss 9.6955, time 121.47ms
iter 26710: loss 9.5705, time 121.56ms
iter 26720: loss 10.0917, time 122.01ms
iter 26730: loss 10.0004, time 121.46ms
iter 26740: loss 9.6882, time 121.33ms
step 26750: train loss 8.1458, val loss 8.1469
saving checkpoint to out-shakespeare-char
iter 26750: loss 10.0013, time 2891.08ms
iter 26760: loss 9.9044, time 121.52ms
iter 26770: loss 9.4752, time 122.14ms
iter 26780: loss 9.8626, time 121.91ms
iter 26790: loss 9.4025, time 121.79ms
iter 26800: loss 9.5330, time 121.09ms
iter 26810: loss 9.3350, time 121.53ms
iter 26820: loss 9.2705, time 121.41ms
iter 26830: loss 9.5308, time 121.59ms
iter 26840: loss 9.4804, time 121.56ms
iter 26850: loss 9.1467, time 121.60ms
iter 26860: loss 10.1531, time 122.07ms
iter 26870: loss 9.2458, time 121.95ms
iter 26880: loss 9.1071, time 121.93ms
iter 26890: loss 9.7247, time 122.15ms
iter 26900: loss 8.9755, time 121.75ms
iter 26910: loss 9.3073, time 121.92ms
iter 26920: loss 9.2370, time 122.30ms
iter 26930: loss 8.9847, time 122.09ms
iter 26940: loss 10.2520, time 122.03ms
iter 26950: loss 9.5867, time 121.82ms
iter 26960: loss 9.3723, time 121.33ms
iter 26970: loss 10.0033, time 122.07ms
iter 26980: loss 9.4641, time 122.42ms
iter 26990: loss 9.5982, time 121.75ms
step 27000: train loss 8.0945, val loss 8.1230
saving checkpoint to out-shakespeare-char
iter 27000: loss 9.8363, time 2894.54ms
iter 27010: loss 9.6010, time 120.63ms
iter 27020: loss 9.2976, time 122.02ms
iter 27030: loss 9.9430, time 121.32ms
iter 27040: loss 9.4896, time 121.46ms
iter 27050: loss 9.4013, time 121.75ms
iter 27060: loss 9.2880, time 121.71ms
iter 27070: loss 10.0137, time 121.36ms
iter 27080: loss 9.0447, time 121.93ms
iter 27090: loss 8.7713, time 122.79ms
iter 27100: loss 9.9547, time 121.38ms
iter 27110: loss 9.3364, time 121.36ms
iter 27120: loss 9.1997, time 121.37ms
iter 27130: loss 9.0692, time 121.52ms
iter 27140: loss 9.6983, time 121.29ms
iter 27150: loss 9.4997, time 121.45ms
iter 27160: loss 9.6576, time 121.54ms
iter 27170: loss 9.9456, time 121.69ms
iter 27180: loss 9.3290, time 121.56ms
iter 27190: loss 8.5909, time 121.30ms
iter 27200: loss 9.6487, time 122.08ms
iter 27210: loss 9.2306, time 121.33ms
iter 27220: loss 9.6796, time 121.55ms
iter 27230: loss 9.3248, time 121.74ms
iter 27240: loss 9.4670, time 121.57ms
step 27250: train loss 8.1174, val loss 8.1628
saving checkpoint to out-shakespeare-char
iter 27250: loss 9.6338, time 2899.50ms
iter 27260: loss 9.4355, time 121.45ms
iter 27270: loss 9.6211, time 122.47ms
iter 27280: loss 9.6137, time 121.38ms
iter 27290: loss 9.5167, time 121.23ms
iter 27300: loss 9.5241, time 121.49ms
iter 27310: loss 9.3564, time 122.52ms
iter 27320: loss 10.1236, time 121.50ms
iter 27330: loss 9.2660, time 122.89ms
iter 27340: loss 9.8677, time 121.56ms
iter 27350: loss 9.6489, time 121.48ms
iter 27360: loss 9.0861, time 121.89ms
iter 27370: loss 9.9623, time 121.69ms
iter 27380: loss 9.5853, time 121.81ms
iter 27390: loss 9.4020, time 121.17ms
iter 27400: loss 9.6586, time 122.06ms
iter 27410: loss 9.3809, time 121.97ms
iter 27420: loss 9.2784, time 121.85ms
iter 27430: loss 9.5047, time 122.01ms
iter 27440: loss 9.6214, time 121.92ms
iter 27450: loss 9.3812, time 121.62ms
iter 27460: loss 9.4231, time 121.05ms
iter 27470: loss 9.6349, time 121.75ms
iter 27480: loss 9.4877, time 121.86ms
iter 27490: loss 9.6810, time 121.75ms
step 27500: train loss 8.1084, val loss 8.1205
saving checkpoint to out-shakespeare-char
iter 27500: loss 9.8768, time 2902.57ms
iter 27510: loss 9.8919, time 124.87ms
iter 27520: loss 10.2385, time 121.86ms
iter 27530: loss 9.7789, time 124.76ms
iter 27540: loss 9.6622, time 121.84ms
iter 27550: loss 8.7281, time 124.15ms
iter 27560: loss 9.5946, time 122.17ms
iter 27570: loss 9.3846, time 124.97ms
iter 27580: loss 9.5229, time 122.10ms
iter 27590: loss 9.3634, time 124.71ms
iter 27600: loss 9.2674, time 121.88ms
iter 27610: loss 9.5308, time 124.56ms
iter 27620: loss 9.6417, time 121.87ms
iter 27630: loss 9.9609, time 124.69ms
iter 27640: loss 9.0513, time 122.83ms
iter 27650: loss 9.6828, time 124.73ms
iter 27660: loss 9.7613, time 122.12ms
iter 27670: loss 9.6013, time 124.48ms
iter 27680: loss 9.2471, time 121.02ms
iter 27690: loss 9.2556, time 124.53ms
iter 27700: loss 10.4431, time 121.76ms
iter 27710: loss 9.9043, time 124.51ms
iter 27720: loss 9.1737, time 121.88ms
iter 27730: loss 9.7999, time 124.71ms
iter 27740: loss 9.3544, time 121.91ms
step 27750: train loss 8.0892, val loss 8.0948
saving checkpoint to out-shakespeare-char
iter 27750: loss 9.3308, time 2901.79ms
iter 27760: loss 9.0987, time 121.28ms
iter 27770: loss 9.8618, time 121.30ms
iter 27780: loss 8.7555, time 122.43ms
iter 27790: loss 9.3304, time 121.37ms
iter 27800: loss 8.8399, time 121.30ms
iter 27810: loss 9.2504, time 121.65ms
iter 27820: loss 9.7338, time 121.45ms
iter 27830: loss 8.8222, time 121.45ms
iter 27840: loss 9.0120, time 121.44ms
iter 27850: loss 9.3134, time 120.50ms
iter 27860: loss 9.4525, time 121.51ms
iter 27870: loss 9.6259, time 121.45ms
iter 27880: loss 10.1473, time 121.08ms
iter 27890: loss 9.8880, time 121.42ms
iter 27900: loss 9.7157, time 120.97ms
iter 27910: loss 9.3765, time 121.45ms
iter 27920: loss 9.7521, time 121.95ms
iter 27930: loss 9.9301, time 121.49ms
iter 27940: loss 8.9401, time 121.39ms
iter 27950: loss 9.9115, time 120.51ms
iter 27960: loss 9.2418, time 121.45ms
iter 27970: loss 9.6803, time 121.29ms
iter 27980: loss 9.4488, time 121.51ms
iter 27990: loss 9.8532, time 121.31ms
step 28000: train loss 8.1065, val loss 8.0926
saving checkpoint to out-shakespeare-char
iter 28000: loss 10.3646, time 2887.91ms
iter 28010: loss 9.7827, time 123.40ms
iter 28020: loss 9.4275, time 121.89ms
iter 28030: loss 9.0411, time 122.45ms
iter 28040: loss 9.3761, time 121.62ms
iter 28050: loss 9.9089, time 122.45ms
iter 28060: loss 9.6850, time 121.91ms
iter 28070: loss 9.8779, time 123.05ms
iter 28080: loss 9.5626, time 120.64ms
iter 28090: loss 9.5600, time 122.55ms
iter 28100: loss 9.4264, time 121.57ms
iter 28110: loss 9.7805, time 122.56ms
iter 28120: loss 9.0633, time 121.68ms
iter 28130: loss 9.5793, time 122.31ms
iter 28140: loss 9.3681, time 122.08ms
iter 28150: loss 9.0725, time 122.57ms
iter 28160: loss 8.9534, time 121.65ms
iter 28170: loss 9.3291, time 122.75ms
iter 28180: loss 9.1187, time 121.97ms
iter 28190: loss 9.3876, time 122.37ms
iter 28200: loss 9.1861, time 120.94ms
iter 28210: loss 10.0614, time 122.66ms
iter 28220: loss 10.0957, time 121.57ms
iter 28230: loss 9.3641, time 122.68ms
iter 28240: loss 9.9713, time 121.60ms
step 28250: train loss 8.1588, val loss 8.0960
saving checkpoint to out-shakespeare-char
iter 28250: loss 10.0661, time 2878.32ms
iter 28260: loss 9.3812, time 126.23ms
iter 28270: loss 9.3129, time 125.87ms
iter 28280: loss 9.7221, time 125.30ms
iter 28290: loss 9.3634, time 125.30ms
iter 28300: loss 10.0109, time 126.04ms
iter 28310: loss 9.3246, time 125.97ms
iter 28320: loss 9.5478, time 128.58ms
iter 28330: loss 9.8817, time 125.36ms
iter 28340: loss 9.7800, time 125.03ms
iter 28350: loss 10.0619, time 125.34ms
iter 28360: loss 9.4415, time 125.05ms
iter 28370: loss 8.8547, time 128.83ms
iter 28380: loss 9.3345, time 124.56ms
iter 28390: loss 9.9667, time 125.72ms
iter 28400: loss 9.1947, time 124.42ms
iter 28410: loss 9.3367, time 125.24ms
iter 28420: loss 9.9020, time 125.25ms
iter 28430: loss 10.0200, time 125.38ms
iter 28440: loss 9.2960, time 127.48ms
iter 28450: loss 9.9344, time 125.26ms
iter 28460: loss 9.9020, time 125.68ms
iter 28470: loss 9.5446, time 125.81ms
iter 28480: loss 9.2170, time 125.57ms
iter 28490: loss 8.7467, time 125.10ms
step 28500: train loss 8.0437, val loss 8.0475
saving checkpoint to out-shakespeare-char
iter 28500: loss 9.7957, time 2874.11ms
iter 28510: loss 8.7229, time 121.78ms
iter 28520: loss 9.3817, time 121.71ms
iter 28530: loss 9.5045, time 121.98ms
iter 28540: loss 9.0617, time 121.75ms
iter 28550: loss 9.8093, time 121.56ms
iter 28560: loss 9.0432, time 122.03ms
iter 28570: loss 9.8971, time 121.48ms
iter 28580: loss 9.9626, time 121.54ms
iter 28590: loss 9.1208, time 121.62ms
iter 28600: loss 9.8136, time 121.67ms
iter 28610: loss 9.4145, time 121.55ms
iter 28620: loss 9.2963, time 121.59ms
iter 28630: loss 9.3443, time 120.74ms
iter 28640: loss 9.7697, time 120.77ms
iter 28650: loss 9.4855, time 121.67ms
iter 28660: loss 9.7088, time 122.21ms
iter 28670: loss 9.5131, time 122.99ms
iter 28680: loss 9.3668, time 121.88ms
iter 28690: loss 9.2857, time 121.68ms
iter 28700: loss 9.3133, time 121.78ms
iter 28710: loss 9.6493, time 121.62ms
iter 28720: loss 9.0367, time 120.37ms
iter 28730: loss 9.6969, time 120.87ms
iter 28740: loss 9.0885, time 122.91ms
step 28750: train loss 8.1083, val loss 8.0894
saving checkpoint to out-shakespeare-char
iter 28750: loss 9.4839, time 2909.64ms
iter 28760: loss 9.7222, time 122.02ms
iter 28770: loss 9.2282, time 124.91ms
iter 28780: loss 9.0158, time 122.02ms
iter 28790: loss 9.6817, time 124.62ms
iter 28800: loss 8.8962, time 122.16ms
iter 28810: loss 9.5199, time 124.66ms
iter 28820: loss 9.4359, time 121.25ms
iter 28830: loss 9.0053, time 124.73ms
iter 28840: loss 9.1231, time 122.00ms
iter 28850: loss 9.2413, time 124.42ms
iter 28860: loss 9.0417, time 121.98ms
iter 28870: loss 10.2224, time 124.93ms
iter 28880: loss 9.6101, time 121.37ms
iter 28890: loss 9.3034, time 124.14ms
iter 28900: loss 9.2517, time 121.13ms
iter 28910: loss 10.1577, time 124.79ms
iter 28920: loss 8.7396, time 122.22ms
iter 28930: loss 9.7019, time 121.99ms
iter 28940: loss 9.5445, time 121.89ms
iter 28950: loss 9.0169, time 121.95ms
iter 28960: loss 9.0965, time 121.82ms
iter 28970: loss 9.3746, time 120.97ms
iter 28980: loss 9.0343, time 121.41ms
iter 28990: loss 9.5616, time 121.53ms
step 29000: train loss 8.0743, val loss 8.1165
saving checkpoint to out-shakespeare-char
iter 29000: loss 8.9850, time 2892.93ms
iter 29010: loss 9.7148, time 121.12ms
iter 29020: loss 9.5258, time 121.96ms
iter 29030: loss 9.3202, time 121.92ms
iter 29040: loss 9.8454, time 122.09ms
iter 29050: loss 9.4547, time 122.20ms
iter 29060: loss 9.8372, time 121.98ms
iter 29070: loss 10.0144, time 121.60ms
iter 29080: loss 8.7275, time 121.92ms
iter 29090: loss 9.5739, time 121.26ms
iter 29100: loss 9.5176, time 122.96ms
iter 29110: loss 9.6165, time 121.79ms
iter 29120: loss 9.0400, time 122.22ms
iter 29130: loss 10.0958, time 121.83ms
iter 29140: loss 9.3392, time 121.97ms
iter 29150: loss 9.4124, time 121.75ms
iter 29160: loss 9.5807, time 121.83ms
iter 29170: loss 8.8915, time 121.80ms
iter 29180: loss 9.4287, time 121.94ms
iter 29190: loss 9.1352, time 121.92ms
iter 29200: loss 9.3438, time 121.94ms
iter 29210: loss 9.8353, time 121.99ms
iter 29220: loss 9.9951, time 121.97ms
iter 29230: loss 8.6865, time 121.93ms
iter 29240: loss 10.1494, time 121.56ms
step 29250: train loss 8.1234, val loss 8.0442
saving checkpoint to out-shakespeare-char
iter 29250: loss 9.2301, time 2903.94ms
iter 29260: loss 9.5109, time 120.98ms
iter 29270: loss 8.7364, time 121.73ms
iter 29280: loss 9.9952, time 122.54ms
iter 29290: loss 9.4991, time 121.27ms
iter 29300: loss 8.8025, time 121.57ms
iter 29310: loss 9.9199, time 126.21ms
iter 29320: loss 9.4851, time 122.78ms
iter 29330: loss 9.6166, time 123.86ms
iter 29340: loss 9.2361, time 122.09ms
iter 29350: loss 9.4173, time 123.32ms
iter 29360: loss 9.3859, time 121.17ms
iter 29370: loss 9.6574, time 122.41ms
iter 29380: loss 9.7161, time 121.69ms
iter 29390: loss 9.7730, time 123.71ms
iter 29400: loss 10.0135, time 122.68ms
iter 29410: loss 9.6764, time 123.18ms
iter 29420: loss 9.4723, time 121.67ms
iter 29430: loss 9.5821, time 123.29ms
iter 29440: loss 9.7883, time 121.73ms
iter 29450: loss 9.4883, time 122.50ms
iter 29460: loss 9.3569, time 121.28ms
iter 29470: loss 10.0544, time 122.89ms
iter 29480: loss 9.3633, time 122.05ms
iter 29490: loss 9.5623, time 122.96ms
step 29500: train loss 8.0233, val loss 8.0700
saving checkpoint to out-shakespeare-char
iter 29500: loss 9.3658, time 2906.56ms
iter 29510: loss 9.2707, time 128.37ms
iter 29520: loss 9.5161, time 125.18ms
iter 29530: loss 9.4852, time 125.10ms
iter 29540: loss 9.6859, time 125.80ms
iter 29550: loss 9.6165, time 126.57ms
iter 29560: loss 9.0564, time 125.23ms
iter 29570: loss 9.6891, time 125.19ms
iter 29580: loss 9.6020, time 125.62ms
iter 29590: loss 9.4683, time 125.44ms
iter 29600: loss 9.1933, time 125.17ms
iter 29610: loss 9.3810, time 125.20ms
iter 29620: loss 9.3630, time 126.50ms
iter 29630: loss 9.4848, time 125.71ms
iter 29640: loss 9.2527, time 126.15ms
iter 29650: loss 9.4404, time 126.05ms
iter 29660: loss 9.8905, time 126.29ms
iter 29670: loss 9.3816, time 125.87ms
iter 29680: loss 9.1163, time 124.26ms
iter 29690: loss 9.9530, time 125.57ms
iter 29700: loss 9.5041, time 128.98ms
iter 29710: loss 9.6235, time 125.57ms
iter 29720: loss 8.9741, time 125.36ms
iter 29730: loss 9.3148, time 125.15ms
iter 29740: loss 9.5040, time 125.97ms
step 29750: train loss 8.0033, val loss 8.0174
saving checkpoint to out-shakespeare-char
iter 29750: loss 9.4190, time 2867.76ms
iter 29760: loss 9.8306, time 125.29ms
iter 29770: loss 9.7649, time 125.46ms
iter 29780: loss 9.8439, time 125.31ms
iter 29790: loss 10.0322, time 124.95ms
iter 29800: loss 9.2094, time 128.35ms
iter 29810: loss 9.6295, time 125.74ms
iter 29820: loss 9.5706, time 127.14ms
iter 29830: loss 9.3499, time 125.87ms
iter 29840: loss 9.1042, time 125.36ms
iter 29850: loss 9.6657, time 126.07ms
iter 29860: loss 8.7356, time 125.73ms
iter 29870: loss 9.4462, time 124.99ms
iter 29880: loss 9.9091, time 124.67ms
iter 29890: loss 8.9774, time 125.39ms
iter 29900: loss 10.1687, time 125.35ms
iter 29910: loss 9.2265, time 125.10ms
iter 29920: loss 10.3391, time 125.34ms
iter 29930: loss 9.3909, time 125.45ms
iter 29940: loss 8.6653, time 125.23ms
iter 29950: loss 9.1658, time 128.31ms
iter 29960: loss 9.2194, time 125.67ms
iter 29970: loss 10.2252, time 126.09ms
iter 29980: loss 9.7391, time 125.44ms
iter 29990: loss 9.3731, time 124.97ms
step 30000: train loss 8.0514, val loss 8.0404
saving checkpoint to out-shakespeare-char
iter 30000: loss 8.8440, time 2867.51ms
iter 30010: loss 9.3173, time 126.10ms
iter 30020: loss 8.8801, time 125.82ms
iter 30030: loss 9.8341, time 125.87ms
iter 30040: loss 9.1793, time 126.02ms
iter 30050: loss 9.2910, time 126.20ms
iter 30060: loss 9.9981, time 126.09ms
iter 30070: loss 9.2800, time 125.90ms
iter 30080: loss 9.4598, time 126.02ms
iter 30090: loss 9.0536, time 129.37ms
iter 30100: loss 8.8126, time 125.77ms
iter 30110: loss 9.1302, time 125.91ms
iter 30120: loss 9.5439, time 126.67ms
iter 30130: loss 9.5639, time 125.77ms
iter 30140: loss 9.6435, time 126.17ms
iter 30150: loss 9.1297, time 125.76ms
iter 30160: loss 9.4998, time 125.83ms
iter 30170: loss 9.4619, time 125.83ms
iter 30180: loss 9.0400, time 125.85ms
iter 30190: loss 8.8257, time 125.84ms
iter 30200: loss 9.2126, time 128.74ms
iter 30210: loss 9.2066, time 125.61ms
iter 30220: loss 9.2560, time 125.90ms
iter 30230: loss 9.8890, time 125.44ms
iter 30240: loss 9.7702, time 124.33ms
step 30250: train loss 8.0289, val loss 8.0555
saving checkpoint to out-shakespeare-char
iter 30250: loss 9.7991, time 2897.72ms
iter 30260: loss 9.1930, time 125.33ms
iter 30270: loss 9.2271, time 125.32ms
iter 30280: loss 9.8793, time 125.50ms
iter 30290: loss 9.3697, time 128.71ms
iter 30300: loss 9.1287, time 125.26ms
iter 30310: loss 9.4103, time 124.74ms
iter 30320: loss 9.6796, time 125.59ms
iter 30330: loss 9.4587, time 125.53ms
iter 30340: loss 9.3543, time 125.29ms
iter 30350: loss 9.1437, time 125.28ms
iter 30360: loss 8.9321, time 124.42ms
iter 30370: loss 9.2635, time 125.75ms
iter 30380: loss 9.7047, time 125.21ms
iter 30390: loss 8.8180, time 125.47ms
iter 30400: loss 9.3487, time 128.17ms
iter 30410: loss 9.2397, time 125.79ms
iter 30420: loss 9.4868, time 125.53ms
iter 30430: loss 9.2664, time 125.54ms
iter 30440: loss 8.5737, time 124.71ms
iter 30450: loss 9.6354, time 125.54ms
iter 30460: loss 9.2584, time 125.64ms
iter 30470: loss 9.6828, time 125.67ms
iter 30480: loss 9.8385, time 125.21ms
iter 30490: loss 9.0320, time 125.41ms
step 30500: train loss 8.0151, val loss 8.0004
saving checkpoint to out-shakespeare-char
iter 30500: loss 9.3533, time 2905.66ms
iter 30510: loss 9.8097, time 125.61ms
iter 30520: loss 9.2989, time 125.61ms
iter 30530: loss 9.3812, time 128.43ms
iter 30540: loss 9.2162, time 125.44ms
iter 30550: loss 9.5934, time 124.84ms
iter 30560: loss 9.9602, time 125.95ms
iter 30570: loss 9.5037, time 125.30ms
iter 30580: loss 9.4686, time 126.44ms
iter 30590: loss 9.8946, time 125.51ms
iter 30600: loss 9.8868, time 125.34ms
iter 30610: loss 9.0815, time 125.67ms
iter 30620: loss 9.2461, time 125.33ms
iter 30630: loss 9.9798, time 125.42ms
iter 30640: loss 8.9625, time 128.23ms
iter 30650: loss 9.8112, time 125.32ms
iter 30660: loss 9.4801, time 125.27ms
iter 30670: loss 9.9290, time 125.51ms
iter 30680: loss 9.6768, time 125.37ms
iter 30690: loss 9.6316, time 125.36ms
iter 30700: loss 9.6188, time 125.36ms
iter 30710: loss 9.9627, time 125.22ms
iter 30720: loss 9.0970, time 125.43ms
iter 30730: loss 9.3759, time 125.46ms
iter 30740: loss 9.4160, time 125.67ms
step 30750: train loss 7.9929, val loss 7.9697
saving checkpoint to out-shakespeare-char
iter 30750: loss 9.2296, time 2911.07ms
iter 30760: loss 8.8020, time 126.11ms
iter 30770: loss 9.3417, time 128.80ms
iter 30780: loss 9.6359, time 125.91ms
iter 30790: loss 9.1263, time 125.95ms
iter 30800: loss 9.1627, time 126.03ms
iter 30810: loss 8.9576, time 125.58ms
iter 30820: loss 9.8898, time 125.75ms
iter 30830: loss 9.7842, time 125.83ms
iter 30840: loss 9.4745, time 125.96ms
iter 30850: loss 9.3606, time 125.91ms
iter 30860: loss 9.4775, time 125.94ms
iter 30870: loss 9.5374, time 125.99ms
iter 30880: loss 9.3015, time 127.51ms
iter 30890: loss 9.3868, time 125.85ms
iter 30900: loss 9.8253, time 125.82ms
iter 30910: loss 8.7964, time 126.06ms
iter 30920: loss 9.1181, time 125.64ms
iter 30930: loss 9.6122, time 126.07ms
iter 30940: loss 9.3532, time 125.05ms
iter 30950: loss 9.2782, time 125.91ms
iter 30960: loss 9.2587, time 125.23ms
iter 30970: loss 9.0616, time 126.12ms
iter 30980: loss 9.9193, time 125.17ms
iter 30990: loss 8.8535, time 129.04ms
step 31000: train loss 8.0102, val loss 8.0091
saving checkpoint to out-shakespeare-char
iter 31000: loss 8.7383, time 2896.12ms
iter 31010: loss 9.3084, time 128.62ms
iter 31020: loss 9.5137, time 125.82ms
iter 31030: loss 9.2036, time 125.08ms
iter 31040: loss 9.7242, time 126.26ms
iter 31050: loss 8.8570, time 128.16ms
iter 31060: loss 9.5928, time 125.99ms
iter 31070: loss 8.8115, time 125.09ms
iter 31080: loss 8.8792, time 126.05ms
iter 31090: loss 9.2630, time 125.50ms
iter 31100: loss 10.0448, time 126.02ms
iter 31110: loss 9.4193, time 125.69ms
iter 31120: loss 10.3334, time 125.00ms
iter 31130: loss 9.7438, time 125.88ms
iter 31140: loss 9.7817, time 125.74ms
iter 31150: loss 9.4276, time 126.16ms
iter 31160: loss 9.4997, time 128.96ms
iter 31170: loss 9.1665, time 125.91ms
iter 31180: loss 9.2971, time 125.92ms
iter 31190: loss 9.2019, time 126.17ms
iter 31200: loss 9.6129, time 128.98ms
iter 31210: loss 9.3989, time 125.89ms
iter 31220: loss 10.0039, time 125.82ms
iter 31230: loss 9.1657, time 125.99ms
iter 31240: loss 9.9038, time 126.06ms
step 31250: train loss 7.9870, val loss 7.9667
saving checkpoint to out-shakespeare-char
iter 31250: loss 9.4174, time 2902.30ms
iter 31260: loss 9.6466, time 126.01ms
iter 31270: loss 9.7056, time 126.67ms
iter 31280: loss 9.5996, time 126.12ms
iter 31290: loss 9.3042, time 126.02ms
iter 31300: loss 8.5126, time 125.72ms
iter 31310: loss 8.8088, time 125.92ms
iter 31320: loss 9.1287, time 126.00ms
iter 31330: loss 9.8861, time 125.46ms
iter 31340: loss 9.9926, time 126.11ms
iter 31350: loss 9.6041, time 129.23ms
iter 31360: loss 9.1232, time 125.92ms
iter 31370: loss 9.9476, time 125.99ms
iter 31380: loss 9.4739, time 125.95ms
iter 31390: loss 8.2623, time 127.80ms
iter 31400: loss 9.0703, time 126.01ms
iter 31410: loss 9.6622, time 126.62ms
iter 31420: loss 9.7602, time 126.25ms
iter 31430: loss 10.0066, time 125.99ms
iter 31440: loss 9.9846, time 125.84ms
iter 31450: loss 9.2950, time 125.37ms
iter 31460: loss 9.6274, time 126.04ms
iter 31470: loss 8.6373, time 124.87ms
iter 31480: loss 9.3056, time 125.96ms
iter 31490: loss 9.1961, time 126.28ms
step 31500: train loss 7.9645, val loss 7.9968
saving checkpoint to out-shakespeare-char
iter 31500: loss 9.0889, time 2878.50ms
iter 31510: loss 9.6873, time 125.95ms
iter 31520: loss 9.2681, time 125.85ms
iter 31530: loss 10.4218, time 125.89ms
iter 31540: loss 9.0555, time 125.44ms
iter 31550: loss 9.3700, time 125.88ms
iter 31560: loss 8.3822, time 128.51ms
iter 31570: loss 9.5669, time 125.19ms
iter 31580: loss 9.7911, time 125.59ms
iter 31590: loss 9.4996, time 125.83ms
iter 31600: loss 9.1789, time 125.57ms
iter 31610: loss 8.9136, time 124.85ms
iter 31620: loss 10.1349, time 125.28ms
iter 31630: loss 8.8144, time 124.75ms
iter 31640: loss 9.4274, time 125.35ms
iter 31650: loss 8.5409, time 125.26ms
iter 31660: loss 9.6025, time 125.51ms
iter 31670: loss 9.0668, time 128.37ms
iter 31680: loss 9.7856, time 125.43ms
iter 31690: loss 9.1413, time 125.39ms
iter 31700: loss 9.8324, time 125.99ms
iter 31710: loss 9.3998, time 125.51ms
iter 31720: loss 9.1793, time 125.67ms
iter 31730: loss 9.0396, time 125.47ms
iter 31740: loss 9.6226, time 125.58ms
step 31750: train loss 7.9934, val loss 7.9883
saving checkpoint to out-shakespeare-char
iter 31750: loss 8.6911, time 2873.24ms
iter 31760: loss 9.0775, time 126.10ms
iter 31770: loss 9.5889, time 125.95ms
iter 31780: loss 8.7772, time 126.04ms
iter 31790: loss 8.9113, time 126.78ms
iter 31800: loss 9.7257, time 126.23ms
iter 31810: loss 9.5197, time 128.56ms
iter 31820: loss 8.8106, time 125.89ms
iter 31830: loss 9.1967, time 125.76ms
iter 31840: loss 9.3281, time 125.71ms
iter 31850: loss 9.4011, time 128.59ms
iter 31860: loss 9.4868, time 125.71ms
iter 31870: loss 9.3655, time 125.77ms
iter 31880: loss 9.1881, time 125.87ms
iter 31890: loss 9.3598, time 127.66ms
iter 31900: loss 9.0910, time 125.51ms
iter 31910: loss 9.0672, time 125.47ms
iter 31920: loss 9.0526, time 125.69ms
iter 31930: loss 9.0466, time 128.65ms
iter 31940: loss 8.4392, time 125.43ms
iter 31950: loss 9.3227, time 125.60ms
iter 31960: loss 9.6686, time 126.78ms
iter 31970: loss 9.5688, time 125.66ms
iter 31980: loss 8.8940, time 125.62ms
iter 31990: loss 9.3356, time 125.55ms
step 32000: train loss 7.9963, val loss 7.9277
saving checkpoint to out-shakespeare-char
iter 32000: loss 8.3809, time 2893.30ms
iter 32010: loss 9.6488, time 125.09ms
iter 32020: loss 9.8823, time 125.68ms
iter 32030: loss 9.6388, time 125.60ms
iter 32040: loss 9.7342, time 125.69ms
iter 32050: loss 9.8910, time 125.19ms
iter 32060: loss 9.3317, time 125.47ms
iter 32070: loss 9.5601, time 125.56ms
iter 32080: loss 9.7012, time 127.07ms
iter 32090: loss 9.5359, time 125.69ms
iter 32100: loss 10.3587, time 125.58ms
iter 32110: loss 9.2308, time 125.69ms
iter 32120: loss 9.4197, time 125.62ms
iter 32130: loss 10.0076, time 125.44ms
iter 32140: loss 9.2594, time 125.50ms
iter 32150: loss 9.6481, time 125.72ms
iter 32160: loss 9.5106, time 124.64ms
iter 32170: loss 10.0701, time 124.62ms
iter 32180: loss 9.8203, time 125.59ms
iter 32190: loss 9.5484, time 125.43ms
iter 32200: loss 9.7384, time 125.92ms
iter 32210: loss 9.7993, time 125.15ms
iter 32220: loss 9.4125, time 127.63ms
iter 32230: loss 9.8817, time 126.72ms
iter 32240: loss 9.4508, time 125.94ms
step 32250: train loss 7.9539, val loss 7.9449
saving checkpoint to out-shakespeare-char
iter 32250: loss 8.8043, time 2880.12ms
iter 32260: loss 9.2827, time 126.05ms
iter 32270: loss 9.7992, time 126.08ms
iter 32280: loss 8.9245, time 125.27ms
iter 32290: loss 9.3305, time 124.11ms
iter 32300: loss 8.9799, time 125.16ms
iter 32310: loss 9.2435, time 125.62ms
iter 32320: loss 9.2486, time 125.17ms
iter 32330: loss 9.9819, time 125.34ms
iter 32340: loss 8.5910, time 125.37ms
iter 32350: loss 9.3193, time 125.01ms
iter 32360: loss 9.1673, time 128.37ms
iter 32370: loss 8.8363, time 125.36ms
iter 32380: loss 9.4185, time 125.33ms
iter 32390: loss 9.5805, time 125.43ms
iter 32400: loss 9.1386, time 125.21ms
iter 32410: loss 8.4516, time 124.68ms
iter 32420: loss 9.2773, time 125.22ms
iter 32430: loss 9.6127, time 125.18ms
iter 32440: loss 9.1571, time 125.27ms
iter 32450: loss 9.7724, time 125.35ms
iter 32460: loss 9.1076, time 125.36ms
iter 32470: loss 8.9925, time 128.23ms
iter 32480: loss 9.2211, time 125.34ms
iter 32490: loss 9.3561, time 125.01ms
step 32500: train loss 8.0118, val loss 8.0220
saving checkpoint to out-shakespeare-char
iter 32500: loss 9.3957, time 2883.39ms
iter 32510: loss 9.5252, time 124.55ms
iter 32520: loss 9.1065, time 125.83ms
iter 32530: loss 9.4498, time 127.50ms
iter 32540: loss 9.5395, time 125.57ms
iter 32550: loss 9.8751, time 121.88ms
iter 32560: loss 8.7939, time 121.60ms
iter 32570: loss 9.8857, time 121.84ms
iter 32580: loss 9.4393, time 121.95ms
iter 32590: loss 8.7606, time 121.51ms
iter 32600: loss 9.2975, time 121.67ms
iter 32610: loss 9.3598, time 121.32ms
iter 32620: loss 9.4802, time 121.70ms
iter 32630: loss 9.5569, time 121.54ms
iter 32640: loss 9.0893, time 122.58ms
iter 32650: loss 8.0621, time 121.79ms
iter 32660: loss 9.9252, time 121.34ms
iter 32670: loss 9.9806, time 121.77ms
iter 32680: loss 9.0561, time 121.27ms
iter 32690: loss 8.8356, time 120.81ms
iter 32700: loss 9.5201, time 121.84ms
iter 32710: loss 9.2093, time 121.69ms
iter 32720: loss 9.5624, time 121.80ms
iter 32730: loss 9.4487, time 121.83ms
iter 32740: loss 9.9615, time 121.79ms
step 32750: train loss 7.9329, val loss 7.9462
saving checkpoint to out-shakespeare-char
iter 32750: loss 8.6811, time 2905.56ms
iter 32760: loss 9.2284, time 121.92ms
iter 32770: loss 9.8755, time 123.21ms
iter 32780: loss 9.3464, time 122.81ms
iter 32790: loss 9.2617, time 122.73ms
iter 32800: loss 9.5104, time 121.59ms
iter 32810: loss 9.3491, time 122.69ms
iter 32820: loss 9.3534, time 121.60ms
iter 32830: loss 9.6534, time 122.56ms
iter 32840: loss 10.4048, time 121.53ms
iter 32850: loss 9.6639, time 122.56ms
iter 32860: loss 9.3306, time 121.55ms
iter 32870: loss 9.9658, time 121.44ms
iter 32880: loss 9.7188, time 121.55ms
iter 32890: loss 9.6966, time 121.57ms
iter 32900: loss 9.9712, time 121.48ms
iter 32910: loss 9.6684, time 121.51ms
iter 32920: loss 9.5065, time 120.41ms
iter 32930: loss 9.6396, time 121.53ms
iter 32940: loss 9.6162, time 120.89ms
iter 32950: loss 9.3569, time 121.54ms
iter 32960: loss 9.4606, time 121.52ms
iter 32970: loss 8.9868, time 121.70ms
iter 32980: loss 9.2753, time 121.36ms
iter 32990: loss 8.7919, time 121.65ms
step 33000: train loss 7.9350, val loss 7.9371
saving checkpoint to out-shakespeare-char
iter 33000: loss 8.8808, time 2893.00ms
iter 33010: loss 9.5230, time 121.42ms
iter 33020: loss 9.2811, time 120.52ms
iter 33030: loss 9.4643, time 121.38ms
iter 33040: loss 10.0242, time 121.28ms
iter 33050: loss 9.5560, time 121.43ms
iter 33060: loss 9.2792, time 121.47ms
iter 33070: loss 9.5846, time 121.70ms
iter 33080: loss 9.4483, time 121.20ms
iter 33090: loss 9.6321, time 121.38ms
iter 33100: loss 8.6455, time 121.55ms
iter 33110: loss 9.4409, time 121.64ms
iter 33120: loss 9.1342, time 121.83ms
iter 33130: loss 8.6572, time 121.51ms
iter 33140: loss 9.8758, time 120.50ms
iter 33150: loss 9.3046, time 121.35ms
iter 33160: loss 9.2355, time 121.25ms
iter 33170: loss 9.2761, time 121.45ms
iter 33180: loss 9.7403, time 121.49ms
iter 33190: loss 9.2914, time 121.44ms
iter 33200: loss 9.4885, time 121.32ms
iter 33210: loss 8.9064, time 121.33ms
iter 33220: loss 9.5620, time 121.31ms
iter 33230: loss 9.3362, time 121.31ms
iter 33240: loss 9.3533, time 121.38ms
step 33250: train loss 7.9799, val loss 7.9635
saving checkpoint to out-shakespeare-char
iter 33250: loss 9.2789, time 2895.61ms
iter 33260: loss 9.2589, time 122.60ms
iter 33270: loss 9.6027, time 121.44ms
iter 33280: loss 9.4524, time 122.31ms
iter 33290: loss 9.5061, time 121.50ms
iter 33300: loss 9.5489, time 122.70ms
iter 33310: loss 9.4623, time 121.65ms
iter 33320: loss 9.5373, time 122.99ms
iter 33330: loss 8.9263, time 121.60ms
iter 33340: loss 9.1481, time 122.54ms
iter 33350: loss 9.5972, time 121.36ms
iter 33360: loss 9.3109, time 121.82ms
iter 33370: loss 9.4909, time 121.47ms
iter 33380: loss 9.4941, time 123.64ms
iter 33390: loss 9.2617, time 121.41ms
iter 33400: loss 9.1801, time 123.44ms
iter 33410: loss 8.9129, time 121.23ms
iter 33420: loss 9.6328, time 122.72ms
iter 33430: loss 9.3270, time 121.56ms
iter 33440: loss 10.6866, time 122.58ms
iter 33450: loss 9.6522, time 121.36ms
iter 33460: loss 9.4650, time 122.71ms
iter 33470: loss 9.6610, time 121.20ms
iter 33480: loss 9.6409, time 122.18ms
iter 33490: loss 9.5170, time 121.45ms
step 33500: train loss 7.9207, val loss 7.9273
saving checkpoint to out-shakespeare-char
iter 33500: loss 8.9300, time 2890.67ms
iter 33510: loss 9.3631, time 121.55ms
iter 33520: loss 9.1105, time 121.18ms
iter 33530: loss 9.0362, time 121.64ms
iter 33540: loss 9.4278, time 121.07ms
iter 33550: loss 9.1512, time 122.20ms
iter 33560: loss 9.5539, time 121.45ms
iter 33570: loss 9.3534, time 121.44ms
iter 33580: loss 9.0138, time 121.57ms
iter 33590: loss 9.2269, time 121.66ms
iter 33600: loss 9.1991, time 121.51ms
iter 33610: loss 9.2550, time 121.96ms
iter 33620: loss 9.2283, time 120.77ms
iter 33630: loss 9.5891, time 121.91ms
iter 33640: loss 9.5099, time 121.33ms
iter 33650: loss 9.0204, time 121.58ms
iter 33660: loss 9.3126, time 121.52ms
iter 33670: loss 9.7094, time 121.37ms
iter 33680: loss 9.6603, time 121.38ms
iter 33690: loss 8.8184, time 121.39ms
iter 33700: loss 9.3352, time 121.78ms
iter 33710: loss 9.0668, time 121.46ms
iter 33720: loss 9.0331, time 121.40ms
iter 33730: loss 9.3064, time 122.72ms
iter 33740: loss 9.5113, time 121.59ms
step 33750: train loss 7.9604, val loss 7.9376
saving checkpoint to out-shakespeare-char
iter 33750: loss 9.8968, time 2899.51ms
iter 33760: loss 9.5963, time 121.74ms
iter 33770: loss 9.0092, time 121.78ms
iter 33780: loss 9.2654, time 121.88ms
iter 33790: loss 10.0715, time 122.00ms
iter 33800: loss 9.8381, time 121.86ms
iter 33810: loss 9.6472, time 121.75ms
iter 33820: loss 9.8030, time 121.78ms
iter 33830: loss 9.8086, time 122.00ms
iter 33840: loss 9.8067, time 122.02ms
iter 33850: loss 10.0496, time 122.00ms
iter 33860: loss 9.5591, time 122.57ms
iter 33870: loss 9.4085, time 122.08ms
iter 33880: loss 9.2393, time 121.93ms
iter 33890: loss 8.9843, time 121.86ms
iter 33900: loss 8.9949, time 121.65ms
iter 33910: loss 9.1657, time 121.66ms
iter 33920: loss 9.5216, time 121.34ms
iter 33930: loss 9.2732, time 121.88ms
iter 33940: loss 9.1923, time 122.03ms
iter 33950: loss 8.6896, time 121.87ms
iter 33960: loss 8.9610, time 121.71ms
iter 33970: loss 9.1428, time 121.91ms
iter 33980: loss 10.2177, time 121.82ms
iter 33990: loss 9.3736, time 122.75ms
step 34000: train loss 7.9298, val loss 7.8948
saving checkpoint to out-shakespeare-char
iter 34000: loss 8.9230, time 2893.92ms
iter 34010: loss 8.9699, time 121.55ms
iter 34020: loss 9.0954, time 126.45ms
iter 34030: loss 9.6837, time 121.94ms
iter 34040: loss 10.2464, time 125.10ms
iter 34050: loss 9.3318, time 121.18ms
iter 34060: loss 9.0778, time 124.49ms
iter 34070: loss 9.0604, time 122.14ms
iter 34080: loss 9.0944, time 125.80ms
iter 34090: loss 9.4965, time 125.42ms
iter 34100: loss 9.5596, time 125.45ms
iter 34110: loss 9.9309, time 125.31ms
iter 34120: loss 9.2493, time 125.40ms
iter 34130: loss 8.7030, time 125.52ms
iter 34140: loss 9.3394, time 124.21ms
iter 34150: loss 9.6430, time 125.67ms
iter 34160: loss 9.4164, time 125.60ms
iter 34170: loss 9.5295, time 125.39ms
iter 34180: loss 9.3558, time 125.52ms
iter 34190: loss 8.9114, time 125.13ms
iter 34200: loss 9.5028, time 125.25ms
iter 34210: loss 9.3848, time 125.20ms
iter 34220: loss 9.9050, time 125.29ms
iter 34230: loss 9.7586, time 128.06ms
iter 34240: loss 9.5820, time 125.19ms
step 34250: train loss 7.9105, val loss 7.9560
saving checkpoint to out-shakespeare-char
iter 34250: loss 9.1027, time 2899.53ms
iter 34260: loss 9.3172, time 125.96ms
iter 34270: loss 9.2229, time 125.64ms
iter 34280: loss 9.3912, time 125.67ms
iter 34290: loss 9.5989, time 124.63ms
iter 34300: loss 9.4488, time 125.82ms
iter 34310: loss 10.2327, time 125.66ms
iter 34320: loss 9.2651, time 125.60ms
iter 34330: loss 9.5543, time 127.74ms
iter 34340: loss 9.2840, time 125.47ms
iter 34350: loss 8.3699, time 125.55ms
iter 34360: loss 9.7431, time 124.81ms
iter 34370: loss 8.4810, time 125.53ms
iter 34380: loss 9.4250, time 125.37ms
iter 34390: loss 9.9791, time 125.95ms
iter 34400: loss 9.6593, time 124.18ms
iter 34410: loss 9.1117, time 126.81ms
iter 34420: loss 9.7193, time 125.79ms
iter 34430: loss 8.7674, time 125.51ms
iter 34440: loss 9.3861, time 128.12ms
iter 34450: loss 9.5173, time 124.96ms
iter 34460: loss 9.1558, time 125.14ms
iter 34470: loss 9.4353, time 125.66ms
iter 34480: loss 8.8539, time 126.83ms
iter 34490: loss 9.0075, time 125.56ms
step 34500: train loss 7.8850, val loss 7.9393
saving checkpoint to out-shakespeare-char
iter 34500: loss 9.3313, time 2888.81ms
iter 34510: loss 9.2061, time 125.98ms
iter 34520: loss 9.6028, time 126.12ms
iter 34530: loss 9.1725, time 125.93ms
iter 34540: loss 9.9588, time 125.97ms
iter 34550: loss 9.1199, time 126.06ms
iter 34560: loss 9.4256, time 125.03ms
iter 34570: loss 9.6975, time 124.85ms
iter 34580: loss 9.2654, time 124.85ms
iter 34590: loss 8.6293, time 124.93ms
iter 34600: loss 10.1426, time 125.56ms
iter 34610: loss 9.6188, time 128.05ms
iter 34620: loss 9.6801, time 124.97ms
iter 34630: loss 9.7754, time 124.98ms
iter 34640: loss 9.3816, time 125.33ms
iter 34650: loss 9.1732, time 125.31ms
iter 34660: loss 9.0853, time 124.91ms
iter 34670: loss 9.4679, time 125.12ms
iter 34680: loss 9.6559, time 125.16ms
iter 34690: loss 9.2153, time 125.62ms
iter 34700: loss 8.5366, time 125.28ms
iter 34710: loss 9.6351, time 125.67ms
iter 34720: loss 9.6007, time 127.15ms
iter 34730: loss 9.1527, time 125.37ms
iter 34740: loss 10.0952, time 124.21ms
step 34750: train loss 7.9033, val loss 7.9095
saving checkpoint to out-shakespeare-char
iter 34750: loss 9.6272, time 2879.67ms
iter 34760: loss 9.3830, time 122.06ms
iter 34770: loss 9.2014, time 121.93ms
iter 34780: loss 9.4141, time 122.29ms
iter 34790: loss 9.3043, time 121.95ms
iter 34800: loss 9.1748, time 122.23ms
iter 34810: loss 9.4563, time 122.21ms
iter 34820: loss 9.1727, time 122.20ms
iter 34830: loss 9.6321, time 121.86ms
iter 34840: loss 9.0162, time 121.44ms
iter 34850: loss 10.4993, time 121.97ms
iter 34860: loss 9.2232, time 121.86ms
iter 34870: loss 8.9807, time 122.63ms
iter 34880: loss 8.8321, time 121.96ms
iter 34890: loss 10.3476, time 122.29ms
iter 34900: loss 9.4044, time 122.05ms
iter 34910: loss 9.3998, time 122.26ms
iter 34920: loss 9.4118, time 121.59ms
iter 34930: loss 9.5282, time 122.11ms
iter 34940: loss 10.1698, time 121.96ms
iter 34950: loss 9.6615, time 121.99ms
iter 34960: loss 9.0021, time 122.40ms
iter 34970: loss 9.0586, time 121.23ms
iter 34980: loss 9.1334, time 121.98ms
iter 34990: loss 9.4243, time 121.94ms
step 35000: train loss 7.9179, val loss 7.8855
saving checkpoint to out-shakespeare-char
iter 35000: loss 9.0784, time 2902.26ms
iter 35010: loss 9.4080, time 125.79ms
iter 35020: loss 9.2549, time 125.50ms
iter 35030: loss 9.8769, time 126.47ms
iter 35040: loss 9.7649, time 126.03ms
iter 35050: loss 9.2748, time 125.63ms
iter 35060: loss 9.5919, time 125.68ms
iter 35070: loss 10.1219, time 125.52ms
iter 35080: loss 9.7930, time 125.44ms
iter 35090: loss 8.5848, time 125.79ms
iter 35100: loss 9.3808, time 125.04ms
iter 35110: loss 9.1361, time 128.53ms
iter 35120: loss 9.4343, time 125.68ms
iter 35130: loss 9.2072, time 125.57ms
iter 35140: loss 9.5068, time 125.63ms
iter 35150: loss 9.3329, time 125.58ms
iter 35160: loss 8.8568, time 125.05ms
iter 35170: loss 10.4231, time 125.55ms
iter 35180: loss 8.9369, time 126.15ms
iter 35190: loss 10.0754, time 125.57ms
iter 35200: loss 9.4206, time 126.54ms
iter 35210: loss 9.0474, time 125.66ms
iter 35220: loss 9.1285, time 123.20ms
iter 35230: loss 9.0360, time 121.47ms
iter 35240: loss 8.8485, time 122.62ms
step 35250: train loss 7.8797, val loss 7.9040
saving checkpoint to out-shakespeare-char
iter 35250: loss 9.9070, time 2877.73ms
iter 35260: loss 9.2893, time 126.23ms
iter 35270: loss 9.4154, time 125.92ms
iter 35280: loss 8.8689, time 126.14ms
iter 35290: loss 9.6292, time 126.07ms
iter 35300: loss 9.2322, time 126.08ms
iter 35310: loss 9.2003, time 125.99ms
iter 35320: loss 9.7101, time 127.89ms
iter 35330: loss 10.0419, time 125.19ms
iter 35340: loss 9.2890, time 127.66ms
iter 35350: loss 9.1407, time 124.86ms
iter 35360: loss 9.7991, time 125.12ms
iter 35370: loss 9.0341, time 124.88ms
iter 35380: loss 9.3168, time 124.64ms
iter 35390: loss 10.0616, time 124.95ms
iter 35400: loss 9.6096, time 125.59ms
iter 35410: loss 9.5357, time 125.73ms
iter 35420: loss 9.5512, time 124.46ms
iter 35430: loss 9.1213, time 125.31ms
iter 35440: loss 8.4743, time 125.57ms
iter 35450: loss 9.8179, time 127.89ms
iter 35460: loss 9.3127, time 125.50ms
iter 35470: loss 9.3386, time 125.65ms
iter 35480: loss 9.1131, time 125.33ms
iter 35490: loss 8.9313, time 124.39ms
step 35500: train loss 7.8654, val loss 7.8743
saving checkpoint to out-shakespeare-char
iter 35500: loss 9.0353, time 2876.17ms
iter 35510: loss 9.3113, time 121.06ms
iter 35520: loss 9.4150, time 121.62ms
iter 35530: loss 9.4701, time 121.65ms
iter 35540: loss 9.7065, time 121.55ms
iter 35550: loss 8.9905, time 122.16ms
iter 35560: loss 9.7186, time 121.52ms
iter 35570: loss 9.3021, time 121.58ms
iter 35580: loss 9.5918, time 121.89ms
iter 35590: loss 8.9962, time 121.48ms
iter 35600: loss 9.8445, time 122.43ms
iter 35610: loss 9.0551, time 122.70ms
iter 35620: loss 8.8308, time 121.42ms
iter 35630: loss 10.3022, time 121.65ms
iter 35640: loss 9.2589, time 121.42ms
iter 35650: loss 9.4489, time 122.48ms
iter 35660: loss 9.4044, time 121.98ms
iter 35670: loss 9.5648, time 121.45ms
iter 35680: loss 9.3355, time 121.52ms
iter 35690: loss 9.3099, time 121.57ms
iter 35700: loss 9.6373, time 121.86ms
iter 35710: loss 8.9341, time 121.57ms
iter 35720: loss 9.3854, time 121.53ms
iter 35730: loss 8.8784, time 121.87ms
iter 35740: loss 9.1938, time 121.71ms
step 35750: train loss 7.9089, val loss 7.8564
saving checkpoint to out-shakespeare-char
iter 35750: loss 9.6202, time 2885.35ms
iter 35760: loss 9.1411, time 125.58ms
iter 35770: loss 9.9064, time 124.92ms
iter 35780: loss 9.7269, time 125.12ms
iter 35790: loss 9.2668, time 125.33ms
iter 35800: loss 9.2256, time 128.41ms
iter 35810: loss 9.2907, time 124.10ms
iter 35820: loss 10.3117, time 124.91ms
iter 35830: loss 8.7220, time 126.28ms
iter 35840: loss 9.2722, time 125.24ms
iter 35850: loss 9.2811, time 125.67ms
iter 35860: loss 9.5298, time 125.89ms
iter 35870: loss 9.7876, time 125.87ms
iter 35880: loss 9.4298, time 125.80ms
iter 35890: loss 8.9969, time 125.13ms
iter 35900: loss 8.7718, time 125.80ms
iter 35910: loss 9.7024, time 128.81ms
iter 35920: loss 9.4609, time 125.15ms
iter 35930: loss 9.6800, time 125.52ms
iter 35940: loss 9.6052, time 125.29ms
iter 35950: loss 9.3725, time 125.55ms
iter 35960: loss 9.6428, time 125.25ms
iter 35970: loss 8.9046, time 125.90ms
iter 35980: loss 9.2651, time 125.89ms
iter 35990: loss 9.2317, time 125.76ms
step 36000: train loss 7.8850, val loss 7.8673
saving checkpoint to out-shakespeare-char
iter 36000: loss 9.1836, time 2885.42ms
iter 36010: loss 9.2811, time 124.65ms
iter 36020: loss 10.0526, time 125.43ms
iter 36030: loss 9.3605, time 124.78ms
iter 36040: loss 9.3982, time 128.73ms
iter 36050: loss 9.3196, time 125.52ms
iter 36060: loss 9.8498, time 125.98ms
iter 36070: loss 9.2388, time 125.11ms
iter 36080: loss 9.5184, time 128.56ms
iter 36090: loss 9.2077, time 125.12ms
iter 36100: loss 9.5492, time 125.39ms
iter 36110: loss 9.3701, time 125.62ms
iter 36120: loss 9.7630, time 125.54ms
iter 36130: loss 9.4716, time 126.67ms
iter 36140: loss 9.2905, time 125.70ms
iter 36150: loss 9.7050, time 125.91ms
iter 36160: loss 9.3331, time 126.57ms
iter 36170: loss 8.8404, time 125.70ms
iter 36180: loss 9.4197, time 125.55ms
iter 36190: loss 9.4950, time 128.36ms
iter 36200: loss 9.5558, time 125.68ms
iter 36210: loss 9.1812, time 125.78ms
iter 36220: loss 9.4507, time 125.62ms
iter 36230: loss 9.5126, time 125.62ms
iter 36240: loss 9.4013, time 125.87ms
step 36250: train loss 7.8788, val loss 7.8429
saving checkpoint to out-shakespeare-char
iter 36250: loss 10.0425, time 2884.33ms
iter 36260: loss 9.8634, time 128.86ms
iter 36270: loss 9.5669, time 124.82ms
iter 36280: loss 9.0976, time 126.08ms
iter 36290: loss 9.4103, time 126.58ms
iter 36300: loss 8.8317, time 125.00ms
iter 36310: loss 8.6503, time 125.04ms
iter 36320: loss 8.9239, time 125.55ms
iter 36330: loss 9.1639, time 125.15ms
iter 36340: loss 8.4602, time 124.75ms
iter 36350: loss 9.2455, time 125.14ms
iter 36360: loss 9.1609, time 126.52ms
iter 36370: loss 8.9762, time 128.04ms
iter 36380: loss 9.1484, time 125.26ms
iter 36390: loss 9.3604, time 124.96ms
iter 36400: loss 9.2822, time 125.21ms
iter 36410: loss 9.2043, time 125.12ms
iter 36420: loss 9.7627, time 125.55ms
iter 36430: loss 9.0280, time 125.79ms
iter 36440: loss 9.3483, time 124.75ms
iter 36450: loss 8.8894, time 125.92ms
iter 36460: loss 9.5977, time 125.96ms
iter 36470: loss 8.4001, time 125.87ms
iter 36480: loss 8.8842, time 128.84ms
iter 36490: loss 9.3875, time 125.81ms
step 36500: train loss 7.8434, val loss 7.8523
saving checkpoint to out-shakespeare-char
iter 36500: loss 9.9979, time 2896.13ms
iter 36510: loss 9.7258, time 121.65ms
iter 36520: loss 9.6050, time 124.67ms
iter 36530: loss 9.2169, time 121.51ms
iter 36540: loss 9.4310, time 124.45ms
iter 36550: loss 9.6973, time 120.82ms
iter 36560: loss 9.6444, time 124.16ms
iter 36570: loss 9.2130, time 121.39ms
iter 36580: loss 9.6048, time 124.32ms
iter 36590: loss 9.9559, time 121.46ms
iter 36600: loss 9.3675, time 124.74ms
iter 36610: loss 10.2425, time 121.85ms
iter 36620: loss 9.2540, time 124.77ms
iter 36630: loss 9.4483, time 121.45ms
iter 36640: loss 9.4884, time 124.41ms
iter 36650: loss 9.4747, time 121.49ms
iter 36660: loss 9.1823, time 124.41ms
iter 36670: loss 9.2412, time 121.61ms
iter 36680: loss 10.0478, time 124.31ms
iter 36690: loss 9.5210, time 121.56ms
iter 36700: loss 9.5925, time 124.66ms
iter 36710: loss 9.6783, time 121.39ms
iter 36720: loss 8.6898, time 124.82ms
iter 36730: loss 9.7957, time 120.87ms
iter 36740: loss 9.1877, time 124.61ms
step 36750: train loss 7.8607, val loss 7.8331
saving checkpoint to out-shakespeare-char
iter 36750: loss 8.8894, time 2885.35ms
iter 36760: loss 9.7351, time 121.93ms
iter 36770: loss 9.0610, time 124.88ms
iter 36780: loss 9.5277, time 121.88ms
iter 36790: loss 9.2291, time 124.68ms
iter 36800: loss 8.3367, time 122.03ms
iter 36810: loss 8.9601, time 124.75ms
iter 36820: loss 9.5518, time 121.90ms
iter 36830: loss 9.4158, time 124.74ms
iter 36840: loss 8.9543, time 121.92ms
iter 36850: loss 9.7861, time 124.68ms
iter 36860: loss 9.5286, time 121.87ms
iter 36870: loss 10.0389, time 124.60ms
iter 36880: loss 9.5962, time 121.89ms
iter 36890: loss 9.5920, time 124.80ms
iter 36900: loss 8.7764, time 122.04ms
iter 36910: loss 9.0814, time 124.32ms
iter 36920: loss 8.6200, time 121.08ms
iter 36930: loss 9.0132, time 124.55ms
iter 36940: loss 9.1418, time 122.21ms
iter 36950: loss 9.0169, time 124.43ms
iter 36960: loss 9.8421, time 121.52ms
iter 36970: loss 9.6527, time 124.78ms
iter 36980: loss 9.3905, time 121.60ms
iter 36990: loss 9.4483, time 124.46ms
step 37000: train loss 7.8083, val loss 7.8420
saving checkpoint to out-shakespeare-char
iter 37000: loss 9.7304, time 2891.03ms
iter 37010: loss 9.4355, time 121.67ms
iter 37020: loss 9.5438, time 124.27ms
iter 37030: loss 9.0092, time 121.52ms
iter 37040: loss 9.6026, time 126.08ms
iter 37050: loss 8.8630, time 125.97ms
iter 37060: loss 9.2811, time 125.65ms
iter 37070: loss 9.8963, time 125.74ms
iter 37080: loss 9.3197, time 125.74ms
iter 37090: loss 8.8867, time 126.81ms
iter 37100: loss 9.9279, time 125.51ms
iter 37110: loss 9.6461, time 125.61ms
iter 37120: loss 9.0134, time 125.91ms
iter 37130: loss 8.9462, time 125.72ms
iter 37140: loss 8.9732, time 125.57ms
iter 37150: loss 8.9169, time 125.52ms
iter 37160: loss 9.3002, time 128.14ms
iter 37170: loss 9.7767, time 125.90ms
iter 37180: loss 9.5835, time 125.91ms
iter 37190: loss 9.7620, time 125.86ms
iter 37200: loss 8.8858, time 125.58ms
iter 37210: loss 8.7841, time 125.72ms
iter 37220: loss 9.2776, time 125.71ms
iter 37230: loss 9.9464, time 125.41ms
iter 37240: loss 8.9581, time 125.69ms
step 37250: train loss 7.8569, val loss 7.8334
saving checkpoint to out-shakespeare-char
iter 37250: loss 8.6171, time 2903.69ms
iter 37260: loss 8.9343, time 125.69ms
iter 37270: loss 8.6765, time 125.52ms
iter 37280: loss 8.8349, time 125.30ms
iter 37290: loss 9.4449, time 125.09ms
iter 37300: loss 10.1118, time 125.01ms
iter 37310: loss 8.9807, time 125.25ms
iter 37320: loss 9.6465, time 125.55ms
iter 37330: loss 8.5352, time 128.18ms
iter 37340: loss 8.9641, time 125.22ms
iter 37350: loss 9.0446, time 125.54ms
iter 37360: loss 9.2989, time 125.47ms
iter 37370: loss 9.5610, time 125.21ms
iter 37380: loss 9.6192, time 125.48ms
iter 37390: loss 9.5616, time 125.28ms
iter 37400: loss 9.6566, time 124.65ms
iter 37410: loss 9.2870, time 125.19ms
iter 37420: loss 9.8592, time 125.52ms
iter 37430: loss 9.4823, time 125.34ms
iter 37440: loss 9.0911, time 127.92ms
iter 37450: loss 9.6141, time 125.01ms
iter 37460: loss 9.6587, time 126.22ms
iter 37470: loss 9.8753, time 125.45ms
iter 37480: loss 9.1548, time 125.31ms
iter 37490: loss 8.9348, time 125.06ms
step 37500: train loss 7.8049, val loss 7.8672
saving checkpoint to out-shakespeare-char
iter 37500: loss 9.5523, time 2905.67ms
iter 37510: loss 9.9134, time 125.49ms
iter 37520: loss 9.1353, time 125.41ms
iter 37530: loss 9.3599, time 125.16ms
iter 37540: loss 9.0560, time 128.03ms
iter 37550: loss 8.8897, time 124.90ms
iter 37560: loss 8.7815, time 124.33ms
iter 37570: loss 9.6083, time 125.28ms
iter 37580: loss 9.1274, time 125.30ms
iter 37590: loss 8.5565, time 125.43ms
iter 37600: loss 9.3683, time 125.03ms
iter 37610: loss 8.8567, time 124.92ms
iter 37620: loss 8.9876, time 124.95ms
iter 37630: loss 9.1887, time 125.02ms
iter 37640: loss 8.7180, time 125.37ms
iter 37650: loss 9.2798, time 128.06ms
iter 37660: loss 9.1435, time 125.04ms
iter 37670: loss 9.1956, time 125.07ms
iter 37680: loss 9.2676, time 125.13ms
iter 37690: loss 9.5266, time 124.99ms
iter 37700: loss 8.8566, time 125.02ms
iter 37710: loss 9.0013, time 125.02ms
iter 37720: loss 9.4493, time 124.78ms
iter 37730: loss 9.0595, time 124.94ms
iter 37740: loss 8.8101, time 125.34ms
step 37750: train loss 7.8110, val loss 7.8416
saving checkpoint to out-shakespeare-char
iter 37750: loss 9.0286, time 2883.86ms
iter 37760: loss 9.6954, time 125.30ms
iter 37770: loss 9.2650, time 126.30ms
iter 37780: loss 9.7916, time 125.62ms
iter 37790: loss 9.1629, time 125.97ms
iter 37800: loss 9.4191, time 126.29ms
iter 37810: loss 9.5744, time 125.35ms
iter 37820: loss 8.7292, time 125.12ms
iter 37830: loss 9.4959, time 125.03ms
iter 37840: loss 8.7277, time 125.08ms
iter 37850: loss 9.2363, time 124.25ms
iter 37860: loss 9.4564, time 124.82ms
iter 37870: loss 8.7880, time 125.64ms
iter 37880: loss 9.5038, time 125.47ms
iter 37890: loss 9.2006, time 125.18ms
iter 37900: loss 9.6236, time 125.22ms
iter 37910: loss 9.5262, time 125.16ms
iter 37920: loss 9.1015, time 125.35ms
iter 37930: loss 9.2543, time 128.11ms
iter 37940: loss 9.0198, time 124.81ms
iter 37950: loss 9.7280, time 126.47ms
iter 37960: loss 8.8221, time 125.68ms
iter 37970: loss 9.5865, time 129.33ms
iter 37980: loss 9.3181, time 125.75ms
iter 37990: loss 8.8727, time 127.33ms
step 38000: train loss 7.8253, val loss 7.8085
saving checkpoint to out-shakespeare-char
iter 38000: loss 9.8543, time 2911.11ms
iter 38010: loss 9.9936, time 125.39ms
iter 38020: loss 10.2761, time 127.07ms
iter 38030: loss 9.4802, time 125.43ms
iter 38040: loss 9.1826, time 124.63ms
iter 38050: loss 9.0587, time 125.02ms
iter 38060: loss 8.8867, time 124.36ms
iter 38070: loss 8.8379, time 125.72ms
iter 38080: loss 8.9594, time 122.98ms
iter 38090: loss 8.2813, time 125.64ms
iter 38100: loss 9.1442, time 128.55ms
iter 38110: loss 9.3398, time 125.72ms
iter 38120: loss 9.0220, time 124.31ms
iter 38130: loss 9.1300, time 125.50ms
iter 38140: loss 9.5305, time 125.95ms
iter 38150: loss 9.6401, time 125.19ms
iter 38160: loss 9.6018, time 125.27ms
iter 38170: loss 9.6503, time 125.16ms
iter 38180: loss 8.8367, time 125.29ms
iter 38190: loss 9.8951, time 125.41ms
iter 38200: loss 9.0335, time 125.58ms
iter 38210: loss 9.5656, time 128.40ms
iter 38220: loss 9.9400, time 125.29ms
iter 38230: loss 9.1265, time 126.33ms
iter 38240: loss 9.4402, time 125.91ms
step 38250: train loss 7.8552, val loss 7.7914
saving checkpoint to out-shakespeare-char
iter 38250: loss 9.2536, time 2931.01ms
iter 38260: loss 9.0106, time 121.71ms
iter 38270: loss 9.0942, time 120.62ms
iter 38280: loss 9.1673, time 120.06ms
iter 38290: loss 8.6684, time 126.23ms
iter 38300: loss 9.4171, time 125.21ms
iter 38310: loss 9.1539, time 125.20ms
iter 38320: loss 10.0758, time 125.69ms
iter 38330: loss 9.5335, time 125.52ms
iter 38340: loss 9.4728, time 125.40ms
iter 38350: loss 9.4980, time 125.29ms
iter 38360: loss 8.9154, time 125.59ms
iter 38370: loss 9.3250, time 122.00ms
iter 38380: loss 9.2321, time 122.01ms
iter 38390: loss 8.2996, time 121.90ms
iter 38400: loss 9.2269, time 122.13ms
iter 38410: loss 9.4623, time 122.21ms
iter 38420: loss 9.5214, time 122.23ms
iter 38430: loss 9.0958, time 122.07ms
iter 38440: loss 8.7159, time 121.87ms
iter 38450: loss 9.4944, time 122.23ms
iter 38460: loss 9.5407, time 121.97ms
iter 38470: loss 9.4701, time 121.98ms
iter 38480: loss 9.6595, time 123.01ms
iter 38490: loss 9.6090, time 121.90ms
step 38500: train loss 7.7521, val loss 7.8434
saving checkpoint to out-shakespeare-char
iter 38500: loss 9.3374, time 2865.68ms
iter 38510: loss 8.9561, time 120.79ms
iter 38520: loss 9.2520, time 121.66ms
iter 38530: loss 9.0308, time 121.70ms
iter 38540: loss 8.8775, time 121.88ms
iter 38550: loss 8.4867, time 121.36ms
iter 38560: loss 9.6083, time 120.47ms
iter 38570: loss 8.9918, time 121.43ms
iter 38580: loss 9.6392, time 121.37ms
iter 38590: loss 9.5032, time 121.88ms
iter 38600: loss 10.1182, time 121.65ms
iter 38610: loss 8.8424, time 122.17ms
iter 38620: loss 9.0079, time 122.12ms
iter 38630: loss 9.2442, time 121.53ms
iter 38640: loss 8.9469, time 121.54ms
iter 38650: loss 8.9644, time 121.61ms
iter 38660: loss 8.8817, time 121.33ms
iter 38670: loss 8.6744, time 121.28ms
iter 38680: loss 9.5799, time 121.79ms
iter 38690: loss 10.0496, time 122.03ms
iter 38700: loss 9.5677, time 120.86ms
iter 38710: loss 8.5858, time 121.48ms
iter 38720: loss 9.6074, time 122.58ms
iter 38730: loss 8.9184, time 122.37ms
iter 38740: loss 9.3855, time 122.01ms
step 38750: train loss 7.7742, val loss 7.8409
saving checkpoint to out-shakespeare-char
iter 38750: loss 10.0139, time 2908.30ms
iter 38760: loss 9.2404, time 121.28ms
iter 38770: loss 8.6601, time 125.27ms
iter 38780: loss 9.2664, time 123.01ms
iter 38790: loss 9.6387, time 125.15ms
iter 38800: loss 9.3331, time 121.89ms
iter 38810: loss 9.7509, time 123.07ms
iter 38820: loss 8.9741, time 122.70ms
iter 38830: loss 10.0836, time 125.35ms
iter 38840: loss 9.3590, time 120.95ms
iter 38850: loss 9.9932, time 124.74ms
iter 38860: loss 9.9045, time 121.94ms
iter 38870: loss 9.6119, time 125.07ms
iter 38880: loss 8.5735, time 121.78ms
iter 38890: loss 9.4368, time 124.26ms
iter 38900: loss 9.0567, time 121.57ms
iter 38910: loss 9.0586, time 124.65ms
iter 38920: loss 8.8794, time 125.24ms
iter 38930: loss 9.2941, time 125.20ms
iter 38940: loss 9.7205, time 125.17ms
iter 38950: loss 9.3761, time 125.81ms
iter 38960: loss 9.6573, time 125.13ms
iter 38970: loss 8.7081, time 125.02ms
iter 38980: loss 8.7277, time 124.95ms
iter 38990: loss 9.3787, time 125.17ms
step 39000: train loss 7.8187, val loss 7.8065
saving checkpoint to out-shakespeare-char
iter 39000: loss 8.5376, time 2897.54ms
iter 39010: loss 9.8910, time 127.94ms
iter 39020: loss 9.1574, time 124.70ms
iter 39030: loss 9.3536, time 125.22ms
iter 39040: loss 9.1179, time 125.63ms
iter 39050: loss 9.8982, time 124.86ms
iter 39060: loss 9.5483, time 126.04ms
iter 39070: loss 8.9539, time 125.55ms
iter 39080: loss 9.8237, time 125.56ms
iter 39090: loss 9.2008, time 124.48ms
iter 39100: loss 9.3986, time 125.02ms
iter 39110: loss 9.3114, time 125.32ms
iter 39120: loss 9.4991, time 127.46ms
iter 39130: loss 8.9395, time 125.60ms
iter 39140: loss 8.8916, time 125.39ms
iter 39150: loss 8.9803, time 125.12ms
iter 39160: loss 9.0672, time 124.83ms
iter 39170: loss 9.7516, time 125.51ms
iter 39180: loss 10.0734, time 125.97ms
iter 39190: loss 8.6766, time 128.62ms
iter 39200: loss 8.6385, time 125.74ms
iter 39210: loss 8.8315, time 124.91ms
iter 39220: loss 9.4060, time 126.31ms
iter 39230: loss 9.0543, time 127.82ms
iter 39240: loss 8.7708, time 125.24ms
step 39250: train loss 7.7799, val loss 7.7378
saving checkpoint to out-shakespeare-char
iter 39250: loss 9.6974, time 2862.95ms
iter 39260: loss 9.4080, time 124.81ms
iter 39270: loss 9.6545, time 128.40ms
iter 39280: loss 8.9605, time 125.47ms
iter 39290: loss 9.2273, time 124.74ms
iter 39300: loss 8.5715, time 124.87ms
iter 39310: loss 9.3234, time 123.91ms
iter 39320: loss 9.3207, time 124.94ms
iter 39330: loss 8.6920, time 124.89ms
iter 39340: loss 8.4660, time 124.46ms
iter 39350: loss 9.8757, time 125.26ms
iter 39360: loss 9.0364, time 125.54ms
iter 39370: loss 9.0095, time 120.96ms
iter 39380: loss 8.8613, time 119.85ms
iter 39390: loss 9.4362, time 119.58ms
iter 39400: loss 9.6876, time 119.71ms
iter 39410: loss 9.7406, time 120.78ms
iter 39420: loss 9.5188, time 120.19ms
iter 39430: loss 9.4830, time 119.64ms
iter 39440: loss 9.0654, time 120.62ms
iter 39450: loss 8.7401, time 119.59ms
iter 39460: loss 9.0034, time 119.75ms
iter 39470: loss 9.0732, time 119.43ms
iter 39480: loss 8.5899, time 119.69ms
iter 39490: loss 9.0130, time 119.79ms
step 39500: train loss 7.7565, val loss 7.7467
saving checkpoint to out-shakespeare-char
iter 39500: loss 8.8522, time 2886.27ms
iter 39510: loss 10.0359, time 121.84ms
iter 39520: loss 9.9781, time 121.70ms
iter 39530: loss 9.2052, time 121.79ms
iter 39540: loss 8.9273, time 121.62ms
iter 39550: loss 9.6292, time 121.36ms
iter 39560: loss 9.1270, time 121.50ms
iter 39570: loss 9.7151, time 121.37ms
iter 39580: loss 9.1906, time 121.43ms
iter 39590: loss 9.8601, time 121.55ms
iter 39600: loss 10.4712, time 121.30ms
iter 39610: loss 9.2125, time 121.44ms
iter 39620: loss 9.6220, time 121.52ms
iter 39630: loss 9.0729, time 121.34ms
iter 39640: loss 9.1181, time 121.62ms
iter 39650: loss 9.4266, time 121.53ms
iter 39660: loss 9.7271, time 121.50ms
iter 39670: loss 9.4176, time 121.96ms
iter 39680: loss 9.2362, time 121.61ms
iter 39690: loss 10.4953, time 121.56ms
iter 39700: loss 9.9923, time 122.41ms
iter 39710: loss 8.5250, time 121.55ms
iter 39720: loss 8.7440, time 122.01ms
iter 39730: loss 8.6636, time 121.33ms
iter 39740: loss 9.0189, time 121.11ms
step 39750: train loss 7.7840, val loss 7.7489
saving checkpoint to out-shakespeare-char
iter 39750: loss 8.2449, time 2899.52ms
iter 39760: loss 9.0124, time 125.13ms
iter 39770: loss 8.7315, time 122.07ms
iter 39780: loss 9.3532, time 122.33ms
iter 39790: loss 9.7204, time 121.55ms
iter 39800: loss 8.9501, time 121.74ms
iter 39810: loss 9.5344, time 121.93ms
iter 39820: loss 8.7614, time 121.94ms
iter 39830: loss 8.7495, time 121.73ms
iter 39840: loss 9.3085, time 121.71ms
iter 39850: loss 8.2206, time 121.94ms
iter 39860: loss 8.8480, time 121.89ms
iter 39870: loss 9.3284, time 121.78ms
iter 39880: loss 9.3296, time 121.80ms
iter 39890: loss 9.2205, time 121.76ms
iter 39900: loss 8.6099, time 121.99ms
iter 39910: loss 9.2164, time 121.88ms
iter 39920: loss 9.6040, time 121.78ms
iter 39930: loss 8.9991, time 121.41ms
iter 39940: loss 9.6825, time 121.54ms
iter 39950: loss 9.4767, time 121.52ms
iter 39960: loss 9.0544, time 121.58ms
iter 39970: loss 9.2717, time 121.58ms
iter 39980: loss 9.2719, time 121.41ms
iter 39990: loss 9.2854, time 120.87ms
step 40000: train loss 7.7781, val loss 7.7571
saving checkpoint to out-shakespeare-char
iter 40000: loss 9.1017, time 2891.81ms
iter 40010: loss 8.9578, time 121.60ms
iter 40020: loss 9.7887, time 121.23ms
iter 40030: loss 9.5239, time 121.11ms
iter 40040: loss 9.7463, time 121.52ms
iter 40050: loss 9.1707, time 120.43ms
iter 40060: loss 9.0947, time 120.72ms
iter 40070: loss 8.8904, time 121.50ms
iter 40080: loss 9.0128, time 121.24ms
iter 40090: loss 9.6566, time 122.03ms
iter 40100: loss 9.4587, time 121.56ms
iter 40110: loss 9.0568, time 121.34ms
iter 40120: loss 9.2596, time 121.46ms
iter 40130: loss 9.2096, time 121.34ms
iter 40140: loss 9.0190, time 121.11ms
iter 40150: loss 8.2043, time 122.03ms
iter 40160: loss 9.8539, time 121.29ms
iter 40170: loss 9.3432, time 121.21ms
iter 40180: loss 9.4454, time 121.42ms
iter 40190: loss 9.4648, time 121.79ms
iter 40200: loss 9.5889, time 121.65ms
iter 40210: loss 9.5357, time 121.86ms
iter 40220: loss 8.7978, time 121.54ms
iter 40230: loss 9.9933, time 121.63ms
iter 40240: loss 9.5359, time 121.95ms
step 40250: train loss 7.7701, val loss 7.7445
saving checkpoint to out-shakespeare-char
iter 40250: loss 8.9653, time 2910.92ms
iter 40260: loss 9.5058, time 125.33ms
iter 40270: loss 8.7184, time 124.95ms
iter 40280: loss 8.7253, time 126.05ms
iter 40290: loss 9.9856, time 125.08ms
iter 40300: loss 9.0477, time 124.96ms
iter 40310: loss 8.0680, time 125.16ms
iter 40320: loss 8.8032, time 124.90ms
iter 40330: loss 8.5455, time 128.08ms
iter 40340: loss 8.7721, time 125.35ms
iter 40350: loss 8.7853, time 125.55ms
iter 40360: loss 9.3970, time 125.58ms
iter 40370: loss 9.3106, time 125.81ms
iter 40380: loss 8.9768, time 127.92ms
iter 40390: loss 9.7538, time 124.95ms
iter 40400: loss 8.5712, time 124.98ms
iter 40410: loss 8.5059, time 125.43ms
iter 40420: loss 9.4825, time 125.15ms
iter 40430: loss 9.2935, time 125.15ms
iter 40440: loss 9.2645, time 125.16ms
iter 40450: loss 9.2140, time 125.18ms
iter 40460: loss 9.7651, time 125.20ms
iter 40470: loss 9.0477, time 125.24ms
iter 40480: loss 8.6019, time 125.19ms
iter 40490: loss 9.1177, time 127.86ms
step 40500: train loss 7.7438, val loss 7.7689
saving checkpoint to out-shakespeare-char
iter 40500: loss 9.0936, time 2903.01ms
iter 40510: loss 9.4572, time 125.10ms
iter 40520: loss 8.9318, time 125.17ms
iter 40530: loss 9.3163, time 124.64ms
iter 40540: loss 9.3973, time 125.10ms
iter 40550: loss 9.1952, time 125.31ms
iter 40560: loss 8.4822, time 125.48ms
iter 40570: loss 9.4911, time 125.84ms
iter 40580: loss 9.0747, time 125.77ms
iter 40590: loss 9.2924, time 127.95ms
iter 40600: loss 8.9253, time 125.54ms
iter 40610: loss 9.3717, time 125.34ms
iter 40620: loss 9.0768, time 125.64ms
iter 40630: loss 9.4400, time 127.67ms
iter 40640: loss 9.8984, time 126.75ms
iter 40650: loss 9.1633, time 125.49ms
iter 40660: loss 8.6256, time 125.28ms
iter 40670: loss 8.8236, time 125.21ms
iter 40680: loss 9.4071, time 126.19ms
iter 40690: loss 8.8943, time 125.47ms
iter 40700: loss 8.8139, time 128.50ms
iter 40710: loss 9.6438, time 125.36ms
iter 40720: loss 9.0510, time 124.69ms
iter 40730: loss 9.4125, time 123.76ms
iter 40740: loss 8.6108, time 128.20ms
step 40750: train loss 7.7782, val loss 7.7187
saving checkpoint to out-shakespeare-char
iter 40750: loss 9.1564, time 2895.09ms
iter 40760: loss 9.6048, time 125.72ms
iter 40770: loss 8.7009, time 125.69ms
iter 40780: loss 7.9132, time 128.23ms
iter 40790: loss 9.0480, time 125.30ms
iter 40800: loss 8.8514, time 126.30ms
iter 40810: loss 9.5224, time 125.59ms
iter 40820: loss 9.8825, time 123.22ms
iter 40830: loss 8.8522, time 126.04ms
iter 40840: loss 9.0725, time 125.84ms
iter 40850: loss 9.6152, time 125.86ms
iter 40860: loss 9.6157, time 125.84ms
iter 40870: loss 8.7774, time 125.82ms
iter 40880: loss 9.0109, time 125.83ms
iter 40890: loss 8.6867, time 127.66ms
iter 40900: loss 9.1497, time 125.47ms
iter 40910: loss 9.8588, time 124.88ms
iter 40920: loss 9.1927, time 125.31ms
iter 40930: loss 9.6667, time 125.18ms
iter 40940: loss 9.1476, time 125.55ms
iter 40950: loss 9.7238, time 125.61ms
iter 40960: loss 9.1230, time 128.42ms
iter 40970: loss 9.3096, time 125.46ms
iter 40980: loss 9.1027, time 125.36ms
iter 40990: loss 9.9547, time 125.46ms
step 41000: train loss 7.7474, val loss 7.7569
saving checkpoint to out-shakespeare-char
iter 41000: loss 8.5810, time 2891.98ms
iter 41010: loss 8.7164, time 126.08ms
iter 41020: loss 9.8097, time 128.52ms
iter 41030: loss 8.6648, time 125.72ms
iter 41040: loss 9.2241, time 125.18ms
iter 41050: loss 9.2879, time 125.46ms
iter 41060: loss 9.2149, time 124.27ms
iter 41070: loss 9.7538, time 125.21ms
iter 41080: loss 8.9134, time 125.58ms
iter 41090: loss 8.9364, time 125.74ms
iter 41100: loss 9.2402, time 125.72ms
iter 41110: loss 9.7156, time 125.67ms
iter 41120: loss 8.4680, time 125.84ms
iter 41130: loss 8.7718, time 128.70ms
iter 41140: loss 8.9110, time 126.17ms
iter 41150: loss 9.6242, time 124.88ms
iter 41160: loss 9.0146, time 125.33ms
iter 41170: loss 9.4468, time 124.44ms
iter 41180: loss 8.4577, time 125.17ms
iter 41190: loss 9.4610, time 125.20ms
iter 41200: loss 8.4584, time 127.94ms
iter 41210: loss 9.1431, time 124.71ms
iter 41220: loss 9.1361, time 125.04ms
iter 41230: loss 9.0343, time 125.24ms
iter 41240: loss 9.0129, time 125.49ms
step 41250: train loss 7.7647, val loss 7.7573
saving checkpoint to out-shakespeare-char
iter 41250: loss 9.1810, time 2885.57ms
iter 41260: loss 9.6499, time 128.13ms
iter 41270: loss 9.8179, time 124.54ms
iter 41280: loss 8.9466, time 125.50ms
iter 41290: loss 8.7845, time 125.17ms
iter 41300: loss 7.6230, time 124.82ms
iter 41310: loss 9.1258, time 125.14ms
iter 41320: loss 9.3626, time 125.04ms
iter 41330: loss 9.2818, time 125.30ms
iter 41340: loss 9.5449, time 125.22ms
iter 41350: loss 8.7829, time 124.96ms
iter 41360: loss 9.2088, time 125.36ms
iter 41370: loss 9.1337, time 125.11ms
iter 41380: loss 9.3024, time 125.69ms
iter 41390: loss 9.7432, time 125.19ms
iter 41400: loss 9.0710, time 124.14ms
iter 41410: loss 9.4547, time 125.12ms
iter 41420: loss 9.0549, time 127.50ms
iter 41430: loss 9.5284, time 125.75ms
iter 41440: loss 9.2759, time 128.58ms
iter 41450: loss 9.0872, time 125.91ms
iter 41460: loss 9.1726, time 125.57ms
iter 41470: loss 9.1327, time 125.90ms
iter 41480: loss 9.2976, time 125.77ms
iter 41490: loss 9.0613, time 125.79ms
step 41500: train loss 7.7081, val loss 7.7166
saving checkpoint to out-shakespeare-char
iter 41500: loss 9.3933, time 2914.29ms
iter 41510: loss 9.5601, time 125.93ms
iter 41520: loss 8.9564, time 126.57ms
iter 41530: loss 9.3956, time 125.72ms
iter 41540: loss 9.2444, time 125.98ms
iter 41550: loss 8.5926, time 125.54ms
iter 41560: loss 8.3603, time 125.77ms
iter 41570: loss 9.5377, time 125.50ms
iter 41580: loss 9.5018, time 125.53ms
iter 41590: loss 8.6968, time 125.60ms
iter 41600: loss 9.3905, time 125.86ms
iter 41610: loss 9.7906, time 128.36ms
iter 41620: loss 8.8215, time 125.90ms
iter 41630: loss 9.0866, time 125.43ms
iter 41640: loss 9.1845, time 125.43ms
iter 41650: loss 8.5888, time 125.51ms
iter 41660: loss 9.2110, time 125.69ms
iter 41670: loss 8.9260, time 125.49ms
iter 41680: loss 9.2388, time 125.57ms
iter 41690: loss 8.6687, time 125.53ms
iter 41700: loss 8.9251, time 125.67ms
iter 41710: loss 9.3954, time 125.86ms
iter 41720: loss 8.7622, time 128.78ms
iter 41730: loss 9.1433, time 125.79ms
iter 41740: loss 8.7672, time 125.97ms
step 41750: train loss 7.6946, val loss 7.7413
saving checkpoint to out-shakespeare-char
iter 41750: loss 9.5915, time 2880.41ms
iter 41760: loss 9.0171, time 121.77ms
iter 41770: loss 8.4770, time 121.57ms
iter 41780: loss 9.3837, time 121.56ms
iter 41790: loss 8.7597, time 120.49ms
iter 41800: loss 9.4794, time 120.86ms
iter 41810: loss 9.7003, time 121.57ms
iter 41820: loss 8.9252, time 121.43ms
iter 41830: loss 8.8572, time 121.42ms
iter 41840: loss 9.2140, time 121.35ms
iter 41850: loss 9.4875, time 121.56ms
iter 41860: loss 9.0070, time 121.50ms
iter 41870: loss 9.5002, time 121.65ms
iter 41880: loss 8.5921, time 121.31ms
iter 41890: loss 9.0312, time 121.54ms
iter 41900: loss 9.2451, time 121.36ms
iter 41910: loss 9.4161, time 121.48ms
iter 41920: loss 9.9810, time 123.12ms
iter 41930: loss 8.7799, time 124.78ms
iter 41940: loss 9.2251, time 125.71ms
iter 41950: loss 9.5860, time 124.78ms
iter 41960: loss 8.7424, time 124.83ms
iter 41970: loss 9.4913, time 125.99ms
iter 41980: loss 9.5354, time 124.34ms
iter 41990: loss 9.9915, time 125.66ms
step 42000: train loss 7.7369, val loss 7.7059
saving checkpoint to out-shakespeare-char
iter 42000: loss 9.2145, time 2883.91ms
iter 42010: loss 8.6913, time 125.18ms
iter 42020: loss 9.3837, time 125.44ms
iter 42030: loss 8.8799, time 126.22ms
iter 42040: loss 8.4271, time 125.91ms
iter 42050: loss 9.1583, time 125.52ms
iter 42060: loss 9.3986, time 124.78ms
iter 42070: loss 8.6714, time 125.53ms
iter 42080: loss 9.1828, time 125.70ms
iter 42090: loss 8.8506, time 125.38ms
iter 42100: loss 9.8583, time 126.01ms
iter 42110: loss 9.3825, time 128.47ms
iter 42120: loss 8.5687, time 125.40ms
iter 42130: loss 9.0500, time 125.32ms
iter 42140: loss 9.1690, time 125.86ms
iter 42150: loss 8.7593, time 125.77ms
iter 42160: loss 8.8064, time 124.96ms
iter 42170: loss 9.8331, time 124.95ms
iter 42180: loss 9.3554, time 125.77ms
iter 42190: loss 8.9705, time 125.80ms
iter 42200: loss 8.6450, time 123.96ms
iter 42210: loss 9.2236, time 125.57ms
iter 42220: loss 9.3482, time 128.49ms
iter 42230: loss 8.5782, time 125.52ms
iter 42240: loss 8.7350, time 124.62ms
step 42250: train loss 7.7275, val loss 7.7152
saving checkpoint to out-shakespeare-char
iter 42250: loss 8.9002, time 2882.11ms
iter 42260: loss 9.0808, time 125.38ms
iter 42270: loss 8.7759, time 126.07ms
iter 42280: loss 8.6958, time 128.65ms
iter 42290: loss 8.6406, time 124.38ms
iter 42300: loss 8.8867, time 125.48ms
iter 42310: loss 8.3602, time 125.82ms
iter 42320: loss 8.2817, time 126.35ms
iter 42330: loss 8.8877, time 124.53ms
iter 42340: loss 9.0395, time 125.70ms
iter 42350: loss 8.7509, time 125.46ms
iter 42360: loss 9.1029, time 125.65ms
iter 42370: loss 8.9973, time 125.00ms
iter 42380: loss 9.8494, time 125.74ms
iter 42390: loss 8.5156, time 128.63ms
iter 42400: loss 9.9160, time 125.30ms
iter 42410: loss 9.0903, time 125.37ms
iter 42420: loss 8.9144, time 125.53ms
iter 42430: loss 8.6788, time 125.33ms
iter 42440: loss 9.1800, time 125.66ms
iter 42450: loss 9.4302, time 125.56ms
iter 42460: loss 9.0337, time 126.01ms
iter 42470: loss 8.9276, time 125.66ms
iter 42480: loss 8.7694, time 125.96ms
iter 42490: loss 8.6673, time 124.96ms
step 42500: train loss 7.6815, val loss 7.6813
saving checkpoint to out-shakespeare-char
iter 42500: loss 9.8259, time 2879.69ms
iter 42510: loss 8.5504, time 126.04ms
iter 42520: loss 9.2023, time 126.26ms
iter 42530: loss 9.3700, time 128.76ms
iter 42540: loss 9.9898, time 125.73ms
iter 42550: loss 8.8169, time 126.06ms
iter 42560: loss 9.0777, time 126.34ms
iter 42570: loss 8.9160, time 125.32ms
iter 42580: loss 9.5079, time 125.85ms
iter 42590: loss 9.5365, time 125.54ms
iter 42600: loss 8.6049, time 125.45ms
iter 42610: loss 9.0540, time 125.64ms
iter 42620: loss 8.8691, time 125.67ms
iter 42630: loss 9.4697, time 125.67ms
iter 42640: loss 9.6468, time 128.49ms
iter 42650: loss 9.5414, time 125.48ms
iter 42660: loss 8.9775, time 125.58ms
iter 42670: loss 9.6769, time 125.67ms
iter 42680: loss 9.2901, time 125.39ms
iter 42690: loss 9.4676, time 125.44ms
iter 42700: loss 9.2519, time 126.03ms
iter 42710: loss 9.7668, time 128.54ms
iter 42720: loss 8.9749, time 125.76ms
iter 42730: loss 9.2303, time 125.93ms
iter 42740: loss 9.5603, time 125.80ms
step 42750: train loss 7.6546, val loss 7.6830
saving checkpoint to out-shakespeare-char
iter 42750: loss 9.2285, time 2873.86ms
iter 42760: loss 9.5509, time 125.58ms
iter 42770: loss 9.3760, time 124.80ms
iter 42780: loss 8.9063, time 125.66ms
iter 42790: loss 9.4662, time 126.16ms
iter 42800: loss 9.1521, time 125.96ms
iter 42810: loss 9.1608, time 127.49ms
iter 42820: loss 9.9624, time 125.79ms
iter 42830: loss 9.0887, time 125.03ms
iter 42840: loss 8.9726, time 125.11ms
iter 42850: loss 9.2043, time 124.95ms
iter 42860: loss 8.5206, time 126.15ms
iter 42870: loss 8.8689, time 126.72ms
iter 42880: loss 9.3751, time 125.02ms
iter 42890: loss 9.1767, time 125.90ms
iter 42900: loss 9.2422, time 125.95ms
iter 42910: loss 9.0343, time 125.31ms
iter 42920: loss 8.9111, time 125.64ms
iter 42930: loss 9.5660, time 125.71ms
iter 42940: loss 8.9165, time 125.46ms
iter 42950: loss 8.9825, time 125.75ms
iter 42960: loss 9.5188, time 125.69ms
iter 42970: loss 8.9931, time 126.27ms
iter 42980: loss 8.9685, time 126.00ms
iter 42990: loss 9.3755, time 125.77ms
step 43000: train loss 7.7006, val loss 7.6909
saving checkpoint to out-shakespeare-char
iter 43000: loss 9.0219, time 2884.71ms
iter 43010: loss 8.5862, time 125.91ms
iter 43020: loss 9.5377, time 125.88ms
iter 43030: loss 9.0335, time 124.81ms
iter 43040: loss 8.6572, time 125.53ms
iter 43050: loss 9.2547, time 125.02ms
iter 43060: loss 9.9214, time 125.80ms
iter 43070: loss 9.1744, time 128.62ms
iter 43080: loss 8.4741, time 125.68ms
iter 43090: loss 10.1164, time 125.11ms
iter 43100: loss 8.4589, time 125.85ms
iter 43110: loss 8.7770, time 125.50ms
iter 43120: loss 9.4295, time 125.85ms
iter 43130: loss 9.2215, time 124.96ms
iter 43140: loss 9.3084, time 128.54ms
iter 43150: loss 9.4123, time 125.17ms
iter 43160: loss 10.1548, time 125.68ms
iter 43170: loss 9.0233, time 125.91ms
iter 43180: loss 8.8596, time 127.76ms
iter 43190: loss 9.1837, time 125.26ms
iter 43200: loss 9.0477, time 125.67ms
iter 43210: loss 9.1470, time 125.06ms
iter 43220: loss 9.2478, time 127.31ms
iter 43230: loss 8.6315, time 125.96ms
iter 43240: loss 9.3918, time 125.40ms
step 43250: train loss 7.6448, val loss 7.6759
saving checkpoint to out-shakespeare-char
iter 43250: loss 9.3876, time 2872.76ms
iter 43260: loss 8.7457, time 125.74ms
iter 43270: loss 8.6118, time 125.88ms
iter 43280: loss 9.3022, time 125.77ms
iter 43290: loss 9.2848, time 125.87ms
iter 43300: loss 9.3073, time 126.18ms
iter 43310: loss 8.6714, time 125.96ms
iter 43320: loss 8.5046, time 125.61ms
iter 43330: loss 8.5908, time 125.74ms
iter 43340: loss 9.5238, time 125.52ms
iter 43350: loss 9.5757, time 126.12ms
iter 43360: loss 9.2229, time 126.17ms
iter 43370: loss 9.0743, time 125.85ms
iter 43380: loss 9.8315, time 126.13ms
iter 43390: loss 9.3514, time 128.94ms
iter 43400: loss 9.6687, time 126.04ms
iter 43410: loss 8.8596, time 125.95ms
iter 43420: loss 9.0876, time 127.10ms
iter 43430: loss 9.4032, time 125.85ms
iter 43440: loss 8.6646, time 125.86ms
iter 43450: loss 8.5552, time 125.67ms
iter 43460: loss 8.9349, time 125.95ms
iter 43470: loss 8.2640, time 126.14ms
iter 43480: loss 9.7449, time 126.09ms
iter 43490: loss 8.8567, time 126.14ms
step 43500: train loss 7.6684, val loss 7.6443
saving checkpoint to out-shakespeare-char
iter 43500: loss 8.7114, time 2898.30ms
iter 43510: loss 8.8519, time 125.99ms
iter 43520: loss 8.9946, time 125.60ms
iter 43530: loss 9.1213, time 125.18ms
iter 43540: loss 8.9718, time 125.36ms
iter 43550: loss 9.4205, time 125.35ms
iter 43560: loss 9.2844, time 125.29ms
iter 43570: loss 9.0901, time 124.80ms
iter 43580: loss 9.5893, time 125.88ms
iter 43590: loss 9.1228, time 124.95ms
iter 43600: loss 9.1011, time 128.36ms
iter 43610: loss 9.2083, time 124.73ms
iter 43620: loss 8.8882, time 125.80ms
iter 43630: loss 9.6721, time 125.96ms
iter 43640: loss 9.6421, time 128.57ms
iter 43650: loss 9.0421, time 126.09ms
iter 43660: loss 9.5421, time 125.69ms
iter 43670: loss 8.9696, time 125.98ms
iter 43680: loss 8.5981, time 125.60ms
iter 43690: loss 9.0542, time 124.94ms
iter 43700: loss 9.1434, time 125.86ms
iter 43710: loss 8.8078, time 125.44ms
iter 43720: loss 9.2148, time 125.58ms
iter 43730: loss 8.9493, time 125.53ms
iter 43740: loss 8.5801, time 125.69ms
step 43750: train loss 7.7046, val loss 7.6901
saving checkpoint to out-shakespeare-char
iter 43750: loss 8.1038, time 2874.82ms
iter 43760: loss 8.6390, time 126.04ms
iter 43770: loss 9.8021, time 125.80ms
iter 43780: loss 8.8979, time 125.86ms
iter 43790: loss 9.0013, time 125.74ms
iter 43800: loss 8.9965, time 125.74ms
iter 43810: loss 8.1991, time 128.28ms
iter 43820: loss 9.4762, time 125.30ms
iter 43830: loss 8.4165, time 125.77ms
iter 43840: loss 8.8162, time 124.65ms
iter 43850: loss 9.1082, time 124.61ms
iter 43860: loss 9.0611, time 124.56ms
iter 43870: loss 9.3986, time 124.80ms
iter 43880: loss 9.0075, time 125.67ms
iter 43890: loss 9.8268, time 124.85ms
iter 43900: loss 9.9117, time 125.82ms
iter 43910: loss 9.5876, time 126.08ms
iter 43920: loss 9.5624, time 128.70ms
iter 43930: loss 8.7957, time 125.69ms
iter 43940: loss 8.9444, time 125.94ms
iter 43950: loss 9.4684, time 126.07ms
iter 43960: loss 9.1717, time 125.66ms
iter 43970: loss 9.2526, time 125.80ms
iter 43980: loss 8.9749, time 125.99ms
iter 43990: loss 9.8125, time 126.13ms
step 44000: train loss 7.6416, val loss 7.6646
saving checkpoint to out-shakespeare-char
iter 44000: loss 9.1229, time 2878.26ms
iter 44010: loss 8.8819, time 126.17ms
iter 44020: loss 9.1890, time 126.00ms
iter 44030: loss 8.6646, time 125.51ms
iter 44040: loss 9.4675, time 126.08ms
iter 44050: loss 8.9482, time 125.32ms
iter 44060: loss 9.6341, time 126.72ms
iter 44070: loss 8.9535, time 125.67ms
iter 44080: loss 8.5966, time 125.92ms
iter 44090: loss 8.2443, time 128.55ms
iter 44100: loss 9.3832, time 125.63ms
iter 44110: loss 9.1001, time 125.68ms
iter 44120: loss 8.5514, time 125.60ms
iter 44130: loss 8.9258, time 127.98ms
iter 44140: loss 9.4007, time 124.96ms
iter 44150: loss 9.1411, time 125.22ms
iter 44160: loss 9.3026, time 125.19ms
iter 44170: loss 9.7712, time 125.12ms
iter 44180: loss 9.1061, time 125.37ms
iter 44190: loss 8.9790, time 124.97ms
iter 44200: loss 9.8988, time 125.01ms
iter 44210: loss 9.0385, time 125.14ms
iter 44220: loss 9.0356, time 124.97ms
iter 44230: loss 9.1008, time 125.04ms
iter 44240: loss 8.7517, time 128.37ms
step 44250: train loss 7.6630, val loss 7.6117
saving checkpoint to out-shakespeare-char
iter 44250: loss 9.9977, time 2868.77ms
iter 44260: loss 9.0835, time 121.56ms
iter 44270: loss 9.2186, time 121.43ms
iter 44280: loss 8.6798, time 121.76ms
iter 44290: loss 8.8375, time 121.66ms
iter 44300: loss 9.0052, time 121.65ms
iter 44310: loss 8.6686, time 121.66ms
iter 44320: loss 9.0058, time 121.53ms
iter 44330: loss 8.6753, time 121.27ms
iter 44340: loss 9.7591, time 121.82ms
iter 44350: loss 9.4804, time 120.87ms
iter 44360: loss 9.4492, time 121.46ms
iter 44370: loss 9.2892, time 121.34ms
iter 44380: loss 9.3625, time 121.63ms
iter 44390: loss 9.4464, time 121.71ms
iter 44400: loss 8.8951, time 121.31ms
iter 44410: loss 8.8506, time 121.43ms
iter 44420: loss 9.4104, time 121.55ms
iter 44430: loss 9.3929, time 121.31ms
iter 44440: loss 9.1493, time 120.48ms
iter 44450: loss 9.2983, time 121.40ms
iter 44460: loss 8.3411, time 121.55ms
iter 44470: loss 8.9759, time 121.38ms
iter 44480: loss 9.3139, time 121.67ms
iter 44490: loss 9.2373, time 121.35ms
step 44500: train loss 7.6152, val loss 7.6364
saving checkpoint to out-shakespeare-char
iter 44500: loss 8.8350, time 2900.59ms
iter 44510: loss 8.7047, time 125.72ms
iter 44520: loss 9.5746, time 126.03ms
iter 44530: loss 9.6727, time 128.72ms
iter 44540: loss 9.3363, time 125.95ms
iter 44550: loss 9.5705, time 125.81ms
iter 44560: loss 9.9576, time 125.84ms
iter 44570: loss 9.1837, time 125.23ms
iter 44580: loss 9.2016, time 124.10ms
iter 44590: loss 9.5548, time 125.60ms
iter 44600: loss 9.0277, time 125.11ms
iter 44610: loss 8.7617, time 125.14ms
iter 44620: loss 9.7131, time 125.10ms
iter 44630: loss 9.1431, time 125.42ms
iter 44640: loss 9.5456, time 128.42ms
iter 44650: loss 9.2063, time 125.18ms
iter 44660: loss 8.7738, time 124.29ms
iter 44670: loss 9.0417, time 126.15ms
iter 44680: loss 9.1387, time 125.53ms
iter 44690: loss 9.5703, time 125.32ms
iter 44700: loss 8.3447, time 125.28ms
iter 44710: loss 9.3526, time 125.14ms
iter 44720: loss 9.2764, time 125.18ms
iter 44730: loss 9.0365, time 125.12ms
iter 44740: loss 9.5025, time 125.58ms
step 44750: train loss 7.6115, val loss 7.6485
saving checkpoint to out-shakespeare-char
iter 44750: loss 8.6939, time 2876.43ms
iter 44760: loss 9.2667, time 125.82ms
iter 44770: loss 9.8018, time 125.97ms
iter 44780: loss 9.5877, time 126.04ms
iter 44790: loss 8.7528, time 125.62ms
iter 44800: loss 9.2234, time 125.83ms
iter 44810: loss 8.7672, time 125.51ms
iter 44820: loss 9.3563, time 125.04ms
iter 44830: loss 9.2520, time 125.95ms
iter 44840: loss 9.0352, time 125.83ms
iter 44850: loss 8.8123, time 128.41ms
iter 44860: loss 8.9711, time 125.73ms
iter 44870: loss 9.1535, time 125.64ms
iter 44880: loss 8.4514, time 126.93ms
iter 44890: loss 8.6081, time 125.85ms
iter 44900: loss 8.7039, time 125.85ms
iter 44910: loss 8.1933, time 125.95ms
iter 44920: loss 9.0399, time 125.67ms
iter 44930: loss 9.1093, time 125.67ms
iter 44940: loss 8.8586, time 126.32ms
iter 44950: loss 8.9216, time 125.66ms
iter 44960: loss 9.1900, time 128.82ms
iter 44970: loss 9.0691, time 125.76ms
iter 44980: loss 9.3940, time 126.00ms
iter 44990: loss 8.7966, time 125.75ms
step 45000: train loss 7.6005, val loss 7.6285
saving checkpoint to out-shakespeare-char
iter 45000: loss 10.0856, time 2902.22ms
iter 45010: loss 8.7069, time 125.61ms
iter 45020: loss 9.5309, time 125.82ms
iter 45030: loss 8.6878, time 124.93ms
iter 45040: loss 9.1340, time 125.49ms
iter 45050: loss 8.3418, time 125.84ms
iter 45060: loss 9.1553, time 125.25ms
iter 45070: loss 9.4031, time 126.17ms
iter 45080: loss 8.7570, time 125.70ms
iter 45090: loss 9.2841, time 125.74ms
iter 45100: loss 8.8573, time 125.67ms
iter 45110: loss 9.3020, time 126.52ms
iter 45120: loss 8.2485, time 125.83ms
iter 45130: loss 8.5690, time 125.84ms
iter 45140: loss 8.7871, time 127.66ms
iter 45150: loss 9.2807, time 125.60ms
iter 45160: loss 9.2609, time 126.01ms
iter 45170: loss 9.3139, time 125.67ms
iter 45180: loss 9.7516, time 125.64ms
iter 45190: loss 8.5328, time 124.97ms
iter 45200: loss 8.4422, time 125.84ms
iter 45210: loss 9.5448, time 125.11ms
iter 45220: loss 8.7750, time 126.12ms
iter 45230: loss 8.8636, time 125.86ms
iter 45240: loss 9.1418, time 125.93ms
step 45250: train loss 7.6161, val loss 7.6253
saving checkpoint to out-shakespeare-char
iter 45250: loss 8.7875, time 2875.14ms
iter 45260: loss 8.9295, time 125.66ms
iter 45270: loss 9.1875, time 124.85ms
iter 45280: loss 9.5629, time 124.95ms
iter 45290: loss 9.2755, time 125.69ms
iter 45300: loss 9.4474, time 126.15ms
iter 45310: loss 8.2850, time 128.33ms
iter 45320: loss 8.8562, time 125.77ms
iter 45330: loss 9.7714, time 125.87ms
iter 45340: loss 8.9794, time 124.61ms
iter 45350: loss 9.4529, time 125.86ms
iter 45360: loss 9.5983, time 125.57ms
iter 45370: loss 9.4397, time 125.81ms
iter 45380: loss 9.6743, time 124.84ms
iter 45390: loss 8.6520, time 125.67ms
iter 45400: loss 9.3428, time 125.72ms
iter 45410: loss 8.8543, time 124.87ms
iter 45420: loss 8.6476, time 128.34ms
iter 45430: loss 9.6089, time 125.85ms
iter 45440: loss 9.1624, time 125.65ms
iter 45450: loss 9.5674, time 124.41ms
iter 45460: loss 9.6081, time 125.51ms
iter 45470: loss 9.7737, time 125.22ms
iter 45480: loss 9.2657, time 124.97ms
iter 45490: loss 9.5047, time 125.57ms
step 45500: train loss 7.5960, val loss 7.6250
saving checkpoint to out-shakespeare-char
iter 45500: loss 8.2626, time 2880.46ms
iter 45510: loss 8.9404, time 124.49ms
iter 45520: loss 9.8716, time 121.81ms
iter 45530: loss 8.9814, time 124.51ms
iter 45540: loss 8.9954, time 121.68ms
iter 45550: loss 9.8386, time 124.51ms
iter 45560: loss 8.9800, time 121.65ms
iter 45570: loss 9.1933, time 124.45ms
iter 45580: loss 9.2429, time 121.49ms
iter 45590: loss 8.7364, time 123.64ms
iter 45600: loss 9.9038, time 121.64ms
iter 45610: loss 9.0578, time 124.68ms
iter 45620: loss 9.3873, time 121.51ms
iter 45630: loss 8.7492, time 124.28ms
iter 45640: loss 8.4970, time 121.27ms
iter 45650: loss 9.2486, time 124.95ms
iter 45660: loss 9.9697, time 122.14ms
iter 45670: loss 9.2531, time 124.73ms
iter 45680: loss 9.1318, time 120.79ms
iter 45690: loss 9.6143, time 124.86ms
iter 45700: loss 9.2811, time 121.74ms
iter 45710: loss 8.5856, time 124.80ms
iter 45720: loss 9.9938, time 121.59ms
iter 45730: loss 8.4345, time 124.21ms
iter 45740: loss 8.8685, time 121.76ms
step 45750: train loss 7.6169, val loss 7.5723
saving checkpoint to out-shakespeare-char
iter 45750: loss 9.3078, time 2913.93ms
iter 45760: loss 8.3854, time 123.11ms
iter 45770: loss 8.8650, time 121.61ms
iter 45780: loss 9.5714, time 122.30ms
iter 45790: loss 8.8701, time 121.61ms
iter 45800: loss 9.1747, time 123.08ms
iter 45810: loss 9.1112, time 121.63ms
iter 45820: loss 9.0791, time 121.95ms
iter 45830: loss 9.0605, time 121.09ms
iter 45840: loss 8.9019, time 122.90ms
iter 45850: loss 9.1858, time 121.73ms
iter 45860: loss 9.1736, time 122.95ms
iter 45870: loss 9.1662, time 120.89ms
iter 45880: loss 8.5555, time 122.05ms
iter 45890: loss 9.4492, time 121.64ms
iter 45900: loss 9.1577, time 123.14ms
iter 45910: loss 8.6691, time 122.11ms
iter 45920: loss 9.4295, time 122.13ms
iter 45930: loss 8.8313, time 121.91ms
iter 45940: loss 8.8234, time 121.54ms
iter 45950: loss 9.2064, time 121.33ms
iter 45960: loss 8.6863, time 122.85ms
iter 45970: loss 9.2304, time 121.77ms
iter 45980: loss 9.0716, time 122.87ms
iter 45990: loss 9.7822, time 120.97ms
step 46000: train loss 7.5908, val loss 7.6536
saving checkpoint to out-shakespeare-char
iter 46000: loss 9.4523, time 2906.71ms
iter 46010: loss 9.9454, time 121.45ms
iter 46020: loss 9.1231, time 120.84ms
iter 46030: loss 8.7099, time 121.44ms
iter 46040: loss 9.6102, time 122.28ms
iter 46050: loss 9.1671, time 121.90ms
iter 46060: loss 9.3324, time 121.83ms
iter 46070: loss 9.3501, time 121.64ms
iter 46080: loss 8.4246, time 121.61ms
iter 46090: loss 8.5602, time 121.65ms
iter 46100: loss 8.4934, time 121.49ms
iter 46110: loss 8.7755, time 121.31ms
iter 46120: loss 9.3411, time 121.57ms
iter 46130: loss 8.4778, time 121.43ms
iter 46140: loss 9.0586, time 121.75ms
iter 46150: loss 9.1798, time 121.32ms
iter 46160: loss 8.4047, time 121.51ms
iter 46170: loss 9.1979, time 121.44ms
iter 46180: loss 9.1964, time 121.15ms
iter 46190: loss 8.9634, time 120.49ms
iter 46200: loss 9.2484, time 121.57ms
iter 46210: loss 8.0890, time 121.64ms
iter 46220: loss 9.0871, time 121.71ms
iter 46230: loss 8.8622, time 121.83ms
iter 46240: loss 8.6220, time 121.86ms
step 46250: train loss 7.5856, val loss 7.5841
saving checkpoint to out-shakespeare-char
iter 46250: loss 8.6126, time 2878.17ms
iter 46260: loss 8.7800, time 120.69ms
iter 46270: loss 9.1893, time 121.42ms
iter 46280: loss 8.8595, time 121.77ms
iter 46290: loss 8.8838, time 121.42ms
iter 46300: loss 9.3659, time 121.75ms
iter 46310: loss 9.0629, time 120.87ms
iter 46320: loss 8.9106, time 121.61ms
iter 46330: loss 9.1049, time 121.53ms
iter 46340: loss 9.6144, time 121.81ms
iter 46350: loss 8.3026, time 121.05ms
iter 46360: loss 9.0042, time 120.66ms
iter 46370: loss 10.1965, time 121.17ms
iter 46380: loss 9.5988, time 121.07ms
iter 46390: loss 9.0652, time 121.68ms
iter 46400: loss 9.7006, time 121.83ms
iter 46410: loss 9.2703, time 121.61ms
iter 46420: loss 9.2982, time 121.69ms
iter 46430: loss 8.8124, time 121.49ms
iter 46440: loss 9.1077, time 121.41ms
iter 46450: loss 9.6270, time 121.32ms
iter 46460: loss 9.1274, time 121.59ms
iter 46470: loss 9.1358, time 121.47ms
iter 46480: loss 8.9102, time 121.47ms
iter 46490: loss 9.3924, time 121.10ms
step 46500: train loss 7.5966, val loss 7.6286
saving checkpoint to out-shakespeare-char
iter 46500: loss 8.5972, time 2902.98ms
iter 46510: loss 9.2712, time 123.00ms
iter 46520: loss 8.2337, time 121.73ms
iter 46530: loss 9.3439, time 122.88ms
iter 46540: loss 8.2195, time 121.57ms
iter 46550: loss 9.5081, time 122.86ms
iter 46560: loss 8.7899, time 121.79ms
iter 46570: loss 9.0278, time 123.03ms
iter 46580: loss 8.9788, time 121.48ms
iter 46590: loss 9.1652, time 122.92ms
iter 46600: loss 8.3352, time 121.68ms
iter 46610: loss 8.7364, time 123.30ms
iter 46620: loss 8.0684, time 121.45ms
iter 46630: loss 9.2431, time 123.25ms
iter 46640: loss 8.4050, time 121.72ms
iter 46650: loss 9.0638, time 123.00ms
iter 46660: loss 9.4099, time 121.63ms
iter 46670: loss 9.2151, time 122.73ms
iter 46680: loss 9.3882, time 121.55ms
iter 46690: loss 8.8987, time 123.08ms
iter 46700: loss 8.8116, time 121.56ms
iter 46710: loss 9.5216, time 123.09ms
iter 46720: loss 9.0530, time 121.67ms
iter 46730: loss 9.1642, time 122.87ms
iter 46740: loss 9.3725, time 121.64ms
step 46750: train loss 7.6229, val loss 7.5816
saving checkpoint to out-shakespeare-char
iter 46750: loss 9.1519, time 2906.41ms
iter 46760: loss 9.3347, time 122.20ms
iter 46770: loss 8.6219, time 124.63ms
iter 46780: loss 9.3004, time 121.37ms
iter 46790: loss 8.8741, time 123.99ms
iter 46800: loss 9.4333, time 121.64ms
iter 46810: loss 9.5064, time 124.64ms
iter 46820: loss 8.3474, time 121.74ms
iter 46830: loss 9.0413, time 124.85ms
iter 46840: loss 8.5690, time 121.85ms
iter 46850: loss 8.7667, time 124.87ms
iter 46860: loss 9.5738, time 121.99ms
iter 46870: loss 9.0261, time 124.61ms
iter 46880: loss 9.4532, time 121.60ms
iter 46890: loss 8.4838, time 125.01ms
iter 46900: loss 9.1294, time 122.24ms
iter 46910: loss 9.1699, time 124.57ms
iter 46920: loss 9.2485, time 121.45ms
iter 46930: loss 8.2492, time 124.52ms
iter 46940: loss 9.0804, time 121.51ms
iter 46950: loss 8.7116, time 124.55ms
iter 46960: loss 9.5007, time 121.60ms
iter 46970: loss 9.3966, time 124.75ms
iter 46980: loss 8.2112, time 121.64ms
iter 46990: loss 9.5302, time 124.60ms
step 47000: train loss 7.6184, val loss 7.6240
saving checkpoint to out-shakespeare-char
iter 47000: loss 9.0077, time 2905.22ms
iter 47010: loss 8.8607, time 121.60ms
iter 47020: loss 8.7454, time 123.22ms
iter 47030: loss 8.9281, time 121.32ms
iter 47040: loss 9.3895, time 122.89ms
iter 47050: loss 8.8167, time 121.66ms
iter 47060: loss 9.1547, time 122.92ms
iter 47070: loss 9.2613, time 121.39ms
iter 47080: loss 9.0952, time 122.83ms
iter 47090: loss 9.3171, time 121.47ms
iter 47100: loss 8.7651, time 122.76ms
iter 47110: loss 9.1457, time 121.70ms
iter 47120: loss 9.3366, time 123.43ms
iter 47130: loss 8.9668, time 121.60ms
iter 47140: loss 9.7437, time 122.81ms
iter 47150: loss 9.0218, time 121.38ms
iter 47160: loss 8.8852, time 122.41ms
iter 47170: loss 8.9296, time 121.38ms
iter 47180: loss 8.8879, time 123.05ms
iter 47190: loss 9.2967, time 121.35ms
iter 47200: loss 9.0645, time 122.76ms
iter 47210: loss 9.3644, time 121.71ms
iter 47220: loss 8.6717, time 122.83ms
iter 47230: loss 9.5106, time 121.18ms
iter 47240: loss 8.9750, time 122.94ms
step 47250: train loss 7.6057, val loss 7.6052
saving checkpoint to out-shakespeare-char
iter 47250: loss 8.9861, time 2904.79ms
iter 47260: loss 8.8832, time 123.88ms
iter 47270: loss 8.4192, time 121.23ms
iter 47280: loss 8.9149, time 125.58ms
iter 47290: loss 8.9963, time 121.70ms
iter 47300: loss 8.2675, time 124.47ms
iter 47310: loss 8.9086, time 119.79ms
iter 47320: loss 8.5779, time 124.56ms
iter 47330: loss 9.2741, time 125.99ms
iter 47340: loss 9.4190, time 125.69ms
iter 47350: loss 9.0720, time 125.61ms
iter 47360: loss 9.4046, time 125.59ms
iter 47370: loss 8.8696, time 125.53ms
iter 47380: loss 8.8130, time 125.59ms
iter 47390: loss 8.7792, time 125.35ms
iter 47400: loss 9.1425, time 128.78ms
iter 47410: loss 9.3089, time 125.88ms
iter 47420: loss 9.4019, time 125.85ms
iter 47430: loss 8.5272, time 125.15ms
iter 47440: loss 9.2147, time 125.65ms
iter 47450: loss 9.7717, time 125.98ms
iter 47460: loss 9.4460, time 125.83ms
iter 47470: loss 8.5222, time 128.45ms
iter 47480: loss 8.7848, time 126.26ms
iter 47490: loss 9.3788, time 125.43ms
step 47500: train loss 7.5850, val loss 7.5923
saving checkpoint to out-shakespeare-char
iter 47500: loss 8.8263, time 2910.76ms
iter 47510: loss 8.5700, time 125.77ms
iter 47520: loss 9.0097, time 125.83ms
iter 47530: loss 8.8757, time 126.00ms
iter 47540: loss 9.4807, time 125.59ms
iter 47550: loss 9.1326, time 125.77ms
iter 47560: loss 9.2630, time 126.33ms
iter 47570: loss 9.6438, time 125.76ms
iter 47580: loss 9.6006, time 125.62ms
iter 47590: loss 9.7046, time 125.75ms
iter 47600: loss 8.8511, time 126.02ms
iter 47610: loss 8.9624, time 127.68ms
iter 47620: loss 8.7406, time 125.65ms
iter 47630: loss 9.2158, time 125.75ms
iter 47640: loss 9.3980, time 125.63ms
iter 47650: loss 8.9103, time 125.37ms
iter 47660: loss 9.1912, time 125.83ms
iter 47670: loss 9.0411, time 125.96ms
iter 47680: loss 9.3497, time 125.52ms
iter 47690: loss 9.3145, time 125.99ms
iter 47700: loss 9.3674, time 125.64ms
iter 47710: loss 9.3126, time 125.80ms
iter 47720: loss 8.7271, time 128.71ms
iter 47730: loss 8.9609, time 125.57ms
iter 47740: loss 8.9398, time 125.22ms
step 47750: train loss 7.5641, val loss 7.5703
saving checkpoint to out-shakespeare-char
iter 47750: loss 8.7889, time 2911.25ms
iter 47760: loss 9.2772, time 126.47ms
iter 47770: loss 8.4677, time 125.86ms
iter 47780: loss 8.7239, time 125.53ms
iter 47790: loss 9.3693, time 125.34ms
iter 47800: loss 9.4179, time 125.51ms
iter 47810: loss 8.6671, time 125.69ms
iter 47820: loss 8.9816, time 125.78ms
iter 47830: loss 9.0233, time 126.60ms
iter 47840: loss 9.0229, time 126.25ms
iter 47850: loss 9.5670, time 126.35ms
iter 47860: loss 9.2869, time 125.64ms
iter 47870: loss 8.4559, time 125.43ms
iter 47880: loss 8.8074, time 125.89ms
iter 47890: loss 9.3930, time 126.05ms
iter 47900: loss 7.7403, time 125.80ms
iter 47910: loss 8.6787, time 125.80ms
iter 47920: loss 8.9161, time 125.82ms
iter 47930: loss 8.9436, time 126.14ms
iter 47940: loss 8.8468, time 126.64ms
iter 47950: loss 9.5331, time 125.93ms
iter 47960: loss 9.3755, time 126.42ms
iter 47970: loss 8.8518, time 126.21ms
iter 47980: loss 8.9512, time 130.70ms
iter 47990: loss 8.6734, time 125.77ms
step 48000: train loss 7.5592, val loss 7.5303
saving checkpoint to out-shakespeare-char
iter 48000: loss 8.3214, time 2900.53ms
iter 48010: loss 8.8968, time 129.04ms
iter 48020: loss 9.0420, time 125.11ms
iter 48030: loss 9.0360, time 125.80ms
iter 48040: loss 8.6098, time 125.81ms
iter 48050: loss 9.0989, time 125.72ms
iter 48060: loss 8.5806, time 126.00ms
iter 48070: loss 8.9660, time 125.62ms
iter 48080: loss 9.4184, time 127.09ms
iter 48090: loss 9.6604, time 125.58ms
iter 48100: loss 8.9206, time 125.87ms
iter 48110: loss 8.9604, time 125.85ms
iter 48120: loss 9.4377, time 124.78ms
iter 48130: loss 8.9203, time 125.66ms
iter 48140: loss 8.6986, time 124.83ms
iter 48150: loss 8.6189, time 127.16ms
iter 48160: loss 9.0200, time 127.81ms
iter 48170: loss 9.1758, time 125.60ms
iter 48180: loss 9.0845, time 125.64ms
iter 48190: loss 8.9489, time 125.85ms
iter 48200: loss 9.4385, time 128.93ms
iter 48210: loss 8.5535, time 125.95ms
iter 48220: loss 9.6336, time 126.02ms
iter 48230: loss 8.9488, time 125.98ms
iter 48240: loss 9.3646, time 125.89ms
step 48250: train loss 7.5612, val loss 7.5256
saving checkpoint to out-shakespeare-char
iter 48250: loss 8.6090, time 2904.94ms
iter 48260: loss 8.8191, time 121.65ms
iter 48270: loss 8.8626, time 121.98ms
iter 48280: loss 8.9051, time 121.47ms
iter 48290: loss 8.5908, time 120.89ms
iter 48300: loss 8.8506, time 121.85ms
iter 48310: loss 9.5843, time 120.62ms
iter 48320: loss 8.3814, time 121.77ms
iter 48330: loss 8.8631, time 121.78ms
iter 48340: loss 8.8991, time 121.55ms
iter 48350: loss 9.1091, time 121.78ms
iter 48360: loss 9.3940, time 121.63ms
iter 48370: loss 8.7008, time 121.77ms
iter 48380: loss 9.2816, time 120.36ms
iter 48390: loss 9.5268, time 121.64ms
iter 48400: loss 8.6571, time 120.41ms
iter 48410: loss 9.4641, time 121.71ms
iter 48420: loss 8.8689, time 121.42ms
iter 48430: loss 8.4936, time 121.42ms
iter 48440: loss 9.5898, time 121.62ms
iter 48450: loss 9.7154, time 121.09ms
iter 48460: loss 9.3909, time 121.42ms
iter 48470: loss 9.3908, time 121.49ms
iter 48480: loss 8.4875, time 122.00ms
iter 48490: loss 9.4917, time 121.59ms
step 48500: train loss 7.5660, val loss 7.5142
saving checkpoint to out-shakespeare-char
iter 48500: loss 8.7080, time 2881.04ms
iter 48510: loss 8.9414, time 125.31ms
iter 48520: loss 9.0047, time 125.28ms
iter 48530: loss 9.0905, time 125.26ms
iter 48540: loss 8.9596, time 125.18ms
iter 48550: loss 9.0049, time 124.50ms
iter 48560: loss 8.6344, time 125.77ms
iter 48570: loss 8.3532, time 129.21ms
iter 48580: loss 9.6674, time 127.33ms
iter 48590: loss 9.0783, time 125.90ms
iter 48600: loss 9.0989, time 124.17ms
iter 48610: loss 8.8983, time 125.63ms
iter 48620: loss 8.6936, time 125.48ms
iter 48630: loss 9.1641, time 125.12ms
iter 48640: loss 9.4307, time 124.87ms
iter 48650: loss 8.5114, time 125.23ms
iter 48660: loss 9.1108, time 125.20ms
iter 48670: loss 8.9781, time 125.06ms
iter 48680: loss 9.1073, time 128.42ms
iter 48690: loss 9.4722, time 125.37ms
iter 48700: loss 9.0276, time 124.27ms
iter 48710: loss 8.9191, time 124.33ms
iter 48720: loss 8.8286, time 125.04ms
iter 48730: loss 8.3312, time 125.08ms
iter 48740: loss 9.4162, time 125.12ms
step 48750: train loss 7.5387, val loss 7.4899
saving checkpoint to out-shakespeare-char
iter 48750: loss 9.0686, time 2876.19ms
iter 48760: loss 8.4288, time 125.01ms
iter 48770: loss 8.8362, time 123.81ms
iter 48780: loss 9.2763, time 126.23ms
iter 48790: loss 8.6134, time 125.13ms
iter 48800: loss 9.3945, time 124.91ms
iter 48810: loss 8.7896, time 124.01ms
iter 48820: loss 8.5704, time 124.79ms
iter 48830: loss 8.8572, time 124.77ms
iter 48840: loss 9.6981, time 126.08ms
iter 48850: loss 8.9696, time 123.81ms
iter 48860: loss 9.0648, time 124.63ms
iter 48870: loss 9.0371, time 124.86ms
iter 48880: loss 9.1826, time 127.79ms
iter 48890: loss 9.7124, time 124.62ms
iter 48900: loss 9.3254, time 124.84ms
iter 48910: loss 9.0727, time 124.80ms
iter 48920: loss 9.4975, time 125.07ms
iter 48930: loss 9.0463, time 125.01ms
iter 48940: loss 8.9340, time 124.89ms
iter 48950: loss 8.9376, time 124.36ms
iter 48960: loss 8.6372, time 125.43ms
iter 48970: loss 8.6974, time 125.01ms
iter 48980: loss 8.5480, time 124.95ms
iter 48990: loss 9.1056, time 128.35ms
step 49000: train loss 7.5293, val loss 7.5466
saving checkpoint to out-shakespeare-char
iter 49000: loss 8.9747, time 2885.53ms
iter 49010: loss 9.2942, time 124.63ms
iter 49020: loss 8.5301, time 128.45ms
iter 49030: loss 8.5975, time 125.34ms
iter 49040: loss 8.5514, time 125.36ms
iter 49050: loss 8.4551, time 124.88ms
iter 49060: loss 9.1243, time 125.06ms
iter 49070: loss 9.0586, time 124.34ms
iter 49080: loss 8.3109, time 125.12ms
iter 49090: loss 8.8545, time 124.03ms
iter 49100: loss 9.4545, time 125.12ms
iter 49110: loss 9.5123, time 124.59ms
iter 49120: loss 8.2455, time 125.16ms
iter 49130: loss 8.8037, time 128.20ms
iter 49140: loss 9.5667, time 124.97ms
iter 49150: loss 9.2701, time 125.43ms
iter 49160: loss 8.6122, time 124.42ms
iter 49170: loss 9.7318, time 124.95ms
iter 49180: loss 8.4872, time 124.78ms
iter 49190: loss 8.4429, time 125.24ms
iter 49200: loss 8.8935, time 124.22ms
iter 49210: loss 9.2567, time 125.90ms
iter 49220: loss 9.3543, time 125.18ms
iter 49230: loss 8.5120, time 125.45ms
iter 49240: loss 9.0051, time 127.97ms
step 49250: train loss 7.5302, val loss 7.4743
saving checkpoint to out-shakespeare-char
iter 49250: loss 9.0254, time 2907.43ms
iter 49260: loss 9.1394, time 128.31ms
iter 49270: loss 8.6750, time 125.88ms
iter 49280: loss 8.5892, time 125.04ms
iter 49290: loss 8.5822, time 125.58ms
iter 49300: loss 9.3575, time 125.03ms
iter 49310: loss 9.4869, time 125.44ms
iter 49320: loss 8.6892, time 125.08ms
iter 49330: loss 8.6179, time 124.94ms
iter 49340: loss 8.7134, time 125.09ms
iter 49350: loss 9.4114, time 125.29ms
iter 49360: loss 8.6223, time 124.87ms
iter 49370: loss 9.3116, time 127.94ms
iter 49380: loss 9.1443, time 125.06ms
iter 49390: loss 9.0575, time 124.88ms
iter 49400: loss 9.2240, time 125.09ms
iter 49410: loss 8.1536, time 124.99ms
iter 49420: loss 9.2541, time 125.18ms
iter 49430: loss 9.4767, time 125.17ms
iter 49440: loss 8.7919, time 124.44ms
iter 49450: loss 9.3911, time 125.13ms
iter 49460: loss 9.1753, time 125.04ms
iter 49470: loss 9.0425, time 125.21ms
iter 49480: loss 9.1116, time 128.25ms
iter 49490: loss 10.0359, time 124.25ms
step 49500: train loss 7.5198, val loss 7.5024
saving checkpoint to out-shakespeare-char
iter 49500: loss 9.3879, time 2902.39ms
iter 49510: loss 8.4690, time 119.95ms
iter 49520: loss 9.1515, time 120.49ms
iter 49530: loss 8.9636, time 119.62ms
iter 49540: loss 9.1775, time 119.82ms
iter 49550: loss 9.2289, time 119.52ms
iter 49560: loss 9.0371, time 119.43ms
iter 49570: loss 8.6470, time 119.58ms
iter 49580: loss 9.6449, time 121.09ms
iter 49590: loss 8.4493, time 120.71ms
iter 49600: loss 9.1066, time 120.58ms
iter 49610: loss 9.3082, time 119.55ms
iter 49620: loss 9.4191, time 121.28ms
iter 49630: loss 9.3872, time 119.56ms
iter 49640: loss 8.7007, time 121.61ms
iter 49650: loss 9.2741, time 120.42ms
iter 49660: loss 9.2124, time 120.59ms
iter 49670: loss 9.1216, time 119.40ms
iter 49680: loss 8.7420, time 120.83ms
iter 49690: loss 8.9588, time 120.62ms
iter 49700: loss 8.4503, time 120.59ms
iter 49710: loss 8.5082, time 119.51ms
iter 49720: loss 9.2360, time 120.67ms
iter 49730: loss 9.1580, time 119.64ms
iter 49740: loss 9.2115, time 120.66ms
step 49750: train loss 7.4881, val loss 7.4812
saving checkpoint to out-shakespeare-char
iter 49750: loss 8.7401, time 2904.21ms
iter 49760: loss 8.7156, time 119.88ms
iter 49770: loss 9.3285, time 119.64ms
iter 49780: loss 9.2914, time 119.80ms
iter 49790: loss 8.6593, time 119.70ms
iter 49800: loss 8.8856, time 120.90ms
iter 49810: loss 9.0091, time 119.36ms
iter 49820: loss 8.6435, time 119.61ms
iter 49830: loss 8.1572, time 120.36ms
iter 49840: loss 8.3222, time 119.59ms
iter 49850: loss 9.7024, time 119.44ms
iter 49860: loss 8.7122, time 120.99ms
iter 49870: loss 8.7270, time 119.51ms
iter 49880: loss 9.1334, time 119.72ms
iter 49890: loss 8.5767, time 120.34ms
iter 49900: loss 8.3456, time 120.51ms
iter 49910: loss 9.3212, time 119.64ms
iter 49920: loss 9.2948, time 119.83ms
iter 49930: loss 8.4396, time 119.50ms
iter 49940: loss 9.4639, time 119.85ms
iter 49950: loss 9.8201, time 119.99ms
iter 49960: loss 8.9481, time 120.07ms
iter 49970: loss 8.8162, time 121.44ms
iter 49980: loss 8.2643, time 121.16ms
iter 49990: loss 9.0237, time 122.46ms
step 50000: train loss 7.4937, val loss 7.4800
saving checkpoint to out-shakespeare-char
iter 50000: loss 8.3481, time 2893.97ms
iter 50010: loss 9.5909, time 121.98ms
iter 50020: loss 8.7438, time 121.89ms
iter 50030: loss 8.7742, time 121.72ms
iter 50040: loss 9.1237, time 121.20ms
iter 50050: loss 9.0116, time 121.49ms
iter 50060: loss 8.7141, time 120.96ms
iter 50070: loss 8.7915, time 121.97ms
iter 50080: loss 8.6125, time 121.75ms
iter 50090: loss 9.1695, time 121.36ms
iter 50100: loss 8.8428, time 121.87ms
iter 50110: loss 9.3335, time 121.66ms
iter 50120: loss 9.3122, time 121.48ms
iter 50130: loss 8.2080, time 121.78ms
iter 50140: loss 8.5039, time 122.04ms
iter 50150: loss 9.0040, time 121.78ms
iter 50160: loss 8.5175, time 121.93ms
iter 50170: loss 8.8854, time 121.57ms
iter 50180: loss 8.3994, time 121.61ms
iter 50190: loss 9.5890, time 122.20ms
iter 50200: loss 8.9650, time 121.85ms
iter 50210: loss 9.0320, time 121.62ms
iter 50220: loss 9.0379, time 121.05ms
iter 50230: loss 9.6426, time 121.71ms
iter 50240: loss 9.2434, time 121.61ms
step 50250: train loss 7.5258, val loss 7.4697
saving checkpoint to out-shakespeare-char
iter 50250: loss 9.4919, time 2909.78ms
iter 50260: loss 9.4264, time 126.02ms
iter 50270: loss 8.5127, time 125.85ms
iter 50280: loss 8.3762, time 126.13ms
iter 50290: loss 9.1798, time 126.10ms
iter 50300: loss 9.0988, time 125.74ms
iter 50310: loss 8.7789, time 125.45ms
iter 50320: loss 9.0609, time 126.36ms
iter 50330: loss 8.9679, time 126.25ms
iter 50340: loss 9.0327, time 125.87ms
iter 50350: loss 9.2735, time 125.28ms
iter 50360: loss 8.7798, time 128.46ms
iter 50370: loss 8.6049, time 125.31ms
iter 50380: loss 8.3451, time 125.63ms
iter 50390: loss 9.2309, time 125.60ms
iter 50400: loss 9.1762, time 124.39ms
iter 50410: loss 8.6710, time 125.34ms
iter 50420: loss 8.9793, time 125.48ms
iter 50430: loss 9.1702, time 124.13ms
iter 50440: loss 8.8905, time 126.04ms
iter 50450: loss 9.5065, time 125.33ms
iter 50460: loss 8.9620, time 125.81ms
iter 50470: loss 9.4274, time 129.05ms
iter 50480: loss 8.9620, time 126.17ms
iter 50490: loss 8.2945, time 126.00ms
step 50500: train loss 7.5353, val loss 7.5053
saving checkpoint to out-shakespeare-char
iter 50500: loss 9.0603, time 2892.53ms
iter 50510: loss 8.7776, time 126.07ms
iter 50520: loss 8.8482, time 125.95ms
iter 50530: loss 8.6984, time 125.57ms
iter 50540: loss 9.3417, time 125.44ms
iter 50550: loss 8.6619, time 125.71ms
iter 50560: loss 8.9463, time 125.40ms
iter 50570: loss 8.8323, time 125.10ms
iter 50580: loss 8.7919, time 125.22ms
iter 50590: loss 8.9410, time 124.94ms
iter 50600: loss 8.9840, time 128.20ms
iter 50610: loss 8.9052, time 125.26ms
iter 50620: loss 9.7004, time 125.17ms
iter 50630: loss 8.4070, time 125.69ms
iter 50640: loss 8.9204, time 125.28ms
iter 50650: loss 8.4978, time 125.17ms
iter 50660: loss 9.1887, time 126.91ms
iter 50670: loss 9.8975, time 124.84ms
iter 50680: loss 9.1957, time 125.78ms
iter 50690: loss 9.5148, time 125.45ms
iter 50700: loss 8.0876, time 125.72ms
iter 50710: loss 8.9045, time 125.83ms
iter 50720: loss 8.0454, time 125.41ms
iter 50730: loss 9.0780, time 125.69ms
iter 50740: loss 7.9258, time 125.72ms
step 50750: train loss 7.4956, val loss 7.5056
saving checkpoint to out-shakespeare-char
iter 50750: loss 9.1041, time 2900.19ms
iter 50760: loss 8.8203, time 121.75ms
iter 50770: loss 8.4470, time 121.28ms
iter 50780: loss 8.8071, time 121.60ms
iter 50790: loss 9.2143, time 121.35ms
iter 50800: loss 8.6488, time 121.44ms
iter 50810: loss 9.1067, time 121.36ms
iter 50820: loss 8.2694, time 121.27ms
iter 50830: loss 9.2269, time 120.88ms
iter 50840: loss 9.3307, time 121.08ms
iter 50850: loss 9.4397, time 122.14ms
iter 50860: loss 8.7482, time 121.41ms
iter 50870: loss 8.1359, time 121.37ms
iter 50880: loss 8.8558, time 121.48ms
iter 50890: loss 9.0280, time 121.47ms
iter 50900: loss 9.0720, time 121.37ms
iter 50910: loss 8.9096, time 121.55ms
iter 50920: loss 9.5848, time 121.51ms
iter 50930: loss 9.3717, time 121.26ms
iter 50940: loss 8.3578, time 121.51ms
iter 50950: loss 9.0634, time 121.99ms
iter 50960: loss 9.0518, time 121.60ms
iter 50970: loss 9.1202, time 121.83ms
iter 50980: loss 8.9373, time 121.28ms
iter 50990: loss 9.2053, time 121.34ms
step 51000: train loss 7.4842, val loss 7.4816
saving checkpoint to out-shakespeare-char
iter 51000: loss 8.9078, time 2886.07ms
iter 51010: loss 8.2078, time 121.53ms
iter 51020: loss 8.9081, time 124.59ms
iter 51030: loss 9.0796, time 121.53ms
iter 51040: loss 8.5237, time 124.18ms
iter 51050: loss 9.0937, time 121.40ms
iter 51060: loss 9.2547, time 125.09ms
iter 51070: loss 8.2871, time 121.53ms
iter 51080: loss 9.1235, time 124.33ms
iter 51090: loss 9.3390, time 121.46ms
iter 51100: loss 8.7820, time 123.88ms
iter 51110: loss 9.4088, time 121.67ms
iter 51120: loss 8.6593, time 124.20ms
iter 51130: loss 8.9245, time 121.51ms
iter 51140: loss 8.6733, time 124.17ms
iter 51150: loss 8.8409, time 121.20ms
iter 51160: loss 8.7288, time 124.42ms
iter 51170: loss 9.4928, time 121.48ms
iter 51180: loss 9.0989, time 124.47ms
iter 51190: loss 8.9740, time 121.61ms
iter 51200: loss 9.2580, time 123.72ms
iter 51210: loss 8.7516, time 121.54ms
iter 51220: loss 8.6524, time 124.73ms
iter 51230: loss 9.1508, time 121.83ms
iter 51240: loss 8.9282, time 124.97ms
step 51250: train loss 7.5025, val loss 7.4140
saving checkpoint to out-shakespeare-char
iter 51250: loss 9.4097, time 2883.71ms
iter 51260: loss 9.5283, time 121.52ms
iter 51270: loss 8.7598, time 124.17ms
iter 51280: loss 9.1612, time 121.47ms
iter 51290: loss 9.4382, time 124.38ms
iter 51300: loss 9.0622, time 121.72ms
iter 51310: loss 9.0953, time 124.21ms
iter 51320: loss 8.7792, time 121.40ms
iter 51330: loss 8.5041, time 124.47ms
iter 51340: loss 8.6440, time 121.44ms
iter 51350: loss 9.4166, time 124.59ms
iter 51360: loss 9.3376, time 121.39ms
iter 51370: loss 9.0259, time 124.18ms
iter 51380: loss 9.3334, time 121.34ms
iter 51390: loss 8.6403, time 124.34ms
iter 51400: loss 9.2232, time 121.28ms
iter 51410: loss 9.2145, time 124.10ms
iter 51420: loss 8.1573, time 121.41ms
iter 51430: loss 8.5043, time 124.28ms
iter 51440: loss 8.3645, time 121.26ms
iter 51450: loss 8.9308, time 124.25ms
iter 51460: loss 8.7176, time 121.30ms
iter 51470: loss 9.3753, time 124.29ms
iter 51480: loss 8.9972, time 121.35ms
iter 51490: loss 8.8279, time 124.07ms
step 51500: train loss 7.4951, val loss 7.5142
saving checkpoint to out-shakespeare-char
iter 51500: loss 8.8132, time 2881.13ms
iter 51510: loss 8.5373, time 125.51ms
iter 51520: loss 8.8326, time 125.20ms
iter 51530: loss 9.7476, time 125.25ms
iter 51540: loss 8.9043, time 125.46ms
iter 51550: loss 8.6955, time 125.16ms
iter 51560: loss 8.5050, time 125.03ms
iter 51570: loss 9.1552, time 125.33ms
iter 51580: loss 9.0195, time 125.36ms
iter 51590: loss 8.9070, time 125.36ms
iter 51600: loss 8.8151, time 125.51ms
iter 51610: loss 8.6926, time 128.11ms
iter 51620: loss 8.8961, time 125.28ms
iter 51630: loss 8.9350, time 125.32ms
iter 51640: loss 8.9448, time 125.44ms
iter 51650: loss 8.6250, time 124.11ms
iter 51660: loss 8.6163, time 125.27ms
iter 51670: loss 8.3185, time 125.14ms
iter 51680: loss 9.0502, time 124.63ms
iter 51690: loss 9.2299, time 124.51ms
iter 51700: loss 8.2047, time 124.49ms
iter 51710: loss 8.9780, time 125.13ms
iter 51720: loss 9.0205, time 125.51ms
iter 51730: loss 8.8046, time 125.42ms
iter 51740: loss 9.5452, time 124.91ms
step 51750: train loss 7.4364, val loss 7.4727
saving checkpoint to out-shakespeare-char
iter 51750: loss 8.6907, time 2895.41ms
iter 51760: loss 9.1705, time 127.44ms
iter 51770: loss 8.5502, time 121.62ms
iter 51780: loss 9.0698, time 125.57ms
iter 51790: loss 9.0702, time 125.84ms
iter 51800: loss 8.5608, time 126.00ms
iter 51810: loss 9.3311, time 126.67ms
iter 51820: loss 8.6014, time 125.69ms
iter 51830: loss 9.2221, time 125.90ms
iter 51840: loss 10.1164, time 125.81ms
iter 51850: loss 8.6074, time 126.25ms
iter 51860: loss 9.3527, time 125.77ms
iter 51870: loss 8.6948, time 125.23ms
iter 51880: loss 9.4708, time 125.95ms
iter 51890: loss 9.3451, time 128.80ms
iter 51900: loss 9.0370, time 125.59ms
iter 51910: loss 9.6396, time 125.73ms
iter 51920: loss 8.6754, time 125.90ms
iter 51930: loss 8.9925, time 125.63ms
iter 51940: loss 9.2852, time 125.65ms
iter 51950: loss 8.1293, time 125.99ms
iter 51960: loss 8.5519, time 125.53ms
iter 51970: loss 9.1255, time 125.03ms
iter 51980: loss 9.2008, time 125.47ms
iter 51990: loss 8.5247, time 125.88ms
step 52000: train loss 7.4657, val loss 7.4946
saving checkpoint to out-shakespeare-char
iter 52000: loss 8.2990, time 2909.35ms
iter 52010: loss 9.1740, time 125.73ms
iter 52020: loss 8.8110, time 125.76ms
iter 52030: loss 9.1682, time 124.89ms
iter 52040: loss 8.4509, time 125.48ms
iter 52050: loss 8.7985, time 125.82ms
iter 52060: loss 9.0779, time 125.90ms
iter 52070: loss 8.6104, time 128.64ms
iter 52080: loss 8.3575, time 125.55ms
iter 52090: loss 8.6240, time 125.81ms
iter 52100: loss 8.7580, time 125.80ms
iter 52110: loss 8.5425, time 125.50ms
iter 52120: loss 9.2254, time 125.77ms
iter 52130: loss 8.7069, time 124.98ms
iter 52140: loss 8.5154, time 125.72ms
iter 52150: loss 8.6612, time 125.86ms
iter 52160: loss 8.9234, time 125.64ms
iter 52170: loss 9.1304, time 125.70ms
iter 52180: loss 9.1165, time 128.93ms
iter 52190: loss 9.2695, time 125.63ms
iter 52200: loss 9.1620, time 125.50ms
iter 52210: loss 8.8409, time 125.91ms
iter 52220: loss 9.0929, time 125.26ms
iter 52230: loss 9.4209, time 125.92ms
iter 52240: loss 9.1889, time 119.57ms
step 52250: train loss 7.4306, val loss 7.4047
saving checkpoint to out-shakespeare-char
iter 52250: loss 8.2725, time 2884.91ms
iter 52260: loss 8.6380, time 119.68ms
iter 52270: loss 8.7665, time 119.61ms
iter 52280: loss 8.6214, time 119.99ms
iter 52290: loss 8.9939, time 119.54ms
iter 52300: loss 9.0367, time 119.75ms
iter 52310: loss 9.2194, time 119.60ms
iter 52320: loss 8.5693, time 120.94ms
iter 52330: loss 9.1942, time 119.92ms
iter 52340: loss 8.7698, time 119.49ms
iter 52350: loss 9.0328, time 119.65ms
iter 52360: loss 8.9491, time 119.58ms
iter 52370: loss 8.4170, time 119.52ms
iter 52380: loss 9.0677, time 119.68ms
iter 52390: loss 8.9878, time 119.78ms
iter 52400: loss 8.9392, time 119.58ms
iter 52410: loss 8.7474, time 119.53ms
iter 52420: loss 8.3865, time 120.17ms
iter 52430: loss 8.8580, time 119.64ms
iter 52440: loss 8.8868, time 119.78ms
iter 52450: loss 9.7534, time 119.58ms
iter 52460: loss 9.1403, time 119.84ms
iter 52470: loss 8.8334, time 119.62ms
iter 52480: loss 8.7135, time 121.10ms
iter 52490: loss 8.8133, time 120.62ms
step 52500: train loss 7.4543, val loss 7.4798
saving checkpoint to out-shakespeare-char
iter 52500: loss 8.9336, time 2879.71ms
iter 52510: loss 9.0894, time 125.11ms
iter 52520: loss 8.7620, time 121.81ms
iter 52530: loss 7.8247, time 124.34ms
iter 52540: loss 9.3657, time 121.63ms
iter 52550: loss 8.3469, time 124.88ms
iter 52560: loss 9.0779, time 122.08ms
iter 52570: loss 8.6217, time 124.38ms
iter 52580: loss 8.8877, time 121.55ms
iter 52590: loss 9.0342, time 124.54ms
iter 52600: loss 9.3615, time 121.61ms
iter 52610: loss 8.7933, time 124.27ms
iter 52620: loss 8.6039, time 121.49ms
iter 52630: loss 8.6861, time 124.65ms
iter 52640: loss 8.9626, time 121.69ms
iter 52650: loss 8.4946, time 125.51ms
iter 52660: loss 8.9900, time 121.64ms
iter 52670: loss 8.5233, time 124.44ms
iter 52680: loss 8.3740, time 121.59ms
iter 52690: loss 8.2710, time 124.24ms
iter 52700: loss 9.2367, time 121.65ms
iter 52710: loss 9.1517, time 124.69ms
iter 52720: loss 8.6226, time 120.64ms
iter 52730: loss 9.2848, time 124.34ms
iter 52740: loss 8.9073, time 122.11ms
step 52750: train loss 7.4564, val loss 7.4761
saving checkpoint to out-shakespeare-char
iter 52750: loss 9.5671, time 2884.55ms
iter 52760: loss 8.9524, time 121.10ms
iter 52770: loss 8.9348, time 120.04ms
iter 52780: loss 9.5147, time 121.41ms
iter 52790: loss 8.3344, time 119.83ms
iter 52800: loss 8.5598, time 121.12ms
iter 52810: loss 9.2994, time 119.93ms
iter 52820: loss 8.6137, time 121.06ms
iter 52830: loss 9.2729, time 120.03ms
iter 52840: loss 9.5074, time 120.98ms
iter 52850: loss 8.7453, time 121.09ms
iter 52860: loss 9.0675, time 121.26ms
iter 52870: loss 9.4281, time 120.07ms
iter 52880: loss 8.8827, time 121.14ms
iter 52890: loss 8.7514, time 120.15ms
iter 52900: loss 9.1552, time 121.34ms
iter 52910: loss 8.6374, time 119.90ms
iter 52920: loss 9.0107, time 121.32ms
iter 52930: loss 8.6204, time 120.06ms
iter 52940: loss 9.2509, time 121.08ms
iter 52950: loss 8.0199, time 121.28ms
iter 52960: loss 9.3083, time 121.91ms
iter 52970: loss 8.2979, time 120.15ms
iter 52980: loss 8.7584, time 121.04ms
iter 52990: loss 9.6884, time 119.92ms
step 53000: train loss 7.4509, val loss 7.4422
saving checkpoint to out-shakespeare-char
iter 53000: loss 8.4147, time 2890.34ms
iter 53010: loss 8.9123, time 120.60ms
iter 53020: loss 9.3051, time 120.51ms
iter 53030: loss 8.9785, time 120.66ms
iter 53040: loss 8.5634, time 119.64ms
iter 53050: loss 8.4308, time 121.04ms
iter 53060: loss 9.1778, time 119.86ms
iter 53070: loss 9.1136, time 120.73ms
iter 53080: loss 9.3620, time 119.55ms
iter 53090: loss 8.9954, time 120.45ms
iter 53100: loss 8.7235, time 121.46ms
iter 53110: loss 8.9304, time 120.83ms
iter 53120: loss 9.4452, time 119.51ms
iter 53130: loss 9.2868, time 121.26ms
iter 53140: loss 9.3035, time 119.50ms
iter 53150: loss 8.8468, time 122.01ms
iter 53160: loss 9.7535, time 119.91ms
iter 53170: loss 8.3014, time 121.78ms
iter 53180: loss 8.6273, time 120.08ms
iter 53190: loss 8.4465, time 120.87ms
iter 53200: loss 8.9209, time 119.55ms
iter 53210: loss 9.3728, time 121.05ms
iter 53220: loss 9.1418, time 119.48ms
iter 53230: loss 9.0974, time 120.91ms
iter 53240: loss 9.2372, time 119.48ms
step 53250: train loss 7.4571, val loss 7.4210
saving checkpoint to out-shakespeare-char
iter 53250: loss 8.9125, time 2898.52ms
iter 53260: loss 8.5659, time 121.93ms
iter 53270: loss 8.0966, time 122.27ms
iter 53280: loss 9.1898, time 122.81ms
iter 53290: loss 8.1494, time 123.04ms
iter 53300: loss 8.7124, time 121.55ms
iter 53310: loss 8.1037, time 122.31ms
iter 53320: loss 9.1127, time 122.05ms
iter 53330: loss 8.7658, time 122.20ms
iter 53340: loss 9.5512, time 122.39ms
iter 53350: loss 9.0151, time 121.92ms
iter 53360: loss 8.8857, time 122.56ms
iter 53370: loss 9.1587, time 121.81ms
iter 53380: loss 8.8915, time 122.52ms
iter 53390: loss 8.7203, time 122.47ms
iter 53400: loss 9.2597, time 122.87ms
iter 53410: loss 9.0911, time 121.87ms
iter 53420: loss 9.1606, time 122.49ms
iter 53430: loss 9.5292, time 121.87ms
iter 53440: loss 8.8500, time 122.04ms
iter 53450: loss 8.5622, time 122.20ms
iter 53460: loss 9.2086, time 123.22ms
iter 53470: loss 9.7433, time 121.42ms
iter 53480: loss 8.3928, time 123.46ms
iter 53490: loss 8.9684, time 121.86ms
step 53500: train loss 7.3656, val loss 7.4395
saving checkpoint to out-shakespeare-char
iter 53500: loss 8.7242, time 2899.59ms
iter 53510: loss 8.9022, time 122.45ms
iter 53520: loss 8.1444, time 121.51ms
iter 53530: loss 8.7440, time 121.70ms
iter 53540: loss 9.2003, time 121.61ms
iter 53550: loss 9.7461, time 122.21ms
iter 53560: loss 9.0903, time 121.70ms
iter 53570: loss 8.0550, time 121.43ms
iter 53580: loss 9.2837, time 121.51ms
iter 53590: loss 8.5667, time 121.64ms
iter 53600: loss 8.2268, time 121.58ms
iter 53610: loss 8.6363, time 121.53ms
iter 53620: loss 9.6794, time 120.71ms
iter 53630: loss 8.3930, time 121.80ms
iter 53640: loss 8.9632, time 121.52ms
iter 53650: loss 8.5733, time 121.30ms
iter 53660: loss 9.0350, time 121.61ms
iter 53670: loss 8.5100, time 121.45ms
iter 53680: loss 9.4509, time 120.91ms
iter 53690: loss 8.9780, time 121.50ms
iter 53700: loss 9.2664, time 121.46ms
iter 53710: loss 9.0211, time 121.29ms
iter 53720: loss 8.9365, time 121.92ms
iter 53730: loss 8.1045, time 121.27ms
iter 53740: loss 8.8022, time 121.57ms
step 53750: train loss 7.4374, val loss 7.4655
saving checkpoint to out-shakespeare-char
iter 53750: loss 8.8221, time 2894.92ms
iter 53760: loss 9.9868, time 121.40ms
iter 53770: loss 9.3300, time 121.46ms
iter 53780: loss 9.7687, time 121.67ms
iter 53790: loss 8.6771, time 121.43ms
iter 53800: loss 9.1393, time 121.64ms
iter 53810: loss 8.7179, time 121.55ms
iter 53820: loss 9.3684, time 121.56ms
iter 53830: loss 8.8029, time 120.79ms
iter 53840: loss 8.5267, time 122.29ms
iter 53850: loss 8.1930, time 121.65ms
iter 53860: loss 8.5097, time 122.08ms
iter 53870: loss 8.5503, time 121.41ms
iter 53880: loss 9.3230, time 121.51ms
iter 53890: loss 8.7606, time 121.53ms
iter 53900: loss 8.2572, time 121.59ms
iter 53910: loss 8.9897, time 121.64ms
iter 53920: loss 9.1366, time 121.59ms
iter 53930: loss 8.6056, time 121.58ms
iter 53940: loss 7.9158, time 121.43ms
iter 53950: loss 8.5800, time 120.59ms
iter 53960: loss 9.0622, time 121.42ms
iter 53970: loss 8.6539, time 121.58ms
iter 53980: loss 9.1452, time 121.62ms
iter 53990: loss 9.6117, time 121.49ms
step 54000: train loss 7.4502, val loss 7.4981
saving checkpoint to out-shakespeare-char
iter 54000: loss 8.3186, time 2898.35ms
iter 54010: loss 8.7671, time 121.30ms
iter 54020: loss 8.6509, time 123.52ms
iter 54030: loss 8.5670, time 121.54ms
iter 54040: loss 8.9429, time 122.79ms
iter 54050: loss 8.9340, time 121.55ms
iter 54060: loss 9.6623, time 123.26ms
iter 54070: loss 7.7992, time 121.98ms
iter 54080: loss 9.1016, time 122.89ms
iter 54090: loss 8.6469, time 121.51ms
iter 54100: loss 8.6206, time 123.20ms
iter 54110: loss 8.6692, time 121.55ms
iter 54120: loss 8.9063, time 123.19ms
iter 54130: loss 8.8312, time 121.48ms
iter 54140: loss 8.4697, time 122.78ms
iter 54150: loss 9.8641, time 121.07ms
iter 54160: loss 9.0576, time 122.80ms
iter 54170: loss 9.0267, time 121.48ms
iter 54180: loss 9.0689, time 122.86ms
iter 54190: loss 8.1865, time 121.26ms
iter 54200: loss 9.0786, time 123.16ms
iter 54210: loss 8.1166, time 121.67ms
iter 54220: loss 9.5452, time 123.02ms
iter 54230: loss 8.8750, time 121.40ms
iter 54240: loss 9.1010, time 122.72ms
step 54250: train loss 7.4233, val loss 7.4753
saving checkpoint to out-shakespeare-char
iter 54250: loss 8.8229, time 2884.11ms
iter 54260: loss 8.9786, time 125.68ms
iter 54270: loss 8.7130, time 121.47ms
iter 54280: loss 8.9545, time 124.21ms
iter 54290: loss 8.2656, time 121.33ms
iter 54300: loss 9.0218, time 124.53ms
iter 54310: loss 9.1233, time 121.42ms
iter 54320: loss 9.5382, time 124.38ms
iter 54330: loss 9.1527, time 121.39ms
iter 54340: loss 8.4632, time 124.11ms
iter 54350: loss 9.3361, time 121.67ms
iter 54360: loss 9.0555, time 124.56ms
iter 54370: loss 8.0848, time 121.28ms
iter 54380: loss 8.8149, time 123.82ms
iter 54390: loss 9.1524, time 121.55ms
iter 54400: loss 8.6943, time 124.25ms
iter 54410: loss 8.2966, time 121.52ms
iter 54420: loss 8.8960, time 124.62ms
iter 54430: loss 8.8198, time 122.59ms
iter 54440: loss 8.6982, time 123.35ms
iter 54450: loss 9.0748, time 121.42ms
iter 54460: loss 8.3386, time 124.10ms
iter 54470: loss 8.7763, time 120.79ms
iter 54480: loss 9.0607, time 124.10ms
iter 54490: loss 8.5370, time 121.43ms
step 54500: train loss 7.4268, val loss 7.3760
saving checkpoint to out-shakespeare-char
iter 54500: loss 9.2113, time 2893.82ms
iter 54510: loss 8.9043, time 126.07ms
iter 54520: loss 9.0233, time 127.77ms
iter 54530: loss 8.6954, time 125.59ms
iter 54540: loss 8.3805, time 125.43ms
iter 54550: loss 9.0008, time 126.58ms
iter 54560: loss 8.7255, time 128.54ms
iter 54570: loss 8.7809, time 125.36ms
iter 54580: loss 9.2480, time 125.11ms
iter 54590: loss 9.0560, time 125.34ms
iter 54600: loss 8.8499, time 125.74ms
iter 54610: loss 9.5249, time 125.41ms
iter 54620: loss 9.3247, time 125.34ms
iter 54630: loss 8.8953, time 125.62ms
iter 54640: loss 8.0155, time 125.64ms
iter 54650: loss 8.5943, time 124.36ms
iter 54660: loss 8.5875, time 125.72ms
iter 54670: loss 8.9557, time 129.30ms
iter 54680: loss 9.3580, time 126.01ms
iter 54690: loss 8.3081, time 126.07ms
iter 54700: loss 8.4953, time 126.04ms
iter 54710: loss 8.4443, time 126.05ms
iter 54720: loss 9.2973, time 125.37ms
iter 54730: loss 8.8952, time 125.95ms
iter 54740: loss 9.2879, time 126.08ms
step 54750: train loss 7.3878, val loss 7.3804
saving checkpoint to out-shakespeare-char
iter 54750: loss 8.6444, time 2902.28ms
iter 54760: loss 8.7918, time 126.18ms
iter 54770: loss 8.9119, time 126.07ms
iter 54780: loss 9.1179, time 126.31ms
iter 54790: loss 8.7356, time 126.78ms
iter 54800: loss 8.5218, time 128.37ms
iter 54810: loss 8.1028, time 125.86ms
iter 54820: loss 8.1832, time 128.82ms
iter 54830: loss 8.5452, time 125.63ms
iter 54840: loss 9.2286, time 124.38ms
iter 54850: loss 8.2831, time 124.60ms
iter 54860: loss 9.2032, time 125.05ms
iter 54870: loss 7.8779, time 124.93ms
iter 54880: loss 8.7043, time 125.90ms
iter 54890: loss 9.1460, time 125.61ms
iter 54900: loss 8.4833, time 125.61ms
iter 54910: loss 8.1424, time 128.54ms
iter 54920: loss 8.6435, time 124.74ms
iter 54930: loss 9.5721, time 124.89ms
iter 54940: loss 9.3534, time 124.99ms
iter 54950: loss 9.0363, time 127.74ms
iter 54960: loss 9.1338, time 124.97ms
iter 54970: loss 8.7436, time 124.98ms
iter 54980: loss 9.3622, time 125.15ms
iter 54990: loss 9.0122, time 125.33ms
step 55000: train loss 7.3786, val loss 7.3684
saving checkpoint to out-shakespeare-char
iter 55000: loss 8.5552, time 2876.59ms
iter 55010: loss 8.1598, time 125.88ms
iter 55020: loss 8.6836, time 125.63ms
iter 55030: loss 8.2577, time 125.96ms
iter 55040: loss 10.0400, time 125.57ms
iter 55050: loss 9.1376, time 126.40ms
iter 55060: loss 8.8671, time 125.59ms
iter 55070: loss 8.6072, time 126.00ms
iter 55080: loss 8.9779, time 125.27ms
iter 55090: loss 8.7866, time 125.47ms
iter 55100: loss 8.8211, time 125.46ms
iter 55110: loss 9.0985, time 125.55ms
iter 55120: loss 8.5844, time 126.13ms
iter 55130: loss 8.5538, time 129.33ms
iter 55140: loss 9.0337, time 125.89ms
iter 55150: loss 8.5407, time 126.02ms
iter 55160: loss 8.9916, time 125.86ms
iter 55170: loss 8.2926, time 126.04ms
iter 55180: loss 8.4436, time 125.86ms
iter 55190: loss 8.5200, time 125.74ms
iter 55200: loss 9.1050, time 125.99ms
iter 55210: loss 7.6655, time 125.94ms
iter 55220: loss 8.8261, time 125.88ms
iter 55230: loss 8.9887, time 126.54ms
iter 55240: loss 9.1717, time 128.17ms
step 55250: train loss 7.3649, val loss 7.4007
saving checkpoint to out-shakespeare-char
iter 55250: loss 9.4515, time 2893.67ms
iter 55260: loss 8.8702, time 126.03ms
iter 55270: loss 9.4732, time 126.00ms
iter 55280: loss 9.3377, time 125.78ms
iter 55290: loss 8.7562, time 125.83ms
iter 55300: loss 9.1610, time 125.43ms
iter 55310: loss 9.0824, time 125.44ms
iter 55320: loss 8.8384, time 125.95ms
iter 55330: loss 8.6032, time 125.47ms
iter 55340: loss 8.4327, time 128.57ms
iter 55350: loss 9.4745, time 125.68ms
iter 55360: loss 9.6231, time 125.65ms
iter 55370: loss 9.6474, time 125.47ms
iter 55380: loss 8.6540, time 124.61ms
iter 55390: loss 8.3935, time 125.65ms
iter 55400: loss 9.1657, time 126.21ms
iter 55410: loss 8.9747, time 125.51ms
iter 55420: loss 8.7733, time 125.51ms
iter 55430: loss 8.0015, time 124.90ms
iter 55440: loss 9.7594, time 124.79ms
iter 55450: loss 8.8175, time 128.73ms
iter 55460: loss 8.9743, time 125.97ms
iter 55470: loss 8.9798, time 125.49ms
iter 55480: loss 9.0604, time 125.81ms
iter 55490: loss 8.7163, time 125.65ms
step 55500: train loss 7.3700, val loss 7.3873
saving checkpoint to out-shakespeare-char
iter 55500: loss 8.4620, time 2895.87ms
iter 55510: loss 8.9950, time 121.80ms
iter 55520: loss 9.0253, time 122.27ms
iter 55530: loss 9.1650, time 120.86ms
iter 55540: loss 9.1420, time 122.96ms
iter 55550: loss 8.6633, time 121.77ms
iter 55560: loss 8.7220, time 121.78ms
iter 55570: loss 9.3299, time 122.21ms
iter 55580: loss 8.7280, time 121.85ms
iter 55590: loss 9.2943, time 121.63ms
iter 55600: loss 8.2058, time 121.69ms
iter 55610: loss 8.6701, time 121.88ms
iter 55620: loss 8.8662, time 121.56ms
iter 55630: loss 8.1621, time 122.12ms
iter 55640: loss 9.3932, time 121.48ms
iter 55650: loss 8.2470, time 121.69ms
iter 55660: loss 8.9161, time 121.75ms
iter 55670: loss 8.7342, time 121.69ms
iter 55680: loss 8.7179, time 121.97ms
iter 55690: loss 8.6689, time 121.72ms
iter 55700: loss 9.0626, time 120.69ms
iter 55710: loss 8.6130, time 121.89ms
iter 55720: loss 9.6706, time 121.72ms
iter 55730: loss 8.4334, time 121.80ms
iter 55740: loss 9.2457, time 120.94ms
step 55750: train loss 7.3555, val loss 7.3988
saving checkpoint to out-shakespeare-char
iter 55750: loss 8.6389, time 2899.59ms
iter 55760: loss 8.6425, time 121.68ms
iter 55770: loss 9.2141, time 121.76ms
iter 55780: loss 8.9942, time 121.70ms
iter 55790: loss 8.4798, time 124.65ms
iter 55800: loss 8.8578, time 121.58ms
iter 55810: loss 8.8463, time 124.27ms
iter 55820: loss 8.6656, time 121.60ms
iter 55830: loss 9.0168, time 124.58ms
iter 55840: loss 8.6014, time 121.68ms
iter 55850: loss 8.9905, time 124.41ms
iter 55860: loss 8.6986, time 121.74ms
iter 55870: loss 9.0508, time 124.48ms
iter 55880: loss 8.8717, time 121.67ms
iter 55890: loss 8.3823, time 124.53ms
iter 55900: loss 8.5947, time 121.16ms
iter 55910: loss 8.7940, time 124.47ms
iter 55920: loss 8.8547, time 121.66ms
iter 55930: loss 9.0325, time 124.34ms
iter 55940: loss 7.8361, time 121.89ms
iter 55950: loss 8.3883, time 124.43ms
iter 55960: loss 9.1619, time 121.53ms
iter 55970: loss 9.0677, time 124.43ms
iter 55980: loss 8.7239, time 122.18ms
iter 55990: loss 9.6158, time 124.39ms
step 56000: train loss 7.4200, val loss 7.4133
saving checkpoint to out-shakespeare-char
iter 56000: loss 8.7590, time 2886.84ms
iter 56010: loss 8.5841, time 121.69ms
iter 56020: loss 8.8543, time 122.69ms
iter 56030: loss 9.1802, time 121.61ms
iter 56040: loss 9.2349, time 122.74ms
iter 56050: loss 8.7535, time 121.71ms
iter 56060: loss 8.5203, time 122.03ms
iter 56070: loss 9.3273, time 122.03ms
iter 56080: loss 8.8205, time 122.86ms
iter 56090: loss 8.4711, time 121.69ms
iter 56100: loss 8.8383, time 122.73ms
iter 56110: loss 8.3834, time 121.58ms
iter 56120: loss 8.5978, time 122.86ms
iter 56130: loss 8.5197, time 121.69ms
iter 56140: loss 8.7459, time 123.05ms
iter 56150: loss 8.8012, time 121.53ms
iter 56160: loss 8.4470, time 123.45ms
iter 56170: loss 8.3406, time 121.62ms
iter 56180: loss 9.0029, time 122.80ms
iter 56190: loss 8.6331, time 121.84ms
iter 56200: loss 8.8821, time 122.84ms
iter 56210: loss 8.0104, time 121.69ms
iter 56220: loss 8.8352, time 122.97ms
iter 56230: loss 9.3711, time 121.76ms
iter 56240: loss 8.7264, time 124.04ms
step 56250: train loss 7.4000, val loss 7.3565
saving checkpoint to out-shakespeare-char
iter 56250: loss 9.0942, time 2905.53ms
iter 56260: loss 9.2267, time 121.94ms
iter 56270: loss 7.9591, time 121.80ms
iter 56280: loss 8.0397, time 121.90ms
iter 56290: loss 8.6153, time 121.89ms
iter 56300: loss 9.1376, time 121.97ms
iter 56310: loss 8.9116, time 120.94ms
iter 56320: loss 8.8601, time 121.49ms
iter 56330: loss 9.4005, time 122.16ms
iter 56340: loss 9.0903, time 122.25ms
iter 56350: loss 8.8579, time 121.83ms
iter 56360: loss 8.8859, time 122.54ms
iter 56370: loss 8.4469, time 121.81ms
iter 56380: loss 9.2026, time 121.84ms
iter 56390: loss 8.0817, time 121.75ms
iter 56400: loss 7.9551, time 121.80ms
iter 56410: loss 8.7773, time 121.80ms
iter 56420: loss 9.5048, time 121.92ms
iter 56430: loss 9.0996, time 122.28ms
iter 56440: loss 9.1934, time 122.05ms
iter 56450: loss 8.2828, time 121.55ms
iter 56460: loss 8.3295, time 121.07ms
iter 56470: loss 9.2315, time 121.88ms
iter 56480: loss 8.5699, time 121.79ms
iter 56490: loss 9.0145, time 121.78ms
step 56500: train loss 7.4034, val loss 7.3331
saving checkpoint to out-shakespeare-char
iter 56500: loss 8.7777, time 2903.50ms
iter 56510: loss 9.2025, time 122.67ms
iter 56520: loss 8.2568, time 121.15ms
iter 56530: loss 9.0092, time 122.69ms
iter 56540: loss 8.5004, time 121.53ms
iter 56550: loss 7.9655, time 123.12ms
iter 56560: loss 8.2956, time 121.75ms
iter 56570: loss 8.8563, time 122.62ms
iter 56580: loss 9.0873, time 121.53ms
iter 56590: loss 9.3627, time 122.65ms
iter 56600: loss 9.0930, time 121.76ms
iter 56610: loss 9.1693, time 122.66ms
iter 56620: loss 9.0351, time 121.44ms
iter 56630: loss 8.7065, time 122.57ms
iter 56640: loss 8.5767, time 121.52ms
iter 56650: loss 9.5109, time 122.74ms
iter 56660: loss 8.6298, time 121.66ms
iter 56670: loss 7.8879, time 121.83ms
iter 56680: loss 8.8731, time 121.73ms
iter 56690: loss 8.5867, time 123.71ms
iter 56700: loss 9.4479, time 121.66ms
iter 56710: loss 8.3788, time 122.69ms
iter 56720: loss 8.8741, time 121.97ms
iter 56730: loss 9.1269, time 122.60ms
iter 56740: loss 8.5008, time 121.47ms
step 56750: train loss 7.3683, val loss 7.3483
saving checkpoint to out-shakespeare-char
iter 56750: loss 8.8535, time 2900.94ms
iter 56760: loss 9.0782, time 121.80ms
iter 56770: loss 8.5394, time 121.45ms
iter 56780: loss 8.2067, time 121.67ms
iter 56790: loss 8.2449, time 121.73ms
iter 56800: loss 9.1608, time 121.45ms
iter 56810: loss 8.8609, time 120.95ms
iter 56820: loss 8.7356, time 121.64ms
iter 56830: loss 8.8651, time 121.48ms
iter 56840: loss 8.8527, time 121.55ms
iter 56850: loss 8.3195, time 121.70ms
iter 56860: loss 8.2029, time 120.90ms
iter 56870: loss 8.7188, time 121.54ms
iter 56880: loss 8.2199, time 121.49ms
iter 56890: loss 8.9216, time 121.69ms
iter 56900: loss 8.6298, time 121.46ms
iter 56910: loss 9.3605, time 121.71ms
iter 56920: loss 9.1778, time 121.69ms
iter 56930: loss 8.9916, time 120.94ms
iter 56940: loss 8.5564, time 121.47ms
iter 56950: loss 8.0722, time 121.72ms
iter 56960: loss 8.9307, time 121.38ms
iter 56970: loss 9.0347, time 121.55ms
iter 56980: loss 8.9862, time 120.87ms
iter 56990: loss 8.1539, time 121.23ms
step 57000: train loss 7.3043, val loss 7.3751
saving checkpoint to out-shakespeare-char
iter 57000: loss 9.0500, time 2887.48ms
iter 57010: loss 8.1776, time 119.87ms
iter 57020: loss 9.6022, time 120.04ms
iter 57030: loss 9.0843, time 120.74ms
iter 57040: loss 8.7230, time 120.12ms
iter 57050: loss 8.5030, time 121.43ms
iter 57060: loss 9.3820, time 119.94ms
iter 57070: loss 9.0488, time 119.88ms
iter 57080: loss 8.8306, time 121.10ms
iter 57090: loss 8.5423, time 120.24ms
iter 57100: loss 8.9446, time 120.06ms
iter 57110: loss 8.4237, time 120.69ms
iter 57120: loss 9.2141, time 119.96ms
iter 57130: loss 8.9585, time 119.64ms
iter 57140: loss 8.4828, time 120.83ms
iter 57150: loss 8.7082, time 119.81ms
iter 57160: loss 8.9910, time 122.16ms
iter 57170: loss 8.9598, time 121.78ms
iter 57180: loss 8.8564, time 121.60ms
iter 57190: loss 8.9450, time 122.40ms
iter 57200: loss 8.2030, time 122.01ms
iter 57210: loss 8.7417, time 121.44ms
iter 57220: loss 9.1699, time 122.95ms
iter 57230: loss 9.1880, time 121.74ms
iter 57240: loss 8.2620, time 121.96ms
step 57250: train loss 7.3458, val loss 7.3217
saving checkpoint to out-shakespeare-char
iter 57250: loss 9.2711, time 2895.72ms
iter 57260: loss 8.4635, time 120.58ms
iter 57270: loss 9.5269, time 124.49ms
iter 57280: loss 8.9820, time 120.48ms
iter 57290: loss 8.4383, time 124.19ms
iter 57300: loss 8.9275, time 121.69ms
iter 57310: loss 8.1198, time 124.57ms
iter 57320: loss 9.0488, time 121.48ms
iter 57330: loss 8.8508, time 124.29ms
iter 57340: loss 9.0727, time 121.72ms
iter 57350: loss 8.0838, time 124.62ms
iter 57360: loss 9.0062, time 121.70ms
iter 57370: loss 8.0977, time 124.74ms
iter 57380: loss 9.0101, time 120.80ms
iter 57390: loss 8.2187, time 124.84ms
iter 57400: loss 9.4640, time 121.63ms
iter 57410: loss 8.1000, time 124.41ms
iter 57420: loss 8.7731, time 121.74ms
iter 57430: loss 9.1722, time 124.36ms
iter 57440: loss 8.7327, time 120.59ms
iter 57450: loss 8.4081, time 124.61ms
iter 57460: loss 8.8992, time 121.70ms
iter 57470: loss 8.7880, time 123.75ms
iter 57480: loss 9.3224, time 121.62ms
iter 57490: loss 9.0380, time 124.73ms
step 57500: train loss 7.3903, val loss 7.3632
saving checkpoint to out-shakespeare-char
iter 57500: loss 9.3127, time 2899.47ms
iter 57510: loss 8.6023, time 128.17ms
iter 57520: loss 8.8671, time 124.48ms
iter 57530: loss 8.4922, time 125.25ms
iter 57540: loss 9.1573, time 125.37ms
iter 57550: loss 8.8540, time 125.97ms
iter 57560: loss 8.6899, time 124.97ms
iter 57570: loss 8.3698, time 124.97ms
iter 57580: loss 8.8569, time 124.89ms
iter 57590: loss 8.3613, time 124.94ms
iter 57600: loss 9.0265, time 123.92ms
iter 57610: loss 9.1256, time 126.00ms
iter 57620: loss 8.2929, time 128.14ms
iter 57630: loss 9.5940, time 125.46ms
iter 57640: loss 8.7485, time 124.79ms
iter 57650: loss 8.5896, time 125.09ms
iter 57660: loss 8.7921, time 125.13ms
iter 57670: loss 9.1886, time 124.73ms
iter 57680: loss 8.6164, time 124.34ms
iter 57690: loss 9.0791, time 128.78ms
iter 57700: loss 9.2006, time 125.32ms
iter 57710: loss 8.6349, time 124.39ms
iter 57720: loss 8.4794, time 125.26ms
iter 57730: loss 8.8290, time 125.39ms
iter 57740: loss 8.4008, time 125.94ms
step 57750: train loss 7.3662, val loss 7.3454
saving checkpoint to out-shakespeare-char
iter 57750: loss 8.1371, time 2913.40ms
iter 57760: loss 7.9911, time 120.13ms
iter 57770: loss 8.5862, time 119.61ms
iter 57780: loss 8.8558, time 119.70ms
iter 57790: loss 7.8917, time 119.70ms
iter 57800: loss 9.5966, time 119.23ms
iter 57810: loss 9.0005, time 119.59ms
iter 57820: loss 9.0837, time 120.83ms
iter 57830: loss 9.7007, time 121.30ms
iter 57840: loss 8.8197, time 121.48ms
iter 57850: loss 8.4609, time 121.15ms
iter 57860: loss 9.3332, time 121.54ms
iter 57870: loss 8.5950, time 121.54ms
iter 57880: loss 8.5157, time 121.60ms
iter 57890: loss 8.9363, time 121.45ms
iter 57900: loss 8.3983, time 121.37ms
iter 57910: loss 8.8283, time 121.77ms
iter 57920: loss 8.6442, time 121.52ms
iter 57930: loss 9.3343, time 121.58ms
iter 57940: loss 8.4394, time 121.40ms
iter 57950: loss 9.1874, time 121.61ms
iter 57960: loss 9.5579, time 121.54ms
iter 57970: loss 8.9530, time 121.53ms
iter 57980: loss 8.6998, time 121.49ms
iter 57990: loss 9.3777, time 122.73ms
step 58000: train loss 7.3472, val loss 7.3240
saving checkpoint to out-shakespeare-char
iter 58000: loss 8.9547, time 2891.58ms
iter 58010: loss 9.2283, time 122.01ms
iter 58020: loss 9.0823, time 123.19ms
iter 58030: loss 8.5558, time 121.87ms
iter 58040: loss 8.2664, time 121.70ms
iter 58050: loss 8.6519, time 121.57ms
iter 58060: loss 8.8231, time 121.53ms
iter 58070: loss 7.9151, time 121.68ms
iter 58080: loss 8.5166, time 122.49ms
iter 58090: loss 8.5389, time 121.57ms
iter 58100: loss 8.2187, time 121.76ms
iter 58110: loss 9.4023, time 121.76ms
iter 58120: loss 8.4744, time 122.02ms
iter 58130: loss 8.7801, time 121.62ms
iter 58140: loss 9.2417, time 121.53ms
iter 58150: loss 9.1240, time 120.70ms
iter 58160: loss 8.0856, time 121.70ms
iter 58170: loss 8.2422, time 121.64ms
iter 58180: loss 8.0080, time 121.67ms
iter 58190: loss 9.0032, time 121.56ms
iter 58200: loss 8.6036, time 121.93ms
iter 58210: loss 8.8028, time 121.69ms
iter 58220: loss 8.0945, time 121.69ms
iter 58230: loss 8.8618, time 121.55ms
iter 58240: loss 8.5520, time 119.95ms
step 58250: train loss 7.3294, val loss 7.3497
saving checkpoint to out-shakespeare-char
iter 58250: loss 8.6974, time 2880.28ms
iter 58260: loss 8.4745, time 121.52ms
iter 58270: loss 8.9180, time 121.92ms
iter 58280: loss 9.3522, time 121.55ms
iter 58290: loss 9.2890, time 122.69ms
iter 58300: loss 8.0893, time 121.47ms
iter 58310: loss 8.5903, time 121.53ms
iter 58320: loss 9.3893, time 121.55ms
iter 58330: loss 8.3592, time 122.02ms
iter 58340: loss 8.6172, time 121.82ms
iter 58350: loss 8.8131, time 121.69ms
iter 58360: loss 8.2315, time 121.62ms
iter 58370: loss 9.1865, time 121.47ms
iter 58380: loss 8.4668, time 121.75ms
iter 58390: loss 9.1228, time 121.73ms
iter 58400: loss 8.7837, time 121.59ms
iter 58410: loss 8.9106, time 121.40ms
iter 58420: loss 9.3485, time 121.61ms
iter 58430: loss 8.8157, time 121.28ms
iter 58440: loss 8.3929, time 121.58ms
iter 58450: loss 8.7342, time 122.92ms
iter 58460: loss 8.4526, time 121.60ms
iter 58470: loss 8.8582, time 122.98ms
iter 58480: loss 8.5938, time 119.21ms
iter 58490: loss 8.1106, time 122.09ms
step 58500: train loss 7.3570, val loss 7.3167
saving checkpoint to out-shakespeare-char
iter 58500: loss 7.7272, time 2898.95ms
iter 58510: loss 8.1920, time 121.45ms
iter 58520: loss 8.6522, time 121.51ms
iter 58530: loss 8.8417, time 121.40ms
iter 58540: loss 8.9799, time 121.38ms
iter 58550: loss 9.2543, time 121.23ms
iter 58560: loss 9.1556, time 121.35ms
iter 58570: loss 9.1468, time 121.83ms
iter 58580: loss 8.8762, time 121.35ms
iter 58590: loss 8.8632, time 121.57ms
iter 58600: loss 8.4101, time 121.55ms
iter 58610: loss 8.9788, time 121.32ms
iter 58620: loss 9.1278, time 121.43ms
iter 58630: loss 8.7044, time 121.51ms
iter 58640: loss 9.1729, time 121.36ms
iter 58650: loss 8.6038, time 121.38ms
iter 58660: loss 7.9426, time 121.65ms
iter 58670: loss 8.9127, time 121.19ms
iter 58680: loss 9.0013, time 121.93ms
iter 58690: loss 8.9733, time 121.43ms
iter 58700: loss 8.7056, time 121.45ms
iter 58710: loss 8.2133, time 121.52ms
iter 58720: loss 8.3397, time 122.10ms
iter 58730: loss 8.9482, time 121.68ms
iter 58740: loss 8.9929, time 121.60ms
step 58750: train loss 7.2986, val loss 7.3736
saving checkpoint to out-shakespeare-char
iter 58750: loss 9.4029, time 2893.51ms
iter 58760: loss 9.2292, time 121.76ms
iter 58770: loss 8.8626, time 126.33ms
iter 58780: loss 8.6066, time 121.69ms
iter 58790: loss 9.3384, time 123.06ms
iter 58800: loss 9.3059, time 121.65ms
iter 58810: loss 9.0201, time 123.14ms
iter 58820: loss 8.9688, time 121.88ms
iter 58830: loss 8.4903, time 122.81ms
iter 58840: loss 8.5503, time 120.61ms
iter 58850: loss 8.4326, time 122.98ms
iter 58860: loss 8.6566, time 122.02ms
iter 58870: loss 8.5116, time 123.11ms
iter 58880: loss 7.9672, time 121.62ms
iter 58890: loss 8.9253, time 122.90ms
iter 58900: loss 9.2583, time 121.67ms
iter 58910: loss 9.7443, time 123.29ms
iter 58920: loss 8.4704, time 122.09ms
iter 58930: loss 8.7643, time 123.38ms
iter 58940: loss 9.5264, time 120.70ms
iter 58950: loss 8.9365, time 123.20ms
iter 58960: loss 8.4903, time 122.09ms
iter 58970: loss 8.4751, time 122.76ms
iter 58980: loss 9.2086, time 121.70ms
iter 58990: loss 9.0503, time 122.78ms
step 59000: train loss 7.3370, val loss 7.2677
saving checkpoint to out-shakespeare-char
iter 59000: loss 9.4354, time 2910.35ms
iter 59010: loss 9.1259, time 123.11ms
iter 59020: loss 8.8826, time 124.43ms
iter 59030: loss 8.7095, time 124.34ms
iter 59040: loss 7.9009, time 124.77ms
iter 59050: loss 8.4007, time 125.07ms
iter 59060: loss 8.2870, time 125.33ms
iter 59070: loss 8.1372, time 124.78ms
iter 59080: loss 9.1333, time 124.68ms
iter 59090: loss 8.4125, time 125.31ms
iter 59100: loss 8.4526, time 124.80ms
iter 59110: loss 8.7713, time 126.03ms
iter 59120: loss 8.4983, time 125.45ms
iter 59130: loss 8.4771, time 128.41ms
iter 59140: loss 8.4755, time 125.32ms
iter 59150: loss 9.3951, time 125.30ms
iter 59160: loss 8.9050, time 125.36ms
iter 59170: loss 8.4760, time 125.29ms
iter 59180: loss 8.3191, time 124.75ms
iter 59190: loss 8.1189, time 125.35ms
iter 59200: loss 8.8545, time 125.21ms
iter 59210: loss 8.4293, time 125.31ms
iter 59220: loss 9.4159, time 125.46ms
iter 59230: loss 9.1011, time 125.42ms
iter 59240: loss 8.6383, time 128.35ms
step 59250: train loss 7.3538, val loss 7.3127
saving checkpoint to out-shakespeare-char
iter 59250: loss 8.4653, time 2906.87ms
iter 59260: loss 9.0955, time 125.29ms
iter 59270: loss 9.0898, time 125.19ms
iter 59280: loss 8.3602, time 124.84ms
iter 59290: loss 8.0058, time 125.39ms
iter 59300: loss 8.6756, time 125.61ms
iter 59310: loss 7.9921, time 127.03ms
iter 59320: loss 9.4451, time 125.27ms
iter 59330: loss 8.4185, time 125.18ms
iter 59340: loss 8.8619, time 125.37ms
iter 59350: loss 8.6901, time 125.23ms
iter 59360: loss 8.7895, time 125.18ms
iter 59370: loss 7.7481, time 124.65ms
iter 59380: loss 8.4619, time 125.01ms
iter 59390: loss 8.8654, time 125.41ms
iter 59400: loss 9.0908, time 125.24ms
iter 59410: loss 8.8229, time 125.25ms
iter 59420: loss 9.1118, time 125.21ms
iter 59430: loss 8.4061, time 127.84ms
iter 59440: loss 8.8606, time 125.66ms
iter 59450: loss 8.2532, time 125.10ms
iter 59460: loss 9.6299, time 125.09ms
iter 59470: loss 8.4706, time 125.56ms
iter 59480: loss 8.9809, time 125.67ms
iter 59490: loss 8.6191, time 125.64ms
step 59500: train loss 7.2998, val loss 7.3432
saving checkpoint to out-shakespeare-char
iter 59500: loss 8.2364, time 2897.22ms
iter 59510: loss 9.0359, time 125.43ms
iter 59520: loss 8.1929, time 125.03ms
iter 59530: loss 8.5610, time 124.82ms
iter 59540: loss 9.0891, time 124.76ms
iter 59550: loss 8.8832, time 124.91ms
iter 59560: loss 8.2218, time 124.88ms
iter 59570: loss 8.9180, time 124.99ms
iter 59580: loss 9.1309, time 125.19ms
iter 59590: loss 8.6342, time 125.15ms
iter 59600: loss 8.8857, time 127.85ms
iter 59610: loss 8.9701, time 124.75ms
iter 59620: loss 9.0885, time 124.86ms
iter 59630: loss 8.6476, time 124.76ms
iter 59640: loss 8.6480, time 124.91ms
iter 59650: loss 8.7336, time 124.97ms
iter 59660: loss 8.7421, time 124.91ms
iter 59670: loss 8.7903, time 124.61ms
iter 59680: loss 9.3300, time 124.79ms
iter 59690: loss 8.4770, time 124.98ms
iter 59700: loss 8.8655, time 125.16ms
iter 59710: loss 8.9845, time 128.01ms
iter 59720: loss 8.3330, time 124.93ms
iter 59730: loss 8.6554, time 124.72ms
iter 59740: loss 8.9549, time 125.01ms
step 59750: train loss 7.3196, val loss 7.3164
saving checkpoint to out-shakespeare-char
iter 59750: loss 8.8320, time 2894.55ms
iter 59760: loss 8.2057, time 125.56ms
iter 59770: loss 9.0719, time 125.36ms
iter 59780: loss 9.4317, time 124.36ms
iter 59790: loss 8.6343, time 125.62ms
iter 59800: loss 8.5941, time 125.49ms
iter 59810: loss 9.0041, time 125.38ms
iter 59820: loss 8.7074, time 125.38ms
iter 59830: loss 9.0155, time 125.26ms
iter 59840: loss 8.6818, time 125.67ms
iter 59850: loss 8.5954, time 124.76ms
iter 59860: loss 8.6812, time 125.35ms
iter 59870: loss 7.9679, time 125.29ms
iter 59880: loss 9.0769, time 127.82ms
iter 59890: loss 9.5813, time 124.99ms
iter 59900: loss 8.9656, time 124.99ms
iter 59910: loss 9.0122, time 125.21ms
iter 59920: loss 9.2201, time 124.98ms
iter 59930: loss 8.2845, time 125.06ms
iter 59940: loss 7.9802, time 125.00ms
iter 59950: loss 9.2658, time 125.09ms
iter 59960: loss 8.3498, time 124.95ms
iter 59970: loss 8.8200, time 125.43ms
iter 59980: loss 8.5068, time 125.17ms
iter 59990: loss 9.3531, time 128.01ms
step 60000: train loss 7.2633, val loss 7.3002
saving checkpoint to out-shakespeare-char
iter 60000: loss 9.3050, time 2897.98ms
iter 60010: loss 8.2899, time 124.69ms
iter 60020: loss 8.8917, time 121.55ms
iter 60030: loss 8.2582, time 124.52ms
iter 60040: loss 8.7837, time 121.68ms
iter 60050: loss 8.9629, time 124.78ms
iter 60060: loss 7.9611, time 121.63ms
iter 60070: loss 9.4202, time 124.55ms
iter 60080: loss 8.1933, time 121.64ms
iter 60090: loss 9.2926, time 124.97ms
iter 60100: loss 8.1531, time 121.78ms
iter 60110: loss 8.9108, time 124.96ms
iter 60120: loss 8.9569, time 121.69ms
iter 60130: loss 8.9298, time 124.48ms
iter 60140: loss 8.8964, time 121.76ms
iter 60150: loss 8.9129, time 124.61ms
iter 60160: loss 8.3506, time 121.76ms
iter 60170: loss 8.1434, time 124.41ms
iter 60180: loss 9.0613, time 121.77ms
iter 60190: loss 9.1130, time 124.59ms
iter 60200: loss 9.3416, time 121.68ms
iter 60210: loss 8.3421, time 124.83ms
iter 60220: loss 8.7571, time 121.64ms
iter 60230: loss 8.7975, time 124.67ms
iter 60240: loss 8.8909, time 121.59ms
step 60250: train loss 7.2884, val loss 7.3464
saving checkpoint to out-shakespeare-char
iter 60250: loss 9.1555, time 2894.14ms
iter 60260: loss 9.0810, time 122.57ms
iter 60270: loss 8.8292, time 121.52ms
iter 60280: loss 8.1973, time 122.65ms
iter 60290: loss 8.2862, time 121.45ms
iter 60300: loss 8.6528, time 122.50ms
iter 60310: loss 8.6218, time 121.50ms
iter 60320: loss 8.8875, time 122.61ms
iter 60330: loss 8.4714, time 121.55ms
iter 60340: loss 8.3192, time 122.54ms
iter 60350: loss 8.0298, time 121.40ms
iter 60360: loss 9.2620, time 122.58ms
iter 60370: loss 8.4379, time 120.74ms
iter 60380: loss 8.7696, time 122.65ms
iter 60390: loss 8.5326, time 121.43ms
iter 60400: loss 8.2596, time 122.05ms
iter 60410: loss 8.0471, time 121.53ms
iter 60420: loss 8.1573, time 122.60ms
iter 60430: loss 9.2036, time 121.61ms
iter 60440: loss 8.8570, time 122.63ms
iter 60450: loss 9.0445, time 121.65ms
iter 60460: loss 8.7394, time 121.62ms
iter 60470: loss 8.8211, time 121.54ms
iter 60480: loss 8.5803, time 122.07ms
iter 60490: loss 8.3997, time 121.58ms
step 60500: train loss 7.2264, val loss 7.3180
saving checkpoint to out-shakespeare-char
iter 60500: loss 9.2373, time 2887.76ms
iter 60510: loss 8.7416, time 121.57ms
iter 60520: loss 9.1244, time 121.54ms
iter 60530: loss 8.5851, time 121.56ms
iter 60540: loss 8.8689, time 121.53ms
iter 60550: loss 8.8195, time 121.25ms
iter 60560: loss 7.7195, time 121.49ms
iter 60570: loss 8.9397, time 121.61ms
iter 60580: loss 8.1676, time 121.56ms
iter 60590: loss 8.6673, time 121.53ms
iter 60600: loss 8.4639, time 121.50ms
iter 60610: loss 8.3769, time 121.43ms
iter 60620: loss 8.3921, time 121.47ms
iter 60630: loss 8.4227, time 121.52ms
iter 60640: loss 8.2659, time 121.64ms
iter 60650: loss 8.5478, time 121.43ms
iter 60660: loss 8.0499, time 121.53ms
iter 60670: loss 8.3588, time 121.82ms
iter 60680: loss 8.2335, time 121.53ms
iter 60690: loss 8.1893, time 121.46ms
iter 60700: loss 8.5294, time 121.54ms
iter 60710: loss 8.5954, time 121.95ms
iter 60720: loss 8.3582, time 121.34ms
iter 60730: loss 9.2377, time 122.31ms
iter 60740: loss 8.5354, time 121.75ms
step 60750: train loss 7.2327, val loss 7.2764
saving checkpoint to out-shakespeare-char
iter 60750: loss 8.8340, time 2883.93ms
iter 60760: loss 8.4263, time 121.75ms
iter 60770: loss 8.5216, time 121.56ms
iter 60780: loss 8.2229, time 121.47ms
iter 60790: loss 9.2392, time 121.72ms
iter 60800: loss 8.6228, time 121.40ms
iter 60810: loss 8.6709, time 121.64ms
iter 60820: loss 8.3292, time 121.50ms
iter 60830: loss 8.8051, time 121.51ms
iter 60840: loss 8.6610, time 121.46ms
iter 60850: loss 8.5450, time 121.42ms
iter 60860: loss 8.9101, time 120.39ms
iter 60870: loss 8.8105, time 121.43ms
iter 60880: loss 8.6554, time 121.44ms
iter 60890: loss 8.5197, time 121.42ms
iter 60900: loss 8.3838, time 121.41ms
iter 60910: loss 8.8149, time 121.44ms
iter 60920: loss 9.2253, time 121.49ms
iter 60930: loss 8.2118, time 121.49ms
iter 60940: loss 8.8411, time 121.54ms
iter 60950: loss 8.0768, time 121.44ms
iter 60960: loss 9.0980, time 121.27ms
iter 60970: loss 8.9376, time 121.49ms
iter 60980: loss 8.7616, time 121.38ms
iter 60990: loss 8.2828, time 121.44ms
step 61000: train loss 7.2620, val loss 7.2434
saving checkpoint to out-shakespeare-char
iter 61000: loss 8.2126, time 2882.99ms
iter 61010: loss 8.9664, time 121.63ms
iter 61020: loss 9.1178, time 122.90ms
iter 61030: loss 8.6247, time 121.51ms
iter 61040: loss 8.5464, time 122.62ms
iter 61050: loss 7.8744, time 121.54ms
iter 61060: loss 8.4993, time 122.62ms
iter 61070: loss 8.1969, time 121.53ms
iter 61080: loss 8.4538, time 122.66ms
iter 61090: loss 8.1208, time 121.59ms
iter 61100: loss 8.3934, time 121.61ms
iter 61110: loss 8.6939, time 121.75ms
iter 61120: loss 8.2808, time 122.78ms
iter 61130: loss 7.9952, time 121.57ms
iter 61140: loss 8.5950, time 122.67ms
iter 61150: loss 9.6552, time 121.56ms
iter 61160: loss 9.2327, time 122.66ms
iter 61170: loss 8.5020, time 121.69ms
iter 61180: loss 8.5151, time 122.84ms
iter 61190: loss 8.3239, time 121.74ms
iter 61200: loss 9.4160, time 122.70ms
iter 61210: loss 8.7468, time 122.42ms
iter 61220: loss 8.2848, time 122.68ms
iter 61230: loss 9.1083, time 121.21ms
iter 61240: loss 8.8372, time 122.76ms
step 61250: train loss 7.2533, val loss 7.2880
saving checkpoint to out-shakespeare-char
iter 61250: loss 8.5428, time 2894.19ms
iter 61260: loss 8.4892, time 121.76ms
iter 61270: loss 8.6647, time 123.21ms
iter 61280: loss 8.0968, time 121.62ms
iter 61290: loss 8.8825, time 122.82ms
iter 61300: loss 8.6076, time 121.72ms
iter 61310: loss 8.8740, time 122.81ms
iter 61320: loss 9.5100, time 121.67ms
iter 61330: loss 8.9619, time 122.73ms
iter 61340: loss 8.8957, time 120.94ms
iter 61350: loss 9.6211, time 121.11ms
iter 61360: loss 8.8598, time 121.81ms
iter 61370: loss 9.1078, time 120.93ms
iter 61380: loss 8.7013, time 121.72ms
iter 61390: loss 8.5330, time 121.26ms
iter 61400: loss 8.2153, time 121.48ms
iter 61410: loss 7.8726, time 121.34ms
iter 61420: loss 8.9473, time 121.86ms
iter 61430: loss 8.9885, time 121.25ms
iter 61440: loss 9.1277, time 121.64ms
iter 61450: loss 9.0997, time 121.05ms
iter 61460: loss 9.2710, time 121.61ms
iter 61470: loss 8.8093, time 120.98ms
iter 61480: loss 8.9029, time 121.81ms
iter 61490: loss 8.4081, time 121.11ms
step 61500: train loss 7.2584, val loss 7.2373
saving checkpoint to out-shakespeare-char
iter 61500: loss 8.9578, time 2899.97ms
iter 61510: loss 8.8750, time 121.67ms
iter 61520: loss 8.2843, time 121.62ms
iter 61530: loss 8.8643, time 121.72ms
iter 61540: loss 8.9492, time 121.59ms
iter 61550: loss 8.2804, time 121.68ms
iter 61560: loss 9.0901, time 120.79ms
iter 61570: loss 8.2505, time 121.62ms
iter 61580: loss 9.2893, time 121.50ms
iter 61590: loss 8.8598, time 121.67ms
iter 61600: loss 8.0494, time 121.64ms
iter 61610: loss 7.9907, time 121.52ms
iter 61620: loss 8.4194, time 121.55ms
iter 61630: loss 9.7027, time 121.66ms
iter 61640: loss 8.4870, time 121.49ms
iter 61650: loss 8.0428, time 121.44ms
iter 61660: loss 8.0531, time 121.61ms
iter 61670: loss 9.6224, time 121.67ms
iter 61680: loss 8.6646, time 121.61ms
iter 61690: loss 8.9500, time 121.51ms
iter 61700: loss 8.3054, time 121.59ms
iter 61710: loss 9.0296, time 121.63ms
iter 61720: loss 8.5157, time 121.42ms
iter 61730: loss 8.5823, time 121.26ms
iter 61740: loss 8.8325, time 121.10ms
step 61750: train loss 7.2989, val loss 7.2491
saving checkpoint to out-shakespeare-char
iter 61750: loss 8.8388, time 2901.01ms
iter 61760: loss 8.7803, time 124.60ms
iter 61770: loss 8.6124, time 121.91ms
iter 61780: loss 8.4007, time 124.96ms
iter 61790: loss 8.1919, time 121.77ms
iter 61800: loss 9.1608, time 125.09ms
iter 61810: loss 9.1110, time 121.80ms
iter 61820: loss 8.8863, time 124.58ms
iter 61830: loss 9.2601, time 121.70ms
iter 61840: loss 8.5697, time 124.63ms
iter 61850: loss 7.7875, time 121.80ms
iter 61860: loss 7.9652, time 124.58ms
iter 61870: loss 8.4411, time 121.06ms
iter 61880: loss 8.6968, time 124.70ms
iter 61890: loss 8.0727, time 121.80ms
iter 61900: loss 8.9077, time 124.77ms
iter 61910: loss 8.6062, time 121.83ms
iter 61920: loss 8.5176, time 124.85ms
iter 61930: loss 8.7590, time 121.84ms
iter 61940: loss 8.7054, time 124.78ms
iter 61950: loss 8.8295, time 122.08ms
iter 61960: loss 9.4001, time 124.65ms
iter 61970: loss 8.5754, time 121.87ms
iter 61980: loss 7.6782, time 124.60ms
iter 61990: loss 9.3230, time 121.97ms
step 62000: train loss 7.2753, val loss 7.2005
saving checkpoint to out-shakespeare-char
iter 62000: loss 7.7515, time 2899.71ms
iter 62010: loss 9.0448, time 125.35ms
iter 62020: loss 8.8433, time 123.35ms
iter 62030: loss 8.1498, time 125.17ms
iter 62040: loss 9.1102, time 126.02ms
iter 62050: loss 8.7490, time 125.58ms
iter 62060: loss 9.3163, time 125.19ms
iter 62070: loss 8.0952, time 124.75ms
iter 62080: loss 8.3817, time 125.42ms
iter 62090: loss 8.6278, time 124.99ms
iter 62100: loss 8.3809, time 128.26ms
iter 62110: loss 8.6626, time 125.35ms
iter 62120: loss 9.1672, time 126.37ms
iter 62130: loss 8.5669, time 125.72ms
iter 62140: loss 9.2907, time 125.24ms
iter 62150: loss 9.3899, time 125.21ms
iter 62160: loss 8.3634, time 125.49ms
iter 62170: loss 9.0224, time 125.38ms
iter 62180: loss 8.7738, time 124.41ms
iter 62190: loss 9.1401, time 125.38ms
iter 62200: loss 8.5926, time 125.41ms
iter 62210: loss 8.7554, time 128.38ms
iter 62220: loss 7.8323, time 125.31ms
iter 62230: loss 8.2716, time 125.46ms
iter 62240: loss 8.6750, time 125.53ms
step 62250: train loss 7.2175, val loss 7.2773
saving checkpoint to out-shakespeare-char
iter 62250: loss 8.8126, time 2894.06ms
iter 62260: loss 8.7044, time 124.23ms
iter 62270: loss 9.7316, time 120.58ms
iter 62280: loss 8.4697, time 122.18ms
iter 62290: loss 8.7440, time 120.80ms
iter 62300: loss 8.8926, time 122.91ms
iter 62310: loss 8.4368, time 121.33ms
iter 62320: loss 8.5096, time 122.52ms
iter 62330: loss 8.4289, time 120.67ms
iter 62340: loss 8.5284, time 122.70ms
iter 62350: loss 8.4393, time 121.20ms
iter 62360: loss 7.8231, time 121.93ms
iter 62370: loss 7.8880, time 122.39ms
iter 62380: loss 8.6969, time 122.13ms
iter 62390: loss 9.0361, time 120.82ms
iter 62400: loss 8.1081, time 122.61ms
iter 62410: loss 9.3776, time 120.87ms
iter 62420: loss 8.0136, time 121.03ms
iter 62430: loss 9.2747, time 121.05ms
iter 62440: loss 8.9974, time 121.27ms
iter 62450: loss 9.1190, time 121.34ms
iter 62460: loss 8.5740, time 120.71ms
iter 62470: loss 8.5039, time 120.82ms
iter 62480: loss 8.4130, time 121.36ms
iter 62490: loss 8.1915, time 120.14ms
step 62500: train loss 7.2156, val loss 7.2031
saving checkpoint to out-shakespeare-char
iter 62500: loss 8.4417, time 2894.59ms
iter 62510: loss 9.0354, time 122.35ms
iter 62520: loss 8.5614, time 121.51ms
iter 62530: loss 9.1241, time 122.00ms
iter 62540: loss 8.9116, time 121.30ms
iter 62550: loss 8.2497, time 121.68ms
iter 62560: loss 8.8276, time 121.36ms
iter 62570: loss 8.5690, time 121.37ms
iter 62580: loss 8.7805, time 121.23ms
iter 62590: loss 8.0813, time 122.71ms
iter 62600: loss 8.7203, time 121.59ms
iter 62610: loss 9.1707, time 122.61ms
iter 62620: loss 9.1759, time 121.53ms
iter 62630: loss 9.5490, time 122.62ms
iter 62640: loss 8.6916, time 121.58ms
iter 62650: loss 8.2419, time 122.76ms
iter 62660: loss 8.8334, time 121.46ms
iter 62670: loss 8.6818, time 122.65ms
iter 62680: loss 9.0183, time 121.38ms
iter 62690: loss 8.8630, time 122.54ms
iter 62700: loss 9.0382, time 121.51ms
iter 62710: loss 8.8489, time 122.64ms
iter 62720: loss 8.9816, time 121.47ms
iter 62730: loss 9.2845, time 122.56ms
iter 62740: loss 8.0905, time 121.01ms
step 62750: train loss 7.2805, val loss 7.2856
saving checkpoint to out-shakespeare-char
iter 62750: loss 8.6023, time 2897.16ms
iter 62760: loss 8.5278, time 125.19ms
iter 62770: loss 8.7727, time 125.35ms
iter 62780: loss 8.5922, time 128.19ms
iter 62790: loss 8.4153, time 125.54ms
iter 62800: loss 8.6959, time 125.76ms
iter 62810: loss 8.4416, time 125.66ms
iter 62820: loss 8.4856, time 125.89ms
iter 62830: loss 8.4107, time 124.34ms
iter 62840: loss 9.4421, time 125.63ms
iter 62850: loss 8.3945, time 125.83ms
iter 62860: loss 8.1043, time 125.47ms
iter 62870: loss 8.4541, time 125.67ms
iter 62880: loss 8.6188, time 125.85ms
iter 62890: loss 8.6246, time 128.50ms
iter 62900: loss 8.5142, time 125.82ms
iter 62910: loss 8.1651, time 125.63ms
iter 62920: loss 9.0452, time 122.70ms
iter 62930: loss 8.7330, time 121.62ms
iter 62940: loss 9.0991, time 123.30ms
iter 62950: loss 9.1481, time 122.12ms
iter 62960: loss 8.4779, time 122.92ms
iter 62970: loss 8.9450, time 121.55ms
iter 62980: loss 8.8817, time 123.03ms
iter 62990: loss 8.2688, time 121.55ms
step 63000: train loss 7.2786, val loss 7.1794
saving checkpoint to out-shakespeare-char
iter 63000: loss 8.5876, time 2909.58ms
iter 63010: loss 8.4509, time 121.93ms
iter 63020: loss 8.2434, time 124.37ms
iter 63030: loss 8.4325, time 121.92ms
iter 63040: loss 8.4514, time 123.94ms
iter 63050: loss 8.9614, time 122.91ms
iter 63060: loss 8.5196, time 123.85ms
iter 63070: loss 8.8731, time 121.94ms
iter 63080: loss 8.6861, time 123.45ms
iter 63090: loss 8.8072, time 121.91ms
iter 63100: loss 9.0498, time 123.83ms
iter 63110: loss 8.8924, time 122.14ms
iter 63120: loss 8.9379, time 123.47ms
iter 63130: loss 8.6637, time 122.05ms
iter 63140: loss 9.1591, time 123.68ms
iter 63150: loss 8.3677, time 122.68ms
iter 63160: loss 8.6356, time 123.94ms
iter 63170: loss 8.6429, time 121.89ms
iter 63180: loss 9.1103, time 123.22ms
iter 63190: loss 8.3170, time 122.04ms
iter 63200: loss 8.6138, time 124.49ms
iter 63210: loss 8.5121, time 121.86ms
iter 63220: loss 9.3227, time 123.65ms
iter 63230: loss 8.3749, time 121.91ms
iter 63240: loss 8.5237, time 124.01ms
step 63250: train loss 7.1852, val loss 7.1789
saving checkpoint to out-shakespeare-char
iter 63250: loss 8.9116, time 2885.16ms
iter 63260: loss 8.0520, time 121.91ms
iter 63270: loss 8.9283, time 123.08ms
iter 63280: loss 9.2090, time 121.89ms
iter 63290: loss 8.6000, time 123.06ms
iter 63300: loss 8.2751, time 121.86ms
iter 63310: loss 8.8083, time 122.95ms
iter 63320: loss 8.9716, time 121.85ms
iter 63330: loss 8.2743, time 123.00ms
iter 63340: loss 8.3385, time 121.81ms
iter 63350: loss 8.5504, time 123.15ms
iter 63360: loss 8.4283, time 121.93ms
iter 63370: loss 8.2335, time 123.40ms
iter 63380: loss 8.7127, time 121.32ms
iter 63390: loss 8.8135, time 122.72ms
iter 63400: loss 8.1148, time 121.70ms
iter 63410: loss 8.3185, time 123.41ms
iter 63420: loss 8.3627, time 122.07ms
iter 63430: loss 9.0717, time 122.95ms
iter 63440: loss 7.8965, time 122.06ms
iter 63450: loss 9.1000, time 123.07ms
iter 63460: loss 8.7361, time 121.57ms
iter 63470: loss 8.8027, time 123.21ms
iter 63480: loss 8.4131, time 121.59ms
iter 63490: loss 8.5829, time 121.47ms
step 63500: train loss 7.1935, val loss 7.1896
saving checkpoint to out-shakespeare-char
iter 63500: loss 8.6853, time 2887.08ms
iter 63510: loss 8.2617, time 122.00ms
iter 63520: loss 8.7049, time 121.92ms
iter 63530: loss 8.5206, time 121.89ms
iter 63540: loss 9.1087, time 121.82ms
iter 63550: loss 8.3475, time 121.78ms
iter 63560: loss 8.3762, time 121.71ms
iter 63570: loss 8.8198, time 121.90ms
iter 63580: loss 7.9766, time 121.75ms
iter 63590: loss 8.8485, time 121.99ms
iter 63600: loss 9.3594, time 121.85ms
iter 63610: loss 8.2366, time 121.82ms
iter 63620: loss 8.8751, time 121.77ms
iter 63630: loss 8.9861, time 121.95ms
iter 63640: loss 9.2201, time 121.63ms
iter 63650: loss 8.9596, time 120.65ms
iter 63660: loss 8.0912, time 121.75ms
iter 63670: loss 8.0745, time 121.89ms
iter 63680: loss 8.2565, time 121.86ms
iter 63690: loss 8.8952, time 122.00ms
iter 63700: loss 8.2540, time 121.80ms
iter 63710: loss 8.1485, time 121.96ms
iter 63720: loss 8.1956, time 121.79ms
iter 63730: loss 9.1758, time 121.91ms
iter 63740: loss 8.6464, time 121.79ms
step 63750: train loss 7.2467, val loss 7.1684
saving checkpoint to out-shakespeare-char
iter 63750: loss 8.8341, time 2894.27ms
iter 63760: loss 8.1433, time 122.74ms
iter 63770: loss 8.8411, time 121.55ms
iter 63780: loss 8.0762, time 122.98ms
iter 63790: loss 9.6284, time 122.56ms
iter 63800: loss 8.7074, time 122.68ms
iter 63810: loss 9.0935, time 121.42ms
iter 63820: loss 8.0033, time 121.93ms
iter 63830: loss 8.4911, time 121.53ms
iter 63840: loss 8.7467, time 122.55ms
iter 63850: loss 8.2791, time 120.69ms
iter 63860: loss 8.1530, time 122.61ms
iter 63870: loss 8.8549, time 121.78ms
iter 63880: loss 8.4573, time 122.70ms
iter 63890: loss 8.7931, time 121.69ms
iter 63900: loss 8.6995, time 122.63ms
iter 63910: loss 8.5446, time 121.52ms
iter 63920: loss 8.8033, time 122.95ms
iter 63930: loss 8.2796, time 121.63ms
iter 63940: loss 7.9349, time 122.68ms
iter 63950: loss 8.5293, time 121.53ms
iter 63960: loss 9.1287, time 122.65ms
iter 63970: loss 7.9660, time 121.48ms
iter 63980: loss 8.5613, time 122.66ms
iter 63990: loss 8.4143, time 121.63ms
step 64000: train loss 7.2218, val loss 7.1876
saving checkpoint to out-shakespeare-char
iter 64000: loss 9.2467, time 2892.11ms
iter 64010: loss 8.1893, time 121.76ms
iter 64020: loss 7.9905, time 124.29ms
iter 64030: loss 8.5631, time 121.68ms
iter 64040: loss 8.2735, time 124.23ms
iter 64050: loss 8.6021, time 121.45ms
iter 64060: loss 8.3115, time 124.33ms
iter 64070: loss 7.9592, time 121.69ms
iter 64080: loss 7.8257, time 124.16ms
iter 64090: loss 8.8558, time 121.31ms
iter 64100: loss 8.1386, time 124.15ms
iter 64110: loss 8.6912, time 121.40ms
iter 64120: loss 8.3796, time 124.35ms
iter 64130: loss 9.6232, time 121.58ms
iter 64140: loss 8.4078, time 124.15ms
iter 64150: loss 9.1004, time 121.38ms
iter 64160: loss 8.7827, time 123.19ms
iter 64170: loss 8.6281, time 121.08ms
iter 64180: loss 8.1913, time 124.70ms
iter 64190: loss 8.1544, time 121.36ms
iter 64200: loss 8.0040, time 124.26ms
iter 64210: loss 8.2499, time 121.39ms
iter 64220: loss 8.3540, time 124.17ms
iter 64230: loss 8.8683, time 121.43ms
iter 64240: loss 8.2284, time 124.09ms
step 64250: train loss 7.1801, val loss 7.1813
saving checkpoint to out-shakespeare-char
iter 64250: loss 9.0099, time 2890.30ms
iter 64260: loss 8.6982, time 122.03ms
iter 64270: loss 7.8316, time 123.31ms
iter 64280: loss 7.8814, time 121.71ms
iter 64290: loss 9.3510, time 123.47ms
iter 64300: loss 8.4875, time 122.09ms
iter 64310: loss 9.2120, time 123.45ms
iter 64320: loss 8.4868, time 122.04ms
iter 64330: loss 8.3779, time 123.10ms
iter 64340: loss 8.9047, time 122.22ms
iter 64350: loss 8.3280, time 122.31ms
iter 64360: loss 7.8666, time 122.05ms
iter 64370: loss 8.8108, time 123.46ms
iter 64380: loss 8.6634, time 120.94ms
iter 64390: loss 8.9581, time 123.90ms
iter 64400: loss 8.3148, time 122.08ms
iter 64410: loss 8.0751, time 123.22ms
iter 64420: loss 8.9684, time 121.88ms
iter 64430: loss 8.8034, time 123.53ms
iter 64440: loss 9.1379, time 122.87ms
iter 64450: loss 8.6466, time 122.78ms
iter 64460: loss 8.7315, time 120.36ms
iter 64470: loss 7.6938, time 124.53ms
iter 64480: loss 8.5195, time 122.58ms
iter 64490: loss 8.5786, time 123.63ms
step 64500: train loss 7.2073, val loss 7.1905
saving checkpoint to out-shakespeare-char
iter 64500: loss 8.2114, time 2897.90ms
iter 64510: loss 8.2401, time 124.43ms
iter 64520: loss 8.3721, time 121.80ms
iter 64530: loss 7.7977, time 124.87ms
iter 64540: loss 8.9552, time 121.08ms
iter 64550: loss 8.2370, time 125.63ms
iter 64560: loss 8.7415, time 121.73ms
iter 64570: loss 8.6931, time 124.46ms
iter 64580: loss 8.6963, time 121.50ms
iter 64590: loss 8.6106, time 124.48ms
iter 64600: loss 8.4383, time 121.97ms
iter 64610: loss 8.2200, time 121.46ms
iter 64620: loss 8.6462, time 121.25ms
iter 64630: loss 8.5705, time 122.75ms
iter 64640: loss 8.7661, time 121.52ms
iter 64650: loss 9.1939, time 122.99ms
iter 64660: loss 8.6349, time 121.67ms
iter 64670: loss 9.3117, time 122.72ms
iter 64680: loss 8.1775, time 121.72ms
iter 64690: loss 7.8939, time 123.81ms
iter 64700: loss 8.4622, time 122.58ms
iter 64710: loss 8.2589, time 123.15ms
iter 64720: loss 9.1208, time 121.51ms
iter 64730: loss 9.0476, time 122.90ms
iter 64740: loss 8.3878, time 121.64ms
step 64750: train loss 7.1952, val loss 7.1624
saving checkpoint to out-shakespeare-char
iter 64750: loss 8.3399, time 2896.45ms
iter 64760: loss 9.1421, time 122.80ms
iter 64770: loss 8.6194, time 121.67ms
iter 64780: loss 8.8840, time 121.92ms
iter 64790: loss 8.5120, time 121.53ms
iter 64800: loss 9.2751, time 121.47ms
iter 64810: loss 8.2793, time 122.07ms
iter 64820: loss 8.3976, time 121.36ms
iter 64830: loss 8.2595, time 121.72ms
iter 64840: loss 8.8564, time 121.24ms
iter 64850: loss 8.7209, time 121.41ms
iter 64860: loss 8.8272, time 121.44ms
iter 64870: loss 8.5059, time 121.37ms
iter 64880: loss 8.0000, time 120.56ms
iter 64890: loss 8.6066, time 120.68ms
iter 64900: loss 8.3203, time 121.41ms
iter 64910: loss 9.2267, time 121.59ms
iter 64920: loss 7.4651, time 122.13ms
iter 64930: loss 8.1996, time 121.46ms
iter 64940: loss 8.6197, time 120.60ms
iter 64950: loss 8.2444, time 121.59ms
iter 64960: loss 8.8032, time 121.57ms
iter 64970: loss 7.9197, time 121.64ms
iter 64980: loss 8.9318, time 121.59ms
iter 64990: loss 8.5210, time 121.51ms
step 65000: train loss 7.1424, val loss 7.1890
saving checkpoint to out-shakespeare-char
iter 65000: loss 8.3692, time 2884.27ms
iter 65010: loss 9.0275, time 121.42ms
iter 65020: loss 9.2434, time 121.56ms
iter 65030: loss 8.1891, time 121.86ms
iter 65040: loss 7.7959, time 121.20ms
iter 65050: loss 8.2973, time 121.17ms
iter 65060: loss 8.3596, time 121.38ms
iter 65070: loss 8.6802, time 121.32ms
iter 65080: loss 8.7006, time 121.33ms
iter 65090: loss 8.8668, time 121.65ms
iter 65100: loss 8.3310, time 121.52ms
iter 65110: loss 8.6050, time 121.38ms
iter 65120: loss 8.8840, time 120.90ms
iter 65130: loss 8.1728, time 120.88ms
iter 65140: loss 8.1966, time 121.29ms
iter 65150: loss 8.7031, time 121.21ms
iter 65160: loss 9.4092, time 121.41ms
iter 65170: loss 8.5281, time 121.43ms
iter 65180: loss 9.2242, time 121.59ms
iter 65190: loss 8.6826, time 121.30ms
iter 65200: loss 9.2727, time 121.05ms
iter 65210: loss 8.4968, time 121.54ms
iter 65220: loss 8.3196, time 121.33ms
iter 65230: loss 8.3068, time 121.53ms
iter 65240: loss 7.9802, time 121.40ms
step 65250: train loss 7.1753, val loss 7.1828
saving checkpoint to out-shakespeare-char
iter 65250: loss 8.4499, time 2885.11ms
iter 65260: loss 8.0023, time 121.10ms
iter 65270: loss 8.9461, time 120.92ms
iter 65280: loss 8.5763, time 120.94ms
iter 65290: loss 8.4714, time 121.48ms
iter 65300: loss 8.6534, time 121.25ms
iter 65310: loss 9.0611, time 125.94ms
iter 65320: loss 8.9069, time 125.69ms
iter 65330: loss 8.5973, time 126.03ms
iter 65340: loss 9.0572, time 125.46ms
iter 65350: loss 9.2869, time 125.94ms
iter 65360: loss 8.5264, time 125.90ms
iter 65370: loss 8.2623, time 125.87ms
iter 65380: loss 8.1963, time 128.88ms
iter 65390: loss 8.2562, time 121.82ms
iter 65400: loss 8.4196, time 121.82ms
iter 65410: loss 8.2102, time 121.84ms
iter 65420: loss 9.3197, time 121.34ms
iter 65430: loss 8.1413, time 121.64ms
iter 65440: loss 8.3507, time 121.73ms
iter 65450: loss 8.5771, time 121.69ms
iter 65460: loss 8.9722, time 121.77ms
iter 65470: loss 8.3495, time 121.37ms
iter 65480: loss 8.4935, time 121.78ms
iter 65490: loss 8.4206, time 121.71ms
step 65500: train loss 7.1936, val loss 7.1370
saving checkpoint to out-shakespeare-char
iter 65500: loss 9.0305, time 2894.90ms
iter 65510: loss 8.1288, time 125.97ms
iter 65520: loss 7.9557, time 125.15ms
iter 65530: loss 8.6730, time 124.99ms
iter 65540: loss 8.8522, time 124.05ms
iter 65550: loss 8.6651, time 124.97ms
iter 65560: loss 9.2419, time 128.68ms
iter 65570: loss 8.8210, time 125.36ms
iter 65580: loss 8.2678, time 125.02ms
iter 65590: loss 9.2519, time 125.04ms
iter 65600: loss 8.1729, time 124.95ms
iter 65610: loss 9.0347, time 125.13ms
iter 65620: loss 8.6987, time 125.24ms
iter 65630: loss 8.5546, time 125.20ms
iter 65640: loss 8.0981, time 125.26ms
iter 65650: loss 9.1682, time 125.33ms
iter 65660: loss 9.2904, time 125.12ms
iter 65670: loss 9.3437, time 128.21ms
iter 65680: loss 8.6339, time 125.22ms
iter 65690: loss 8.4517, time 125.21ms
iter 65700: loss 8.9140, time 125.21ms
iter 65710: loss 8.8202, time 124.87ms
iter 65720: loss 7.3438, time 125.12ms
iter 65730: loss 8.7410, time 125.16ms
iter 65740: loss 7.9595, time 125.12ms
step 65750: train loss 7.1488, val loss 7.1669
saving checkpoint to out-shakespeare-char
iter 65750: loss 9.0778, time 2901.40ms
iter 65760: loss 9.0306, time 125.41ms
iter 65770: loss 9.1754, time 125.10ms
iter 65780: loss 7.5815, time 125.07ms
iter 65790: loss 9.3249, time 124.91ms
iter 65800: loss 8.9983, time 126.34ms
iter 65810: loss 9.1440, time 125.30ms
iter 65820: loss 8.9325, time 125.07ms
iter 65830: loss 8.9132, time 125.59ms
iter 65840: loss 8.2271, time 128.12ms
iter 65850: loss 8.1707, time 125.03ms
iter 65860: loss 8.6499, time 124.97ms
iter 65870: loss 7.9680, time 124.66ms
iter 65880: loss 8.4687, time 125.36ms
iter 65890: loss 8.8419, time 125.06ms
iter 65900: loss 7.8662, time 124.93ms
iter 65910: loss 8.7468, time 126.39ms
iter 65920: loss 9.0658, time 126.30ms
iter 65930: loss 8.8643, time 126.18ms
iter 65940: loss 8.3726, time 126.00ms
iter 65950: loss 9.5124, time 128.81ms
iter 65960: loss 8.8527, time 125.82ms
iter 65970: loss 8.9008, time 125.92ms
iter 65980: loss 8.0503, time 126.23ms
iter 65990: loss 8.6371, time 125.93ms
step 66000: train loss 7.1338, val loss 7.1521
saving checkpoint to out-shakespeare-char
iter 66000: loss 8.3529, time 2897.56ms
iter 66010: loss 8.3701, time 126.17ms
iter 66020: loss 8.2410, time 125.61ms
iter 66030: loss 7.9126, time 125.22ms
iter 66040: loss 7.8412, time 125.80ms
iter 66050: loss 8.1109, time 128.67ms
iter 66060: loss 8.0686, time 125.35ms
iter 66070: loss 8.1641, time 125.28ms
iter 66080: loss 8.7291, time 125.77ms
iter 66090: loss 8.9314, time 125.04ms
iter 66100: loss 9.5282, time 125.45ms
iter 66110: loss 8.5437, time 125.46ms
iter 66120: loss 7.6871, time 125.30ms
iter 66130: loss 8.3603, time 125.35ms
iter 66140: loss 8.7698, time 125.52ms
iter 66150: loss 8.7722, time 126.18ms
iter 66160: loss 9.2760, time 128.40ms
iter 66170: loss 7.9347, time 125.51ms
iter 66180: loss 8.5495, time 125.50ms
iter 66190: loss 8.3289, time 125.53ms
iter 66200: loss 8.3061, time 126.06ms
iter 66210: loss 8.6505, time 126.22ms
iter 66220: loss 8.6947, time 125.97ms
iter 66230: loss 9.3018, time 125.67ms
iter 66240: loss 8.5841, time 126.00ms
step 66250: train loss 7.1249, val loss 7.1444
saving checkpoint to out-shakespeare-char
iter 66250: loss 8.6423, time 2915.39ms
iter 66260: loss 8.5917, time 125.89ms
iter 66270: loss 9.0515, time 125.66ms
iter 66280: loss 8.2311, time 125.83ms
iter 66290: loss 8.6400, time 127.55ms
iter 66300: loss 8.2977, time 125.61ms
iter 66310: loss 8.2389, time 125.83ms
iter 66320: loss 8.0329, time 125.87ms
iter 66330: loss 8.5969, time 125.75ms
iter 66340: loss 9.0438, time 125.80ms
iter 66350: loss 9.4468, time 125.52ms
iter 66360: loss 7.9805, time 126.39ms
iter 66370: loss 9.0025, time 125.85ms
iter 66380: loss 7.9791, time 125.72ms
iter 66390: loss 8.8835, time 125.33ms
iter 66400: loss 8.9066, time 125.23ms
iter 66410: loss 8.3238, time 125.58ms
iter 66420: loss 8.8906, time 125.62ms
iter 66430: loss 8.5476, time 126.43ms
iter 66440: loss 8.7286, time 128.46ms
iter 66450: loss 9.2769, time 126.04ms
iter 66460: loss 9.1435, time 125.63ms
iter 66470: loss 8.7530, time 125.93ms
iter 66480: loss 8.5395, time 125.85ms
iter 66490: loss 9.1180, time 123.64ms
step 66500: train loss 7.2043, val loss 7.1144
saving checkpoint to out-shakespeare-char
iter 66500: loss 9.1867, time 2896.59ms
iter 66510: loss 8.6998, time 121.76ms
iter 66520: loss 8.4245, time 121.29ms
iter 66530: loss 7.8843, time 121.87ms
iter 66540: loss 8.3679, time 121.96ms
iter 66550: loss 8.7943, time 121.87ms
iter 66560: loss 8.5673, time 121.86ms
iter 66570: loss 8.1987, time 122.00ms
iter 66580: loss 9.0874, time 121.63ms
iter 66590: loss 8.6669, time 121.11ms
iter 66600: loss 8.4425, time 121.62ms
iter 66610: loss 8.1644, time 121.61ms
iter 66620: loss 8.4589, time 121.75ms
iter 66630: loss 9.1127, time 121.74ms
iter 66640: loss 8.3354, time 121.67ms
iter 66650: loss 8.3853, time 121.71ms
iter 66660: loss 9.4346, time 121.68ms
iter 66670: loss 9.2416, time 121.76ms
iter 66680: loss 8.1732, time 121.06ms
iter 66690: loss 7.9966, time 121.52ms
iter 66700: loss 8.8410, time 121.47ms
iter 66710: loss 7.8071, time 121.62ms
iter 66720: loss 7.8565, time 121.62ms
iter 66730: loss 9.1120, time 121.20ms
iter 66740: loss 8.9933, time 121.87ms
step 66750: train loss 7.1452, val loss 7.1602
saving checkpoint to out-shakespeare-char
iter 66750: loss 8.1880, time 2892.69ms
iter 66760: loss 8.3081, time 125.65ms
iter 66770: loss 8.7316, time 128.20ms
iter 66780: loss 8.4618, time 125.09ms
iter 66790: loss 8.8613, time 125.61ms
iter 66800: loss 8.7083, time 125.28ms
iter 66810: loss 8.2241, time 125.41ms
iter 66820: loss 8.4825, time 125.49ms
iter 66830: loss 8.5482, time 125.66ms
iter 66840: loss 8.4250, time 125.25ms
iter 66850: loss 8.7837, time 125.42ms
iter 66860: loss 8.3544, time 125.97ms
iter 66870: loss 9.2039, time 125.77ms
iter 66880: loss 7.9140, time 128.42ms
iter 66890: loss 9.1692, time 125.69ms
iter 66900: loss 8.4203, time 125.71ms
iter 66910: loss 7.7476, time 125.74ms
iter 66920: loss 8.5443, time 129.03ms
iter 66930: loss 8.8802, time 125.81ms
iter 66940: loss 8.2073, time 125.88ms
iter 66950: loss 8.6119, time 125.14ms
iter 66960: loss 8.6060, time 125.35ms
iter 66970: loss 8.4014, time 125.57ms
iter 66980: loss 8.5902, time 125.42ms
iter 66990: loss 9.1504, time 124.45ms
step 67000: train loss 7.0876, val loss 7.1264
saving checkpoint to out-shakespeare-char
iter 67000: loss 9.2860, time 2871.59ms
iter 67010: loss 9.0173, time 125.09ms
iter 67020: loss 8.9716, time 125.07ms
iter 67030: loss 7.9847, time 124.98ms
iter 67040: loss 8.3864, time 125.33ms
iter 67050: loss 9.0548, time 123.78ms
iter 67060: loss 8.7639, time 125.25ms
iter 67070: loss 8.7682, time 125.49ms
iter 67080: loss 8.7398, time 125.13ms
iter 67090: loss 8.8679, time 124.46ms
iter 67100: loss 8.1550, time 125.30ms
iter 67110: loss 8.5267, time 125.29ms
iter 67120: loss 8.0067, time 125.64ms
iter 67130: loss 8.3836, time 128.67ms
iter 67140: loss 9.0587, time 125.36ms
iter 67150: loss 8.2960, time 125.35ms
iter 67160: loss 7.6862, time 125.64ms
iter 67170: loss 8.6134, time 126.01ms
iter 67180: loss 8.7402, time 125.25ms
iter 67190: loss 8.8476, time 125.48ms
iter 67200: loss 8.7317, time 128.36ms
iter 67210: loss 8.8464, time 125.31ms
iter 67220: loss 9.3295, time 125.31ms
iter 67230: loss 8.3127, time 125.62ms
iter 67240: loss 9.0978, time 125.24ms
step 67250: train loss 7.1298, val loss 7.1128
saving checkpoint to out-shakespeare-char
iter 67250: loss 8.5006, time 2887.83ms
iter 67260: loss 9.0700, time 125.83ms
iter 67270: loss 7.8745, time 124.32ms
iter 67280: loss 9.0532, time 125.38ms
iter 67290: loss 8.5879, time 125.01ms
iter 67300: loss 8.4153, time 128.36ms
iter 67310: loss 7.9935, time 124.69ms
iter 67320: loss 9.3121, time 124.63ms
iter 67330: loss 8.6166, time 125.68ms
iter 67340: loss 9.2318, time 127.35ms
iter 67350: loss 8.5934, time 125.56ms
iter 67360: loss 8.5657, time 125.39ms
iter 67370: loss 7.8978, time 126.18ms
iter 67380: loss 8.6616, time 124.34ms
iter 67390: loss 9.2215, time 125.36ms
iter 67400: loss 8.3358, time 125.70ms
iter 67410: loss 8.7591, time 128.32ms
iter 67420: loss 8.3393, time 125.61ms
iter 67430: loss 8.1598, time 128.61ms
iter 67440: loss 8.1662, time 124.84ms
iter 67450: loss 8.3267, time 127.36ms
iter 67460: loss 9.2413, time 124.75ms
iter 67470: loss 9.4197, time 125.54ms
iter 67480: loss 8.7664, time 124.51ms
iter 67490: loss 8.3030, time 125.17ms
step 67500: train loss 7.1291, val loss 7.1696
saving checkpoint to out-shakespeare-char
iter 67500: loss 8.6883, time 2891.16ms
iter 67510: loss 7.9688, time 128.13ms
iter 67520: loss 9.0467, time 124.69ms
iter 67530: loss 8.6769, time 124.97ms
iter 67540: loss 8.2685, time 125.34ms
iter 67550: loss 8.3321, time 125.33ms
iter 67560: loss 8.9380, time 125.10ms
iter 67570: loss 9.3458, time 125.11ms
iter 67580: loss 9.0503, time 125.15ms
iter 67590: loss 8.4029, time 125.46ms
iter 67600: loss 8.7199, time 125.72ms
iter 67610: loss 8.6304, time 125.57ms
iter 67620: loss 8.6262, time 125.49ms
iter 67630: loss 7.9047, time 126.18ms
iter 67640: loss 7.7497, time 125.46ms
iter 67650: loss 8.7370, time 125.88ms
iter 67660: loss 8.8533, time 128.88ms
iter 67670: loss 8.6454, time 125.71ms
iter 67680: loss 8.1096, time 125.75ms
iter 67690: loss 8.9009, time 125.91ms
iter 67700: loss 8.8491, time 128.73ms
iter 67710: loss 9.3918, time 126.34ms
iter 67720: loss 8.2333, time 125.56ms
iter 67730: loss 9.0158, time 125.48ms
iter 67740: loss 8.2571, time 125.26ms
step 67750: train loss 7.0891, val loss 7.0957
saving checkpoint to out-shakespeare-char
iter 67750: loss 7.9169, time 2906.73ms
iter 67760: loss 8.7133, time 125.74ms
iter 67770: loss 8.3740, time 125.49ms
iter 67780: loss 8.1184, time 125.81ms
iter 67790: loss 9.8572, time 125.79ms
iter 67800: loss 8.9144, time 125.90ms
iter 67810: loss 8.0276, time 125.99ms
iter 67820: loss 8.5005, time 126.00ms
iter 67830: loss 7.9788, time 125.41ms
iter 67840: loss 8.8596, time 125.65ms
iter 67850: loss 7.7084, time 128.28ms
iter 67860: loss 8.1357, time 125.68ms
iter 67870: loss 8.9325, time 125.68ms
iter 67880: loss 9.2972, time 125.43ms
iter 67890: loss 8.6508, time 125.61ms
iter 67900: loss 7.7557, time 126.04ms
iter 67910: loss 7.8772, time 125.63ms
iter 67920: loss 8.4774, time 125.96ms
iter 67930: loss 7.7741, time 125.64ms
iter 67940: loss 7.7927, time 124.53ms
iter 67950: loss 8.0184, time 125.37ms
iter 67960: loss 8.2143, time 125.17ms
iter 67970: loss 9.3502, time 124.80ms
iter 67980: loss 8.2886, time 125.36ms
iter 67990: loss 9.3312, time 125.40ms
step 68000: train loss 7.1307, val loss 7.1331
saving checkpoint to out-shakespeare-char
iter 68000: loss 7.9303, time 2885.48ms
iter 68010: loss 8.9912, time 125.27ms
iter 68020: loss 8.5523, time 126.04ms
iter 68030: loss 7.8944, time 125.74ms
iter 68040: loss 8.7262, time 125.64ms
iter 68050: loss 8.7825, time 125.23ms
iter 68060: loss 8.9710, time 125.43ms
iter 68070: loss 8.3889, time 125.16ms
iter 68080: loss 8.7160, time 125.33ms
iter 68090: loss 8.3346, time 125.36ms
iter 68100: loss 8.6716, time 128.60ms
iter 68110: loss 8.2586, time 125.34ms
iter 68120: loss 8.8924, time 125.29ms
iter 68130: loss 8.5011, time 125.96ms
iter 68140: loss 8.2262, time 125.43ms
iter 68150: loss 8.5982, time 124.85ms
iter 68160: loss 8.8112, time 125.29ms
iter 68170: loss 8.9566, time 125.82ms
iter 68180: loss 8.5711, time 125.73ms
iter 68190: loss 8.6691, time 123.57ms
iter 68200: loss 8.9684, time 125.38ms
iter 68210: loss 7.6276, time 128.33ms
iter 68220: loss 8.3943, time 125.57ms
iter 68230: loss 8.2986, time 125.43ms
iter 68240: loss 8.4651, time 124.66ms
step 68250: train loss 7.1463, val loss 7.0856
saving checkpoint to out-shakespeare-char
iter 68250: loss 8.3762, time 2889.75ms
iter 68260: loss 8.8181, time 121.79ms
iter 68270: loss 8.1129, time 121.57ms
iter 68280: loss 9.4342, time 123.17ms
iter 68290: loss 9.1590, time 122.04ms
iter 68300: loss 8.9708, time 121.55ms
iter 68310: loss 8.0102, time 121.66ms
iter 68320: loss 8.5727, time 121.55ms
iter 68330: loss 8.5188, time 121.55ms
iter 68340: loss 8.1521, time 121.14ms
iter 68350: loss 7.5566, time 121.57ms
iter 68360: loss 8.4454, time 121.67ms
iter 68370: loss 8.3729, time 121.68ms
iter 68380: loss 8.0774, time 121.60ms
iter 68390: loss 8.4576, time 121.81ms
iter 68400: loss 9.0457, time 121.71ms
iter 68410: loss 8.9339, time 121.70ms
iter 68420: loss 9.0375, time 121.66ms
iter 68430: loss 8.8226, time 121.65ms
iter 68440: loss 8.9327, time 121.71ms
iter 68450: loss 8.3403, time 121.96ms
iter 68460: loss 8.8062, time 122.26ms
iter 68470: loss 8.3015, time 121.71ms
iter 68480: loss 8.0708, time 121.61ms
iter 68490: loss 8.2049, time 121.77ms
step 68500: train loss 7.0642, val loss 7.1019
saving checkpoint to out-shakespeare-char
iter 68500: loss 8.3580, time 2911.80ms
iter 68510: loss 7.7970, time 125.48ms
iter 68520: loss 8.7822, time 125.48ms
iter 68530: loss 8.9041, time 128.19ms
iter 68540: loss 8.1618, time 125.01ms
iter 68550: loss 8.3045, time 125.52ms
iter 68560: loss 8.3798, time 125.34ms
iter 68570: loss 9.1376, time 125.14ms
iter 68580: loss 8.6237, time 124.58ms
iter 68590: loss 8.2160, time 124.87ms
iter 68600: loss 7.8968, time 128.07ms
iter 68610: loss 8.7021, time 125.03ms
iter 68620: loss 8.5846, time 124.86ms
iter 68630: loss 8.3281, time 125.34ms
iter 68640: loss 8.1911, time 125.04ms
iter 68650: loss 8.8493, time 124.95ms
iter 68660: loss 9.2878, time 124.96ms
iter 68670: loss 7.9886, time 124.80ms
iter 68680: loss 8.8693, time 125.69ms
iter 68690: loss 8.1170, time 125.06ms
iter 68700: loss 9.2592, time 125.08ms
iter 68710: loss 8.5185, time 127.91ms
iter 68720: loss 8.2539, time 124.87ms
iter 68730: loss 8.5930, time 124.76ms
iter 68740: loss 9.0264, time 125.24ms
step 68750: train loss 7.0622, val loss 7.1024
saving checkpoint to out-shakespeare-char
iter 68750: loss 9.3335, time 2868.31ms
iter 68760: loss 7.3508, time 125.40ms
iter 68770: loss 8.9446, time 125.24ms
iter 68780: loss 8.1940, time 128.07ms
iter 68790: loss 9.2546, time 124.94ms
iter 68800: loss 8.4645, time 125.46ms
iter 68810: loss 8.5602, time 121.77ms
iter 68820: loss 8.6265, time 121.32ms
iter 68830: loss 8.7734, time 122.31ms
iter 68840: loss 8.3825, time 121.69ms
iter 68850: loss 7.5672, time 122.72ms
iter 68860: loss 9.0512, time 121.56ms
iter 68870: loss 9.4485, time 122.75ms
iter 68880: loss 8.8715, time 121.63ms
iter 68890: loss 7.8829, time 122.58ms
iter 68900: loss 9.1606, time 121.78ms
iter 68910: loss 8.6333, time 122.80ms
iter 68920: loss 7.9609, time 121.66ms
iter 68930: loss 8.6416, time 122.67ms
iter 68940: loss 8.5092, time 121.54ms
iter 68950: loss 7.8748, time 122.54ms
iter 68960: loss 7.9553, time 121.74ms
iter 68970: loss 8.5584, time 122.75ms
iter 68980: loss 9.1572, time 121.52ms
iter 68990: loss 8.9492, time 122.73ms
step 69000: train loss 7.0346, val loss 7.0410
saving checkpoint to out-shakespeare-char
iter 69000: loss 8.1540, time 2888.84ms
iter 69010: loss 8.6058, time 121.37ms
iter 69020: loss 8.8752, time 121.13ms
iter 69030: loss 9.1178, time 121.63ms
iter 69040: loss 8.0832, time 121.60ms
iter 69050: loss 8.9302, time 121.60ms
iter 69060: loss 8.2896, time 121.90ms
iter 69070: loss 8.0936, time 121.68ms
iter 69080: loss 7.9178, time 121.52ms
iter 69090: loss 8.1345, time 121.53ms
iter 69100: loss 8.1843, time 121.66ms
iter 69110: loss 8.6078, time 122.25ms
iter 69120: loss 8.0368, time 121.15ms
iter 69130: loss 8.8931, time 121.56ms
iter 69140: loss 7.9157, time 121.74ms
iter 69150: loss 8.0784, time 121.69ms
iter 69160: loss 8.9802, time 121.54ms
iter 69170: loss 8.6212, time 121.62ms
iter 69180: loss 8.2561, time 121.60ms
iter 69190: loss 9.3792, time 121.85ms
iter 69200: loss 8.0830, time 121.02ms
iter 69210: loss 8.6003, time 121.69ms
iter 69220: loss 8.6724, time 120.83ms
iter 69230: loss 8.2887, time 121.65ms
iter 69240: loss 8.6184, time 121.61ms
step 69250: train loss 7.0730, val loss 7.0855
saving checkpoint to out-shakespeare-char
iter 69250: loss 8.5029, time 2894.80ms
iter 69260: loss 8.9754, time 122.11ms
iter 69270: loss 7.7980, time 121.61ms
iter 69280: loss 8.3791, time 121.83ms
iter 69290: loss 8.1790, time 121.61ms
iter 69300: loss 8.3865, time 121.73ms
iter 69310: loss 8.5836, time 121.56ms
iter 69320: loss 9.6018, time 122.19ms
iter 69330: loss 8.0545, time 121.72ms
iter 69340: loss 8.0534, time 121.74ms
iter 69350: loss 8.9754, time 121.69ms
iter 69360: loss 8.7866, time 121.76ms
iter 69370: loss 8.4618, time 122.05ms
iter 69380: loss 8.0145, time 122.33ms
iter 69390: loss 8.1135, time 121.51ms
iter 69400: loss 8.9080, time 123.70ms
iter 69410: loss 8.5884, time 121.89ms
iter 69420: loss 8.3012, time 122.68ms
iter 69430: loss 8.8057, time 121.65ms
iter 69440: loss 8.1165, time 121.72ms
iter 69450: loss 8.2676, time 121.57ms
iter 69460: loss 9.1464, time 121.85ms
iter 69470: loss 8.1784, time 121.72ms
iter 69480: loss 8.8710, time 122.92ms
iter 69490: loss 9.1704, time 121.84ms
step 69500: train loss 7.0527, val loss 7.0538
saving checkpoint to out-shakespeare-char
iter 69500: loss 8.4062, time 2898.46ms
iter 69510: loss 8.7274, time 121.81ms
iter 69520: loss 9.4846, time 122.09ms
iter 69530: loss 8.6659, time 121.01ms
iter 69540: loss 8.3800, time 121.76ms
iter 69550: loss 8.0352, time 121.73ms
iter 69560: loss 8.0974, time 121.97ms
iter 69570: loss 9.1042, time 125.77ms
iter 69580: loss 8.6556, time 125.41ms
iter 69590: loss 8.0393, time 124.43ms
iter 69600: loss 8.4507, time 125.79ms
iter 69610: loss 7.5151, time 127.98ms
iter 69620: loss 8.3168, time 129.77ms
iter 69630: loss 8.3507, time 125.41ms
iter 69640: loss 8.7065, time 129.26ms
iter 69650: loss 8.3762, time 125.96ms
iter 69660: loss 8.2786, time 125.86ms
iter 69670: loss 9.0860, time 122.60ms
iter 69680: loss 8.4934, time 120.30ms
iter 69690: loss 8.8822, time 121.84ms
iter 69700: loss 8.6841, time 121.37ms
iter 69710: loss 8.3630, time 121.99ms
iter 69720: loss 8.9619, time 121.64ms
iter 69730: loss 8.5217, time 121.73ms
iter 69740: loss 8.6063, time 121.92ms
step 69750: train loss 7.0605, val loss 7.0317
saving checkpoint to out-shakespeare-char
iter 69750: loss 8.7156, time 2888.75ms
iter 69760: loss 8.4558, time 121.79ms
iter 69770: loss 7.5293, time 121.68ms
iter 69780: loss 7.9178, time 121.74ms
iter 69790: loss 8.7073, time 121.63ms
iter 69800: loss 7.9110, time 121.70ms
iter 69810: loss 8.2237, time 121.06ms
iter 69820: loss 8.1408, time 121.35ms
iter 69830: loss 8.2661, time 121.61ms
iter 69840: loss 7.8315, time 120.36ms
iter 69850: loss 8.7673, time 121.65ms
iter 69860: loss 8.1140, time 121.99ms
iter 69870: loss 8.9973, time 122.32ms
iter 69880: loss 8.4276, time 121.03ms
iter 69890: loss 8.7306, time 121.66ms
iter 69900: loss 8.7825, time 121.93ms
iter 69910: loss 8.3174, time 122.09ms
iter 69920: loss 8.5360, time 121.62ms
iter 69930: loss 8.3652, time 126.58ms
iter 69940: loss 8.4664, time 125.79ms
iter 69950: loss 8.8266, time 124.99ms
iter 69960: loss 8.7096, time 128.98ms
iter 69970: loss 8.5827, time 128.44ms
iter 69980: loss 9.2073, time 125.04ms
iter 69990: loss 8.2908, time 124.84ms
step 70000: train loss 7.0362, val loss 7.0554
saving checkpoint to out-shakespeare-char
iter 70000: loss 8.2546, time 2900.31ms
iter 70010: loss 7.9985, time 125.92ms
iter 70020: loss 8.8209, time 124.80ms
iter 70030: loss 8.3220, time 125.04ms
iter 70040: loss 9.0841, time 125.33ms
iter 70050: loss 8.1099, time 125.22ms
iter 70060: loss 7.9612, time 125.42ms
iter 70070: loss 8.2020, time 127.09ms
iter 70080: loss 7.6883, time 125.18ms
iter 70090: loss 8.4447, time 125.10ms
iter 70100: loss 9.3161, time 125.62ms
iter 70110: loss 8.2810, time 125.31ms
iter 70120: loss 9.3987, time 125.49ms
iter 70130: loss 8.1863, time 125.04ms
iter 70140: loss 8.5818, time 125.64ms
iter 70150: loss 7.6538, time 124.48ms
iter 70160: loss 8.6196, time 125.17ms
iter 70170: loss 8.8703, time 125.61ms
iter 70180: loss 8.6857, time 128.11ms
iter 70190: loss 8.2771, time 125.02ms
iter 70200: loss 8.2788, time 124.66ms
iter 70210: loss 8.2955, time 125.12ms
iter 70220: loss 8.5929, time 125.18ms
iter 70230: loss 8.5882, time 124.41ms
iter 70240: loss 8.4896, time 124.94ms
step 70250: train loss 7.1256, val loss 7.1037
saving checkpoint to out-shakespeare-char
iter 70250: loss 7.9850, time 2894.24ms
iter 70260: loss 7.9945, time 125.83ms
iter 70270: loss 8.7990, time 125.45ms
iter 70280: loss 8.7133, time 128.07ms
iter 70290: loss 9.1810, time 125.27ms
iter 70300: loss 8.9552, time 125.14ms
iter 70310: loss 8.1715, time 125.30ms
iter 70320: loss 9.1764, time 124.33ms
iter 70330: loss 9.6899, time 125.08ms
iter 70340: loss 7.9699, time 125.29ms
iter 70350: loss 8.1700, time 125.63ms
iter 70360: loss 7.9401, time 125.23ms
iter 70370: loss 8.8389, time 124.83ms
iter 70380: loss 8.9228, time 125.28ms
iter 70390: loss 8.0936, time 128.44ms
iter 70400: loss 9.1244, time 125.25ms
iter 70410: loss 8.3627, time 125.08ms
iter 70420: loss 7.3059, time 125.41ms
iter 70430: loss 8.1417, time 125.16ms
iter 70440: loss 8.4056, time 125.29ms
iter 70450: loss 8.4362, time 125.16ms
iter 70460: loss 8.3906, time 125.18ms
iter 70470: loss 8.7555, time 125.06ms
iter 70480: loss 9.1737, time 125.72ms
iter 70490: loss 8.4255, time 125.30ms
step 70500: train loss 7.0282, val loss 7.0429
saving checkpoint to out-shakespeare-char
iter 70500: loss 7.9134, time 2912.94ms
iter 70510: loss 8.7885, time 125.29ms
iter 70520: loss 8.4454, time 125.62ms
iter 70530: loss 8.9077, time 125.29ms
iter 70540: loss 8.5904, time 125.94ms
iter 70550: loss 7.8714, time 125.74ms
iter 70560: loss 7.8328, time 125.72ms
iter 70570: loss 7.8391, time 128.68ms
iter 70580: loss 8.4809, time 125.65ms
iter 70590: loss 8.6461, time 125.84ms
iter 70600: loss 8.1417, time 125.80ms
iter 70610: loss 8.5116, time 125.74ms
iter 70620: loss 7.9513, time 125.55ms
iter 70630: loss 7.6510, time 125.65ms
iter 70640: loss 8.9487, time 125.38ms
iter 70650: loss 9.0026, time 125.80ms
iter 70660: loss 8.5214, time 125.19ms
iter 70670: loss 8.4443, time 125.41ms
iter 70680: loss 8.2796, time 127.51ms
iter 70690: loss 7.8929, time 125.20ms
iter 70700: loss 7.6489, time 125.08ms
iter 70710: loss 8.3995, time 125.36ms
iter 70720: loss 8.8300, time 125.27ms
iter 70730: loss 8.6698, time 125.35ms
iter 70740: loss 7.9465, time 125.74ms
step 70750: train loss 7.0518, val loss 7.0565
saving checkpoint to out-shakespeare-char
iter 70750: loss 8.5395, time 2899.45ms
iter 70760: loss 8.2225, time 125.63ms
iter 70770: loss 8.5343, time 125.76ms
iter 70780: loss 8.3522, time 125.56ms
iter 70790: loss 7.8741, time 125.43ms
iter 70800: loss 8.3312, time 124.77ms
iter 70810: loss 7.9347, time 125.43ms
iter 70820: loss 8.5863, time 125.38ms
iter 70830: loss 8.0015, time 125.49ms
iter 70840: loss 9.0185, time 125.13ms
iter 70850: loss 8.7981, time 128.45ms
iter 70860: loss 8.6610, time 125.42ms
iter 70870: loss 8.7059, time 126.30ms
iter 70880: loss 8.2942, time 125.19ms
iter 70890: loss 8.4991, time 125.89ms
iter 70900: loss 8.6780, time 126.03ms
iter 70910: loss 8.1531, time 125.99ms
iter 70920: loss 7.5425, time 125.89ms
iter 70930: loss 8.4322, time 125.67ms
iter 70940: loss 8.0642, time 125.97ms
iter 70950: loss 8.6736, time 126.00ms
iter 70960: loss 8.4425, time 127.84ms
iter 70970: loss 8.2438, time 125.75ms
iter 70980: loss 8.2421, time 124.89ms
iter 70990: loss 8.6045, time 127.95ms
step 71000: train loss 7.0417, val loss 7.0560
saving checkpoint to out-shakespeare-char
iter 71000: loss 8.4984, time 2907.68ms
iter 71010: loss 7.8229, time 125.73ms
iter 71020: loss 8.6802, time 124.08ms
iter 71030: loss 8.4029, time 124.63ms
iter 71040: loss 8.1941, time 124.16ms
iter 71050: loss 8.5237, time 126.12ms
iter 71060: loss 8.7569, time 125.45ms
iter 71070: loss 8.0732, time 124.70ms
iter 71080: loss 9.2213, time 124.58ms
iter 71090: loss 8.5554, time 125.33ms
iter 71100: loss 7.8767, time 126.52ms
iter 71110: loss 7.8142, time 125.46ms
iter 71120: loss 8.4183, time 125.28ms
iter 71130: loss 9.0395, time 128.44ms
iter 71140: loss 8.2801, time 125.18ms
iter 71150: loss 8.3358, time 125.36ms
iter 71160: loss 8.2394, time 125.27ms
iter 71170: loss 8.0276, time 128.27ms
iter 71180: loss 8.5565, time 125.67ms
iter 71190: loss 8.7233, time 126.45ms
iter 71200: loss 8.6161, time 125.67ms
iter 71210: loss 8.2601, time 125.60ms
iter 71220: loss 7.6016, time 125.34ms
iter 71230: loss 8.7213, time 125.32ms
iter 71240: loss 8.7531, time 125.27ms
step 71250: train loss 7.0783, val loss 7.0458
saving checkpoint to out-shakespeare-char
iter 71250: loss 8.4395, time 2899.52ms
iter 71260: loss 8.5905, time 125.18ms
iter 71270: loss 7.9026, time 125.35ms
iter 71280: loss 9.0287, time 125.44ms
iter 71290: loss 8.1566, time 125.65ms
iter 71300: loss 9.0099, time 124.85ms
iter 71310: loss 7.7638, time 128.36ms
iter 71320: loss 8.1855, time 125.30ms
iter 71330: loss 8.8808, time 125.95ms
iter 71340: loss 7.6193, time 125.25ms
iter 71350: loss 7.9745, time 128.56ms
iter 71360: loss 8.4179, time 125.19ms
iter 71370: loss 8.9187, time 125.56ms
iter 71380: loss 8.3494, time 126.00ms
iter 71390: loss 8.6029, time 125.35ms
iter 71400: loss 8.1134, time 124.52ms
iter 71410: loss 7.4725, time 124.45ms
iter 71420: loss 7.9348, time 128.08ms
iter 71430: loss 8.2835, time 124.77ms
iter 71440: loss 9.0781, time 124.06ms
iter 71450: loss 8.3473, time 124.71ms
iter 71460: loss 8.8715, time 125.24ms
iter 71470: loss 9.0364, time 125.23ms
iter 71480: loss 8.7363, time 124.16ms
iter 71490: loss 7.3975, time 124.92ms
step 71500: train loss 7.0122, val loss 6.9567
saving checkpoint to out-shakespeare-char
iter 71500: loss 8.0843, time 2902.78ms
iter 71510: loss 8.7441, time 125.11ms
iter 71520: loss 8.7571, time 124.72ms
iter 71530: loss 7.9501, time 125.20ms
iter 71540: loss 8.3240, time 125.58ms
iter 71550: loss 8.7361, time 128.10ms
iter 71560: loss 9.0450, time 125.30ms
iter 71570: loss 8.1807, time 125.12ms
iter 71580: loss 8.1284, time 125.16ms
iter 71590: loss 8.5281, time 125.27ms
iter 71600: loss 8.5566, time 125.66ms
iter 71610: loss 7.9298, time 125.39ms
iter 71620: loss 8.8694, time 128.57ms
iter 71630: loss 7.7864, time 126.08ms
iter 71640: loss 9.2818, time 124.29ms
iter 71650: loss 8.1616, time 125.82ms
iter 71660: loss 8.8500, time 125.31ms
iter 71670: loss 9.0017, time 125.65ms
iter 71680: loss 8.6296, time 125.73ms
iter 71690: loss 8.8532, time 128.31ms
iter 71700: loss 8.8062, time 125.50ms
iter 71710: loss 8.9362, time 125.42ms
iter 71720: loss 8.4662, time 125.71ms
iter 71730: loss 9.1252, time 126.32ms
iter 71740: loss 8.5790, time 125.24ms
step 71750: train loss 7.0655, val loss 7.0309
saving checkpoint to out-shakespeare-char
iter 71750: loss 8.1418, time 2906.67ms
iter 71760: loss 8.6247, time 125.73ms
iter 71770: loss 8.5657, time 124.90ms
iter 71780: loss 8.6271, time 125.32ms
iter 71790: loss 8.0409, time 125.30ms
iter 71800: loss 8.7920, time 125.60ms
iter 71810: loss 9.0867, time 124.81ms
iter 71820: loss 8.3890, time 125.45ms
iter 71830: loss 8.6076, time 127.36ms
iter 71840: loss 8.4788, time 126.06ms
iter 71850: loss 8.6149, time 125.84ms
iter 71860: loss 7.8709, time 128.38ms
iter 71870: loss 8.6364, time 126.31ms
iter 71880: loss 9.0152, time 125.06ms
iter 71890: loss 8.1148, time 125.90ms
iter 71900: loss 9.0664, time 125.87ms
iter 71910: loss 9.4524, time 126.01ms
iter 71920: loss 7.8326, time 125.22ms
iter 71930: loss 8.0182, time 125.39ms
iter 71940: loss 7.8778, time 125.17ms
iter 71950: loss 8.8614, time 125.39ms
iter 71960: loss 8.3381, time 126.18ms
iter 71970: loss 8.2760, time 128.38ms
iter 71980: loss 8.7150, time 125.55ms
iter 71990: loss 8.4562, time 125.83ms
step 72000: train loss 6.9630, val loss 7.0570
saving checkpoint to out-shakespeare-char
iter 72000: loss 9.0256, time 2884.99ms
iter 72010: loss 8.6058, time 125.69ms
iter 72020: loss 8.0697, time 126.34ms
iter 72030: loss 8.6989, time 125.64ms
iter 72040: loss 8.1787, time 125.48ms
iter 72050: loss 8.1825, time 125.17ms
iter 72060: loss 8.7722, time 125.76ms
iter 72070: loss 8.9288, time 125.80ms
iter 72080: loss 8.4709, time 125.83ms
iter 72090: loss 8.4282, time 125.66ms
iter 72100: loss 8.8135, time 128.04ms
iter 72110: loss 9.0945, time 126.40ms
iter 72120: loss 8.4848, time 124.94ms
iter 72130: loss 8.6669, time 124.83ms
iter 72140: loss 8.1926, time 124.86ms
iter 72150: loss 9.0596, time 125.50ms
iter 72160: loss 8.5077, time 124.81ms
iter 72170: loss 8.4335, time 128.45ms
iter 72180: loss 8.7764, time 125.59ms
iter 72190: loss 9.4856, time 125.16ms
iter 72200: loss 7.3943, time 124.83ms
iter 72210: loss 7.9506, time 125.09ms
iter 72220: loss 9.1298, time 125.19ms
iter 72230: loss 8.4869, time 125.17ms
iter 72240: loss 8.5086, time 123.95ms
step 72250: train loss 7.0171, val loss 7.0343
saving checkpoint to out-shakespeare-char
iter 72250: loss 8.7516, time 2890.64ms
iter 72260: loss 9.0392, time 125.29ms
iter 72270: loss 8.0198, time 125.63ms
iter 72280: loss 7.8089, time 125.47ms
iter 72290: loss 8.3688, time 125.78ms
iter 72300: loss 8.4834, time 128.12ms
iter 72310: loss 8.3063, time 125.50ms
iter 72320: loss 8.6258, time 125.12ms
iter 72330: loss 8.4045, time 125.29ms
iter 72340: loss 8.7269, time 125.61ms
iter 72350: loss 7.8370, time 125.08ms
iter 72360: loss 8.4480, time 125.12ms
iter 72370: loss 8.4504, time 125.43ms
iter 72380: loss 7.5853, time 125.19ms
iter 72390: loss 8.9411, time 125.23ms
iter 72400: loss 8.7714, time 125.17ms
iter 72410: loss 9.0188, time 128.34ms
iter 72420: loss 9.5755, time 125.21ms
iter 72430: loss 8.0706, time 125.20ms
iter 72440: loss 8.3703, time 125.32ms
iter 72450: loss 8.4271, time 125.18ms
iter 72460: loss 8.1216, time 125.17ms
iter 72470: loss 8.7189, time 125.20ms
iter 72480: loss 8.9354, time 125.26ms
iter 72490: loss 8.6324, time 125.57ms
step 72500: train loss 7.0475, val loss 7.0287
saving checkpoint to out-shakespeare-char
iter 72500: loss 8.3473, time 2899.30ms
iter 72510: loss 8.6109, time 125.53ms
iter 72520: loss 8.3363, time 125.27ms
iter 72530: loss 7.8996, time 125.26ms
iter 72540: loss 8.3318, time 124.72ms
iter 72550: loss 8.2153, time 124.76ms
iter 72560: loss 8.6568, time 125.15ms
iter 72570: loss 7.8030, time 124.94ms
iter 72580: loss 8.9058, time 125.32ms
iter 72590: loss 8.4069, time 125.22ms
iter 72600: loss 8.8183, time 125.18ms
iter 72610: loss 8.2046, time 125.95ms
iter 72620: loss 8.5500, time 125.77ms
iter 72630: loss 8.5574, time 125.73ms
iter 72640: loss 7.9514, time 125.77ms
iter 72650: loss 8.1973, time 126.04ms
iter 72660: loss 7.9341, time 125.74ms
iter 72670: loss 9.7141, time 125.71ms
iter 72680: loss 8.3749, time 126.01ms
iter 72690: loss 8.6484, time 126.00ms
iter 72700: loss 8.2048, time 125.71ms
iter 72710: loss 8.2715, time 128.84ms
iter 72720: loss 7.8773, time 125.76ms
iter 72730: loss 8.5117, time 127.18ms
iter 72740: loss 8.7936, time 125.83ms
step 72750: train loss 6.9924, val loss 7.0268
saving checkpoint to out-shakespeare-char
iter 72750: loss 8.1319, time 2869.57ms
iter 72760: loss 8.0453, time 125.67ms
iter 72770: loss 8.3522, time 126.00ms
iter 72780: loss 8.1897, time 125.98ms
iter 72790: loss 8.4122, time 126.75ms
iter 72800: loss 8.0790, time 125.77ms
iter 72810: loss 7.5156, time 126.27ms
iter 72820: loss 8.6023, time 128.87ms
iter 72830: loss 8.4346, time 126.07ms
iter 72840: loss 8.4689, time 125.84ms
iter 72850: loss 8.5314, time 125.77ms
iter 72860: loss 8.4146, time 125.89ms
iter 72870: loss 8.7475, time 125.71ms
iter 72880: loss 8.2416, time 125.70ms
iter 72890: loss 8.1636, time 125.74ms
iter 72900: loss 8.2800, time 125.60ms
iter 72910: loss 7.6095, time 126.10ms
iter 72920: loss 8.2234, time 125.84ms
iter 72930: loss 7.6649, time 125.89ms
iter 72940: loss 8.7387, time 125.70ms
iter 72950: loss 8.9756, time 126.18ms
iter 72960: loss 8.1358, time 125.77ms
iter 72970: loss 8.4172, time 125.53ms
iter 72980: loss 8.2202, time 125.48ms
iter 72990: loss 8.5702, time 125.56ms
step 73000: train loss 7.0471, val loss 7.0115
saving checkpoint to out-shakespeare-char
iter 73000: loss 8.7350, time 2880.96ms
iter 73010: loss 9.1353, time 126.10ms
iter 73020: loss 8.6084, time 126.13ms
iter 73030: loss 8.5624, time 126.05ms
iter 73040: loss 7.7856, time 128.98ms
iter 73050: loss 8.7582, time 125.74ms
iter 73060: loss 8.2461, time 125.73ms
iter 73070: loss 8.2077, time 125.70ms
iter 73080: loss 7.9211, time 125.76ms
iter 73090: loss 7.7352, time 126.08ms
iter 73100: loss 8.3452, time 125.60ms
iter 73110: loss 8.5877, time 125.51ms
iter 73120: loss 8.5766, time 125.22ms
iter 73130: loss 7.7669, time 125.94ms
iter 73140: loss 8.1705, time 125.83ms
iter 73150: loss 8.0355, time 128.56ms
iter 73160: loss 8.4107, time 125.86ms
iter 73170: loss 10.0781, time 125.92ms
iter 73180: loss 8.1996, time 126.03ms
iter 73190: loss 8.4216, time 125.83ms
iter 73200: loss 7.7979, time 125.96ms
iter 73210: loss 8.0973, time 125.79ms
iter 73220: loss 8.3939, time 125.61ms
iter 73230: loss 8.4041, time 125.78ms
iter 73240: loss 8.5111, time 126.18ms
step 73250: train loss 7.0008, val loss 6.9694
saving checkpoint to out-shakespeare-char
iter 73250: loss 8.2045, time 2881.23ms
iter 73260: loss 7.9543, time 126.14ms
iter 73270: loss 8.0850, time 126.24ms
iter 73280: loss 7.9662, time 128.66ms
iter 73290: loss 8.6393, time 125.76ms
iter 73300: loss 9.1991, time 125.72ms
iter 73310: loss 8.5443, time 126.27ms
iter 73320: loss 8.1852, time 125.74ms
iter 73330: loss 7.9078, time 125.43ms
iter 73340: loss 7.8269, time 125.66ms
iter 73350: loss 8.0061, time 126.10ms
iter 73360: loss 7.9811, time 125.98ms
iter 73370: loss 8.4275, time 125.94ms
iter 73380: loss 8.1785, time 126.05ms
iter 73390: loss 8.1924, time 128.88ms
iter 73400: loss 8.2303, time 125.74ms
iter 73410: loss 8.9620, time 126.04ms
iter 73420: loss 8.0597, time 125.87ms
iter 73430: loss 8.8348, time 125.72ms
iter 73440: loss 8.6904, time 125.99ms
iter 73450: loss 7.9353, time 125.75ms
iter 73460: loss 8.3700, time 125.74ms
iter 73470: loss 8.2081, time 125.74ms
iter 73480: loss 9.1539, time 125.70ms
iter 73490: loss 7.7255, time 125.86ms
step 73500: train loss 6.9951, val loss 6.9455
saving checkpoint to out-shakespeare-char
iter 73500: loss 8.5760, time 2868.78ms
iter 73510: loss 8.0331, time 126.03ms
iter 73520: loss 8.6057, time 125.88ms
iter 73530: loss 8.1115, time 125.84ms
iter 73540: loss 7.9389, time 125.79ms
iter 73550: loss 8.5271, time 125.73ms
iter 73560: loss 8.3973, time 125.78ms
iter 73570: loss 8.0859, time 129.05ms
iter 73580: loss 8.8232, time 125.59ms
iter 73590: loss 8.7615, time 126.08ms
iter 73600: loss 8.5496, time 125.98ms
iter 73610: loss 8.6761, time 125.23ms
iter 73620: loss 8.6659, time 125.17ms
iter 73630: loss 8.7473, time 125.01ms
iter 73640: loss 8.2062, time 125.61ms
iter 73650: loss 8.0907, time 125.70ms
iter 73660: loss 8.4539, time 125.77ms
iter 73670: loss 8.4043, time 125.82ms
iter 73680: loss 8.1771, time 128.71ms
iter 73690: loss 7.4985, time 125.83ms
iter 73700: loss 8.5719, time 125.72ms
iter 73710: loss 8.5323, time 126.01ms
iter 73720: loss 8.2209, time 125.69ms
iter 73730: loss 8.8325, time 125.59ms
iter 73740: loss 8.4087, time 125.60ms
step 73750: train loss 7.0065, val loss 7.0743
saving checkpoint to out-shakespeare-char
iter 73750: loss 8.4773, time 2862.38ms
iter 73760: loss 8.4523, time 124.83ms
iter 73770: loss 8.7732, time 125.18ms
iter 73780: loss 8.1707, time 125.00ms
iter 73790: loss 9.3440, time 124.64ms
iter 73800: loss 8.9811, time 125.59ms
iter 73810: loss 9.0738, time 128.54ms
iter 73820: loss 8.7006, time 125.73ms
iter 73830: loss 8.7562, time 125.82ms
iter 73840: loss 8.6759, time 125.57ms
iter 73850: loss 7.7718, time 125.75ms
iter 73860: loss 8.3372, time 125.70ms
iter 73870: loss 8.8145, time 125.61ms
iter 73880: loss 8.2385, time 125.14ms
iter 73890: loss 8.1227, time 125.81ms
iter 73900: loss 8.6355, time 125.47ms
iter 73910: loss 9.0066, time 126.15ms
iter 73920: loss 8.7322, time 128.79ms
iter 73930: loss 7.7940, time 125.70ms
iter 73940: loss 7.4595, time 125.87ms
iter 73950: loss 8.1224, time 124.53ms
iter 73960: loss 8.3663, time 126.04ms
iter 73970: loss 8.1921, time 125.74ms
iter 73980: loss 7.9042, time 126.22ms
iter 73990: loss 8.6921, time 128.49ms
step 74000: train loss 7.0212, val loss 7.0092
saving checkpoint to out-shakespeare-char
iter 74000: loss 8.2349, time 2897.70ms
iter 74010: loss 8.2097, time 126.56ms
iter 74020: loss 8.4247, time 125.88ms
iter 74030: loss 8.5172, time 125.76ms
iter 74040: loss 8.1163, time 125.54ms
iter 74050: loss 8.6371, time 131.71ms
iter 74060: loss 8.6989, time 125.42ms
iter 74070: loss 7.9828, time 125.62ms
iter 74080: loss 9.6302, time 125.51ms
iter 74090: loss 8.4906, time 125.51ms
iter 74100: loss 8.5316, time 125.48ms
iter 74110: loss 7.9694, time 124.43ms
iter 74120: loss 8.3579, time 124.99ms
iter 74130: loss 8.5835, time 128.14ms
iter 74140: loss 8.3788, time 125.02ms
iter 74150: loss 8.4598, time 125.19ms
iter 74160: loss 8.6880, time 125.25ms
iter 74170: loss 8.1374, time 125.47ms
iter 74180: loss 8.1136, time 126.00ms
iter 74190: loss 8.2530, time 125.27ms
iter 74200: loss 8.6779, time 125.41ms
iter 74210: loss 8.4252, time 125.88ms
iter 74220: loss 8.1772, time 124.57ms
iter 74230: loss 8.3406, time 125.58ms
iter 74240: loss 8.9003, time 128.25ms
step 74250: train loss 7.0058, val loss 6.9196
saving checkpoint to out-shakespeare-char
iter 74250: loss 8.0675, time 2885.98ms
iter 74260: loss 8.9512, time 125.70ms
iter 74270: loss 8.3950, time 125.81ms
iter 74280: loss 8.3555, time 128.93ms
iter 74290: loss 9.0879, time 126.18ms
iter 74300: loss 7.7101, time 125.99ms
iter 74310: loss 8.3438, time 126.06ms
iter 74320: loss 8.6785, time 126.04ms
iter 74330: loss 7.6413, time 125.49ms
iter 74340: loss 8.0132, time 125.61ms
iter 74350: loss 8.5289, time 125.26ms
iter 74360: loss 9.4292, time 125.60ms
iter 74370: loss 8.5811, time 125.47ms
iter 74380: loss 8.2473, time 126.16ms
iter 74390: loss 8.2696, time 128.27ms
iter 74400: loss 9.0548, time 125.53ms
iter 74410: loss 8.8622, time 126.13ms
iter 74420: loss 8.2428, time 126.06ms
iter 74430: loss 8.6522, time 125.85ms
iter 74440: loss 8.3881, time 125.62ms
iter 74450: loss 8.7581, time 125.83ms
iter 74460: loss 8.0652, time 125.96ms
iter 74470: loss 8.1331, time 125.87ms
iter 74480: loss 8.9113, time 126.14ms
iter 74490: loss 8.4862, time 126.34ms
step 74500: train loss 6.9683, val loss 6.9749
saving checkpoint to out-shakespeare-char
iter 74500: loss 7.4375, time 2888.21ms
iter 74510: loss 8.0650, time 124.50ms
iter 74520: loss 8.0613, time 121.62ms
iter 74530: loss 8.7165, time 121.91ms
iter 74540: loss 8.1969, time 121.69ms
iter 74550: loss 8.3159, time 121.45ms
iter 74560: loss 7.2579, time 122.14ms
iter 74570: loss 7.5750, time 121.49ms
iter 74580: loss 7.7078, time 121.88ms
iter 74590: loss 8.5881, time 121.43ms
iter 74600: loss 8.0188, time 121.89ms
iter 74610: loss 8.1520, time 121.53ms
iter 74620: loss 7.6008, time 121.56ms
iter 74630: loss 7.9470, time 121.64ms
iter 74640: loss 8.5354, time 121.85ms
iter 74650: loss 8.6462, time 121.67ms
iter 74660: loss 8.3615, time 121.51ms
iter 74670: loss 8.1437, time 121.49ms
iter 74680: loss 8.2846, time 121.67ms
iter 74690: loss 9.0022, time 122.68ms
iter 74700: loss 7.8105, time 121.68ms
iter 74710: loss 8.5854, time 121.70ms
iter 74720: loss 9.2139, time 121.73ms
iter 74730: loss 8.2690, time 121.77ms
iter 74740: loss 7.6629, time 121.79ms
step 74750: train loss 7.0418, val loss 6.9201
saving checkpoint to out-shakespeare-char
iter 74750: loss 8.8101, time 2894.56ms
iter 74760: loss 8.2509, time 121.42ms
iter 74770: loss 8.6090, time 122.60ms
iter 74780: loss 8.3992, time 121.56ms
iter 74790: loss 8.0614, time 123.02ms
iter 74800: loss 8.0937, time 121.64ms
iter 74810: loss 8.4577, time 122.57ms
iter 74820: loss 8.6228, time 121.59ms
iter 74830: loss 8.0470, time 122.65ms
iter 74840: loss 8.0592, time 121.67ms
iter 74850: loss 7.4594, time 122.80ms
iter 74860: loss 8.7091, time 121.12ms
iter 74870: loss 7.7644, time 122.62ms
iter 74880: loss 8.1414, time 121.44ms
iter 74890: loss 8.0852, time 122.68ms
iter 74900: loss 8.7607, time 121.93ms
iter 74910: loss 8.7694, time 122.63ms
iter 74920: loss 8.1247, time 121.46ms
iter 74930: loss 8.1827, time 121.56ms
iter 74940: loss 7.9346, time 121.48ms
iter 74950: loss 8.7259, time 122.50ms
iter 74960: loss 8.0513, time 121.50ms
iter 74970: loss 8.0857, time 122.61ms
iter 74980: loss 8.0063, time 121.56ms
iter 74990: loss 8.7197, time 122.72ms
step 75000: train loss 6.9957, val loss 7.0034
saving checkpoint to out-shakespeare-char
iter 75000: loss 8.7744, time 2893.30ms
iter 75010: loss 8.3014, time 121.57ms
iter 75020: loss 8.7494, time 121.60ms
iter 75030: loss 8.6854, time 121.45ms
iter 75040: loss 8.3940, time 121.33ms
iter 75050: loss 8.2560, time 121.30ms
iter 75060: loss 8.5391, time 121.35ms
iter 75070: loss 7.7481, time 121.65ms
iter 75080: loss 8.3618, time 121.41ms
iter 75090: loss 8.3530, time 121.50ms
iter 75100: loss 8.1451, time 120.70ms
iter 75110: loss 8.3029, time 121.43ms
iter 75120: loss 8.4126, time 122.01ms
iter 75130: loss 8.0967, time 121.42ms
iter 75140: loss 8.7275, time 121.25ms
iter 75150: loss 7.8149, time 121.52ms
iter 75160: loss 8.0523, time 121.24ms
iter 75170: loss 8.6987, time 121.50ms
iter 75180: loss 8.1693, time 121.31ms
iter 75190: loss 7.9170, time 120.35ms
iter 75200: loss 8.3073, time 121.37ms
iter 75210: loss 8.9179, time 121.53ms
iter 75220: loss 7.6541, time 121.09ms
iter 75230: loss 8.7769, time 121.64ms
iter 75240: loss 8.8561, time 121.50ms
step 75250: train loss 6.9466, val loss 6.9540
saving checkpoint to out-shakespeare-char
iter 75250: loss 8.3157, time 2893.31ms
iter 75260: loss 9.3641, time 121.43ms
iter 75270: loss 8.7349, time 124.28ms
iter 75280: loss 8.5258, time 121.57ms
iter 75290: loss 8.0170, time 124.43ms
iter 75300: loss 8.2689, time 121.28ms
iter 75310: loss 8.4895, time 124.20ms
iter 75320: loss 8.5229, time 121.39ms
iter 75330: loss 8.3336, time 124.20ms
iter 75340: loss 8.1470, time 121.65ms
iter 75350: loss 8.2672, time 124.31ms
iter 75360: loss 8.1970, time 121.35ms
iter 75370: loss 7.6679, time 123.98ms
iter 75380: loss 8.3400, time 121.56ms
iter 75390: loss 8.2395, time 124.43ms
iter 75400: loss 8.4375, time 120.25ms
iter 75410: loss 8.3677, time 124.20ms
iter 75420: loss 8.8823, time 121.46ms
iter 75430: loss 8.3579, time 123.79ms
iter 75440: loss 7.9340, time 121.49ms
iter 75450: loss 7.9710, time 124.24ms
iter 75460: loss 8.1023, time 121.48ms
iter 75470: loss 9.0355, time 123.96ms
iter 75480: loss 7.9799, time 121.53ms
iter 75490: loss 8.6569, time 124.32ms
step 75500: train loss 6.9482, val loss 6.9466
saving checkpoint to out-shakespeare-char
iter 75500: loss 8.7658, time 2885.51ms
iter 75510: loss 8.8008, time 121.52ms
iter 75520: loss 8.1703, time 124.28ms
iter 75530: loss 8.1065, time 119.72ms
iter 75540: loss 7.6727, time 124.33ms
iter 75550: loss 8.2307, time 121.48ms
iter 75560: loss 7.8743, time 123.83ms
iter 75570: loss 8.1315, time 121.54ms
iter 75580: loss 8.1865, time 124.48ms
iter 75590: loss 7.6394, time 121.50ms
iter 75600: loss 8.2086, time 124.53ms
iter 75610: loss 8.7098, time 121.41ms
iter 75620: loss 7.9230, time 122.87ms
iter 75630: loss 7.8885, time 121.56ms
iter 75640: loss 8.6694, time 124.27ms
iter 75650: loss 8.5535, time 121.58ms
iter 75660: loss 9.0531, time 124.20ms
iter 75670: loss 8.3605, time 121.40ms
iter 75680: loss 8.0982, time 124.29ms
iter 75690: loss 8.7525, time 121.69ms
iter 75700: loss 8.4652, time 124.47ms
iter 75710: loss 8.3746, time 120.09ms
iter 75720: loss 8.4084, time 124.40ms
iter 75730: loss 9.5597, time 121.54ms
iter 75740: loss 8.2289, time 124.41ms
step 75750: train loss 6.9475, val loss 6.9556
saving checkpoint to out-shakespeare-char
iter 75750: loss 8.4767, time 2884.88ms
iter 75760: loss 8.7243, time 121.45ms
iter 75770: loss 9.0956, time 122.67ms
iter 75780: loss 8.7961, time 121.43ms
iter 75790: loss 7.7991, time 122.24ms
iter 75800: loss 7.8617, time 121.52ms
iter 75810: loss 8.7535, time 121.97ms
iter 75820: loss 7.9257, time 121.29ms
iter 75830: loss 7.3148, time 121.18ms
iter 75840: loss 8.2931, time 121.56ms
iter 75850: loss 7.9117, time 120.66ms
iter 75860: loss 7.8739, time 119.96ms
iter 75870: loss 9.0227, time 121.27ms
iter 75880: loss 8.3505, time 121.37ms
iter 75890: loss 7.7836, time 122.42ms
iter 75900: loss 8.3884, time 121.37ms
iter 75910: loss 9.0941, time 121.38ms
iter 75920: loss 8.5757, time 121.39ms
iter 75930: loss 7.7270, time 121.58ms
iter 75940: loss 8.4024, time 121.48ms
iter 75950: loss 8.9222, time 121.38ms
iter 75960: loss 7.8068, time 121.49ms
iter 75970: loss 8.6332, time 121.73ms
iter 75980: loss 8.6666, time 121.35ms
iter 75990: loss 8.8278, time 121.15ms
step 76000: train loss 6.9772, val loss 6.8704
saving checkpoint to out-shakespeare-char
iter 76000: loss 9.0726, time 2885.17ms
iter 76010: loss 8.6698, time 122.36ms
iter 76020: loss 7.7201, time 121.40ms
iter 76030: loss 7.5279, time 124.26ms
iter 76040: loss 8.4826, time 121.37ms
iter 76050: loss 8.1257, time 124.16ms
iter 76060: loss 8.2087, time 121.53ms
iter 76070: loss 8.6070, time 124.26ms
iter 76080: loss 8.0203, time 121.52ms
iter 76090: loss 7.4115, time 124.42ms
iter 76100: loss 9.5301, time 118.52ms
iter 76110: loss 8.6654, time 124.37ms
iter 76120: loss 8.1952, time 121.35ms
iter 76130: loss 8.6062, time 124.52ms
iter 76140: loss 8.3013, time 121.40ms
iter 76150: loss 9.0225, time 123.77ms
iter 76160: loss 8.6030, time 121.52ms
iter 76170: loss 8.4487, time 124.37ms
iter 76180: loss 9.1384, time 121.83ms
iter 76190: loss 8.1971, time 122.76ms
iter 76200: loss 8.2902, time 121.39ms
iter 76210: loss 8.1679, time 124.18ms
iter 76220: loss 7.8367, time 121.46ms
iter 76230: loss 8.1811, time 124.65ms
iter 76240: loss 8.9764, time 121.49ms
step 76250: train loss 6.9446, val loss 6.9726
saving checkpoint to out-shakespeare-char
iter 76250: loss 8.7648, time 2892.14ms
iter 76260: loss 8.6376, time 121.88ms
iter 76270: loss 8.7613, time 121.78ms
iter 76280: loss 7.7036, time 121.40ms
iter 76290: loss 8.3899, time 119.92ms
iter 76300: loss 8.5500, time 121.40ms
iter 76310: loss 8.5062, time 121.44ms
iter 76320: loss 8.6352, time 120.57ms
iter 76330: loss 8.2555, time 121.79ms
iter 76340: loss 8.4369, time 121.56ms
iter 76350: loss 8.2656, time 121.41ms
iter 76360: loss 8.8683, time 121.52ms
iter 76370: loss 8.9511, time 121.59ms
iter 76380: loss 7.7009, time 120.20ms
iter 76390: loss 8.3469, time 121.37ms
iter 76400: loss 8.1351, time 121.54ms
iter 76410: loss 8.6458, time 121.43ms
iter 76420: loss 8.5851, time 121.40ms
iter 76430: loss 8.4590, time 121.38ms
iter 76440: loss 8.1337, time 121.36ms
iter 76450: loss 8.6660, time 121.79ms
iter 76460: loss 8.2182, time 121.45ms
iter 76470: loss 8.2476, time 120.96ms
iter 76480: loss 8.7731, time 121.15ms
iter 76490: loss 8.5261, time 121.40ms
step 76500: train loss 6.9229, val loss 6.9408
saving checkpoint to out-shakespeare-char
iter 76500: loss 7.9855, time 2890.38ms
iter 76510: loss 9.2667, time 121.59ms
iter 76520: loss 8.5129, time 121.95ms
iter 76530: loss 7.9659, time 121.50ms
iter 76540: loss 8.1957, time 121.41ms
iter 76550: loss 8.2740, time 121.31ms
iter 76560: loss 7.9412, time 121.31ms
iter 76570: loss 8.4308, time 121.90ms
iter 76580: loss 8.4223, time 121.41ms
iter 76590: loss 8.9352, time 121.51ms
iter 76600: loss 7.8724, time 121.33ms
iter 76610: loss 8.3541, time 121.49ms
iter 76620: loss 8.2236, time 119.36ms
iter 76630: loss 8.5138, time 121.35ms
iter 76640: loss 7.9847, time 121.43ms
iter 76650: loss 8.8781, time 121.44ms
iter 76660: loss 8.3283, time 121.69ms
iter 76670: loss 8.8383, time 121.46ms
iter 76680: loss 8.3101, time 121.32ms
iter 76690: loss 8.2308, time 121.38ms
iter 76700: loss 7.9851, time 121.36ms
iter 76710: loss 8.0495, time 121.42ms
iter 76720: loss 8.9308, time 121.33ms
iter 76730: loss 8.6020, time 121.53ms
iter 76740: loss 8.2880, time 121.68ms
step 76750: train loss 6.8874, val loss 6.9530
saving checkpoint to out-shakespeare-char
iter 76750: loss 8.8971, time 2878.18ms
iter 76760: loss 8.5066, time 121.75ms
iter 76770: loss 8.5719, time 121.89ms
iter 76780: loss 8.9422, time 121.35ms
iter 76790: loss 9.3403, time 121.33ms
iter 76800: loss 8.8665, time 122.30ms
iter 76810: loss 8.7312, time 120.82ms
iter 76820: loss 8.0152, time 121.49ms
iter 76830: loss 7.5934, time 121.71ms
iter 76840: loss 8.1889, time 121.64ms
iter 76850: loss 8.0343, time 121.53ms
iter 76860: loss 8.3932, time 122.18ms
iter 76870: loss 7.9843, time 121.54ms
iter 76880: loss 8.1171, time 120.65ms
iter 76890: loss 8.6806, time 121.54ms
iter 76900: loss 9.3634, time 121.38ms
iter 76910: loss 8.1399, time 120.66ms
iter 76920: loss 8.1942, time 121.89ms
iter 76930: loss 8.1061, time 121.37ms
iter 76940: loss 8.0513, time 122.06ms
iter 76950: loss 8.6287, time 120.79ms
iter 76960: loss 8.5804, time 121.45ms
iter 76970: loss 8.2137, time 121.58ms
iter 76980: loss 8.3696, time 122.24ms
iter 76990: loss 9.3349, time 121.56ms
step 77000: train loss 6.9452, val loss 6.9126
saving checkpoint to out-shakespeare-char
iter 77000: loss 8.8201, time 2896.33ms
iter 77010: loss 8.1772, time 121.46ms
iter 77020: loss 8.5330, time 124.31ms
iter 77030: loss 8.5515, time 121.36ms
iter 77040: loss 8.2703, time 124.15ms
iter 77050: loss 8.4144, time 121.32ms
iter 77060: loss 8.4637, time 124.27ms
iter 77070: loss 8.6580, time 121.41ms
iter 77080: loss 8.7909, time 123.95ms
iter 77090: loss 8.1457, time 122.11ms
iter 77100: loss 8.6893, time 125.13ms
iter 77110: loss 8.3861, time 121.03ms
iter 77120: loss 8.1256, time 124.33ms
iter 77130: loss 8.2806, time 121.48ms
iter 77140: loss 7.8179, time 124.27ms
iter 77150: loss 7.4372, time 121.59ms
iter 77160: loss 7.9878, time 123.68ms
iter 77170: loss 8.7472, time 121.56ms
iter 77180: loss 8.5630, time 124.34ms
iter 77190: loss 8.5088, time 121.50ms
iter 77200: loss 8.5632, time 124.36ms
iter 77210: loss 7.6841, time 121.56ms
iter 77220: loss 8.2540, time 124.30ms
iter 77230: loss 8.1096, time 121.49ms
iter 77240: loss 9.1627, time 124.73ms
step 77250: train loss 6.8935, val loss 6.9609
saving checkpoint to out-shakespeare-char
iter 77250: loss 8.2919, time 2910.92ms
iter 77260: loss 8.4451, time 128.92ms
iter 77270: loss 8.8265, time 126.05ms
iter 77280: loss 7.9952, time 125.81ms
iter 77290: loss 8.2898, time 126.58ms
iter 77300: loss 8.3840, time 126.19ms
iter 77310: loss 8.5770, time 125.26ms
iter 77320: loss 9.0792, time 125.76ms
iter 77330: loss 8.0879, time 125.85ms
iter 77340: loss 7.9375, time 125.64ms
iter 77350: loss 8.1144, time 125.69ms
iter 77360: loss 8.7075, time 125.85ms
iter 77370: loss 7.7703, time 128.34ms
iter 77380: loss 7.8556, time 125.65ms
iter 77390: loss 7.6949, time 126.91ms
iter 77400: loss 8.7616, time 125.80ms
iter 77410: loss 8.7137, time 125.56ms
iter 77420: loss 8.5109, time 125.66ms
iter 77430: loss 8.6375, time 125.66ms
iter 77440: loss 8.9247, time 123.43ms
iter 77450: loss 8.0779, time 125.51ms
iter 77460: loss 7.1345, time 125.78ms
iter 77470: loss 7.4871, time 125.32ms
iter 77480: loss 8.9501, time 127.86ms
iter 77490: loss 9.0399, time 125.55ms
step 77500: train loss 6.8634, val loss 6.9308
saving checkpoint to out-shakespeare-char
iter 77500: loss 8.3685, time 2882.55ms
iter 77510: loss 8.9152, time 125.77ms
iter 77520: loss 9.4862, time 125.16ms
iter 77530: loss 8.0848, time 126.07ms
iter 77540: loss 8.1422, time 125.30ms
iter 77550: loss 8.0588, time 125.37ms
iter 77560: loss 8.1264, time 125.46ms
iter 77570: loss 8.2622, time 125.79ms
iter 77580: loss 8.6789, time 125.75ms
iter 77590: loss 8.3032, time 125.96ms
iter 77600: loss 8.8840, time 125.57ms
iter 77610: loss 8.7719, time 125.61ms
iter 77620: loss 8.2141, time 125.73ms
iter 77630: loss 8.6071, time 125.45ms
iter 77640: loss 7.9889, time 125.50ms
iter 77650: loss 8.4039, time 128.79ms
iter 77660: loss 8.5161, time 125.70ms
iter 77670: loss 9.1468, time 125.40ms
iter 77680: loss 7.9098, time 125.55ms
iter 77690: loss 8.6315, time 125.77ms
iter 77700: loss 8.6130, time 125.14ms
iter 77710: loss 8.1663, time 125.47ms
iter 77720: loss 8.2503, time 126.03ms
iter 77730: loss 8.5545, time 129.16ms
iter 77740: loss 8.9720, time 126.00ms
step 77750: train loss 6.9567, val loss 6.9186
saving checkpoint to out-shakespeare-char
iter 77750: loss 7.7826, time 2893.55ms
iter 77760: loss 8.4992, time 128.59ms
iter 77770: loss 7.7905, time 124.35ms
iter 77780: loss 7.9427, time 125.39ms
iter 77790: loss 8.5160, time 125.53ms
iter 77800: loss 8.1974, time 125.47ms
iter 77810: loss 8.3498, time 125.08ms
iter 77820: loss 8.3251, time 125.58ms
iter 77830: loss 8.0858, time 125.43ms
iter 77840: loss 8.3674, time 125.54ms
iter 77850: loss 8.3698, time 125.45ms
iter 77860: loss 8.1474, time 126.02ms
iter 77870: loss 8.0911, time 125.33ms
iter 77880: loss 8.5247, time 124.90ms
iter 77890: loss 8.4884, time 125.62ms
iter 77900: loss 9.2486, time 125.45ms
iter 77910: loss 8.4155, time 125.24ms
iter 77920: loss 8.3636, time 125.82ms
iter 77930: loss 8.2989, time 125.47ms
iter 77940: loss 8.4050, time 126.70ms
iter 77950: loss 7.8454, time 125.62ms
iter 77960: loss 8.5080, time 125.36ms
iter 77970: loss 7.6428, time 125.64ms
iter 77980: loss 8.3274, time 125.09ms
iter 77990: loss 8.5438, time 125.69ms
step 78000: train loss 6.8295, val loss 6.9012
saving checkpoint to out-shakespeare-char
iter 78000: loss 8.3077, time 2882.08ms
iter 78010: loss 7.3304, time 125.07ms
iter 78020: loss 8.9198, time 124.60ms
iter 78030: loss 8.7782, time 125.51ms
iter 78040: loss 7.9909, time 127.43ms
iter 78050: loss 7.9555, time 124.58ms
iter 78060: loss 8.8051, time 125.86ms
iter 78070: loss 8.5812, time 125.46ms
iter 78080: loss 8.5486, time 125.10ms
iter 78090: loss 7.8128, time 124.11ms
iter 78100: loss 8.0656, time 125.18ms
iter 78110: loss 8.2041, time 125.08ms
iter 78120: loss 8.4691, time 125.10ms
iter 78130: loss 8.0653, time 123.11ms
iter 78140: loss 7.9981, time 125.78ms
iter 78150: loss 7.8791, time 128.13ms
iter 78160: loss 8.2731, time 125.05ms
iter 78170: loss 8.3239, time 123.09ms
iter 78180: loss 8.4096, time 124.92ms
iter 78190: loss 8.1905, time 125.08ms
iter 78200: loss 8.3967, time 125.30ms
iter 78210: loss 8.5899, time 124.50ms
iter 78220: loss 8.5371, time 125.48ms
iter 78230: loss 8.1087, time 125.37ms
iter 78240: loss 8.2849, time 125.26ms
step 78250: train loss 6.9069, val loss 6.8682
saving checkpoint to out-shakespeare-char
iter 78250: loss 8.5583, time 2899.58ms
iter 78260: loss 8.1159, time 125.79ms
iter 78270: loss 8.5717, time 125.90ms
iter 78280: loss 7.7388, time 125.72ms
iter 78290: loss 8.2431, time 126.27ms
iter 78300: loss 8.5493, time 125.90ms
iter 78310: loss 7.9529, time 126.42ms
iter 78320: loss 7.8516, time 125.56ms
iter 78330: loss 8.9063, time 126.17ms
iter 78340: loss 7.8655, time 125.43ms
iter 78350: loss 8.5734, time 125.59ms
iter 78360: loss 8.0515, time 128.55ms
iter 78370: loss 8.0422, time 125.37ms
iter 78380: loss 7.6624, time 125.81ms
iter 78390: loss 8.3695, time 125.64ms
iter 78400: loss 7.6421, time 125.43ms
iter 78410: loss 8.2783, time 125.99ms
iter 78420: loss 8.6556, time 125.45ms
iter 78430: loss 8.4915, time 125.83ms
iter 78440: loss 8.6798, time 125.84ms
iter 78450: loss 8.3883, time 122.98ms
iter 78460: loss 8.4652, time 125.76ms
iter 78470: loss 9.0937, time 125.50ms
iter 78480: loss 8.4748, time 125.81ms
iter 78490: loss 8.4379, time 125.87ms
step 78500: train loss 6.8814, val loss 6.8589
saving checkpoint to out-shakespeare-char
iter 78500: loss 7.7141, time 2900.32ms
iter 78510: loss 8.4128, time 122.59ms
iter 78520: loss 8.9580, time 121.55ms
iter 78530: loss 8.1750, time 122.69ms
iter 78540: loss 7.9157, time 121.53ms
iter 78550: loss 8.1437, time 122.58ms
iter 78560: loss 8.3328, time 121.35ms
iter 78570: loss 7.4567, time 122.28ms
iter 78580: loss 8.3188, time 121.56ms
iter 78590: loss 8.4983, time 122.72ms
iter 78600: loss 7.7898, time 121.52ms
iter 78610: loss 8.5214, time 122.71ms
iter 78620: loss 8.3787, time 121.56ms
iter 78630: loss 7.4956, time 122.63ms
iter 78640: loss 7.9611, time 121.50ms
iter 78650: loss 7.7534, time 122.54ms
iter 78660: loss 7.7119, time 121.80ms
iter 78670: loss 8.1840, time 122.59ms
iter 78680: loss 8.9477, time 122.08ms
iter 78690: loss 8.2458, time 122.66ms
iter 78700: loss 8.1809, time 121.43ms
iter 78710: loss 7.8316, time 122.75ms
iter 78720: loss 8.5741, time 121.61ms
iter 78730: loss 7.7938, time 122.65ms
iter 78740: loss 7.5269, time 123.29ms
step 78750: train loss 6.8762, val loss 6.8917
saving checkpoint to out-shakespeare-char
iter 78750: loss 8.5135, time 2881.70ms
iter 78760: loss 8.8064, time 121.94ms
iter 78770: loss 8.9728, time 121.55ms
iter 78780: loss 8.5244, time 121.59ms
iter 78790: loss 8.1275, time 121.64ms
iter 78800: loss 8.4110, time 121.59ms
iter 78810: loss 7.5486, time 121.69ms
iter 78820: loss 8.1577, time 121.57ms
iter 78830: loss 8.5113, time 121.56ms
iter 78840: loss 8.4024, time 121.58ms
iter 78850: loss 7.8372, time 121.59ms
iter 78860: loss 8.6451, time 121.64ms
iter 78870: loss 8.0105, time 121.61ms
iter 78880: loss 8.0324, time 121.55ms
iter 78890: loss 8.9158, time 121.36ms
iter 78900: loss 8.2212, time 121.97ms
iter 78910: loss 8.3923, time 122.20ms
iter 78920: loss 7.6825, time 121.60ms
iter 78930: loss 8.5252, time 121.37ms
iter 78940: loss 8.1140, time 121.51ms
iter 78950: loss 7.9646, time 121.64ms
iter 78960: loss 8.9797, time 121.56ms
iter 78970: loss 8.3651, time 121.62ms
iter 78980: loss 8.0675, time 121.61ms
iter 78990: loss 8.2411, time 121.52ms
step 79000: train loss 6.9153, val loss 6.8319
saving checkpoint to out-shakespeare-char
iter 79000: loss 8.5663, time 2890.70ms
iter 79010: loss 8.8016, time 126.09ms
iter 79020: loss 8.1660, time 126.06ms
iter 79030: loss 8.8294, time 125.90ms
iter 79040: loss 9.2784, time 126.03ms
iter 79050: loss 8.3850, time 125.65ms
iter 79060: loss 8.3728, time 125.74ms
iter 79070: loss 8.4635, time 125.74ms
iter 79080: loss 7.9628, time 126.40ms
iter 79090: loss 7.9821, time 125.89ms
iter 79100: loss 7.7958, time 128.94ms
iter 79110: loss 8.5860, time 125.74ms
iter 79120: loss 8.2747, time 125.65ms
iter 79130: loss 8.1948, time 125.69ms
iter 79140: loss 7.6459, time 125.67ms
iter 79150: loss 7.5692, time 126.38ms
iter 79160: loss 8.1557, time 126.05ms
iter 79170: loss 8.4035, time 125.62ms
iter 79180: loss 8.0885, time 125.73ms
iter 79190: loss 8.2856, time 125.83ms
iter 79200: loss 7.9468, time 125.91ms
iter 79210: loss 7.7974, time 128.76ms
iter 79220: loss 7.7032, time 125.69ms
iter 79230: loss 7.4550, time 125.06ms
iter 79240: loss 9.2811, time 125.89ms
step 79250: train loss 6.8928, val loss 6.8500
saving checkpoint to out-shakespeare-char
iter 79250: loss 8.2318, time 2875.72ms
iter 79260: loss 8.6722, time 125.02ms
iter 79270: loss 8.2119, time 125.52ms
iter 79280: loss 8.5295, time 125.57ms
iter 79290: loss 8.2525, time 124.84ms
iter 79300: loss 7.8293, time 125.22ms
iter 79310: loss 8.6356, time 128.48ms
iter 79320: loss 7.7600, time 125.68ms
iter 79330: loss 7.8030, time 124.48ms
iter 79340: loss 8.5751, time 125.39ms
iter 79350: loss 8.8980, time 125.60ms
iter 79360: loss 7.9364, time 125.07ms
iter 79370: loss 7.5652, time 125.70ms
iter 79380: loss 8.6412, time 128.38ms
iter 79390: loss 9.1434, time 124.71ms
iter 79400: loss 8.4607, time 125.63ms
iter 79410: loss 7.7898, time 125.61ms
iter 79420: loss 7.7905, time 124.70ms
iter 79430: loss 8.0449, time 125.57ms
iter 79440: loss 8.5327, time 125.70ms
iter 79450: loss 8.1012, time 125.45ms
iter 79460: loss 8.3070, time 125.74ms
iter 79470: loss 8.6642, time 124.38ms
iter 79480: loss 9.1696, time 126.78ms
iter 79490: loss 8.6151, time 124.93ms
step 79500: train loss 6.8774, val loss 6.8556
saving checkpoint to out-shakespeare-char
iter 79500: loss 8.6000, time 2842.44ms
iter 79510: loss 8.5356, time 125.62ms
iter 79520: loss 7.9437, time 125.84ms
iter 79530: loss 8.2951, time 125.99ms
iter 79540: loss 7.8238, time 125.34ms
iter 79550: loss 8.3152, time 128.86ms
iter 79560: loss 8.4471, time 126.45ms
iter 79570: loss 7.4607, time 125.75ms
iter 79580: loss 8.5376, time 125.90ms
iter 79590: loss 9.0445, time 125.88ms
iter 79600: loss 8.2772, time 125.88ms
iter 79610: loss 8.6572, time 126.04ms
iter 79620: loss 7.9138, time 126.00ms
iter 79630: loss 8.5969, time 125.91ms
iter 79640: loss 8.9598, time 125.74ms
iter 79650: loss 8.8223, time 125.56ms
iter 79660: loss 8.7459, time 128.46ms
iter 79670: loss 7.5022, time 125.53ms
iter 79680: loss 8.0578, time 125.59ms
iter 79690: loss 7.6352, time 126.33ms
iter 79700: loss 7.5728, time 125.48ms
iter 79710: loss 8.2533, time 125.75ms
iter 79720: loss 8.7184, time 125.77ms
iter 79730: loss 8.5555, time 125.74ms
iter 79740: loss 7.6090, time 125.52ms
step 79750: train loss 6.8985, val loss 6.8893
saving checkpoint to out-shakespeare-char
iter 79750: loss 7.6035, time 2884.22ms
iter 79760: loss 8.6193, time 125.83ms
iter 79770: loss 8.5918, time 128.46ms
iter 79780: loss 8.6144, time 124.98ms
iter 79790: loss 7.6542, time 125.59ms
iter 79800: loss 8.0828, time 125.81ms
iter 79810: loss 8.5951, time 125.94ms
iter 79820: loss 7.9476, time 125.95ms
iter 79830: loss 8.6677, time 125.94ms
iter 79840: loss 8.1892, time 124.44ms
iter 79850: loss 8.3324, time 125.79ms
iter 79860: loss 8.5930, time 125.48ms
iter 79870: loss 7.5657, time 125.30ms
iter 79880: loss 8.4077, time 128.29ms
iter 79890: loss 8.4290, time 125.63ms
iter 79900: loss 8.6206, time 125.20ms
iter 79910: loss 8.7388, time 125.76ms
iter 79920: loss 8.5898, time 125.47ms
iter 79930: loss 8.3834, time 125.40ms
iter 79940: loss 8.7313, time 125.51ms
iter 79950: loss 8.9471, time 125.86ms
iter 79960: loss 8.0282, time 125.44ms
iter 79970: loss 8.2212, time 125.56ms
iter 79980: loss 7.7776, time 125.84ms
iter 79990: loss 8.1614, time 128.91ms
step 80000: train loss 6.8375, val loss 6.8320
saving checkpoint to out-shakespeare-char
iter 80000: loss 7.9047, time 2899.44ms
iter 80010: loss 7.7505, time 125.56ms
iter 80020: loss 7.5950, time 125.05ms
iter 80030: loss 8.7493, time 125.71ms
iter 80040: loss 7.7850, time 125.66ms
iter 80050: loss 9.3022, time 125.50ms
iter 80060: loss 7.6304, time 126.01ms
iter 80070: loss 8.1210, time 125.65ms
iter 80080: loss 7.1520, time 125.82ms
iter 80090: loss 7.1938, time 128.37ms
iter 80100: loss 7.8077, time 125.97ms
iter 80110: loss 7.7304, time 125.78ms
iter 80120: loss 8.2182, time 125.92ms
iter 80130: loss 8.4296, time 128.58ms
iter 80140: loss 8.4360, time 125.06ms
iter 80150: loss 8.6917, time 124.75ms
iter 80160: loss 8.6286, time 125.65ms
iter 80170: loss 7.5079, time 125.69ms
iter 80180: loss 7.6018, time 125.62ms
iter 80190: loss 7.8724, time 125.26ms
iter 80200: loss 8.2527, time 125.99ms
iter 80210: loss 7.6671, time 125.58ms
iter 80220: loss 8.5860, time 126.03ms
iter 80230: loss 8.6089, time 125.82ms
iter 80240: loss 7.7790, time 128.57ms
step 80250: train loss 6.8425, val loss 6.8414
saving checkpoint to out-shakespeare-char
iter 80250: loss 8.3008, time 2887.85ms
iter 80260: loss 7.9700, time 125.86ms
iter 80270: loss 8.4118, time 125.67ms
iter 80280: loss 8.1485, time 125.70ms
iter 80290: loss 8.3844, time 125.85ms
iter 80300: loss 8.4809, time 128.79ms
iter 80310: loss 8.0767, time 125.18ms
iter 80320: loss 9.0695, time 125.54ms
iter 80330: loss 8.0294, time 125.82ms
iter 80340: loss 8.1266, time 125.68ms
iter 80350: loss 8.4612, time 125.74ms
iter 80360: loss 8.5969, time 125.76ms
iter 80370: loss 8.6047, time 125.63ms
iter 80380: loss 8.1205, time 128.80ms
iter 80390: loss 8.9196, time 125.60ms
iter 80400: loss 8.6528, time 125.76ms
iter 80410: loss 7.6376, time 125.72ms
iter 80420: loss 9.5335, time 128.94ms
iter 80430: loss 7.5405, time 125.78ms
iter 80440: loss 8.5652, time 125.70ms
iter 80450: loss 7.7056, time 125.88ms
iter 80460: loss 8.3921, time 128.48ms
iter 80470: loss 8.1692, time 124.72ms
iter 80480: loss 8.2822, time 125.72ms
iter 80490: loss 8.5802, time 125.93ms
step 80500: train loss 6.8847, val loss 6.8794
saving checkpoint to out-shakespeare-char
iter 80500: loss 8.4690, time 2890.12ms
iter 80510: loss 7.8340, time 125.64ms
iter 80520: loss 8.6325, time 125.81ms
iter 80530: loss 8.3003, time 129.14ms
iter 80540: loss 7.8438, time 125.81ms
iter 80550: loss 8.7483, time 125.79ms
iter 80560: loss 7.4079, time 125.78ms
iter 80570: loss 8.2197, time 126.05ms
iter 80580: loss 8.5944, time 124.69ms
iter 80590: loss 8.2430, time 125.68ms
iter 80600: loss 8.4211, time 125.77ms
iter 80610: loss 9.1078, time 125.92ms
iter 80620: loss 7.4210, time 125.84ms
iter 80630: loss 8.6522, time 125.98ms
iter 80640: loss 8.6454, time 128.55ms
iter 80650: loss 8.2151, time 125.77ms
iter 80660: loss 8.9673, time 125.60ms
iter 80670: loss 7.9834, time 125.19ms
iter 80680: loss 8.2973, time 125.52ms
iter 80690: loss 8.7356, time 125.90ms
iter 80700: loss 8.0429, time 125.80ms
iter 80710: loss 7.7285, time 125.79ms
iter 80720: loss 8.0734, time 125.50ms
iter 80730: loss 7.9049, time 125.75ms
iter 80740: loss 8.0183, time 125.69ms
step 80750: train loss 6.8198, val loss 6.8439
saving checkpoint to out-shakespeare-char
iter 80750: loss 8.4994, time 2889.31ms
iter 80760: loss 8.2376, time 125.81ms
iter 80770: loss 7.6824, time 125.63ms
iter 80780: loss 8.1865, time 125.79ms
iter 80790: loss 8.5282, time 125.58ms
iter 80800: loss 8.2647, time 125.17ms
iter 80810: loss 8.6502, time 125.51ms
iter 80820: loss 8.3714, time 125.59ms
iter 80830: loss 7.9624, time 125.50ms
iter 80840: loss 7.8706, time 126.07ms
iter 80850: loss 8.8460, time 128.69ms
iter 80860: loss 8.2929, time 124.65ms
iter 80870: loss 7.9975, time 125.63ms
iter 80880: loss 8.4343, time 125.83ms
iter 80890: loss 8.8313, time 125.75ms
iter 80900: loss 8.9351, time 125.77ms
iter 80910: loss 7.6539, time 125.60ms
iter 80920: loss 7.9347, time 125.50ms
iter 80930: loss 8.3862, time 125.73ms
iter 80940: loss 8.1778, time 125.64ms
iter 80950: loss 8.6492, time 124.99ms
iter 80960: loss 8.9694, time 125.62ms
iter 80970: loss 7.8710, time 125.67ms
iter 80980: loss 8.9869, time 125.74ms
iter 80990: loss 8.9161, time 125.80ms
step 81000: train loss 6.8638, val loss 6.8939
saving checkpoint to out-shakespeare-char
iter 81000: loss 7.7872, time 2882.87ms
iter 81010: loss 9.2248, time 125.76ms
iter 81020: loss 8.2792, time 125.61ms
iter 81030: loss 7.3935, time 125.67ms
iter 81040: loss 8.5671, time 126.04ms
iter 81050: loss 8.6001, time 125.57ms
iter 81060: loss 8.5871, time 128.55ms
iter 81070: loss 8.3123, time 125.90ms
iter 81080: loss 8.3817, time 125.73ms
iter 81090: loss 8.4295, time 125.92ms
iter 81100: loss 7.9936, time 125.80ms
iter 81110: loss 9.0265, time 126.10ms
iter 81120: loss 7.6297, time 125.94ms
iter 81130: loss 8.4727, time 125.52ms
iter 81140: loss 8.0893, time 126.00ms
iter 81150: loss 8.1124, time 126.16ms
iter 81160: loss 8.0330, time 126.09ms
iter 81170: loss 8.2027, time 128.51ms
iter 81180: loss 8.4417, time 125.85ms
iter 81190: loss 8.1106, time 125.55ms
iter 81200: loss 8.5568, time 125.72ms
iter 81210: loss 8.1172, time 125.88ms
iter 81220: loss 8.0667, time 126.11ms
iter 81230: loss 7.7425, time 126.42ms
iter 81240: loss 8.0406, time 126.03ms
step 81250: train loss 6.8481, val loss 6.8482
saving checkpoint to out-shakespeare-char
iter 81250: loss 8.3624, time 2900.61ms
iter 81260: loss 8.5438, time 125.92ms
iter 81270: loss 9.2271, time 124.94ms
iter 81280: loss 8.4009, time 125.92ms
iter 81290: loss 8.2983, time 125.77ms
iter 81300: loss 7.5425, time 125.86ms
iter 81310: loss 8.3440, time 128.60ms
iter 81320: loss 8.7127, time 125.17ms
iter 81330: loss 7.9382, time 125.79ms
iter 81340: loss 8.6488, time 131.83ms
iter 81350: loss 8.2579, time 128.95ms
iter 81360: loss 8.0300, time 125.41ms
iter 81370: loss 9.4360, time 125.76ms
iter 81380: loss 8.3476, time 125.57ms
iter 81390: loss 7.7643, time 125.59ms
iter 81400: loss 8.6399, time 125.54ms
iter 81410: loss 8.4022, time 125.89ms
iter 81420: loss 8.3671, time 126.16ms
iter 81430: loss 8.4798, time 125.98ms
iter 81440: loss 7.8949, time 125.84ms
iter 81450: loss 7.9020, time 125.96ms
iter 81460: loss 7.4143, time 128.75ms
iter 81470: loss 8.4598, time 126.10ms
iter 81480: loss 9.0192, time 126.01ms
iter 81490: loss 7.8456, time 125.85ms
step 81500: train loss 6.8201, val loss 6.8354
saving checkpoint to out-shakespeare-char
iter 81500: loss 7.9727, time 2888.54ms
iter 81510: loss 9.1906, time 129.08ms
iter 81520: loss 8.0506, time 125.08ms
iter 81530: loss 8.1498, time 126.18ms
iter 81540: loss 8.4781, time 126.25ms
iter 81550: loss 8.1280, time 126.76ms
iter 81560: loss 8.4102, time 125.80ms
iter 81570: loss 8.2068, time 125.69ms
iter 81580: loss 8.3942, time 125.60ms
iter 81590: loss 8.5230, time 125.69ms
iter 81600: loss 9.4044, time 125.97ms
iter 81610: loss 7.8518, time 125.54ms
iter 81620: loss 8.3721, time 128.86ms
iter 81630: loss 8.2043, time 125.62ms
iter 81640: loss 7.7591, time 126.02ms
iter 81650: loss 8.2493, time 126.42ms
iter 81660: loss 7.8787, time 125.73ms
iter 81670: loss 7.8045, time 125.43ms
iter 81680: loss 7.8674, time 125.49ms
iter 81690: loss 7.7571, time 125.40ms
iter 81700: loss 7.9774, time 125.32ms
iter 81710: loss 7.9251, time 124.95ms
iter 81720: loss 8.3748, time 125.17ms
iter 81730: loss 8.3988, time 128.02ms
iter 81740: loss 8.4483, time 125.03ms
step 81750: train loss 6.7909, val loss 6.8398
saving checkpoint to out-shakespeare-char
iter 81750: loss 8.4182, time 2887.80ms
iter 81760: loss 8.5969, time 121.73ms
iter 81770: loss 8.3108, time 122.81ms
iter 81780: loss 8.6643, time 122.02ms
iter 81790: loss 8.5753, time 122.75ms
iter 81800: loss 7.3287, time 121.69ms
iter 81810: loss 8.1735, time 123.01ms
iter 81820: loss 8.7668, time 121.77ms
iter 81830: loss 8.0321, time 123.06ms
iter 81840: loss 7.5795, time 121.61ms
iter 81850: loss 8.2825, time 122.19ms
iter 81860: loss 8.4516, time 121.70ms
iter 81870: loss 7.3955, time 122.05ms
iter 81880: loss 8.1347, time 121.84ms
iter 81890: loss 7.9461, time 122.79ms
iter 81900: loss 9.4686, time 121.71ms
iter 81910: loss 7.7643, time 122.79ms
iter 81920: loss 8.3536, time 121.72ms
iter 81930: loss 7.9150, time 122.76ms
iter 81940: loss 8.1065, time 121.69ms
iter 81950: loss 8.4990, time 122.87ms
iter 81960: loss 8.4607, time 121.69ms
iter 81970: loss 7.6567, time 123.55ms
iter 81980: loss 8.7658, time 121.60ms
iter 81990: loss 7.9408, time 122.21ms
step 82000: train loss 6.8199, val loss 6.7823
saving checkpoint to out-shakespeare-char
iter 82000: loss 7.8532, time 2888.87ms
iter 82010: loss 8.1293, time 121.80ms
iter 82020: loss 8.3470, time 121.96ms
iter 82030: loss 8.0421, time 121.73ms
iter 82040: loss 7.8072, time 121.80ms
iter 82050: loss 9.3484, time 122.06ms
iter 82060: loss 8.8926, time 121.90ms
iter 82070: loss 8.5828, time 122.11ms
iter 82080: loss 7.9737, time 121.90ms
iter 82090: loss 8.0962, time 121.80ms
iter 82100: loss 7.6571, time 120.95ms
iter 82110: loss 8.5348, time 121.82ms
iter 82120: loss 8.2233, time 122.04ms
iter 82130: loss 8.1361, time 122.05ms
iter 82140: loss 8.0304, time 121.99ms
iter 82150: loss 7.7521, time 122.09ms
iter 82160: loss 8.8432, time 121.82ms
iter 82170: loss 8.4908, time 121.97ms
iter 82180: loss 7.8716, time 121.81ms
iter 82190: loss 8.2449, time 121.03ms
iter 82200: loss 7.5220, time 121.62ms
iter 82210: loss 8.2221, time 121.98ms
iter 82220: loss 7.6509, time 121.79ms
iter 82230: loss 8.1756, time 122.33ms
iter 82240: loss 8.4629, time 121.78ms
step 82250: train loss 6.7710, val loss 6.8408
saving checkpoint to out-shakespeare-char
iter 82250: loss 7.8704, time 2889.30ms
iter 82260: loss 8.6140, time 126.06ms
iter 82270: loss 9.2823, time 125.79ms
iter 82280: loss 8.1089, time 126.08ms
iter 82290: loss 8.5138, time 126.22ms
iter 82300: loss 8.6069, time 128.99ms
iter 82310: loss 8.0998, time 125.88ms
iter 82320: loss 8.7927, time 125.48ms
iter 82330: loss 7.6740, time 125.38ms
iter 82340: loss 8.0518, time 125.08ms
iter 82350: loss 8.1457, time 125.10ms
iter 82360: loss 8.1917, time 125.14ms
iter 82370: loss 7.2904, time 125.04ms
iter 82380: loss 8.3220, time 125.08ms
iter 82390: loss 8.7756, time 124.58ms
iter 82400: loss 8.3751, time 127.42ms
iter 82410: loss 8.9873, time 121.25ms
iter 82420: loss 7.7087, time 121.77ms
iter 82430: loss 7.8977, time 121.22ms
iter 82440: loss 8.4714, time 121.80ms
iter 82450: loss 7.2890, time 122.10ms
iter 82460: loss 7.8019, time 121.54ms
iter 82470: loss 7.6446, time 121.49ms
iter 82480: loss 7.5352, time 121.57ms
iter 82490: loss 7.6576, time 121.38ms
step 82500: train loss 6.8139, val loss 6.8671
saving checkpoint to out-shakespeare-char
iter 82500: loss 7.8231, time 2880.86ms
iter 82510: loss 8.0129, time 121.77ms
iter 82520: loss 8.8634, time 119.87ms
iter 82530: loss 8.2873, time 120.43ms
iter 82540: loss 8.1419, time 121.28ms
iter 82550: loss 7.2524, time 121.44ms
iter 82560: loss 8.5434, time 121.29ms
iter 82570: loss 7.8172, time 121.44ms
iter 82580: loss 8.5135, time 121.77ms
iter 82590: loss 8.1872, time 121.45ms
iter 82600: loss 7.9131, time 120.59ms
iter 82610: loss 8.0197, time 121.58ms
iter 82620: loss 7.7240, time 121.51ms
iter 82630: loss 8.1268, time 121.77ms
iter 82640: loss 8.4441, time 121.58ms
iter 82650: loss 8.4360, time 120.28ms
iter 82660: loss 9.0667, time 121.55ms
iter 82670: loss 8.2593, time 120.00ms
iter 82680: loss 8.2505, time 120.37ms
iter 82690: loss 8.5453, time 121.36ms
iter 82700: loss 8.8718, time 121.57ms
iter 82710: loss 8.0588, time 121.99ms
iter 82720: loss 8.4999, time 121.65ms
iter 82730: loss 8.5324, time 121.46ms
iter 82740: loss 8.5998, time 119.59ms
step 82750: train loss 6.8236, val loss 6.8159
saving checkpoint to out-shakespeare-char
iter 82750: loss 8.1761, time 2889.38ms
iter 82760: loss 8.3223, time 121.75ms
iter 82770: loss 8.1945, time 121.56ms
iter 82780: loss 8.4549, time 121.54ms
iter 82790: loss 8.2106, time 121.68ms
iter 82800: loss 7.8461, time 122.77ms
iter 82810: loss 7.7668, time 121.60ms
iter 82820: loss 8.3077, time 121.87ms
iter 82830: loss 8.1844, time 121.60ms
iter 82840: loss 8.1006, time 121.57ms
iter 82850: loss 8.5810, time 121.46ms
iter 82860: loss 7.6471, time 121.58ms
iter 82870: loss 7.5931, time 121.53ms
iter 82880: loss 8.9687, time 121.59ms
iter 82890: loss 7.9960, time 121.60ms
iter 82900: loss 8.0032, time 122.48ms
iter 82910: loss 8.1398, time 121.41ms
iter 82920: loss 8.0303, time 121.76ms
iter 82930: loss 8.1882, time 122.08ms
iter 82940: loss 9.4715, time 121.00ms
iter 82950: loss 8.1952, time 121.64ms
iter 82960: loss 8.9683, time 121.89ms
iter 82970: loss 8.1835, time 121.42ms
iter 82980: loss 8.3022, time 121.43ms
iter 82990: loss 8.1479, time 121.62ms
step 83000: train loss 6.8266, val loss 6.8221
saving checkpoint to out-shakespeare-char
iter 83000: loss 8.1586, time 2890.81ms
iter 83010: loss 8.4457, time 121.53ms
iter 83020: loss 8.3907, time 121.92ms
iter 83030: loss 7.5541, time 121.65ms
iter 83040: loss 7.9510, time 121.70ms
iter 83050: loss 8.2556, time 121.50ms
iter 83060: loss 8.1318, time 121.65ms
iter 83070: loss 7.8703, time 121.34ms
iter 83080: loss 8.3188, time 120.69ms
iter 83090: loss 8.5908, time 121.77ms
iter 83100: loss 7.9047, time 121.54ms
iter 83110: loss 8.2322, time 120.85ms
iter 83120: loss 7.3924, time 121.61ms
iter 83130: loss 9.3076, time 121.44ms
iter 83140: loss 7.6055, time 121.37ms
iter 83150: loss 8.1066, time 121.65ms
iter 83160: loss 7.8163, time 121.59ms
iter 83170: loss 8.7607, time 121.96ms
iter 83180: loss 8.6876, time 121.53ms
iter 83190: loss 8.3615, time 121.57ms
iter 83200: loss 7.6727, time 121.82ms
iter 83210: loss 8.2700, time 121.32ms
iter 83220: loss 8.5631, time 121.21ms
iter 83230: loss 8.1943, time 121.33ms
iter 83240: loss 8.3765, time 121.83ms
step 83250: train loss 6.8022, val loss 6.8096
saving checkpoint to out-shakespeare-char
iter 83250: loss 8.4871, time 2909.65ms
iter 83260: loss 7.9620, time 121.63ms
iter 83270: loss 8.2456, time 123.24ms
iter 83280: loss 8.6337, time 121.66ms
iter 83290: loss 8.3750, time 123.24ms
iter 83300: loss 7.8761, time 121.53ms
iter 83310: loss 8.1010, time 123.05ms
iter 83320: loss 9.0764, time 121.67ms
iter 83330: loss 9.0087, time 122.87ms
iter 83340: loss 8.1378, time 121.69ms
iter 83350: loss 8.3233, time 122.69ms
iter 83360: loss 7.5380, time 121.67ms
iter 83370: loss 7.7903, time 123.38ms
iter 83380: loss 7.9324, time 120.39ms
iter 83390: loss 7.8963, time 123.41ms
iter 83400: loss 8.3086, time 120.79ms
iter 83410: loss 7.4428, time 123.48ms
iter 83420: loss 8.5111, time 121.82ms
iter 83430: loss 8.2802, time 121.24ms
iter 83440: loss 8.1743, time 120.84ms
iter 83450: loss 7.7927, time 122.92ms
iter 83460: loss 8.0054, time 121.76ms
iter 83470: loss 8.5416, time 122.80ms
iter 83480: loss 8.5559, time 121.60ms
iter 83490: loss 8.5229, time 124.42ms
step 83500: train loss 6.7827, val loss 6.8056
saving checkpoint to out-shakespeare-char
iter 83500: loss 8.2674, time 2900.01ms
iter 83510: loss 7.7466, time 125.92ms
iter 83520: loss 7.5744, time 125.45ms
iter 83530: loss 7.7644, time 127.44ms
iter 83540: loss 8.5640, time 125.19ms
iter 83550: loss 7.3943, time 125.44ms
iter 83560: loss 8.8620, time 125.31ms
iter 83570: loss 8.2078, time 125.29ms
iter 83580: loss 8.2507, time 125.07ms
iter 83590: loss 7.5378, time 126.07ms
iter 83600: loss 8.4791, time 125.38ms
iter 83610: loss 8.4378, time 125.61ms
iter 83620: loss 8.4663, time 125.55ms
iter 83630: loss 8.0242, time 124.59ms
iter 83640: loss 8.0497, time 127.65ms
iter 83650: loss 8.2368, time 124.53ms
iter 83660: loss 8.0434, time 125.17ms
iter 83670: loss 8.1583, time 125.29ms
iter 83680: loss 8.9993, time 125.20ms
iter 83690: loss 8.1506, time 124.63ms
iter 83700: loss 9.0605, time 124.36ms
iter 83710: loss 8.1053, time 125.25ms
iter 83720: loss 8.2942, time 124.51ms
iter 83730: loss 8.5213, time 125.25ms
iter 83740: loss 8.4850, time 125.43ms
step 83750: train loss 6.7759, val loss 6.8099
saving checkpoint to out-shakespeare-char
iter 83750: loss 7.0109, time 2895.20ms
iter 83760: loss 7.6875, time 125.90ms
iter 83770: loss 8.2035, time 125.58ms
iter 83780: loss 8.5230, time 128.64ms
iter 83790: loss 8.8037, time 125.57ms
iter 83800: loss 8.3183, time 125.36ms
iter 83810: loss 8.3209, time 126.07ms
iter 83820: loss 8.5236, time 125.75ms
iter 83830: loss 7.7323, time 125.65ms
iter 83840: loss 8.6099, time 125.00ms
iter 83850: loss 8.4793, time 125.06ms
iter 83860: loss 8.3419, time 125.59ms
iter 83870: loss 7.3086, time 126.18ms
iter 83880: loss 7.9748, time 125.42ms
iter 83890: loss 8.3464, time 125.36ms
iter 83900: loss 7.9690, time 128.78ms
iter 83910: loss 8.0923, time 125.27ms
iter 83920: loss 8.9830, time 126.91ms
iter 83930: loss 8.4765, time 126.09ms
iter 83940: loss 8.2171, time 125.60ms
iter 83950: loss 8.1977, time 125.18ms
iter 83960: loss 7.3446, time 126.10ms
iter 83970: loss 7.7679, time 125.32ms
iter 83980: loss 8.3109, time 125.64ms
iter 83990: loss 7.7473, time 125.16ms
step 84000: train loss 6.7978, val loss 6.8457
saving checkpoint to out-shakespeare-char
iter 84000: loss 7.5056, time 2896.83ms
iter 84010: loss 6.9277, time 124.87ms
iter 84020: loss 8.7093, time 125.74ms
iter 84030: loss 8.1081, time 125.31ms
iter 84040: loss 8.3018, time 125.38ms
iter 84050: loss 8.7327, time 126.88ms
iter 84060: loss 8.8145, time 128.29ms
iter 84070: loss 7.6036, time 127.74ms
iter 84080: loss 8.8124, time 125.36ms
iter 84090: loss 8.2541, time 125.56ms
iter 84100: loss 8.7165, time 125.65ms
iter 84110: loss 7.9678, time 125.24ms
iter 84120: loss 8.3109, time 125.93ms
iter 84130: loss 8.3004, time 125.82ms
iter 84140: loss 8.3017, time 125.47ms
iter 84150: loss 8.3924, time 125.53ms
iter 84160: loss 8.4433, time 124.85ms
iter 84170: loss 7.9010, time 125.34ms
iter 84180: loss 8.2432, time 125.64ms
iter 84190: loss 8.2582, time 125.65ms
iter 84200: loss 8.2918, time 125.14ms
iter 84210: loss 8.6685, time 125.39ms
iter 84220: loss 9.0000, time 128.19ms
iter 84230: loss 8.2191, time 125.13ms
iter 84240: loss 8.1682, time 123.85ms
step 84250: train loss 6.6931, val loss 6.7307
saving checkpoint to out-shakespeare-char
iter 84250: loss 8.0960, time 2876.51ms
iter 84260: loss 7.9182, time 123.71ms
iter 84270: loss 7.9547, time 124.24ms
iter 84280: loss 8.2188, time 126.91ms
iter 84290: loss 8.5517, time 125.45ms
iter 84300: loss 8.3376, time 125.37ms
iter 84310: loss 9.1641, time 125.49ms
iter 84320: loss 7.8164, time 125.29ms
iter 84330: loss 7.9078, time 125.46ms
iter 84340: loss 8.1396, time 125.27ms
iter 84350: loss 7.7870, time 128.22ms
iter 84360: loss 8.2884, time 125.68ms
iter 84370: loss 7.2124, time 125.15ms
iter 84380: loss 8.5907, time 125.68ms
iter 84390: loss 8.0810, time 128.29ms
iter 84400: loss 8.0956, time 126.39ms
iter 84410: loss 8.0169, time 124.99ms
iter 84420: loss 7.9890, time 124.92ms
iter 84430: loss 8.3303, time 125.20ms
iter 84440: loss 6.7731, time 124.78ms
iter 84450: loss 8.2213, time 125.27ms
iter 84460: loss 7.9285, time 125.07ms
iter 84470: loss 7.8468, time 125.52ms
iter 84480: loss 8.9949, time 124.92ms
iter 84490: loss 8.3136, time 125.16ms
step 84500: train loss 6.7342, val loss 6.7380
saving checkpoint to out-shakespeare-char
iter 84500: loss 8.6325, time 2879.91ms
iter 84510: loss 7.8418, time 124.97ms
iter 84520: loss 8.3294, time 124.94ms
iter 84530: loss 8.9953, time 124.31ms
iter 84540: loss 7.5731, time 124.96ms
iter 84550: loss 8.8525, time 125.18ms
iter 84560: loss 8.4539, time 127.97ms
iter 84570: loss 8.2586, time 124.89ms
iter 84580: loss 8.3530, time 125.07ms
iter 84590: loss 7.7776, time 125.26ms
iter 84600: loss 8.1845, time 124.97ms
iter 84610: loss 8.1588, time 125.06ms
iter 84620: loss 7.8930, time 125.07ms
iter 84630: loss 7.4426, time 123.47ms
iter 84640: loss 8.4527, time 124.74ms
iter 84650: loss 9.0884, time 124.95ms
iter 84660: loss 8.8768, time 125.23ms
iter 84670: loss 8.7094, time 127.93ms
iter 84680: loss 8.2081, time 125.13ms
iter 84690: loss 8.7119, time 125.65ms
iter 84700: loss 8.6175, time 125.09ms
iter 84710: loss 6.8617, time 124.91ms
iter 84720: loss 8.2087, time 124.22ms
iter 84730: loss 8.0365, time 125.41ms
iter 84740: loss 8.2309, time 124.73ms
step 84750: train loss 6.7525, val loss 6.8293
saving checkpoint to out-shakespeare-char
iter 84750: loss 7.8477, time 2901.96ms
iter 84760: loss 8.1245, time 125.36ms
iter 84770: loss 8.3619, time 125.50ms
iter 84780: loss 8.4928, time 125.53ms
iter 84790: loss 8.7932, time 124.92ms
iter 84800: loss 8.5873, time 125.07ms
iter 84810: loss 8.3876, time 125.64ms
iter 84820: loss 8.1172, time 128.19ms
iter 84830: loss 7.5592, time 125.32ms
iter 84840: loss 8.2854, time 125.48ms
iter 84850: loss 7.7304, time 125.08ms
iter 84860: loss 7.7643, time 125.43ms
iter 84870: loss 8.4519, time 124.45ms
iter 84880: loss 8.6899, time 125.37ms
iter 84890: loss 7.5408, time 124.91ms
iter 84900: loss 7.8041, time 125.10ms
iter 84910: loss 7.8716, time 125.42ms
iter 84920: loss 8.5122, time 125.21ms
iter 84930: loss 8.7133, time 127.77ms
iter 84940: loss 7.4985, time 124.95ms
iter 84950: loss 7.7189, time 123.75ms
iter 84960: loss 8.1815, time 124.69ms
iter 84970: loss 7.5653, time 124.19ms
iter 84980: loss 8.0273, time 125.09ms
iter 84990: loss 8.6173, time 124.92ms
step 85000: train loss 6.7807, val loss 6.7898
saving checkpoint to out-shakespeare-char
iter 85000: loss 8.1752, time 2850.60ms
iter 85010: loss 7.4327, time 125.90ms
iter 85020: loss 7.7047, time 126.65ms
iter 85030: loss 8.1538, time 125.63ms
iter 85040: loss 7.1074, time 125.74ms
iter 85050: loss 7.7659, time 125.75ms
iter 85060: loss 7.9505, time 124.75ms
iter 85070: loss 8.8026, time 125.64ms
iter 85080: loss 8.3276, time 125.73ms
iter 85090: loss 8.8325, time 125.82ms
iter 85100: loss 7.2164, time 128.46ms
iter 85110: loss 7.9079, time 126.37ms
iter 85120: loss 8.0046, time 127.14ms
iter 85130: loss 9.3508, time 125.12ms
iter 85140: loss 8.0174, time 125.79ms
iter 85150: loss 7.9581, time 126.03ms
iter 85160: loss 8.1270, time 125.70ms
iter 85170: loss 7.7583, time 127.05ms
iter 85180: loss 8.5383, time 124.69ms
iter 85190: loss 8.3991, time 124.85ms
iter 85200: loss 7.5724, time 125.41ms
iter 85210: loss 7.5703, time 127.34ms
iter 85220: loss 8.1155, time 122.75ms
iter 85230: loss 8.2412, time 122.26ms
iter 85240: loss 9.3184, time 121.57ms
step 85250: train loss 6.7920, val loss 6.7430
saving checkpoint to out-shakespeare-char
iter 85250: loss 7.5141, time 2848.74ms
iter 85260: loss 8.1458, time 121.71ms
iter 85270: loss 7.6169, time 121.99ms
iter 85280: loss 7.5585, time 121.93ms
iter 85290: loss 8.1522, time 121.95ms
iter 85300: loss 7.9833, time 121.92ms
iter 85310: loss 7.5651, time 121.99ms
iter 85320: loss 8.8971, time 121.04ms
iter 85330: loss 7.9309, time 122.09ms
iter 85340: loss 8.0595, time 120.89ms
iter 85350: loss 7.6074, time 121.97ms
iter 85360: loss 7.6753, time 121.83ms
iter 85370: loss 8.2089, time 121.87ms
iter 85380: loss 8.0968, time 121.91ms
iter 85390: loss 8.5563, time 122.36ms
iter 85400: loss 7.5981, time 121.84ms
iter 85410: loss 7.5203, time 121.77ms
iter 85420: loss 7.6762, time 122.04ms
iter 85430: loss 7.4437, time 120.98ms
iter 85440: loss 7.5376, time 121.61ms
iter 85450: loss 7.4536, time 121.91ms
iter 85460: loss 8.7618, time 121.75ms
iter 85470: loss 7.8066, time 121.97ms
iter 85480: loss 8.9116, time 121.87ms
iter 85490: loss 8.2159, time 122.35ms
step 85500: train loss 6.7686, val loss 6.7249
saving checkpoint to out-shakespeare-char
iter 85500: loss 8.6210, time 2887.72ms
iter 85510: loss 7.4802, time 121.93ms
iter 85520: loss 7.9875, time 122.86ms
iter 85530: loss 8.4296, time 121.90ms
iter 85540: loss 8.7210, time 122.92ms
iter 85550: loss 7.7338, time 122.64ms
iter 85560: loss 8.3946, time 122.91ms
iter 85570: loss 7.9952, time 121.84ms
iter 85580: loss 7.7686, time 122.54ms
iter 85590: loss 7.7225, time 120.93ms
iter 85600: loss 7.7904, time 122.25ms
iter 85610: loss 8.4914, time 121.99ms
iter 85620: loss 8.1447, time 122.98ms
iter 85630: loss 7.6357, time 121.80ms
iter 85640: loss 7.4746, time 123.19ms
iter 85650: loss 8.0965, time 122.37ms
iter 85660: loss 7.6012, time 123.18ms
iter 85670: loss 7.2969, time 122.04ms
iter 85680: loss 7.1946, time 122.73ms
iter 85690: loss 7.5848, time 120.74ms
iter 85700: loss 8.4608, time 122.19ms
iter 85710: loss 7.9080, time 121.86ms
iter 85720: loss 8.1171, time 122.68ms
iter 85730: loss 7.6983, time 122.37ms
iter 85740: loss 7.5726, time 123.25ms
step 85750: train loss 6.7311, val loss 6.7485
saving checkpoint to out-shakespeare-char
iter 85750: loss 7.7563, time 2907.27ms
iter 85760: loss 8.4814, time 125.37ms
iter 85770: loss 7.8929, time 122.37ms
iter 85780: loss 7.9257, time 124.34ms
iter 85790: loss 7.9827, time 121.16ms
iter 85800: loss 7.9125, time 124.89ms
iter 85810: loss 8.2983, time 122.17ms
iter 85820: loss 8.7040, time 124.94ms
iter 85830: loss 9.2518, time 121.38ms
iter 85840: loss 8.4834, time 125.01ms
iter 85850: loss 7.5886, time 121.72ms
iter 85860: loss 8.0923, time 123.95ms
iter 85870: loss 7.9249, time 121.51ms
iter 85880: loss 8.2693, time 125.03ms
iter 85890: loss 7.2161, time 122.41ms
iter 85900: loss 8.2225, time 124.74ms
iter 85910: loss 8.2299, time 122.23ms
iter 85920: loss 7.7900, time 125.10ms
iter 85930: loss 7.9417, time 121.80ms
iter 85940: loss 8.4663, time 123.95ms
iter 85950: loss 7.9389, time 120.93ms
iter 85960: loss 8.0724, time 125.65ms
iter 85970: loss 7.1402, time 121.94ms
iter 85980: loss 9.2488, time 124.69ms
iter 85990: loss 7.4876, time 121.44ms
step 86000: train loss 6.7249, val loss 6.7226
saving checkpoint to out-shakespeare-char
iter 86000: loss 7.7033, time 2892.24ms
iter 86010: loss 7.4681, time 121.80ms
iter 86020: loss 7.8689, time 121.96ms
iter 86030: loss 8.7529, time 121.60ms
iter 86040: loss 8.9101, time 121.23ms
iter 86050: loss 7.1917, time 121.18ms
iter 86060: loss 7.9255, time 120.98ms
iter 86070: loss 8.6270, time 121.03ms
iter 86080: loss 8.8011, time 121.31ms
iter 86090: loss 8.1038, time 121.78ms
iter 86100: loss 8.1127, time 121.99ms
iter 86110: loss 8.3269, time 121.91ms
iter 86120: loss 8.6282, time 121.60ms
iter 86130: loss 8.4360, time 121.72ms
iter 86140: loss 7.8235, time 120.74ms
iter 86150: loss 8.3127, time 120.38ms
iter 86160: loss 7.2956, time 121.40ms
iter 86170: loss 7.6713, time 120.81ms
iter 86180: loss 8.2324, time 120.97ms
iter 86190: loss 8.6381, time 121.93ms
iter 86200: loss 8.0111, time 121.95ms
iter 86210: loss 7.4644, time 121.77ms
iter 86220: loss 8.3537, time 121.99ms
iter 86230: loss 8.5244, time 121.48ms
iter 86240: loss 7.6563, time 121.88ms
step 86250: train loss 6.7100, val loss 6.7372
saving checkpoint to out-shakespeare-char
iter 86250: loss 8.1616, time 2890.63ms
iter 86260: loss 9.0221, time 124.80ms
iter 86270: loss 7.7034, time 126.17ms
iter 86280: loss 8.4223, time 125.74ms
iter 86290: loss 7.9859, time 128.55ms
iter 86300: loss 7.6640, time 124.74ms
iter 86310: loss 7.8645, time 125.22ms
iter 86320: loss 7.7043, time 125.53ms
iter 86330: loss 7.4721, time 125.44ms
iter 86340: loss 8.4070, time 125.67ms
iter 86350: loss 8.1689, time 125.66ms
iter 86360: loss 8.5089, time 125.50ms
iter 86370: loss 8.0015, time 125.45ms
iter 86380: loss 7.4395, time 125.77ms
iter 86390: loss 8.2117, time 126.02ms
iter 86400: loss 8.4718, time 125.14ms
iter 86410: loss 7.7777, time 125.83ms
iter 86420: loss 8.1457, time 125.63ms
iter 86430: loss 8.4839, time 125.56ms
iter 86440: loss 7.8826, time 126.09ms
iter 86450: loss 8.6275, time 127.79ms
iter 86460: loss 7.9612, time 125.51ms
iter 86470: loss 8.9252, time 125.43ms
iter 86480: loss 8.7474, time 125.84ms
iter 86490: loss 7.7848, time 124.65ms
step 86500: train loss 6.7339, val loss 6.7538
saving checkpoint to out-shakespeare-char
iter 86500: loss 7.6751, time 2900.38ms
iter 86510: loss 7.1957, time 125.94ms
iter 86520: loss 8.3990, time 128.65ms
iter 86530: loss 7.8942, time 125.73ms
iter 86540: loss 7.9067, time 125.66ms
iter 86550: loss 8.0779, time 125.16ms
iter 86560: loss 8.0795, time 125.56ms
iter 86570: loss 8.7531, time 125.56ms
iter 86580: loss 8.0885, time 125.62ms
iter 86590: loss 8.2950, time 124.59ms
iter 86600: loss 7.8096, time 125.51ms
iter 86610: loss 8.4405, time 125.48ms
iter 86620: loss 8.0186, time 125.60ms
iter 86630: loss 8.5175, time 128.48ms
iter 86640: loss 8.1921, time 125.57ms
iter 86650: loss 8.5218, time 125.47ms
iter 86660: loss 7.9719, time 125.95ms
iter 86670: loss 7.3471, time 128.65ms
iter 86680: loss 7.7645, time 125.73ms
iter 86690: loss 8.7739, time 124.54ms
iter 86700: loss 8.1504, time 125.66ms
iter 86710: loss 8.2010, time 126.19ms
iter 86720: loss 8.9287, time 125.36ms
iter 86730: loss 8.3871, time 125.65ms
iter 86740: loss 8.7133, time 125.30ms
step 86750: train loss 6.7318, val loss 6.7572
saving checkpoint to out-shakespeare-char
iter 86750: loss 8.0335, time 2895.31ms
iter 86760: loss 8.1498, time 126.05ms
iter 86770: loss 8.3210, time 125.72ms
iter 86780: loss 8.1167, time 125.54ms
iter 86790: loss 7.5250, time 123.29ms
iter 86800: loss 8.1285, time 125.91ms
iter 86810: loss 8.1022, time 125.81ms
iter 86820: loss 8.1197, time 125.23ms
iter 86830: loss 7.6355, time 125.19ms
iter 86840: loss 8.2572, time 127.91ms
iter 86850: loss 8.3735, time 124.68ms
iter 86860: loss 8.3819, time 125.02ms
iter 86870: loss 8.6924, time 126.69ms
iter 86880: loss 7.9551, time 125.24ms
iter 86890: loss 7.3390, time 125.55ms
iter 86900: loss 8.6150, time 126.36ms
iter 86910: loss 7.7616, time 125.84ms
iter 86920: loss 7.5952, time 125.36ms
iter 86930: loss 8.0907, time 125.51ms
iter 86940: loss 7.9643, time 125.74ms
iter 86950: loss 8.5012, time 128.31ms
iter 86960: loss 8.1015, time 124.94ms
iter 86970: loss 8.3669, time 124.63ms
iter 86980: loss 8.7326, time 125.22ms
iter 86990: loss 7.9473, time 124.70ms
step 87000: train loss 6.7368, val loss 6.7259
saving checkpoint to out-shakespeare-char
iter 87000: loss 9.0368, time 2882.27ms
iter 87010: loss 8.4557, time 125.25ms
iter 87020: loss 7.3450, time 125.45ms
iter 87030: loss 8.1390, time 125.61ms
iter 87040: loss 8.4824, time 125.65ms
iter 87050: loss 8.0679, time 125.55ms
iter 87060: loss 8.5076, time 126.36ms
iter 87070: loss 8.1031, time 126.14ms
iter 87080: loss 7.7350, time 128.63ms
iter 87090: loss 7.9424, time 125.96ms
iter 87100: loss 8.2363, time 125.16ms
iter 87110: loss 7.9964, time 125.90ms
iter 87120: loss 7.4991, time 126.08ms
iter 87130: loss 7.9599, time 125.68ms
iter 87140: loss 7.5373, time 126.17ms
iter 87150: loss 8.0076, time 125.88ms
iter 87160: loss 8.0546, time 126.01ms
iter 87170: loss 8.1968, time 125.55ms
iter 87180: loss 7.2431, time 126.39ms
iter 87190: loss 7.7003, time 128.83ms
iter 87200: loss 8.1436, time 125.51ms
iter 87210: loss 7.6633, time 125.76ms
iter 87220: loss 8.6784, time 126.41ms
iter 87230: loss 7.3915, time 125.59ms
iter 87240: loss 8.1363, time 126.24ms
step 87250: train loss 6.6783, val loss 6.7385
saving checkpoint to out-shakespeare-char
iter 87250: loss 7.6698, time 2896.56ms
iter 87260: loss 8.0129, time 126.34ms
iter 87270: loss 8.2561, time 126.21ms
iter 87280: loss 7.4020, time 126.15ms
iter 87290: loss 8.5460, time 126.25ms
iter 87300: loss 7.7738, time 126.11ms
iter 87310: loss 8.4997, time 125.77ms
iter 87320: loss 7.7759, time 125.47ms
iter 87330: loss 7.2714, time 126.79ms
iter 87340: loss 8.1791, time 125.82ms
iter 87350: loss 8.3412, time 125.61ms
iter 87360: loss 7.9113, time 128.49ms
iter 87370: loss 7.7827, time 126.03ms
iter 87380: loss 8.7859, time 125.81ms
iter 87390: loss 9.0699, time 125.74ms
iter 87400: loss 8.1581, time 125.65ms
iter 87410: loss 7.9554, time 125.94ms
iter 87420: loss 7.2501, time 125.65ms
iter 87430: loss 7.8803, time 124.96ms
iter 87440: loss 7.7094, time 125.53ms
iter 87450: loss 8.5435, time 125.79ms
iter 87460: loss 7.7330, time 125.74ms
iter 87470: loss 7.9974, time 128.76ms
iter 87480: loss 8.4098, time 125.61ms
iter 87490: loss 8.0400, time 125.68ms
step 87500: train loss 6.7493, val loss 6.7454
saving checkpoint to out-shakespeare-char
iter 87500: loss 8.2294, time 2906.44ms
iter 87510: loss 7.8293, time 124.76ms
iter 87520: loss 8.3555, time 125.83ms
iter 87530: loss 8.9145, time 124.92ms
iter 87540: loss 8.9296, time 126.14ms
iter 87550: loss 7.7551, time 126.47ms
iter 87560: loss 7.7937, time 126.42ms
iter 87570: loss 8.0826, time 128.85ms
iter 87580: loss 7.8579, time 126.00ms
iter 87590: loss 7.9886, time 125.64ms
iter 87600: loss 8.7377, time 125.90ms
iter 87610: loss 7.7813, time 126.08ms
iter 87620: loss 7.5595, time 125.72ms
iter 87630: loss 8.1718, time 125.87ms
iter 87640: loss 8.0106, time 126.03ms
iter 87650: loss 7.3731, time 125.83ms
iter 87660: loss 7.7711, time 125.58ms
iter 87670: loss 7.7217, time 126.05ms
iter 87680: loss 7.9697, time 128.90ms
iter 87690: loss 8.1406, time 126.08ms
iter 87700: loss 7.6170, time 125.18ms
iter 87710: loss 7.8158, time 125.24ms
iter 87720: loss 8.5794, time 124.98ms
iter 87730: loss 7.8356, time 125.23ms
iter 87740: loss 8.7051, time 124.76ms
step 87750: train loss 6.7234, val loss 6.7004
saving checkpoint to out-shakespeare-char
iter 87750: loss 8.0890, time 2881.08ms
iter 87760: loss 8.1409, time 121.50ms
iter 87770: loss 8.5693, time 120.04ms
iter 87780: loss 8.0829, time 121.29ms
iter 87790: loss 7.4080, time 122.00ms
iter 87800: loss 8.4240, time 123.07ms
iter 87810: loss 8.2296, time 121.90ms
iter 87820: loss 7.5810, time 122.16ms
iter 87830: loss 8.1372, time 121.24ms
iter 87840: loss 8.0027, time 123.47ms
iter 87850: loss 8.0723, time 121.56ms
iter 87860: loss 8.0170, time 123.15ms
iter 87870: loss 8.3225, time 122.00ms
iter 87880: loss 8.3524, time 123.18ms
iter 87890: loss 7.6690, time 121.96ms
iter 87900: loss 8.1444, time 122.84ms
iter 87910: loss 7.6796, time 121.09ms
iter 87920: loss 7.6248, time 122.38ms
iter 87930: loss 7.9222, time 122.03ms
iter 87940: loss 8.2205, time 122.66ms
iter 87950: loss 9.1194, time 121.94ms
iter 87960: loss 8.1774, time 124.88ms
iter 87970: loss 8.3130, time 121.92ms
iter 87980: loss 8.7147, time 123.05ms
iter 87990: loss 8.0954, time 121.89ms
step 88000: train loss 6.6811, val loss 6.7006
saving checkpoint to out-shakespeare-char
iter 88000: loss 7.6364, time 2885.27ms
iter 88010: loss 7.9566, time 121.11ms
iter 88020: loss 6.9915, time 121.51ms
iter 88030: loss 7.8776, time 121.62ms
iter 88040: loss 8.2012, time 119.78ms
iter 88050: loss 7.8390, time 119.76ms
iter 88060: loss 7.6130, time 119.82ms
iter 88070: loss 8.1725, time 119.74ms
iter 88080: loss 8.5156, time 120.01ms
iter 88090: loss 8.1394, time 119.88ms
iter 88100: loss 8.1500, time 119.81ms
iter 88110: loss 8.4597, time 119.87ms
iter 88120: loss 8.6058, time 120.81ms
iter 88130: loss 7.3478, time 119.82ms
iter 88140: loss 8.3126, time 120.08ms
iter 88150: loss 8.1860, time 119.66ms
iter 88160: loss 7.8290, time 119.74ms
iter 88170: loss 7.3560, time 119.96ms
iter 88180: loss 8.9907, time 119.82ms
iter 88190: loss 8.4020, time 119.88ms
iter 88200: loss 8.2372, time 120.20ms
iter 88210: loss 7.8884, time 119.86ms
iter 88220: loss 8.2348, time 119.83ms
iter 88230: loss 7.9230, time 119.84ms
iter 88240: loss 8.3481, time 119.82ms
step 88250: train loss 6.6934, val loss 6.6678
saving checkpoint to out-shakespeare-char
iter 88250: loss 8.1892, time 2897.64ms
iter 88260: loss 7.8788, time 120.91ms
iter 88270: loss 7.8144, time 120.46ms
iter 88280: loss 7.4176, time 119.61ms
iter 88290: loss 8.0788, time 120.19ms
iter 88300: loss 8.0198, time 119.73ms
iter 88310: loss 7.6728, time 119.88ms
iter 88320: loss 8.5528, time 119.56ms
iter 88330: loss 8.7286, time 119.82ms
iter 88340: loss 8.4151, time 119.98ms
iter 88350: loss 8.0949, time 119.73ms
iter 88360: loss 7.8349, time 120.64ms
iter 88370: loss 7.6116, time 119.69ms
iter 88380: loss 7.8455, time 119.67ms
iter 88390: loss 7.9545, time 120.09ms
iter 88400: loss 8.2161, time 119.49ms
iter 88410: loss 8.8956, time 119.74ms
iter 88420: loss 7.3559, time 119.70ms
iter 88430: loss 8.3325, time 119.72ms
iter 88440: loss 7.8957, time 119.64ms
iter 88450: loss 7.9033, time 119.90ms
iter 88460: loss 8.2355, time 119.91ms
iter 88470: loss 8.8332, time 120.84ms
iter 88480: loss 7.9711, time 119.70ms
iter 88490: loss 8.2713, time 119.74ms
step 88500: train loss 6.6603, val loss 6.6698
saving checkpoint to out-shakespeare-char
iter 88500: loss 8.2819, time 2888.68ms
iter 88510: loss 8.0126, time 120.19ms
iter 88520: loss 7.9559, time 120.94ms
iter 88530: loss 7.9153, time 119.68ms
iter 88540: loss 7.7468, time 120.79ms
iter 88550: loss 8.1134, time 119.72ms
iter 88560: loss 8.2508, time 120.99ms
iter 88570: loss 8.1642, time 120.29ms
iter 88580: loss 8.2622, time 121.21ms
iter 88590: loss 7.2535, time 119.78ms
iter 88600: loss 8.0715, time 121.23ms
iter 88610: loss 7.7348, time 119.97ms
iter 88620: loss 8.1412, time 120.88ms
iter 88630: loss 8.6351, time 119.58ms
iter 88640: loss 7.8502, time 120.82ms
iter 88650: loss 7.8708, time 119.68ms
iter 88660: loss 8.1359, time 121.21ms
iter 88670: loss 7.7393, time 120.28ms
iter 88680: loss 8.0430, time 120.90ms
iter 88690: loss 7.8242, time 120.03ms
iter 88700: loss 8.1813, time 120.85ms
iter 88710: loss 8.3032, time 119.94ms
iter 88720: loss 8.0773, time 121.32ms
iter 88730: loss 8.3608, time 119.37ms
iter 88740: loss 7.9005, time 120.82ms
step 88750: train loss 6.6820, val loss 6.6713
saving checkpoint to out-shakespeare-char
iter 88750: loss 7.3073, time 2895.96ms
iter 88760: loss 7.8140, time 120.07ms
iter 88770: loss 8.0164, time 119.80ms
iter 88780: loss 8.1563, time 121.15ms
iter 88790: loss 7.7199, time 119.64ms
iter 88800: loss 7.9151, time 119.74ms
iter 88810: loss 8.8591, time 119.82ms
iter 88820: loss 8.6107, time 119.87ms
iter 88830: loss 7.7799, time 121.11ms
iter 88840: loss 7.0618, time 119.96ms
iter 88850: loss 8.7134, time 119.76ms
iter 88860: loss 7.8179, time 119.71ms
iter 88870: loss 7.8103, time 119.98ms
iter 88880: loss 7.7753, time 120.94ms
iter 88890: loss 8.5462, time 119.96ms
iter 88900: loss 9.0599, time 119.87ms
iter 88910: loss 8.4157, time 120.13ms
iter 88920: loss 8.5419, time 119.79ms
iter 88930: loss 8.9110, time 120.54ms
iter 88940: loss 8.0684, time 120.66ms
iter 88950: loss 8.7579, time 119.56ms
iter 88960: loss 7.6410, time 119.57ms
iter 88970: loss 8.3438, time 119.42ms
iter 88980: loss 7.8624, time 120.46ms
iter 88990: loss 8.3148, time 119.91ms
step 89000: train loss 6.6480, val loss 6.6414
saving checkpoint to out-shakespeare-char
iter 89000: loss 8.1099, time 2888.05ms
iter 89010: loss 8.4610, time 125.03ms
iter 89020: loss 7.9577, time 125.25ms
iter 89030: loss 7.9794, time 123.75ms
iter 89040: loss 7.9995, time 125.11ms
iter 89050: loss 8.2423, time 124.92ms
iter 89060: loss 8.3076, time 125.11ms
iter 89070: loss 7.0988, time 125.00ms
iter 89080: loss 8.0931, time 125.38ms
iter 89090: loss 7.8110, time 128.11ms
iter 89100: loss 8.0044, time 125.07ms
iter 89110: loss 8.8525, time 125.19ms
iter 89120: loss 7.7809, time 124.85ms
iter 89130: loss 7.9390, time 125.43ms
iter 89140: loss 7.9151, time 125.72ms
iter 89150: loss 7.8988, time 125.25ms
iter 89160: loss 7.8961, time 125.13ms
iter 89170: loss 8.2428, time 124.94ms
iter 89180: loss 8.3561, time 124.89ms
iter 89190: loss 7.9703, time 125.12ms
iter 89200: loss 7.2189, time 127.67ms
iter 89210: loss 8.1132, time 125.20ms
iter 89220: loss 7.9649, time 125.01ms
iter 89230: loss 7.6051, time 125.30ms
iter 89240: loss 8.1913, time 125.27ms
step 89250: train loss 6.6837, val loss 6.7369
saving checkpoint to out-shakespeare-char
iter 89250: loss 8.2703, time 2891.46ms
iter 89260: loss 6.6100, time 128.13ms
iter 89270: loss 8.6969, time 125.20ms
iter 89280: loss 8.4264, time 125.57ms
iter 89290: loss 8.2170, time 125.35ms
iter 89300: loss 7.9921, time 124.97ms
iter 89310: loss 7.7833, time 124.87ms
iter 89320: loss 8.0223, time 125.16ms
iter 89330: loss 8.2898, time 124.88ms
iter 89340: loss 7.9101, time 125.27ms
iter 89350: loss 8.6415, time 125.83ms
iter 89360: loss 7.9925, time 125.10ms
iter 89370: loss 7.3564, time 128.26ms
iter 89380: loss 7.8200, time 125.03ms
iter 89390: loss 7.7205, time 125.18ms
iter 89400: loss 7.9466, time 125.16ms
iter 89410: loss 8.2467, time 125.45ms
iter 89420: loss 7.6353, time 125.24ms
iter 89430: loss 7.9466, time 124.89ms
iter 89440: loss 7.5234, time 124.84ms
iter 89450: loss 7.6392, time 124.98ms
iter 89460: loss 7.5487, time 125.21ms
iter 89470: loss 8.8536, time 125.30ms
iter 89480: loss 8.3184, time 127.89ms
iter 89490: loss 7.7127, time 125.18ms
step 89500: train loss 6.7014, val loss 6.6782
saving checkpoint to out-shakespeare-char
iter 89500: loss 7.4962, time 2872.57ms
iter 89510: loss 7.5575, time 125.81ms
iter 89520: loss 8.5728, time 125.16ms
iter 89530: loss 8.4107, time 124.99ms
iter 89540: loss 8.1675, time 125.12ms
iter 89550: loss 8.1438, time 124.84ms
iter 89560: loss 8.8856, time 125.66ms
iter 89570: loss 7.3688, time 126.11ms
iter 89580: loss 8.4751, time 125.07ms
iter 89590: loss 7.8910, time 125.54ms
iter 89600: loss 7.8813, time 124.12ms
iter 89610: loss 9.2580, time 125.18ms
iter 89620: loss 7.9139, time 128.07ms
iter 89630: loss 7.6325, time 124.88ms
iter 89640: loss 7.8744, time 124.18ms
iter 89650: loss 8.3239, time 125.36ms
iter 89660: loss 7.7224, time 125.11ms
iter 89670: loss 7.2968, time 125.01ms
iter 89680: loss 8.3553, time 124.88ms
iter 89690: loss 7.4542, time 125.42ms
iter 89700: loss 7.9723, time 125.39ms
iter 89710: loss 7.5492, time 125.45ms
iter 89720: loss 8.9744, time 125.83ms
iter 89730: loss 8.2510, time 128.42ms
iter 89740: loss 8.0541, time 125.37ms
step 89750: train loss 6.6876, val loss 6.7199
saving checkpoint to out-shakespeare-char
iter 89750: loss 8.3364, time 2883.85ms
iter 89760: loss 8.5558, time 124.99ms
iter 89770: loss 7.9717, time 124.98ms
iter 89780: loss 8.3312, time 125.14ms
iter 89790: loss 8.1709, time 128.07ms
iter 89800: loss 8.4425, time 125.06ms
iter 89810: loss 7.8119, time 125.16ms
iter 89820: loss 8.1243, time 126.44ms
iter 89830: loss 8.3291, time 125.43ms
iter 89840: loss 8.6141, time 125.34ms
iter 89850: loss 7.0663, time 124.56ms
iter 89860: loss 7.2006, time 125.60ms
iter 89870: loss 7.9791, time 125.66ms
iter 89880: loss 7.9859, time 125.29ms
iter 89890: loss 8.7054, time 125.36ms
iter 89900: loss 8.4090, time 125.43ms
iter 89910: loss 8.3919, time 128.32ms
iter 89920: loss 8.2685, time 125.57ms
iter 89930: loss 7.5542, time 124.79ms
iter 89940: loss 7.8576, time 125.34ms
iter 89950: loss 7.5935, time 126.06ms
iter 89960: loss 7.5413, time 124.76ms
iter 89970: loss 8.2576, time 125.54ms
iter 89980: loss 8.7113, time 125.54ms
iter 89990: loss 8.2205, time 125.75ms
step 90000: train loss 6.6974, val loss 6.6463
saving checkpoint to out-shakespeare-char
iter 90000: loss 8.5006, time 2878.71ms
iter 90010: loss 7.2138, time 123.98ms
iter 90020: loss 7.8027, time 125.68ms
iter 90030: loss 8.2510, time 125.82ms
iter 90040: loss 8.1712, time 125.08ms
iter 90050: loss 7.9443, time 125.28ms
iter 90060: loss 7.7942, time 125.21ms
iter 90070: loss 8.2462, time 125.98ms
iter 90080: loss 7.9095, time 128.75ms
iter 90090: loss 7.9576, time 125.63ms
iter 90100: loss 7.8900, time 126.95ms
iter 90110: loss 8.6122, time 128.22ms
iter 90120: loss 7.8483, time 126.24ms
iter 90130: loss 8.0652, time 126.34ms
iter 90140: loss 8.0004, time 126.27ms
iter 90150: loss 7.5180, time 124.57ms
iter 90160: loss 8.5035, time 125.31ms
iter 90170: loss 7.6453, time 124.58ms
iter 90180: loss 7.8849, time 124.89ms
iter 90190: loss 8.4937, time 125.24ms
iter 90200: loss 8.2383, time 129.03ms
iter 90210: loss 7.8363, time 126.14ms
iter 90220: loss 8.3142, time 126.15ms
iter 90230: loss 8.0479, time 124.66ms
iter 90240: loss 7.2240, time 124.03ms
step 90250: train loss 6.6704, val loss 6.6380
saving checkpoint to out-shakespeare-char
iter 90250: loss 8.4547, time 2867.86ms
iter 90260: loss 8.6182, time 128.51ms
iter 90270: loss 9.1609, time 125.30ms
iter 90280: loss 7.7786, time 124.65ms
iter 90290: loss 8.2887, time 125.42ms
iter 90300: loss 7.6156, time 125.48ms
iter 90310: loss 8.0440, time 125.33ms
iter 90320: loss 7.5027, time 125.50ms
iter 90330: loss 7.5332, time 125.94ms
iter 90340: loss 7.7026, time 125.36ms
iter 90350: loss 8.1254, time 125.41ms
iter 90360: loss 8.3827, time 125.66ms
iter 90370: loss 7.7741, time 128.10ms
iter 90380: loss 7.8319, time 121.15ms
iter 90390: loss 7.5006, time 123.17ms
iter 90400: loss 8.4470, time 121.63ms
iter 90410: loss 8.3721, time 123.46ms
iter 90420: loss 7.2518, time 122.94ms
iter 90430: loss 7.9858, time 123.28ms
iter 90440: loss 7.7092, time 121.96ms
iter 90450: loss 7.0860, time 122.33ms
iter 90460: loss 8.3451, time 121.62ms
iter 90470: loss 7.4498, time 123.11ms
iter 90480: loss 8.3991, time 121.70ms
iter 90490: loss 8.1331, time 123.06ms
step 90500: train loss 6.6715, val loss 6.7109
saving checkpoint to out-shakespeare-char
iter 90500: loss 8.5622, time 2863.12ms
iter 90510: loss 8.1109, time 121.72ms
iter 90520: loss 8.7234, time 123.11ms
iter 90530: loss 8.3755, time 121.49ms
iter 90540: loss 8.2942, time 122.69ms
iter 90550: loss 8.4758, time 121.45ms
iter 90560: loss 8.2941, time 122.77ms
iter 90570: loss 8.1758, time 121.65ms
iter 90580: loss 7.4413, time 122.67ms
iter 90590: loss 7.7572, time 121.60ms
iter 90600: loss 7.1285, time 122.61ms
iter 90610: loss 7.5766, time 121.63ms
iter 90620: loss 8.0052, time 122.62ms
iter 90630: loss 8.7951, time 121.85ms
iter 90640: loss 8.2838, time 122.53ms
iter 90650: loss 7.6235, time 122.13ms
iter 90660: loss 7.5645, time 122.70ms
iter 90670: loss 7.1602, time 121.59ms
iter 90680: loss 8.8351, time 122.75ms
iter 90690: loss 7.7989, time 121.60ms
iter 90700: loss 8.4584, time 122.65ms
iter 90710: loss 7.8533, time 121.71ms
iter 90720: loss 7.4848, time 122.23ms
iter 90730: loss 8.0141, time 121.96ms
iter 90740: loss 8.2202, time 122.47ms
step 90750: train loss 6.6337, val loss 6.6255
saving checkpoint to out-shakespeare-char
iter 90750: loss 7.9600, time 2874.78ms
iter 90760: loss 8.2598, time 124.64ms
iter 90770: loss 8.3450, time 121.39ms
iter 90780: loss 8.4201, time 124.44ms
iter 90790: loss 7.9668, time 121.91ms
iter 90800: loss 8.0534, time 124.49ms
iter 90810: loss 7.6429, time 120.73ms
iter 90820: loss 8.3753, time 124.51ms
iter 90830: loss 8.8782, time 121.31ms
iter 90840: loss 8.0835, time 124.56ms
iter 90850: loss 8.1206, time 122.20ms
iter 90860: loss 8.1081, time 125.01ms
iter 90870: loss 7.7181, time 121.72ms
iter 90880: loss 7.5430, time 124.49ms
iter 90890: loss 8.0860, time 122.07ms
iter 90900: loss 8.2492, time 124.38ms
iter 90910: loss 7.4346, time 121.64ms
iter 90920: loss 7.9332, time 124.22ms
iter 90930: loss 7.8806, time 121.48ms
iter 90940: loss 7.5577, time 124.26ms
iter 90950: loss 7.6633, time 121.55ms
iter 90960: loss 8.1934, time 124.28ms
iter 90970: loss 7.9540, time 121.45ms
iter 90980: loss 7.3942, time 124.66ms
iter 90990: loss 8.1117, time 120.69ms
step 91000: train loss 6.6693, val loss 6.6718
saving checkpoint to out-shakespeare-char
iter 91000: loss 7.6898, time 2875.35ms
iter 91010: loss 8.1307, time 122.71ms
iter 91020: loss 8.2818, time 120.30ms
iter 91030: loss 8.5984, time 123.04ms
iter 91040: loss 7.6927, time 121.49ms
iter 91050: loss 8.7217, time 123.22ms
iter 91060: loss 7.8667, time 121.96ms
iter 91070: loss 7.7999, time 123.20ms
iter 91080: loss 7.7986, time 122.26ms
iter 91090: loss 8.2839, time 123.39ms
iter 91100: loss 8.0717, time 122.50ms
iter 91110: loss 7.5832, time 123.11ms
iter 91120: loss 8.2684, time 122.10ms
iter 91130: loss 7.5972, time 123.20ms
iter 91140: loss 8.1417, time 121.88ms
iter 91150: loss 8.0349, time 123.13ms
iter 91160: loss 7.4812, time 121.98ms
iter 91170: loss 8.0166, time 122.76ms
iter 91180: loss 7.9722, time 122.36ms
iter 91190: loss 7.6094, time 123.09ms
iter 91200: loss 6.9166, time 121.98ms
iter 91210: loss 8.4052, time 123.04ms
iter 91220: loss 7.8001, time 121.90ms
iter 91230: loss 7.6628, time 123.15ms
iter 91240: loss 7.3292, time 121.76ms
step 91250: train loss 6.6376, val loss 6.6548
saving checkpoint to out-shakespeare-char
iter 91250: loss 7.9476, time 2909.11ms
iter 91260: loss 8.3863, time 125.06ms
iter 91270: loss 8.7494, time 125.60ms
iter 91280: loss 7.6712, time 125.21ms
iter 91290: loss 7.9654, time 125.05ms
iter 91300: loss 6.7672, time 125.16ms
iter 91310: loss 8.5308, time 124.97ms
iter 91320: loss 8.0287, time 124.80ms
iter 91330: loss 8.2631, time 126.29ms
iter 91340: loss 8.5630, time 124.69ms
iter 91350: loss 7.5570, time 125.79ms
iter 91360: loss 8.0406, time 127.94ms
iter 91370: loss 8.2305, time 125.15ms
iter 91380: loss 8.0050, time 125.53ms
iter 91390: loss 7.8683, time 125.10ms
iter 91400: loss 7.9609, time 125.04ms
iter 91410: loss 7.5676, time 125.02ms
iter 91420: loss 8.7052, time 125.05ms
iter 91430: loss 7.7371, time 123.33ms
iter 91440: loss 7.9009, time 121.27ms
iter 91450: loss 7.9832, time 122.56ms
iter 91460: loss 8.7375, time 121.50ms
iter 91470: loss 7.8330, time 123.81ms
iter 91480: loss 8.0238, time 121.56ms
iter 91490: loss 7.5670, time 123.44ms
step 91500: train loss 6.6693, val loss 6.6631
saving checkpoint to out-shakespeare-char
iter 91500: loss 8.1654, time 2890.14ms
iter 91510: loss 7.9768, time 125.68ms
iter 91520: loss 7.6378, time 125.63ms
iter 91530: loss 7.8651, time 125.61ms
iter 91540: loss 8.1098, time 125.64ms
iter 91550: loss 7.9503, time 125.71ms
iter 91560: loss 7.1682, time 125.73ms
iter 91570: loss 7.7920, time 124.73ms
iter 91580: loss 7.5154, time 124.76ms
iter 91590: loss 7.5547, time 128.25ms
iter 91600: loss 7.9770, time 125.68ms
iter 91610: loss 8.9099, time 125.61ms
iter 91620: loss 8.7897, time 125.82ms
iter 91630: loss 7.8584, time 128.61ms
iter 91640: loss 7.5153, time 125.72ms
iter 91650: loss 7.7750, time 125.63ms
iter 91660: loss 7.7453, time 125.06ms
iter 91670: loss 8.3037, time 125.56ms
iter 91680: loss 8.3924, time 125.41ms
iter 91690: loss 8.1009, time 125.91ms
iter 91700: loss 8.4036, time 124.98ms
iter 91710: loss 8.3173, time 125.30ms
iter 91720: loss 7.4250, time 125.24ms
iter 91730: loss 8.9777, time 125.02ms
iter 91740: loss 8.1689, time 128.25ms
step 91750: train loss 6.6112, val loss 6.6961
saving checkpoint to out-shakespeare-char
iter 91750: loss 7.9616, time 2905.12ms
iter 91760: loss 7.7478, time 125.27ms
iter 91770: loss 7.4689, time 124.99ms
iter 91780: loss 7.6031, time 125.52ms
iter 91790: loss 8.1654, time 124.60ms
iter 91800: loss 7.7424, time 125.59ms
iter 91810: loss 7.2110, time 125.17ms
iter 91820: loss 8.2854, time 125.36ms
iter 91830: loss 7.9780, time 126.15ms
iter 91840: loss 8.1725, time 124.21ms
iter 91850: loss 8.7169, time 125.10ms
iter 91860: loss 7.0600, time 124.87ms
iter 91870: loss 8.2010, time 125.03ms
iter 91880: loss 7.5709, time 128.02ms
iter 91890: loss 7.2478, time 124.36ms
iter 91900: loss 7.2080, time 125.38ms
iter 91910: loss 8.7843, time 125.31ms
iter 91920: loss 7.8418, time 124.97ms
iter 91930: loss 8.8695, time 125.30ms
iter 91940: loss 8.7306, time 125.60ms
iter 91950: loss 8.5164, time 125.66ms
iter 91960: loss 8.3855, time 125.36ms
iter 91970: loss 7.8975, time 125.98ms
iter 91980: loss 7.7810, time 126.06ms
iter 91990: loss 8.2301, time 129.55ms
step 92000: train loss 6.6851, val loss 6.6596
saving checkpoint to out-shakespeare-char
iter 92000: loss 7.8335, time 2903.67ms
iter 92010: loss 7.8803, time 125.96ms
iter 92020: loss 8.3581, time 125.61ms
iter 92030: loss 7.7168, time 125.46ms
iter 92040: loss 8.9142, time 125.53ms
iter 92050: loss 7.8267, time 125.49ms
iter 92060: loss 7.9553, time 125.80ms
iter 92070: loss 7.5400, time 125.20ms
iter 92080: loss 7.5956, time 125.65ms
iter 92090: loss 7.1897, time 128.44ms
iter 92100: loss 8.3801, time 124.91ms
iter 92110: loss 8.1550, time 124.93ms
iter 92120: loss 8.1202, time 125.53ms
iter 92130: loss 7.7660, time 124.85ms
iter 92140: loss 8.3767, time 125.41ms
iter 92150: loss 8.0065, time 126.17ms
iter 92160: loss 8.0917, time 125.61ms
iter 92170: loss 7.6410, time 127.09ms
iter 92180: loss 7.4579, time 124.95ms
iter 92190: loss 8.4974, time 125.29ms
iter 92200: loss 7.5748, time 128.39ms
iter 92210: loss 7.2680, time 125.72ms
iter 92220: loss 7.6229, time 125.97ms
iter 92230: loss 7.7815, time 126.05ms
iter 92240: loss 8.6567, time 128.97ms
step 92250: train loss 6.6827, val loss 6.6364
saving checkpoint to out-shakespeare-char
iter 92250: loss 7.7880, time 2893.45ms
iter 92260: loss 8.2636, time 125.47ms
iter 92270: loss 7.5791, time 125.56ms
iter 92280: loss 7.9094, time 126.17ms
iter 92290: loss 8.1703, time 126.39ms
iter 92300: loss 8.1903, time 128.64ms
iter 92310: loss 8.1178, time 124.94ms
iter 92320: loss 8.3451, time 125.20ms
iter 92330: loss 8.8559, time 124.77ms
iter 92340: loss 8.1335, time 126.06ms
iter 92350: loss 8.5546, time 125.92ms
iter 92360: loss 8.4212, time 125.62ms
iter 92370: loss 8.4976, time 125.31ms
iter 92380: loss 8.7297, time 125.46ms
iter 92390: loss 8.3935, time 125.83ms
iter 92400: loss 7.6305, time 125.64ms
iter 92410: loss 7.4696, time 128.20ms
iter 92420: loss 7.3541, time 125.46ms
iter 92430: loss 8.2116, time 125.54ms
iter 92440: loss 7.8349, time 126.24ms
iter 92450: loss 7.8754, time 125.53ms
iter 92460: loss 8.4789, time 125.48ms
iter 92470: loss 7.7087, time 125.44ms
iter 92480: loss 8.1874, time 125.17ms
iter 92490: loss 8.6859, time 125.65ms
step 92500: train loss 6.6226, val loss 6.6294
saving checkpoint to out-shakespeare-char
iter 92500: loss 8.3138, time 2893.83ms
iter 92510: loss 8.1651, time 125.10ms
iter 92520: loss 8.6120, time 124.88ms
iter 92530: loss 8.0973, time 125.09ms
iter 92540: loss 7.6430, time 125.76ms
iter 92550: loss 7.3472, time 125.42ms
iter 92560: loss 7.9975, time 125.54ms
iter 92570: loss 7.5004, time 125.39ms
iter 92580: loss 8.2657, time 128.41ms
iter 92590: loss 8.7069, time 125.19ms
iter 92600: loss 8.0324, time 125.61ms
iter 92610: loss 7.4829, time 124.79ms
iter 92620: loss 7.5918, time 125.79ms
iter 92630: loss 8.3644, time 125.23ms
iter 92640: loss 7.1268, time 125.57ms
iter 92650: loss 7.2488, time 125.47ms
iter 92660: loss 8.2667, time 125.56ms
iter 92670: loss 7.6068, time 125.64ms
iter 92680: loss 7.8940, time 126.28ms
iter 92690: loss 8.1506, time 125.46ms
iter 92700: loss 7.6447, time 124.50ms
iter 92710: loss 7.7469, time 125.84ms
iter 92720: loss 7.6563, time 129.64ms
iter 92730: loss 7.9081, time 125.52ms
iter 92740: loss 6.8401, time 125.59ms
step 92750: train loss 6.6258, val loss 6.6435
saving checkpoint to out-shakespeare-char
iter 92750: loss 8.2201, time 2872.83ms
iter 92760: loss 7.3412, time 125.94ms
iter 92770: loss 8.4890, time 126.18ms
iter 92780: loss 7.8225, time 128.36ms
iter 92790: loss 7.3514, time 125.62ms
iter 92800: loss 8.9166, time 125.81ms
iter 92810: loss 7.5473, time 125.83ms
iter 92820: loss 7.8859, time 128.40ms
iter 92830: loss 7.8658, time 125.85ms
iter 92840: loss 8.5473, time 126.61ms
iter 92850: loss 7.6775, time 125.72ms
iter 92860: loss 7.6964, time 128.86ms
iter 92870: loss 7.2689, time 125.92ms
iter 92880: loss 8.4560, time 125.64ms
iter 92890: loss 7.7187, time 125.65ms
iter 92900: loss 7.3453, time 124.91ms
iter 92910: loss 8.2020, time 125.29ms
iter 92920: loss 7.8873, time 125.34ms
iter 92930: loss 7.9183, time 125.76ms
iter 92940: loss 7.7686, time 125.39ms
iter 92950: loss 9.2976, time 128.12ms
iter 92960: loss 8.3276, time 125.24ms
iter 92970: loss 7.8561, time 125.15ms
iter 92980: loss 7.9568, time 125.44ms
iter 92990: loss 7.9740, time 125.47ms
step 93000: train loss 6.5957, val loss 6.7052
saving checkpoint to out-shakespeare-char
iter 93000: loss 7.7209, time 2889.91ms
iter 93010: loss 7.8513, time 130.47ms
iter 93020: loss 8.3195, time 125.70ms
iter 93030: loss 7.7494, time 125.45ms
iter 93040: loss 7.4114, time 125.78ms
iter 93050: loss 7.0962, time 127.67ms
iter 93060: loss 7.6758, time 125.64ms
iter 93070: loss 7.8478, time 125.39ms
iter 93080: loss 7.3692, time 125.36ms
iter 93090: loss 7.3119, time 125.73ms
iter 93100: loss 8.8256, time 125.45ms
iter 93110: loss 8.5416, time 125.78ms
iter 93120: loss 7.4810, time 129.96ms
iter 93130: loss 7.3088, time 126.35ms
iter 93140: loss 8.3110, time 126.27ms
iter 93150: loss 8.1511, time 125.92ms
iter 93160: loss 8.3228, time 125.41ms
iter 93170: loss 7.5863, time 125.54ms
iter 93180: loss 8.7812, time 126.45ms
iter 93190: loss 7.5977, time 125.01ms
iter 93200: loss 8.0695, time 125.63ms
iter 93210: loss 7.6588, time 125.20ms
iter 93220: loss 7.9335, time 125.92ms
iter 93230: loss 7.3831, time 128.76ms
iter 93240: loss 7.2127, time 125.65ms
step 93250: train loss 6.6053, val loss 6.6095
saving checkpoint to out-shakespeare-char
iter 93250: loss 8.6901, time 2885.17ms
iter 93260: loss 7.8711, time 129.22ms
iter 93270: loss 8.1773, time 125.09ms
iter 93280: loss 8.7456, time 126.42ms
iter 93290: loss 7.2488, time 126.33ms
iter 93300: loss 7.5942, time 125.66ms
iter 93310: loss 8.5163, time 125.76ms
iter 93320: loss 7.6207, time 126.06ms
iter 93330: loss 7.2925, time 125.65ms
iter 93340: loss 7.4671, time 121.95ms
iter 93350: loss 7.5686, time 121.42ms
iter 93360: loss 8.3031, time 121.97ms
iter 93370: loss 7.9000, time 121.81ms
iter 93380: loss 8.2044, time 121.93ms
iter 93390: loss 7.8620, time 121.70ms
iter 93400: loss 7.5671, time 122.11ms
iter 93410: loss 7.4453, time 121.91ms
iter 93420: loss 7.7944, time 121.82ms
iter 93430: loss 7.8733, time 121.77ms
iter 93440: loss 8.0818, time 121.94ms
iter 93450: loss 7.3624, time 121.89ms
iter 93460: loss 7.9810, time 122.07ms
iter 93470: loss 7.2886, time 121.79ms
iter 93480: loss 7.5503, time 121.94ms
iter 93490: loss 8.3070, time 121.80ms
step 93500: train loss 6.6402, val loss 6.5354
saving checkpoint to out-shakespeare-char
iter 93500: loss 7.9744, time 2912.01ms
iter 93510: loss 8.2125, time 123.19ms
iter 93520: loss 7.6439, time 122.03ms
iter 93530: loss 8.1374, time 123.32ms
iter 93540: loss 8.1865, time 121.83ms
iter 93550: loss 7.7080, time 123.23ms
iter 93560: loss 7.8887, time 121.82ms
iter 93570: loss 8.1351, time 123.10ms
iter 93580: loss 8.1573, time 121.75ms
iter 93590: loss 8.0100, time 123.18ms
iter 93600: loss 8.2543, time 122.39ms
iter 93610: loss 8.0241, time 123.28ms
iter 93620: loss 7.5828, time 121.90ms
iter 93630: loss 7.3212, time 122.92ms
iter 93640: loss 8.5289, time 122.07ms
iter 93650: loss 7.9904, time 123.08ms
iter 93660: loss 8.3836, time 120.86ms
iter 93670: loss 8.2019, time 121.99ms
iter 93680: loss 8.1618, time 120.94ms
iter 93690: loss 8.6670, time 123.23ms
iter 93700: loss 7.8599, time 121.86ms
iter 93710: loss 8.1383, time 122.85ms
iter 93720: loss 8.0816, time 121.85ms
iter 93730: loss 8.1553, time 122.58ms
iter 93740: loss 7.3401, time 121.86ms
step 93750: train loss 6.5680, val loss 6.5911
saving checkpoint to out-shakespeare-char
iter 93750: loss 8.3373, time 2879.02ms
iter 93760: loss 8.7552, time 122.76ms
iter 93770: loss 8.2014, time 121.86ms
iter 93780: loss 8.7784, time 121.84ms
iter 93790: loss 7.6662, time 121.88ms
iter 93800: loss 8.3588, time 121.90ms
iter 93810: loss 7.9086, time 121.76ms
iter 93820: loss 7.5580, time 121.77ms
iter 93830: loss 7.6327, time 121.82ms
iter 93840: loss 8.0695, time 121.90ms
iter 93850: loss 7.6268, time 121.94ms
iter 93860: loss 8.5794, time 121.77ms
iter 93870: loss 8.5755, time 121.82ms
iter 93880: loss 7.3994, time 123.06ms
iter 93890: loss 7.2297, time 125.14ms
iter 93900: loss 7.4337, time 124.91ms
iter 93910: loss 7.8715, time 125.09ms
iter 93920: loss 7.1943, time 125.55ms
iter 93930: loss 7.7649, time 128.22ms
iter 93940: loss 7.4613, time 125.22ms
iter 93950: loss 7.9534, time 125.35ms
iter 93960: loss 8.3603, time 126.49ms
iter 93970: loss 7.7670, time 126.01ms
iter 93980: loss 7.8573, time 126.03ms
iter 93990: loss 8.2667, time 125.80ms
step 94000: train loss 6.6128, val loss 6.6221
saving checkpoint to out-shakespeare-char
iter 94000: loss 8.3769, time 2918.27ms
iter 94010: loss 8.0155, time 126.07ms
iter 94020: loss 7.6608, time 128.58ms
iter 94030: loss 8.0020, time 125.92ms
iter 94040: loss 7.5805, time 126.30ms
iter 94050: loss 7.9194, time 126.40ms
iter 94060: loss 7.9943, time 125.89ms
iter 94070: loss 7.6862, time 125.96ms
iter 94080: loss 7.7867, time 125.96ms
iter 94090: loss 7.6893, time 125.85ms
iter 94100: loss 8.0723, time 126.07ms
iter 94110: loss 7.6736, time 126.14ms
iter 94120: loss 8.4693, time 125.78ms
iter 94130: loss 8.2363, time 129.06ms
iter 94140: loss 7.4336, time 125.91ms
iter 94150: loss 8.2953, time 125.39ms
iter 94160: loss 7.7835, time 125.80ms
iter 94170: loss 7.9298, time 125.91ms
iter 94180: loss 8.1553, time 125.80ms
iter 94190: loss 7.7594, time 125.78ms
iter 94200: loss 7.8609, time 125.56ms
iter 94210: loss 8.1253, time 125.70ms
iter 94220: loss 7.9528, time 125.83ms
iter 94230: loss 8.3473, time 126.35ms
iter 94240: loss 8.4304, time 128.61ms
step 94250: train loss 6.5623, val loss 6.5943
saving checkpoint to out-shakespeare-char
iter 94250: loss 8.5117, time 2892.49ms
iter 94260: loss 7.8341, time 126.31ms
iter 94270: loss 8.0347, time 125.88ms
iter 94280: loss 7.6618, time 125.72ms
iter 94290: loss 8.3495, time 125.97ms
iter 94300: loss 7.6413, time 125.64ms
iter 94310: loss 8.3042, time 125.74ms
iter 94320: loss 8.3772, time 125.95ms
iter 94330: loss 8.5434, time 125.92ms
iter 94340: loss 8.5634, time 128.65ms
iter 94350: loss 6.5448, time 124.67ms
iter 94360: loss 7.5130, time 126.23ms
iter 94370: loss 8.3151, time 126.03ms
iter 94380: loss 7.7723, time 125.90ms
iter 94390: loss 7.7251, time 126.19ms
iter 94400: loss 8.3018, time 125.94ms
iter 94410: loss 7.8483, time 126.60ms
iter 94420: loss 7.5832, time 126.04ms
iter 94430: loss 8.2722, time 126.06ms
iter 94440: loss 7.8821, time 125.88ms
iter 94450: loss 6.9860, time 125.65ms
iter 94460: loss 8.8445, time 125.82ms
iter 94470: loss 8.6260, time 126.01ms
iter 94480: loss 8.2435, time 126.13ms
iter 94490: loss 8.1681, time 125.08ms
step 94500: train loss 6.5771, val loss 6.5778
saving checkpoint to out-shakespeare-char
iter 94500: loss 7.3379, time 2903.83ms
iter 94510: loss 8.1994, time 125.78ms
iter 94520: loss 7.4745, time 125.46ms
iter 94530: loss 8.1206, time 124.61ms
iter 94540: loss 8.5722, time 126.33ms
iter 94550: loss 7.4669, time 125.18ms
iter 94560: loss 8.3535, time 125.30ms
iter 94570: loss 8.1830, time 126.13ms
iter 94580: loss 8.5182, time 126.07ms
iter 94590: loss 7.2034, time 125.60ms
iter 94600: loss 8.0259, time 128.93ms
iter 94610: loss 8.0248, time 126.12ms
iter 94620: loss 7.9769, time 126.06ms
iter 94630: loss 8.2502, time 125.54ms
iter 94640: loss 7.6509, time 125.15ms
iter 94650: loss 8.0379, time 125.35ms
iter 94660: loss 8.4141, time 125.26ms
iter 94670: loss 8.0865, time 125.26ms
iter 94680: loss 7.5935, time 124.96ms
iter 94690: loss 8.1327, time 125.50ms
iter 94700: loss 8.6859, time 125.86ms
iter 94710: loss 7.9912, time 128.32ms
iter 94720: loss 7.8401, time 124.96ms
iter 94730: loss 7.8032, time 124.74ms
iter 94740: loss 7.7501, time 125.75ms
step 94750: train loss 6.5605, val loss 6.5833
saving checkpoint to out-shakespeare-char
iter 94750: loss 8.2678, time 2889.54ms
iter 94760: loss 8.1886, time 125.57ms
iter 94770: loss 7.6297, time 125.88ms
iter 94780: loss 8.7116, time 125.66ms
iter 94790: loss 7.5859, time 125.72ms
iter 94800: loss 8.1713, time 126.03ms
iter 94810: loss 7.8504, time 125.76ms
iter 94820: loss 8.1048, time 125.47ms
iter 94830: loss 7.7399, time 128.63ms
iter 94840: loss 8.0835, time 125.95ms
iter 94850: loss 7.6278, time 125.96ms
iter 94860: loss 6.6762, time 125.25ms
iter 94870: loss 8.2056, time 128.67ms
iter 94880: loss 7.8564, time 126.25ms
iter 94890: loss 7.8061, time 125.98ms
iter 94900: loss 8.5677, time 125.76ms
iter 94910: loss 8.0754, time 125.82ms
iter 94920: loss 7.7791, time 125.86ms
iter 94930: loss 7.5162, time 125.83ms
iter 94940: loss 7.6597, time 124.73ms
iter 94950: loss 7.9586, time 125.83ms
iter 94960: loss 7.9173, time 125.70ms
iter 94970: loss 8.4527, time 125.66ms
iter 94980: loss 7.3237, time 128.57ms
iter 94990: loss 8.3376, time 125.69ms
step 95000: train loss 6.6202, val loss 6.6181
saving checkpoint to out-shakespeare-char
iter 95000: loss 8.3373, time 2868.94ms
iter 95010: loss 7.5755, time 128.78ms
iter 95020: loss 8.9543, time 125.71ms
iter 95030: loss 8.3339, time 125.68ms
iter 95040: loss 7.6111, time 126.05ms
iter 95050: loss 7.8142, time 125.85ms
iter 95060: loss 7.7770, time 126.02ms
iter 95070: loss 7.6510, time 125.25ms
iter 95080: loss 7.5102, time 124.16ms
iter 95090: loss 7.5963, time 125.26ms
iter 95100: loss 8.5762, time 125.40ms
iter 95110: loss 7.6485, time 125.49ms
iter 95120: loss 7.8315, time 128.91ms
iter 95130: loss 7.7942, time 125.91ms
iter 95140: loss 7.8938, time 125.98ms
iter 95150: loss 8.2082, time 124.54ms
iter 95160: loss 8.2174, time 125.70ms
iter 95170: loss 7.8479, time 125.94ms
iter 95180: loss 7.2659, time 125.53ms
iter 95190: loss 7.3819, time 124.40ms
iter 95200: loss 6.9937, time 125.63ms
iter 95210: loss 7.5014, time 125.34ms
iter 95220: loss 7.8900, time 125.54ms
iter 95230: loss 7.6323, time 128.26ms
iter 95240: loss 7.0131, time 125.25ms
step 95250: train loss 6.5405, val loss 6.5653
saving checkpoint to out-shakespeare-char
iter 95250: loss 8.3053, time 2915.75ms
iter 95260: loss 7.6865, time 125.48ms
iter 95270: loss 7.6070, time 125.25ms
iter 95280: loss 7.9194, time 125.41ms
iter 95290: loss 7.8277, time 125.67ms
iter 95300: loss 7.9965, time 128.38ms
iter 95310: loss 8.3559, time 125.31ms
iter 95320: loss 7.8881, time 125.87ms
iter 95330: loss 7.6262, time 125.38ms
iter 95340: loss 8.8092, time 125.68ms
iter 95350: loss 7.1840, time 125.72ms
iter 95360: loss 7.8519, time 125.44ms
iter 95370: loss 6.8451, time 125.29ms
iter 95380: loss 8.0417, time 125.27ms
iter 95390: loss 7.4201, time 125.57ms
iter 95400: loss 7.2939, time 125.88ms
iter 95410: loss 7.7181, time 128.82ms
iter 95420: loss 8.1983, time 125.90ms
iter 95430: loss 7.1418, time 125.66ms
iter 95440: loss 7.6517, time 125.80ms
iter 95450: loss 7.2800, time 125.83ms
iter 95460: loss 7.7456, time 125.80ms
iter 95470: loss 7.9144, time 125.80ms
iter 95480: loss 8.0370, time 126.10ms
iter 95490: loss 8.6459, time 125.97ms
step 95500: train loss 6.5742, val loss 6.5113
saving checkpoint to out-shakespeare-char
iter 95500: loss 7.7349, time 2891.14ms
iter 95510: loss 7.5770, time 125.66ms
iter 95520: loss 7.5214, time 125.39ms
iter 95530: loss 8.2068, time 125.53ms
iter 95540: loss 7.6970, time 125.11ms
iter 95550: loss 8.1506, time 125.31ms
iter 95560: loss 8.4075, time 125.34ms
iter 95570: loss 8.0011, time 125.58ms
iter 95580: loss 8.5177, time 125.72ms
iter 95590: loss 8.4935, time 125.73ms
iter 95600: loss 7.5007, time 128.30ms
iter 95610: loss 7.2362, time 125.16ms
iter 95620: loss 7.1422, time 125.50ms
iter 95630: loss 8.5152, time 125.18ms
iter 95640: loss 8.5974, time 125.31ms
iter 95650: loss 8.1937, time 125.52ms
iter 95660: loss 9.0890, time 126.02ms
iter 95670: loss 7.7735, time 125.60ms
iter 95680: loss 8.0900, time 125.79ms
iter 95690: loss 7.8827, time 125.86ms
iter 95700: loss 8.0791, time 125.98ms
iter 95710: loss 7.3434, time 128.61ms
iter 95720: loss 8.5210, time 125.59ms
iter 95730: loss 8.8093, time 125.84ms
iter 95740: loss 7.9570, time 125.78ms
step 95750: train loss 6.5345, val loss 6.5825
saving checkpoint to out-shakespeare-char
iter 95750: loss 8.4527, time 2880.09ms
iter 95760: loss 7.9747, time 121.98ms
iter 95770: loss 8.4352, time 121.91ms
iter 95780: loss 8.3077, time 121.88ms
iter 95790: loss 8.3354, time 121.93ms
iter 95800: loss 7.5958, time 121.77ms
iter 95810: loss 8.7670, time 121.79ms
iter 95820: loss 8.0129, time 121.12ms
iter 95830: loss 7.2143, time 121.95ms
iter 95840: loss 7.7367, time 122.83ms
iter 95850: loss 8.0245, time 121.85ms
iter 95860: loss 7.4741, time 121.84ms
iter 95870: loss 8.1207, time 121.81ms
iter 95880: loss 8.0729, time 121.59ms
iter 95890: loss 7.8601, time 121.90ms
iter 95900: loss 7.6641, time 121.76ms
iter 95910: loss 7.4431, time 121.85ms
iter 95920: loss 8.3364, time 121.71ms
iter 95930: loss 7.2835, time 121.85ms
iter 95940: loss 7.9760, time 122.03ms
iter 95950: loss 8.6077, time 121.82ms
iter 95960: loss 8.5290, time 121.87ms
iter 95970: loss 7.5654, time 122.14ms
iter 95980: loss 8.2204, time 121.74ms
iter 95990: loss 7.7143, time 121.91ms
step 96000: train loss 6.5250, val loss 6.5763
saving checkpoint to out-shakespeare-char
iter 96000: loss 7.7845, time 2897.78ms
iter 96010: loss 7.5192, time 121.33ms
iter 96020: loss 8.4633, time 122.63ms
iter 96030: loss 7.6238, time 121.27ms
iter 96040: loss 7.6433, time 121.63ms
iter 96050: loss 7.3731, time 121.29ms
iter 96060: loss 8.3456, time 121.36ms
iter 96070: loss 6.8976, time 121.39ms
iter 96080: loss 8.0731, time 121.32ms
iter 96090: loss 7.7932, time 121.33ms
iter 96100: loss 7.6949, time 121.61ms
iter 96110: loss 7.6398, time 121.32ms
iter 96120: loss 7.6246, time 121.37ms
iter 96130: loss 7.6427, time 121.43ms
iter 96140: loss 7.3455, time 121.54ms
iter 96150: loss 8.0015, time 121.28ms
iter 96160: loss 7.6333, time 121.48ms
iter 96170: loss 7.5699, time 121.00ms
iter 96180: loss 8.1219, time 121.71ms
iter 96190: loss 8.2092, time 121.66ms
iter 96200: loss 7.9057, time 120.68ms
iter 96210: loss 7.3252, time 120.78ms
iter 96220: loss 8.1738, time 121.55ms
iter 96230: loss 8.2195, time 120.21ms
iter 96240: loss 8.0757, time 120.42ms
step 96250: train loss 6.5214, val loss 6.5402
saving checkpoint to out-shakespeare-char
iter 96250: loss 7.5378, time 2904.77ms
iter 96260: loss 8.2804, time 120.51ms
iter 96270: loss 8.5554, time 122.85ms
iter 96280: loss 7.9068, time 121.41ms
iter 96290: loss 8.2787, time 122.71ms
iter 96300: loss 7.9440, time 121.67ms
iter 96310: loss 7.6423, time 122.87ms
iter 96320: loss 8.2014, time 121.82ms
iter 96330: loss 7.7500, time 122.73ms
iter 96340: loss 8.0354, time 121.35ms
iter 96350: loss 8.2912, time 122.72ms
iter 96360: loss 7.0153, time 121.28ms
iter 96370: loss 7.6883, time 122.38ms
iter 96380: loss 8.3373, time 121.56ms
iter 96390: loss 7.2672, time 123.06ms
iter 96400: loss 7.5606, time 121.35ms
iter 96410: loss 7.7119, time 123.47ms
iter 96420: loss 6.5249, time 121.80ms
iter 96430: loss 7.5131, time 122.56ms
iter 96440: loss 8.1728, time 121.38ms
iter 96450: loss 8.2846, time 122.79ms
iter 96460: loss 7.6439, time 121.23ms
iter 96470: loss 7.9773, time 122.68ms
iter 96480: loss 7.2732, time 121.38ms
iter 96490: loss 8.1392, time 122.74ms
step 96500: train loss 6.6074, val loss 6.5428
saving checkpoint to out-shakespeare-char
iter 96500: loss 8.0954, time 2902.81ms
iter 96510: loss 8.3920, time 121.59ms
iter 96520: loss 7.5511, time 121.57ms
iter 96530: loss 7.7904, time 121.37ms
iter 96540: loss 7.6530, time 121.59ms
iter 96550: loss 7.7662, time 120.48ms
iter 96560: loss 8.4345, time 121.88ms
iter 96570: loss 8.3526, time 121.47ms
iter 96580: loss 7.6718, time 121.39ms
iter 96590: loss 7.7504, time 122.32ms
iter 96600: loss 7.9791, time 121.36ms
iter 96610: loss 8.4107, time 124.67ms
iter 96620: loss 7.5798, time 121.44ms
iter 96630: loss 8.7328, time 124.50ms
iter 96640: loss 7.8218, time 121.80ms
iter 96650: loss 8.1034, time 124.45ms
iter 96660: loss 8.7473, time 121.27ms
iter 96670: loss 7.6949, time 124.53ms
iter 96680: loss 7.6756, time 121.39ms
iter 96690: loss 7.9389, time 124.31ms
iter 96700: loss 8.0224, time 121.42ms
iter 96710: loss 8.1253, time 124.37ms
iter 96720: loss 8.3954, time 121.25ms
iter 96730: loss 7.7518, time 124.33ms
iter 96740: loss 8.3113, time 121.23ms
step 96750: train loss 6.5720, val loss 6.5587
saving checkpoint to out-shakespeare-char
iter 96750: loss 8.1555, time 2901.61ms
iter 96760: loss 7.6961, time 122.72ms
iter 96770: loss 7.3209, time 121.38ms
iter 96780: loss 8.9770, time 122.68ms
iter 96790: loss 7.6745, time 121.52ms
iter 96800: loss 8.0097, time 122.93ms
iter 96810: loss 7.6923, time 121.65ms
iter 96820: loss 7.5608, time 122.73ms
iter 96830: loss 7.8653, time 121.48ms
iter 96840: loss 8.1549, time 123.13ms
iter 96850: loss 7.8868, time 121.35ms
iter 96860: loss 7.6569, time 122.62ms
iter 96870: loss 8.5058, time 121.36ms
iter 96880: loss 8.3871, time 122.57ms
iter 96890: loss 7.7787, time 121.27ms
iter 96900: loss 7.3793, time 122.80ms
iter 96910: loss 7.8059, time 121.55ms
iter 96920: loss 7.3183, time 122.74ms
iter 96930: loss 8.1969, time 121.56ms
iter 96940: loss 7.9254, time 122.67ms
iter 96950: loss 8.2436, time 121.54ms
iter 96960: loss 8.2815, time 122.79ms
iter 96970: loss 7.9840, time 121.39ms
iter 96980: loss 8.2043, time 122.78ms
iter 96990: loss 7.6992, time 120.97ms
step 97000: train loss 6.5285, val loss 6.5341
saving checkpoint to out-shakespeare-char
iter 97000: loss 8.5172, time 2903.39ms
iter 97010: loss 7.8113, time 121.41ms
iter 97020: loss 8.0269, time 121.33ms
iter 97030: loss 8.5263, time 121.34ms
iter 97040: loss 7.6723, time 121.97ms
iter 97050: loss 7.9600, time 121.51ms
iter 97060: loss 7.5822, time 121.51ms
iter 97070: loss 9.1053, time 121.35ms
iter 97080: loss 7.6683, time 121.68ms
iter 97090: loss 8.2599, time 121.41ms
iter 97100: loss 7.3555, time 121.52ms
iter 97110: loss 7.4972, time 121.27ms
iter 97120: loss 8.0620, time 121.11ms
iter 97130: loss 8.4691, time 121.45ms
iter 97140: loss 7.9137, time 121.50ms
iter 97150: loss 8.1555, time 120.97ms
iter 97160: loss 8.0845, time 121.18ms
iter 97170: loss 7.9343, time 121.28ms
iter 97180: loss 8.5343, time 121.47ms
iter 97190: loss 7.3656, time 121.45ms
iter 97200: loss 7.3499, time 121.42ms
iter 97210: loss 8.2052, time 121.37ms
iter 97220: loss 8.2391, time 121.69ms
iter 97230: loss 7.6254, time 121.44ms
iter 97240: loss 7.3480, time 121.34ms
step 97250: train loss 6.5648, val loss 6.5224
saving checkpoint to out-shakespeare-char
iter 97250: loss 7.3437, time 2901.90ms
iter 97260: loss 7.9062, time 121.61ms
iter 97270: loss 7.9752, time 123.14ms
iter 97280: loss 8.2833, time 121.88ms
iter 97290: loss 7.7876, time 122.88ms
iter 97300: loss 7.8020, time 121.40ms
iter 97310: loss 8.1410, time 123.58ms
iter 97320: loss 7.7158, time 121.40ms
iter 97330: loss 8.0555, time 122.71ms
iter 97340: loss 7.8640, time 121.96ms
iter 97350: loss 8.2333, time 123.63ms
iter 97360: loss 8.6547, time 121.27ms
iter 97370: loss 7.9194, time 123.34ms
iter 97380: loss 7.8416, time 121.34ms
iter 97390: loss 8.5754, time 122.11ms
iter 97400: loss 7.7918, time 121.41ms
iter 97410: loss 8.3514, time 122.69ms
iter 97420: loss 6.9401, time 121.49ms
iter 97430: loss 7.6799, time 122.65ms
iter 97440: loss 7.2080, time 121.37ms
iter 97450: loss 7.9196, time 122.72ms
iter 97460: loss 8.1024, time 121.23ms
iter 97470: loss 7.7214, time 122.97ms
iter 97480: loss 8.2844, time 121.28ms
iter 97490: loss 8.2934, time 122.72ms
step 97500: train loss 6.5576, val loss 6.5925
saving checkpoint to out-shakespeare-char
iter 97500: loss 7.4184, time 2889.18ms
iter 97510: loss 7.4468, time 121.61ms
iter 97520: loss 7.4990, time 122.88ms
iter 97530: loss 7.0364, time 121.20ms
iter 97540: loss 7.9193, time 121.15ms
iter 97550: loss 7.3578, time 121.39ms
iter 97560: loss 8.3263, time 121.25ms
iter 97570: loss 7.5719, time 120.92ms
iter 97580: loss 8.4339, time 121.32ms
iter 97590: loss 8.0800, time 121.56ms
iter 97600: loss 7.4730, time 121.31ms
iter 97610: loss 7.0277, time 121.40ms
iter 97620: loss 7.1573, time 121.17ms
iter 97630: loss 7.7703, time 120.49ms
iter 97640: loss 8.3863, time 121.32ms
iter 97650: loss 8.0000, time 121.57ms
iter 97660: loss 7.3673, time 121.18ms
iter 97670: loss 7.8336, time 120.86ms
iter 97680: loss 8.0454, time 120.90ms
iter 97690: loss 8.0252, time 121.29ms
iter 97700: loss 7.6747, time 121.20ms
iter 97710: loss 7.7567, time 121.64ms
iter 97720: loss 8.0018, time 120.86ms
iter 97730: loss 7.7749, time 121.20ms
iter 97740: loss 8.0214, time 121.49ms
step 97750: train loss 6.5485, val loss 6.5768
saving checkpoint to out-shakespeare-char
iter 97750: loss 8.0104, time 2892.86ms
iter 97760: loss 7.8056, time 125.29ms
iter 97770: loss 7.5685, time 120.81ms
iter 97780: loss 7.7615, time 124.43ms
iter 97790: loss 7.5548, time 121.42ms
iter 97800: loss 7.5780, time 124.63ms
iter 97810: loss 8.3240, time 121.28ms
iter 97820: loss 7.7486, time 124.41ms
iter 97830: loss 8.1224, time 120.99ms
iter 97840: loss 7.8474, time 124.77ms
iter 97850: loss 8.3567, time 121.10ms
iter 97860: loss 7.9990, time 125.18ms
iter 97870: loss 7.8330, time 121.60ms
iter 97880: loss 8.3932, time 124.58ms
iter 97890: loss 8.0701, time 122.36ms
iter 97900: loss 7.8452, time 125.33ms
iter 97910: loss 8.0594, time 122.05ms
iter 97920: loss 7.7116, time 124.70ms
iter 97930: loss 8.3771, time 121.60ms
iter 97940: loss 7.7596, time 124.71ms
iter 97950: loss 8.4626, time 121.44ms
iter 97960: loss 7.9761, time 127.82ms
iter 97970: loss 7.7669, time 125.01ms
iter 97980: loss 7.4351, time 124.97ms
iter 97990: loss 7.4253, time 125.18ms
step 98000: train loss 6.5297, val loss 6.5427
saving checkpoint to out-shakespeare-char
iter 98000: loss 7.8320, time 2878.64ms
iter 98010: loss 8.1990, time 122.10ms
iter 98020: loss 7.7046, time 121.09ms
iter 98030: loss 8.2945, time 122.73ms
iter 98040: loss 8.6581, time 120.92ms
iter 98050: loss 8.5447, time 122.85ms
iter 98060: loss 7.9713, time 120.71ms
iter 98070: loss 8.0935, time 123.12ms
iter 98080: loss 7.3881, time 121.85ms
iter 98090: loss 7.9703, time 123.26ms
iter 98100: loss 7.7429, time 121.78ms
iter 98110: loss 7.6300, time 122.35ms
iter 98120: loss 6.7819, time 121.81ms
iter 98130: loss 7.2728, time 123.12ms
iter 98140: loss 8.4532, time 121.62ms
iter 98150: loss 7.8090, time 122.89ms
iter 98160: loss 7.2771, time 121.77ms
iter 98170: loss 7.7568, time 122.98ms
iter 98180: loss 7.5232, time 121.82ms
iter 98190: loss 8.2230, time 123.28ms
iter 98200: loss 7.9632, time 121.73ms
iter 98210: loss 7.4478, time 123.34ms
iter 98220: loss 7.9229, time 121.74ms
iter 98230: loss 7.5939, time 123.54ms
iter 98240: loss 7.6083, time 121.34ms
step 98250: train loss 6.5396, val loss 6.4808
saving checkpoint to out-shakespeare-char
iter 98250: loss 7.5737, time 2896.34ms
iter 98260: loss 8.2973, time 120.96ms
iter 98270: loss 8.5065, time 121.12ms
iter 98280: loss 8.0782, time 121.74ms
iter 98290: loss 7.0489, time 121.99ms
iter 98300: loss 9.0769, time 121.85ms
iter 98310: loss 7.8685, time 122.50ms
iter 98320: loss 7.7274, time 121.74ms
iter 98330: loss 7.1965, time 121.92ms
iter 98340: loss 7.7020, time 122.47ms
iter 98350: loss 7.9684, time 121.90ms
iter 98360: loss 7.6261, time 120.83ms
iter 98370: loss 7.2347, time 121.86ms
iter 98380: loss 7.4991, time 121.84ms
iter 98390: loss 7.9350, time 122.02ms
iter 98400: loss 7.9480, time 121.87ms
iter 98410: loss 8.0525, time 121.95ms
iter 98420: loss 8.0810, time 121.81ms
iter 98430: loss 7.9669, time 121.94ms
iter 98440: loss 7.7837, time 121.72ms
iter 98450: loss 7.8337, time 121.43ms
iter 98460: loss 7.2857, time 121.82ms
iter 98470: loss 7.9869, time 122.08ms
iter 98480: loss 7.7687, time 121.96ms
iter 98490: loss 8.5138, time 121.91ms
step 98500: train loss 6.5044, val loss 6.5254
saving checkpoint to out-shakespeare-char
iter 98500: loss 7.9532, time 2905.57ms
iter 98510: loss 7.9902, time 121.09ms
iter 98520: loss 8.4949, time 123.24ms
iter 98530: loss 7.9301, time 121.79ms
iter 98540: loss 7.7167, time 123.18ms
iter 98550: loss 7.7800, time 121.83ms
iter 98560: loss 7.5390, time 123.09ms
iter 98570: loss 7.5701, time 121.60ms
iter 98580: loss 7.6566, time 122.88ms
iter 98590: loss 7.3167, time 121.73ms
iter 98600: loss 8.2043, time 123.30ms
iter 98610: loss 7.9413, time 122.28ms
iter 98620: loss 7.9973, time 123.14ms
iter 98630: loss 7.2414, time 121.97ms
iter 98640: loss 8.4046, time 121.93ms
iter 98650: loss 7.5203, time 121.84ms
iter 98660: loss 7.2583, time 123.05ms
iter 98670: loss 7.6820, time 121.78ms
iter 98680: loss 6.8933, time 122.77ms
iter 98690: loss 7.3579, time 121.84ms
iter 98700: loss 8.6893, time 123.34ms
iter 98710: loss 7.7741, time 121.73ms
iter 98720: loss 8.6605, time 123.04ms
iter 98730: loss 8.3134, time 121.93ms
iter 98740: loss 8.0376, time 123.21ms
step 98750: train loss 6.5285, val loss 6.5948
saving checkpoint to out-shakespeare-char
iter 98750: loss 7.9388, time 2891.15ms
iter 98760: loss 7.1604, time 121.36ms
iter 98770: loss 9.1739, time 122.49ms
iter 98780: loss 8.2934, time 121.35ms
iter 98790: loss 7.6573, time 121.26ms
iter 98800: loss 8.0176, time 120.30ms
iter 98810: loss 8.2154, time 121.20ms
iter 98820: loss 7.4104, time 121.21ms
iter 98830: loss 7.7470, time 121.62ms
iter 98840: loss 7.4834, time 121.27ms
iter 98850: loss 8.7926, time 121.71ms
iter 98860: loss 6.9315, time 122.18ms
iter 98870: loss 8.1656, time 121.36ms
iter 98880: loss 6.8140, time 120.02ms
iter 98890: loss 7.6673, time 121.20ms
iter 98900: loss 7.4680, time 121.51ms
iter 98910: loss 6.8255, time 121.47ms
iter 98920: loss 7.8275, time 121.26ms
iter 98930: loss 7.2933, time 121.28ms
iter 98940: loss 8.0526, time 121.24ms
iter 98950: loss 7.3935, time 121.27ms
iter 98960: loss 8.8355, time 125.49ms
iter 98970: loss 7.9037, time 125.09ms
iter 98980: loss 7.9974, time 125.02ms
iter 98990: loss 7.8594, time 124.90ms
step 99000: train loss 6.5091, val loss 6.4843
saving checkpoint to out-shakespeare-char
iter 99000: loss 7.7931, time 2880.28ms
iter 99010: loss 8.1485, time 125.13ms
iter 99020: loss 8.1616, time 127.28ms
iter 99030: loss 7.3518, time 125.30ms
iter 99040: loss 8.3465, time 124.15ms
iter 99050: loss 7.8403, time 124.90ms
iter 99060: loss 7.6220, time 125.09ms
iter 99070: loss 8.3401, time 123.70ms
iter 99080: loss 7.1856, time 125.52ms
iter 99090: loss 7.9027, time 125.08ms
iter 99100: loss 8.3179, time 125.07ms
iter 99110: loss 8.2101, time 126.02ms
iter 99120: loss 8.2699, time 125.08ms
iter 99130: loss 8.1600, time 128.12ms
iter 99140: loss 7.5227, time 124.46ms
iter 99150: loss 7.8352, time 123.48ms
iter 99160: loss 8.7272, time 124.38ms
iter 99170: loss 8.0705, time 125.02ms
iter 99180: loss 8.0925, time 125.26ms
iter 99190: loss 8.0287, time 124.12ms
iter 99200: loss 7.8892, time 125.23ms
iter 99210: loss 7.9871, time 123.44ms
iter 99220: loss 8.1595, time 124.76ms
iter 99230: loss 7.4193, time 125.18ms
iter 99240: loss 8.0906, time 127.05ms
step 99250: train loss 6.5592, val loss 6.4659
saving checkpoint to out-shakespeare-char
iter 99250: loss 7.7679, time 2911.87ms
iter 99260: loss 7.8701, time 125.26ms
iter 99270: loss 7.2846, time 124.23ms
iter 99280: loss 7.4729, time 124.91ms
iter 99290: loss 7.6721, time 124.53ms
iter 99300: loss 7.3348, time 124.28ms
iter 99310: loss 7.4685, time 124.93ms
iter 99320: loss 8.0909, time 124.91ms
iter 99330: loss 8.8449, time 123.31ms
iter 99340: loss 8.4323, time 124.53ms
iter 99350: loss 8.3756, time 125.16ms
iter 99360: loss 7.8950, time 125.32ms
iter 99370: loss 7.0036, time 125.78ms
iter 99380: loss 8.7766, time 125.18ms
iter 99390: loss 7.6509, time 127.02ms
iter 99400: loss 7.2415, time 125.87ms
iter 99410: loss 7.8086, time 129.16ms
iter 99420: loss 7.6746, time 125.63ms
iter 99430: loss 8.5124, time 125.75ms
iter 99440: loss 7.6038, time 124.78ms
iter 99450: loss 7.7626, time 125.62ms
iter 99460: loss 7.2367, time 125.64ms
iter 99470: loss 7.6903, time 125.85ms
iter 99480: loss 7.3548, time 125.87ms
iter 99490: loss 8.0260, time 125.76ms
step 99500: train loss 6.4832, val loss 6.5769
saving checkpoint to out-shakespeare-char
iter 99500: loss 7.7417, time 2890.88ms
iter 99510: loss 7.3291, time 128.82ms
iter 99520: loss 7.5468, time 125.68ms
iter 99530: loss 8.2772, time 125.77ms
iter 99540: loss 7.1916, time 126.24ms
iter 99550: loss 7.3866, time 125.00ms
iter 99560: loss 7.2207, time 124.96ms
iter 99570: loss 7.2820, time 124.29ms
iter 99580: loss 8.1090, time 124.41ms
iter 99590: loss 8.3194, time 124.94ms
iter 99600: loss 7.7526, time 125.12ms
iter 99610: loss 7.3174, time 124.79ms
iter 99620: loss 7.1981, time 127.47ms
iter 99630: loss 8.3760, time 125.19ms
iter 99640: loss 7.9243, time 125.16ms
iter 99650: loss 8.0909, time 124.49ms
iter 99660: loss 8.1396, time 124.23ms
iter 99670: loss 8.0816, time 124.93ms
iter 99680: loss 7.8658, time 124.91ms
iter 99690: loss 8.0085, time 124.35ms
iter 99700: loss 7.2945, time 124.62ms
iter 99710: loss 7.5193, time 124.52ms
iter 99720: loss 7.5263, time 125.35ms
iter 99730: loss 8.2337, time 127.68ms
iter 99740: loss 8.0602, time 124.57ms
step 99750: train loss 6.5147, val loss 6.5482
saving checkpoint to out-shakespeare-char
iter 99750: loss 7.3243, time 2873.40ms
iter 99760: loss 8.2627, time 125.10ms
iter 99770: loss 7.7761, time 124.81ms
iter 99780: loss 7.7188, time 125.16ms
iter 99790: loss 8.1236, time 127.78ms
iter 99800: loss 8.3368, time 124.91ms
iter 99810: loss 8.1559, time 125.04ms
iter 99820: loss 7.8692, time 124.94ms
iter 99830: loss 8.0056, time 125.01ms
iter 99840: loss 7.8925, time 125.01ms
iter 99850: loss 7.8974, time 124.93ms
iter 99860: loss 7.7520, time 125.14ms
iter 99870: loss 8.4986, time 125.00ms
iter 99880: loss 8.8182, time 126.21ms
iter 99890: loss 7.3811, time 126.02ms
iter 99900: loss 7.3474, time 128.77ms
iter 99910: loss 7.1132, time 125.85ms
iter 99920: loss 8.2030, time 125.93ms
iter 99930: loss 8.1592, time 125.81ms
iter 99940: loss 7.7549, time 124.84ms
iter 99950: loss 7.4854, time 125.94ms
iter 99960: loss 8.0144, time 125.45ms
iter 99970: loss 7.8437, time 126.10ms
iter 99980: loss 8.0884, time 125.01ms
iter 99990: loss 7.9532, time 124.96ms
step 100000: train loss 6.4482, val loss 6.4936
saving checkpoint to out-shakespeare-char
iter 100000: loss 7.1267, time 2877.61ms
iter 100010: loss 7.9305, time 125.62ms
iter 100020: loss 8.4115, time 125.51ms
iter 100030: loss 7.1102, time 125.46ms
iter 100040: loss 7.4962, time 125.32ms
iter 100050: loss 7.6829, time 125.29ms
iter 100060: loss 7.2921, time 126.33ms
iter 100070: loss 7.2103, time 121.90ms
iter 100080: loss 7.6030, time 121.83ms
iter 100090: loss 8.0270, time 121.15ms
iter 100100: loss 7.1897, time 122.23ms
iter 100110: loss 8.4356, time 122.12ms
iter 100120: loss 7.8274, time 121.62ms
iter 100130: loss 7.7742, time 122.17ms
iter 100140: loss 7.4027, time 121.56ms
iter 100150: loss 8.0789, time 121.97ms
iter 100160: loss 7.9307, time 121.67ms
iter 100170: loss 7.9065, time 121.65ms
iter 100180: loss 7.8024, time 121.82ms
iter 100190: loss 8.3873, time 120.99ms
iter 100200: loss 8.1005, time 122.17ms
iter 100210: loss 7.8349, time 121.28ms
iter 100220: loss 7.9676, time 121.23ms
iter 100230: loss 6.9843, time 121.47ms
iter 100240: loss 7.7027, time 122.33ms
step 100250: train loss 6.5216, val loss 6.5090
saving checkpoint to out-shakespeare-char
iter 100250: loss 7.5189, time 2910.87ms
iter 100260: loss 8.2058, time 126.00ms
iter 100270: loss 8.0808, time 125.56ms
iter 100280: loss 7.7666, time 126.39ms
iter 100290: loss 6.8894, time 125.06ms
iter 100300: loss 8.8118, time 125.91ms
iter 100310: loss 7.8535, time 124.93ms
iter 100320: loss 7.0296, time 125.65ms
iter 100330: loss 8.2666, time 126.37ms
iter 100340: loss 7.5827, time 126.00ms
iter 100350: loss 8.4141, time 127.81ms
iter 100360: loss 7.3465, time 125.71ms
iter 100370: loss 8.8115, time 126.62ms
iter 100380: loss 7.2240, time 125.29ms
iter 100390: loss 7.1532, time 125.63ms
iter 100400: loss 7.9744, time 126.59ms
iter 100410: loss 8.0638, time 125.29ms
iter 100420: loss 8.1798, time 125.22ms
iter 100430: loss 8.0423, time 125.04ms
iter 100440: loss 7.4360, time 126.46ms
iter 100450: loss 8.0899, time 124.91ms
iter 100460: loss 7.5260, time 127.82ms
iter 100470: loss 7.0764, time 125.04ms
iter 100480: loss 7.6619, time 124.91ms
iter 100490: loss 7.5655, time 124.98ms
step 100500: train loss 6.5742, val loss 6.4816
saving checkpoint to out-shakespeare-char
iter 100500: loss 7.1607, time 2869.66ms
iter 100510: loss 7.6733, time 125.36ms
iter 100520: loss 7.6930, time 128.17ms
iter 100530: loss 7.0760, time 125.32ms
iter 100540: loss 7.4995, time 124.34ms
iter 100550: loss 7.3713, time 125.31ms
iter 100560: loss 7.7676, time 125.15ms
iter 100570: loss 8.3630, time 125.16ms
iter 100580: loss 8.1807, time 127.04ms
iter 100590: loss 7.9522, time 125.60ms
iter 100600: loss 6.6346, time 125.45ms
iter 100610: loss 7.2577, time 125.70ms
iter 100620: loss 7.3532, time 125.51ms
iter 100630: loss 8.1742, time 128.55ms
iter 100640: loss 7.2711, time 125.79ms
iter 100650: loss 8.0281, time 125.68ms
iter 100660: loss 7.6782, time 126.14ms
iter 100670: loss 7.8935, time 126.08ms
iter 100680: loss 7.5999, time 126.09ms
iter 100690: loss 7.5880, time 125.73ms
iter 100700: loss 7.8128, time 125.71ms
iter 100710: loss 7.8490, time 125.90ms
iter 100720: loss 8.5950, time 126.42ms
iter 100730: loss 8.2274, time 126.38ms
iter 100740: loss 7.5565, time 128.43ms
step 100750: train loss 6.5131, val loss 6.5362
saving checkpoint to out-shakespeare-char
iter 100750: loss 7.5984, time 2882.24ms
iter 100760: loss 8.5803, time 122.81ms
iter 100770: loss 7.3837, time 122.02ms
iter 100780: loss 8.0561, time 122.83ms
iter 100790: loss 8.7219, time 122.11ms
iter 100800: loss 8.8470, time 122.43ms
iter 100810: loss 7.9785, time 121.94ms
iter 100820: loss 7.5224, time 122.34ms
iter 100830: loss 7.3288, time 121.87ms
iter 100840: loss 7.5891, time 122.28ms
iter 100850: loss 7.2189, time 121.78ms
iter 100860: loss 7.3078, time 122.54ms
iter 100870: loss 7.7721, time 121.82ms
iter 100880: loss 7.7421, time 122.36ms
iter 100890: loss 8.0227, time 121.89ms
iter 100900: loss 7.2488, time 122.18ms
iter 100910: loss 7.5976, time 121.91ms
iter 100920: loss 8.7632, time 122.32ms
iter 100930: loss 7.9364, time 120.87ms
iter 100940: loss 8.1648, time 122.60ms
iter 100950: loss 8.2858, time 122.50ms
iter 100960: loss 7.7714, time 123.22ms
iter 100970: loss 7.4346, time 122.17ms
iter 100980: loss 7.6160, time 122.64ms
iter 100990: loss 7.8389, time 122.39ms
step 101000: train loss 6.4992, val loss 6.5365
saving checkpoint to out-shakespeare-char
iter 101000: loss 7.1874, time 2887.50ms
iter 101010: loss 7.9109, time 122.74ms
iter 101020: loss 8.3486, time 121.79ms
iter 101030: loss 6.8642, time 121.86ms
iter 101040: loss 7.3026, time 122.67ms
iter 101050: loss 8.5383, time 121.52ms
iter 101060: loss 7.9009, time 122.53ms
iter 101070: loss 7.3544, time 121.66ms
iter 101080: loss 8.1315, time 121.68ms
iter 101090: loss 7.3607, time 121.15ms
iter 101100: loss 6.6803, time 121.76ms
iter 101110: loss 7.5772, time 121.93ms
iter 101120: loss 6.9832, time 121.35ms
iter 101130: loss 6.8492, time 121.70ms
iter 101140: loss 7.7573, time 121.48ms
iter 101150: loss 7.4365, time 121.93ms
iter 101160: loss 8.1657, time 122.08ms
iter 101170: loss 8.2877, time 121.93ms
iter 101180: loss 8.4594, time 121.57ms
iter 101190: loss 8.2577, time 121.65ms
iter 101200: loss 8.0738, time 121.71ms
iter 101210: loss 7.7866, time 122.07ms
iter 101220: loss 7.3093, time 121.79ms
iter 101230: loss 7.6880, time 121.73ms
iter 101240: loss 8.5950, time 121.60ms
step 101250: train loss 6.5472, val loss 6.4924
saving checkpoint to out-shakespeare-char
iter 101250: loss 8.0323, time 2916.36ms
iter 101260: loss 7.9396, time 121.62ms
iter 101270: loss 8.0636, time 122.26ms
iter 101280: loss 8.4389, time 121.46ms
iter 101290: loss 7.9177, time 122.30ms
iter 101300: loss 8.1104, time 121.57ms
iter 101310: loss 8.3695, time 122.30ms
iter 101320: loss 7.4117, time 121.58ms
iter 101330: loss 7.6588, time 122.57ms
iter 101340: loss 7.2380, time 120.60ms
iter 101350: loss 7.7446, time 122.46ms
iter 101360: loss 7.7077, time 121.72ms
iter 101370: loss 7.6030, time 122.37ms
iter 101380: loss 8.0191, time 121.30ms
iter 101390: loss 7.4569, time 122.53ms
iter 101400: loss 7.2242, time 121.45ms
iter 101410: loss 7.9379, time 122.33ms
iter 101420: loss 8.7700, time 121.47ms
iter 101430: loss 7.8693, time 122.07ms
iter 101440: loss 8.3487, time 121.63ms
iter 101450: loss 7.0327, time 122.04ms
iter 101460: loss 7.8922, time 121.68ms
iter 101470: loss 6.8275, time 122.34ms
iter 101480: loss 8.2239, time 121.63ms
iter 101490: loss 8.4581, time 122.69ms
step 101500: train loss 6.5051, val loss 6.4819
saving checkpoint to out-shakespeare-char
iter 101500: loss 8.0679, time 2901.49ms
iter 101510: loss 8.0302, time 120.90ms
iter 101520: loss 6.9943, time 121.44ms
iter 101530: loss 7.6694, time 121.51ms
iter 101540: loss 8.0128, time 121.39ms
iter 101550: loss 7.1001, time 121.68ms
iter 101560: loss 7.9184, time 121.62ms
iter 101570: loss 7.4271, time 121.47ms
iter 101580: loss 7.9349, time 120.43ms
iter 101590: loss 7.9506, time 121.44ms
iter 101600: loss 8.2550, time 121.68ms
iter 101610: loss 8.3262, time 121.51ms
iter 101620: loss 7.3396, time 121.13ms
iter 101630: loss 8.2495, time 121.70ms
iter 101640: loss 7.4317, time 121.40ms
iter 101650: loss 7.4774, time 121.45ms
iter 101660: loss 7.7291, time 121.32ms
iter 101670: loss 7.7683, time 121.73ms
iter 101680: loss 8.0346, time 121.94ms
iter 101690: loss 7.7549, time 121.39ms
iter 101700: loss 8.0773, time 122.63ms
iter 101710: loss 7.8515, time 121.50ms
iter 101720: loss 7.2551, time 121.33ms
iter 101730: loss 7.8359, time 122.00ms
iter 101740: loss 7.2586, time 121.49ms
step 101750: train loss 6.5148, val loss 6.4355
saving checkpoint to out-shakespeare-char
iter 101750: loss 8.0109, time 2902.31ms
iter 101760: loss 8.6381, time 122.25ms
iter 101770: loss 7.8210, time 121.61ms
iter 101780: loss 7.5273, time 121.75ms
iter 101790: loss 7.9076, time 122.24ms
iter 101800: loss 7.5190, time 122.95ms
iter 101810: loss 7.7357, time 121.59ms
iter 101820: loss 8.4510, time 122.83ms
iter 101830: loss 7.8237, time 121.88ms
iter 101840: loss 7.5236, time 123.02ms
iter 101850: loss 7.6089, time 121.93ms
iter 101860: loss 8.3832, time 123.15ms
iter 101870: loss 8.0349, time 121.44ms
iter 101880: loss 7.9320, time 123.29ms
iter 101890: loss 7.7915, time 121.47ms
iter 101900: loss 7.3323, time 123.08ms
iter 101910: loss 7.3743, time 121.55ms
iter 101920: loss 8.6991, time 122.76ms
iter 101930: loss 7.8169, time 121.43ms
iter 101940: loss 6.8615, time 122.71ms
iter 101950: loss 8.0498, time 121.53ms
iter 101960: loss 8.0011, time 122.74ms
iter 101970: loss 8.0442, time 121.70ms
iter 101980: loss 8.2736, time 122.84ms
iter 101990: loss 7.9375, time 121.38ms
step 102000: train loss 6.4602, val loss 6.4511
saving checkpoint to out-shakespeare-char
iter 102000: loss 7.4385, time 2901.17ms
iter 102010: loss 7.9370, time 126.50ms
iter 102020: loss 8.0743, time 125.44ms
iter 102030: loss 8.3666, time 124.94ms
iter 102040: loss 7.8831, time 125.23ms
iter 102050: loss 7.7003, time 125.32ms
iter 102060: loss 6.7410, time 125.80ms
iter 102070: loss 7.6872, time 125.63ms
iter 102080: loss 7.6450, time 126.00ms
iter 102090: loss 7.5967, time 125.99ms
iter 102100: loss 8.1026, time 128.59ms
iter 102110: loss 8.0110, time 125.56ms
iter 102120: loss 8.1161, time 125.66ms
iter 102130: loss 7.0461, time 126.67ms
iter 102140: loss 7.6995, time 125.08ms
iter 102150: loss 7.8387, time 125.71ms
iter 102160: loss 7.6372, time 125.70ms
iter 102170: loss 7.7539, time 126.01ms
iter 102180: loss 7.4234, time 125.72ms
iter 102190: loss 7.3981, time 123.57ms
iter 102200: loss 7.6884, time 121.86ms
iter 102210: loss 8.5439, time 121.87ms
iter 102220: loss 7.4559, time 121.74ms
iter 102230: loss 7.8586, time 121.84ms
iter 102240: loss 8.3453, time 122.96ms
step 102250: train loss 6.4236, val loss 6.4923
saving checkpoint to out-shakespeare-char
iter 102250: loss 7.8494, time 2901.12ms
iter 102260: loss 8.6055, time 121.21ms
iter 102270: loss 8.0202, time 122.13ms
iter 102280: loss 8.0707, time 122.11ms
iter 102290: loss 7.1584, time 121.88ms
iter 102300: loss 7.5672, time 122.10ms
iter 102310: loss 8.2457, time 121.84ms
iter 102320: loss 7.8355, time 122.23ms
iter 102330: loss 8.5066, time 122.36ms
iter 102340: loss 7.6804, time 122.08ms
iter 102350: loss 7.9114, time 121.97ms
iter 102360: loss 7.5231, time 122.12ms
iter 102370: loss 8.2578, time 121.91ms
iter 102380: loss 7.5164, time 122.12ms
iter 102390: loss 8.3172, time 122.10ms
iter 102400: loss 8.1733, time 122.07ms
iter 102410: loss 7.1831, time 121.74ms
iter 102420: loss 6.9228, time 122.16ms
iter 102430: loss 7.7270, time 121.88ms
iter 102440: loss 7.6913, time 122.11ms
iter 102450: loss 7.0924, time 122.62ms
iter 102460: loss 7.3681, time 121.29ms
iter 102470: loss 7.5972, time 122.13ms
iter 102480: loss 7.8753, time 121.99ms
iter 102490: loss 7.2802, time 121.91ms
step 102500: train loss 6.4665, val loss 6.5017
saving checkpoint to out-shakespeare-char
iter 102500: loss 7.7955, time 2897.94ms
iter 102510: loss 7.4642, time 124.80ms
iter 102520: loss 8.3910, time 121.78ms
iter 102530: loss 7.8095, time 125.29ms
iter 102540: loss 7.9726, time 122.08ms
iter 102550: loss 7.3106, time 124.94ms
iter 102560: loss 7.4224, time 121.93ms
iter 102570: loss 7.8368, time 124.66ms
iter 102580: loss 7.7059, time 121.04ms
iter 102590: loss 8.1401, time 124.84ms
iter 102600: loss 7.5101, time 122.08ms
iter 102610: loss 7.3349, time 124.79ms
iter 102620: loss 7.7468, time 121.85ms
iter 102630: loss 7.5176, time 124.63ms
iter 102640: loss 7.4340, time 121.08ms
iter 102650: loss 7.1651, time 124.70ms
iter 102660: loss 7.0199, time 121.91ms
iter 102670: loss 8.2986, time 124.84ms
iter 102680: loss 7.7499, time 121.81ms
iter 102690: loss 7.8978, time 124.66ms
iter 102700: loss 7.9261, time 121.67ms
iter 102710: loss 7.8335, time 125.02ms
iter 102720: loss 7.2278, time 121.50ms
iter 102730: loss 7.3482, time 124.91ms
iter 102740: loss 8.2120, time 121.94ms
step 102750: train loss 6.4560, val loss 6.4230
saving checkpoint to out-shakespeare-char
iter 102750: loss 8.5023, time 2898.62ms
iter 102760: loss 8.0503, time 125.11ms
iter 102770: loss 7.8995, time 125.75ms
iter 102780: loss 7.2706, time 125.76ms
iter 102790: loss 7.3728, time 121.57ms
iter 102800: loss 7.9157, time 121.59ms
iter 102810: loss 7.3579, time 120.87ms
iter 102820: loss 7.8512, time 121.67ms
iter 102830: loss 7.7111, time 121.66ms
iter 102840: loss 7.7965, time 122.08ms
iter 102850: loss 8.3207, time 121.69ms
iter 102860: loss 9.0758, time 121.86ms
iter 102870: loss 8.3850, time 122.01ms
iter 102880: loss 7.8363, time 122.13ms
iter 102890: loss 7.8123, time 120.82ms
iter 102900: loss 7.0421, time 122.03ms
iter 102910: loss 7.1288, time 122.23ms
iter 102920: loss 7.1889, time 122.25ms
iter 102930: loss 7.9563, time 122.16ms
iter 102940: loss 7.4272, time 121.99ms
iter 102950: loss 8.0364, time 121.90ms
iter 102960: loss 7.9120, time 122.22ms
iter 102970: loss 7.9470, time 121.85ms
iter 102980: loss 7.7507, time 122.17ms
iter 102990: loss 7.4234, time 121.33ms
step 103000: train loss 6.4217, val loss 6.4811
saving checkpoint to out-shakespeare-char
iter 103000: loss 7.6088, time 2913.32ms
iter 103010: loss 7.8062, time 126.18ms
iter 103020: loss 8.8798, time 125.79ms
iter 103030: loss 8.0467, time 127.20ms
iter 103040: loss 7.3894, time 125.96ms
iter 103050: loss 7.8192, time 125.80ms
iter 103060: loss 7.8207, time 128.77ms
iter 103070: loss 8.1762, time 125.64ms
iter 103080: loss 7.7955, time 125.89ms
iter 103090: loss 7.7330, time 126.79ms
iter 103100: loss 7.2971, time 125.74ms
iter 103110: loss 7.7621, time 125.78ms
iter 103120: loss 8.4650, time 126.04ms
iter 103130: loss 7.3753, time 126.02ms
iter 103140: loss 8.1197, time 126.45ms
iter 103150: loss 7.1144, time 125.14ms
iter 103160: loss 7.5683, time 125.76ms
iter 103170: loss 7.6859, time 128.50ms
iter 103180: loss 7.5072, time 125.57ms
iter 103190: loss 7.8385, time 125.84ms
iter 103200: loss 8.5853, time 125.70ms
iter 103210: loss 7.3028, time 125.65ms
iter 103220: loss 7.7846, time 125.79ms
iter 103230: loss 7.7082, time 125.69ms
iter 103240: loss 7.0561, time 125.63ms
step 103250: train loss 6.4903, val loss 6.4741
saving checkpoint to out-shakespeare-char
iter 103250: loss 8.1014, time 2880.40ms
iter 103260: loss 7.4914, time 126.46ms
iter 103270: loss 7.7095, time 124.99ms
iter 103280: loss 7.2989, time 125.49ms
iter 103290: loss 6.8650, time 125.12ms
iter 103300: loss 8.0508, time 128.04ms
iter 103310: loss 7.7259, time 125.48ms
iter 103320: loss 7.8469, time 125.57ms
iter 103330: loss 7.9077, time 125.53ms
iter 103340: loss 7.3815, time 125.04ms
iter 103350: loss 7.5004, time 125.41ms
iter 103360: loss 8.0489, time 125.34ms
iter 103370: loss 7.0775, time 124.98ms
iter 103380: loss 8.0142, time 125.16ms
iter 103390: loss 7.5366, time 125.14ms
iter 103400: loss 7.9274, time 125.41ms
iter 103410: loss 7.9430, time 128.56ms
iter 103420: loss 7.8544, time 125.63ms
iter 103430: loss 7.4701, time 124.19ms
iter 103440: loss 7.8246, time 125.66ms
iter 103450: loss 7.1966, time 125.08ms
iter 103460: loss 7.0768, time 125.31ms
iter 103470: loss 6.9614, time 124.49ms
iter 103480: loss 8.0819, time 125.37ms
iter 103490: loss 7.4091, time 125.36ms
step 103500: train loss 6.5023, val loss 6.4637
saving checkpoint to out-shakespeare-char
iter 103500: loss 7.9243, time 2884.04ms
iter 103510: loss 8.0541, time 128.07ms
iter 103520: loss 7.2283, time 125.66ms
iter 103530: loss 7.0813, time 125.19ms
iter 103540: loss 7.7429, time 125.14ms
iter 103550: loss 7.2353, time 125.30ms
iter 103560: loss 7.4506, time 125.20ms
iter 103570: loss 7.5774, time 125.41ms
iter 103580: loss 8.4659, time 128.41ms
iter 103590: loss 7.4550, time 125.46ms
iter 103600: loss 7.6001, time 125.26ms
iter 103610: loss 7.5332, time 125.44ms
iter 103620: loss 7.4677, time 125.35ms
iter 103630: loss 7.7633, time 125.77ms
iter 103640: loss 7.5751, time 125.53ms
iter 103650: loss 7.5958, time 124.93ms
iter 103660: loss 7.6688, time 125.25ms
iter 103670: loss 7.8180, time 125.66ms
iter 103680: loss 7.6918, time 125.75ms
iter 103690: loss 7.9328, time 128.52ms
iter 103700: loss 8.3838, time 126.18ms
iter 103710: loss 7.9945, time 125.97ms
iter 103720: loss 7.9877, time 125.91ms
iter 103730: loss 7.8722, time 125.69ms
iter 103740: loss 8.1379, time 125.75ms
step 103750: train loss 6.4819, val loss 6.4885
saving checkpoint to out-shakespeare-char
iter 103750: loss 7.8742, time 2898.86ms
iter 103760: loss 7.9431, time 124.73ms
iter 103770: loss 7.6523, time 125.60ms
iter 103780: loss 8.3091, time 125.56ms
iter 103790: loss 7.7666, time 125.87ms
iter 103800: loss 8.3130, time 125.69ms
iter 103810: loss 7.7435, time 125.89ms
iter 103820: loss 8.1683, time 125.62ms
iter 103830: loss 8.0279, time 125.47ms
iter 103840: loss 7.2532, time 125.63ms
iter 103850: loss 7.9869, time 124.79ms
iter 103860: loss 7.7681, time 128.28ms
iter 103870: loss 7.4559, time 125.69ms
iter 103880: loss 7.6083, time 125.81ms
iter 103890: loss 7.3376, time 125.66ms
iter 103900: loss 8.4462, time 126.19ms
iter 103910: loss 8.0165, time 125.76ms
iter 103920: loss 8.1420, time 125.76ms
iter 103930: loss 7.4688, time 125.57ms
iter 103940: loss 7.8318, time 125.71ms
iter 103950: loss 7.7873, time 125.74ms
iter 103960: loss 8.0851, time 125.73ms
iter 103970: loss 7.9020, time 125.56ms
iter 103980: loss 7.7537, time 125.76ms
iter 103990: loss 7.5318, time 126.02ms
step 104000: train loss 6.4587, val loss 6.5196
saving checkpoint to out-shakespeare-char
iter 104000: loss 7.1139, time 2883.79ms
iter 104010: loss 6.8900, time 125.42ms
iter 104020: loss 8.8062, time 125.50ms
iter 104030: loss 7.8238, time 124.45ms
iter 104040: loss 6.8871, time 129.90ms
iter 104050: loss 8.0353, time 123.76ms
iter 104060: loss 7.9753, time 123.12ms
iter 104070: loss 6.6457, time 124.13ms
iter 104080: loss 8.5753, time 124.20ms
iter 104090: loss 8.0177, time 124.68ms
iter 104100: loss 7.2175, time 127.93ms
iter 104110: loss 8.1999, time 124.92ms
iter 104120: loss 7.9326, time 123.81ms
iter 104130: loss 7.3498, time 124.86ms
iter 104140: loss 8.2940, time 124.98ms
iter 104150: loss 8.2720, time 125.11ms
iter 104160: loss 7.4847, time 124.15ms
iter 104170: loss 8.5234, time 127.91ms
iter 104180: loss 8.0240, time 125.00ms
iter 104190: loss 7.4915, time 125.10ms
iter 104200: loss 8.3531, time 124.39ms
iter 104210: loss 7.8825, time 125.06ms
iter 104220: loss 7.6999, time 125.10ms
iter 104230: loss 7.8687, time 124.92ms
iter 104240: loss 7.9200, time 124.25ms
step 104250: train loss 6.4339, val loss 6.4467
saving checkpoint to out-shakespeare-char
iter 104250: loss 7.3371, time 2889.07ms
iter 104260: loss 7.3258, time 124.72ms
iter 104270: loss 7.5822, time 124.61ms
iter 104280: loss 7.3458, time 124.43ms
iter 104290: loss 7.8481, time 125.84ms
iter 104300: loss 7.2310, time 125.32ms
iter 104310: loss 7.5415, time 125.68ms
iter 104320: loss 7.5443, time 124.91ms
iter 104330: loss 8.0862, time 125.39ms
iter 104340: loss 6.6902, time 125.31ms
iter 104350: loss 7.2342, time 127.64ms
iter 104360: loss 7.7935, time 124.81ms
iter 104370: loss 7.5264, time 125.14ms
iter 104380: loss 7.2114, time 125.22ms
iter 104390: loss 7.3182, time 125.08ms
iter 104400: loss 7.0354, time 125.05ms
iter 104410: loss 7.7194, time 125.07ms
iter 104420: loss 7.3605, time 125.15ms
iter 104430: loss 7.0939, time 124.84ms
iter 104440: loss 7.9276, time 131.57ms
iter 104450: loss 8.1530, time 125.15ms
iter 104460: loss 7.8991, time 125.27ms
iter 104470: loss 7.3002, time 128.42ms
iter 104480: loss 7.7211, time 125.64ms
iter 104490: loss 7.2414, time 125.69ms
step 104500: train loss 6.4655, val loss 6.4654
saving checkpoint to out-shakespeare-char
iter 104500: loss 7.6291, time 2896.53ms
iter 104510: loss 7.5628, time 125.81ms
iter 104520: loss 7.4577, time 125.50ms
iter 104530: loss 7.6489, time 128.62ms
iter 104540: loss 7.6693, time 125.88ms
iter 104550: loss 7.1836, time 125.63ms
iter 104560: loss 7.8362, time 125.96ms
iter 104570: loss 7.3868, time 124.82ms
iter 104580: loss 7.1921, time 125.46ms
iter 104590: loss 7.9756, time 125.55ms
iter 104600: loss 7.2426, time 126.14ms
iter 104610: loss 7.3225, time 125.87ms
iter 104620: loss 7.8877, time 126.04ms
iter 104630: loss 7.1832, time 125.79ms
iter 104640: loss 7.8667, time 127.59ms
iter 104650: loss 7.6207, time 125.59ms
iter 104660: loss 7.6913, time 125.88ms
iter 104670: loss 7.5298, time 125.63ms
iter 104680: loss 7.1913, time 125.41ms
iter 104690: loss 7.8649, time 124.98ms
iter 104700: loss 7.8398, time 125.62ms
iter 104710: loss 7.8859, time 126.05ms
iter 104720: loss 8.3851, time 125.53ms
iter 104730: loss 8.2307, time 125.62ms
iter 104740: loss 7.1405, time 125.63ms
step 104750: train loss 6.3883, val loss 6.4489
saving checkpoint to out-shakespeare-char
iter 104750: loss 8.0326, time 2892.75ms
iter 104760: loss 7.5300, time 121.92ms
iter 104770: loss 8.1846, time 123.72ms
iter 104780: loss 7.6722, time 121.88ms
iter 104790: loss 7.4015, time 124.92ms
iter 104800: loss 8.9298, time 121.50ms
iter 104810: loss 7.3480, time 125.46ms
iter 104820: loss 7.6967, time 121.84ms
iter 104830: loss 7.4734, time 124.93ms
iter 104840: loss 7.6165, time 121.96ms
iter 104850: loss 7.5725, time 124.76ms
iter 104860: loss 7.6927, time 121.82ms
iter 104870: loss 7.7116, time 124.67ms
iter 104880: loss 6.8047, time 122.31ms
iter 104890: loss 7.7969, time 125.11ms
iter 104900: loss 7.1750, time 121.83ms
iter 104910: loss 7.9869, time 123.86ms
iter 104920: loss 8.1494, time 122.11ms
iter 104930: loss 8.4458, time 124.37ms
iter 104940: loss 7.8537, time 121.93ms
iter 104950: loss 7.2422, time 124.67ms
iter 104960: loss 7.8564, time 121.60ms
iter 104970: loss 7.9007, time 124.77ms
iter 104980: loss 7.7746, time 121.49ms
iter 104990: loss 8.2801, time 124.72ms
step 105000: train loss 6.4327, val loss 6.3594
saving checkpoint to out-shakespeare-char
iter 105000: loss 7.8158, time 2895.18ms
iter 105010: loss 7.8540, time 121.88ms
iter 105020: loss 7.1832, time 122.48ms
iter 105030: loss 7.9382, time 121.85ms
iter 105040: loss 8.4068, time 124.30ms
iter 105050: loss 8.1463, time 121.86ms
iter 105060: loss 8.3774, time 122.89ms
iter 105070: loss 7.0969, time 122.33ms
iter 105080: loss 7.8380, time 122.81ms
iter 105090: loss 7.4611, time 121.85ms
iter 105100: loss 7.7695, time 123.10ms
iter 105110: loss 6.9435, time 121.90ms
iter 105120: loss 8.2885, time 123.49ms
iter 105130: loss 7.3594, time 122.18ms
iter 105140: loss 7.7766, time 123.30ms
iter 105150: loss 7.3443, time 122.32ms
iter 105160: loss 7.3041, time 123.22ms
iter 105170: loss 7.7693, time 121.85ms
iter 105180: loss 7.4417, time 122.43ms
iter 105190: loss 7.9836, time 121.88ms
iter 105200: loss 7.8947, time 122.95ms
iter 105210: loss 7.7981, time 121.79ms
iter 105220: loss 7.1558, time 123.25ms
iter 105230: loss 6.7729, time 121.91ms
iter 105240: loss 7.5578, time 122.90ms
step 105250: train loss 6.4773, val loss 6.4283
saving checkpoint to out-shakespeare-char
iter 105250: loss 7.2343, time 2901.30ms
iter 105260: loss 7.5409, time 125.70ms
iter 105270: loss 7.6164, time 125.20ms
iter 105280: loss 7.6417, time 125.24ms
iter 105290: loss 8.2919, time 125.15ms
iter 105300: loss 8.1136, time 124.57ms
iter 105310: loss 8.1703, time 125.19ms
iter 105320: loss 7.8536, time 125.33ms
iter 105330: loss 8.0360, time 128.13ms
iter 105340: loss 7.0899, time 125.62ms
iter 105350: loss 7.9072, time 125.92ms
iter 105360: loss 8.0247, time 125.63ms
iter 105370: loss 8.5357, time 125.21ms
iter 105380: loss 7.7548, time 125.65ms
iter 105390: loss 8.2657, time 125.36ms
iter 105400: loss 7.3077, time 125.53ms
iter 105410: loss 8.5951, time 125.66ms
iter 105420: loss 7.3259, time 125.82ms
iter 105430: loss 7.4859, time 125.89ms
iter 105440: loss 7.8214, time 128.65ms
iter 105450: loss 8.3658, time 124.45ms
iter 105460: loss 7.7140, time 124.77ms
iter 105470: loss 7.8178, time 125.32ms
iter 105480: loss 8.0204, time 124.98ms
iter 105490: loss 7.2413, time 125.48ms
step 105500: train loss 6.3983, val loss 6.4423
saving checkpoint to out-shakespeare-char
iter 105500: loss 7.9027, time 2900.21ms
iter 105510: loss 8.0522, time 125.30ms
iter 105520: loss 8.3132, time 124.47ms
iter 105530: loss 7.3192, time 125.53ms
iter 105540: loss 7.6726, time 125.74ms
iter 105550: loss 7.6428, time 125.65ms
iter 105560: loss 7.7100, time 125.36ms
iter 105570: loss 8.0295, time 125.48ms
iter 105580: loss 7.2099, time 125.15ms
iter 105590: loss 7.0337, time 125.45ms
iter 105600: loss 6.8456, time 126.37ms
iter 105610: loss 7.1176, time 125.69ms
iter 105620: loss 7.0782, time 125.34ms
iter 105630: loss 8.2489, time 125.22ms
iter 105640: loss 7.9855, time 125.28ms
iter 105650: loss 7.6026, time 128.17ms
iter 105660: loss 8.0103, time 125.14ms
iter 105670: loss 7.3503, time 125.14ms
iter 105680: loss 7.4115, time 125.42ms
iter 105690: loss 8.1754, time 125.28ms
iter 105700: loss 7.8787, time 125.15ms
iter 105710: loss 7.4073, time 125.41ms
iter 105720: loss 7.8559, time 125.17ms
iter 105730: loss 7.6369, time 125.44ms
iter 105740: loss 8.4234, time 125.05ms
step 105750: train loss 6.4281, val loss 6.4056
saving checkpoint to out-shakespeare-char
iter 105750: loss 7.4182, time 2898.93ms
iter 105760: loss 7.3751, time 125.19ms
iter 105770: loss 8.0684, time 125.65ms
iter 105780: loss 7.8571, time 127.91ms
iter 105790: loss 7.9093, time 125.07ms
iter 105800: loss 7.6127, time 125.23ms
iter 105810: loss 6.8300, time 125.14ms
iter 105820: loss 7.8100, time 125.34ms
iter 105830: loss 6.3793, time 124.56ms
iter 105840: loss 7.7946, time 125.15ms
iter 105850: loss 7.7603, time 124.13ms
iter 105860: loss 7.5925, time 125.29ms
iter 105870: loss 7.8780, time 124.91ms
iter 105880: loss 7.4018, time 125.02ms
iter 105890: loss 7.6043, time 128.13ms
iter 105900: loss 7.0832, time 125.13ms
iter 105910: loss 7.1719, time 125.12ms
iter 105920: loss 7.4183, time 125.28ms
iter 105930: loss 7.2648, time 125.23ms
iter 105940: loss 7.9426, time 125.05ms
iter 105950: loss 8.7312, time 127.48ms
iter 105960: loss 6.8848, time 125.86ms
iter 105970: loss 7.8530, time 125.62ms
iter 105980: loss 7.5255, time 125.39ms
iter 105990: loss 8.1847, time 126.16ms
step 106000: train loss 6.3506, val loss 6.4274
saving checkpoint to out-shakespeare-char
iter 106000: loss 7.8801, time 2906.40ms
iter 106010: loss 7.8791, time 125.79ms
iter 106020: loss 7.3488, time 125.27ms
iter 106030: loss 7.4692, time 128.64ms
iter 106040: loss 7.0949, time 126.11ms
iter 106050: loss 7.2324, time 124.51ms
iter 106060: loss 8.1990, time 125.51ms
iter 106070: loss 7.4582, time 125.11ms
iter 106080: loss 8.1466, time 125.46ms
iter 106090: loss 7.7660, time 125.23ms
iter 106100: loss 7.2709, time 125.23ms
iter 106110: loss 7.1759, time 125.13ms
iter 106120: loss 7.9560, time 125.29ms
iter 106130: loss 8.0114, time 125.28ms
iter 106140: loss 7.8312, time 125.41ms
iter 106150: loss 8.3897, time 125.28ms
iter 106160: loss 7.7548, time 125.28ms
iter 106170: loss 7.4623, time 125.20ms
iter 106180: loss 7.4582, time 128.14ms
iter 106190: loss 7.6443, time 125.35ms
iter 106200: loss 8.0400, time 125.29ms
iter 106210: loss 8.4609, time 125.68ms
iter 106220: loss 7.5078, time 125.84ms
iter 106230: loss 7.6066, time 126.39ms
iter 106240: loss 8.6311, time 125.56ms
step 106250: train loss 6.3759, val loss 6.4664
saving checkpoint to out-shakespeare-char
iter 106250: loss 7.8549, time 2762.69ms
iter 106260: loss 7.7646, time 124.58ms
iter 106270: loss 8.1638, time 125.99ms
iter 106280: loss 7.8076, time 125.78ms
iter 106290: loss 7.5727, time 126.27ms
iter 106300: loss 7.5314, time 128.47ms
iter 106310: loss 8.3362, time 125.89ms
iter 106320: loss 7.9753, time 125.98ms
iter 106330: loss 7.1944, time 126.37ms
iter 106340: loss 8.5705, time 129.16ms
iter 106350: loss 7.6684, time 125.94ms
iter 106360: loss 7.6782, time 125.91ms
iter 106370: loss 8.4552, time 126.27ms
iter 106380: loss 7.2368, time 124.98ms
iter 106390: loss 7.7935, time 125.90ms
iter 106400: loss 7.3405, time 124.50ms
iter 106410: loss 8.1413, time 125.22ms
iter 106420: loss 7.1166, time 125.68ms
iter 106430: loss 7.6807, time 125.31ms
iter 106440: loss 6.9780, time 126.58ms
iter 106450: loss 7.0510, time 126.02ms
iter 106460: loss 7.2172, time 126.72ms
iter 106470: loss 7.5530, time 130.17ms
iter 106480: loss 7.3212, time 127.53ms
iter 106490: loss 8.4646, time 125.87ms
step 106500: train loss 6.4490, val loss 6.4320
saving checkpoint to out-shakespeare-char
iter 106500: loss 7.0876, time 2911.41ms
iter 106510: loss 8.7693, time 126.42ms
iter 106520: loss 6.9236, time 125.12ms
iter 106530: loss 7.6629, time 128.64ms
iter 106540: loss 7.5246, time 124.77ms
iter 106550: loss 8.1005, time 125.53ms
iter 106560: loss 7.4206, time 124.78ms
iter 106570: loss 7.7214, time 125.50ms
iter 106580: loss 7.6407, time 125.34ms
iter 106590: loss 8.0026, time 125.07ms
iter 106600: loss 6.9908, time 123.81ms
iter 106610: loss 8.5156, time 125.70ms
iter 106620: loss 6.8302, time 125.95ms
iter 106630: loss 7.1093, time 125.55ms
iter 106640: loss 7.6185, time 126.06ms
iter 106650: loss 7.2784, time 125.72ms
iter 106660: loss 7.3487, time 125.61ms
iter 106670: loss 8.1228, time 128.59ms
iter 106680: loss 7.6598, time 123.47ms
iter 106690: loss 8.0549, time 125.63ms
iter 106700: loss 8.2369, time 126.55ms
iter 106710: loss 8.1320, time 124.24ms
iter 106720: loss 7.5519, time 125.01ms
iter 106730: loss 8.1605, time 123.73ms
iter 106740: loss 7.8100, time 125.34ms
step 106750: train loss 6.4081, val loss 6.3737
saving checkpoint to out-shakespeare-char
iter 106750: loss 6.8529, time 2918.18ms
iter 106760: loss 7.6553, time 125.84ms
iter 106770: loss 6.8925, time 126.42ms
iter 106780: loss 7.6012, time 126.14ms
iter 106790: loss 7.4081, time 125.95ms
iter 106800: loss 7.4547, time 128.98ms
iter 106810: loss 8.3478, time 124.75ms
iter 106820: loss 7.3281, time 126.51ms
iter 106830: loss 7.8309, time 125.66ms
iter 106840: loss 7.4066, time 125.52ms
iter 106850: loss 7.4488, time 125.80ms
iter 106860: loss 7.2383, time 125.46ms
iter 106870: loss 7.7064, time 125.53ms
iter 106880: loss 7.8887, time 125.68ms
iter 106890: loss 8.1713, time 125.85ms
iter 106900: loss 8.1905, time 125.80ms
iter 106910: loss 7.6053, time 128.27ms
iter 106920: loss 7.7441, time 125.68ms
iter 106930: loss 7.2885, time 125.69ms
iter 106940: loss 7.6826, time 124.73ms
iter 106950: loss 7.7697, time 125.60ms
iter 106960: loss 7.6933, time 125.21ms
iter 106970: loss 7.4730, time 125.53ms
iter 106980: loss 7.4638, time 125.44ms
iter 106990: loss 7.6951, time 125.50ms
step 107000: train loss 6.3525, val loss 6.4208
saving checkpoint to out-shakespeare-char
iter 107000: loss 8.0421, time 2885.95ms
iter 107010: loss 8.1401, time 125.60ms
iter 107020: loss 6.8837, time 125.76ms
iter 107030: loss 7.6765, time 124.74ms
iter 107040: loss 7.8768, time 128.47ms
iter 107050: loss 8.1850, time 125.39ms
iter 107060: loss 8.0197, time 124.80ms
iter 107070: loss 7.3972, time 125.72ms
iter 107080: loss 8.3512, time 125.84ms
iter 107090: loss 7.5675, time 125.04ms
iter 107100: loss 7.6991, time 124.78ms
iter 107110: loss 8.3236, time 126.17ms
iter 107120: loss 7.9875, time 125.82ms
iter 107130: loss 6.9611, time 124.99ms
iter 107140: loss 7.4256, time 125.91ms
iter 107150: loss 7.8724, time 128.65ms
iter 107160: loss 7.0486, time 125.72ms
iter 107170: loss 7.5840, time 124.82ms
iter 107180: loss 8.3842, time 126.05ms
iter 107190: loss 7.2340, time 124.42ms
iter 107200: loss 6.4366, time 124.43ms
iter 107210: loss 7.5395, time 125.23ms
iter 107220: loss 7.7748, time 128.82ms
iter 107230: loss 7.9315, time 125.88ms
iter 107240: loss 7.8516, time 125.58ms
step 107250: train loss 6.4421, val loss 6.4095
saving checkpoint to out-shakespeare-char
iter 107250: loss 6.6586, time 2892.08ms
iter 107260: loss 7.1194, time 125.47ms
iter 107270: loss 7.4170, time 125.44ms
iter 107280: loss 7.7660, time 125.23ms
iter 107290: loss 7.5045, time 125.19ms
iter 107300: loss 7.9742, time 125.88ms
iter 107310: loss 7.3773, time 124.86ms
iter 107320: loss 8.5731, time 125.51ms
iter 107330: loss 7.2493, time 125.70ms
iter 107340: loss 7.4304, time 125.39ms
iter 107350: loss 7.5793, time 128.30ms
iter 107360: loss 7.5229, time 124.62ms
iter 107370: loss 7.3773, time 125.27ms
iter 107380: loss 7.5524, time 125.10ms
iter 107390: loss 7.9150, time 125.21ms
iter 107400: loss 7.8101, time 125.23ms
iter 107410: loss 7.4085, time 124.96ms
iter 107420: loss 7.5611, time 127.80ms
iter 107430: loss 7.6677, time 125.42ms
iter 107440: loss 7.9079, time 124.86ms
iter 107450: loss 7.3553, time 125.12ms
iter 107460: loss 7.1450, time 124.27ms
iter 107470: loss 6.3647, time 123.70ms
iter 107480: loss 8.2904, time 125.17ms
iter 107490: loss 7.9756, time 125.52ms
step 107500: train loss 6.4219, val loss 6.4138
saving checkpoint to out-shakespeare-char
iter 107500: loss 7.7863, time 2879.11ms
iter 107510: loss 7.4948, time 125.34ms
iter 107520: loss 7.4869, time 128.13ms
iter 107530: loss 8.4945, time 125.53ms
iter 107540: loss 7.6772, time 125.48ms
iter 107550: loss 7.6713, time 124.49ms
iter 107560: loss 7.2775, time 125.13ms
iter 107570: loss 7.9513, time 125.78ms
iter 107580: loss 7.6091, time 124.81ms
iter 107590: loss 7.5828, time 124.92ms
iter 107600: loss 8.4004, time 125.49ms
iter 107610: loss 7.0002, time 124.53ms
iter 107620: loss 7.7639, time 125.81ms
iter 107630: loss 7.6446, time 128.83ms
iter 107640: loss 7.5457, time 125.51ms
iter 107650: loss 7.6097, time 125.20ms
iter 107660: loss 6.9224, time 125.22ms
iter 107670: loss 8.1512, time 125.22ms
iter 107680: loss 8.1617, time 125.12ms
iter 107690: loss 7.6682, time 125.28ms
iter 107700: loss 7.8404, time 128.23ms
iter 107710: loss 6.8794, time 125.09ms
iter 107720: loss 7.3026, time 125.82ms
iter 107730: loss 7.6732, time 125.51ms
iter 107740: loss 8.1662, time 125.19ms
step 107750: train loss 6.3914, val loss 6.4081
saving checkpoint to out-shakespeare-char
iter 107750: loss 8.1082, time 2877.60ms
iter 107760: loss 6.9833, time 125.35ms
iter 107770: loss 8.0752, time 125.43ms
iter 107780: loss 7.6897, time 125.65ms
iter 107790: loss 8.1287, time 125.72ms
iter 107800: loss 7.2390, time 125.24ms
iter 107810: loss 6.9624, time 124.66ms
iter 107820: loss 7.5041, time 126.20ms
iter 107830: loss 7.6508, time 125.00ms
iter 107840: loss 7.8219, time 125.44ms
iter 107850: loss 7.9072, time 125.47ms
iter 107860: loss 8.6821, time 125.77ms
iter 107870: loss 7.8549, time 125.55ms
iter 107880: loss 7.3671, time 125.73ms
iter 107890: loss 8.1743, time 125.51ms
iter 107900: loss 7.9510, time 128.60ms
iter 107910: loss 7.6972, time 125.23ms
iter 107920: loss 7.7764, time 124.92ms
iter 107930: loss 7.2987, time 125.80ms
iter 107940: loss 8.4470, time 127.40ms
iter 107950: loss 7.9458, time 124.79ms
iter 107960: loss 7.5084, time 125.47ms
iter 107970: loss 7.8009, time 125.34ms
iter 107980: loss 7.1215, time 124.75ms
iter 107990: loss 7.1495, time 125.00ms
step 108000: train loss 6.4022, val loss 6.4281
saving checkpoint to out-shakespeare-char
iter 108000: loss 7.7481, time 2887.55ms
iter 108010: loss 7.2376, time 125.62ms
iter 108020: loss 7.4642, time 125.54ms
iter 108030: loss 8.3073, time 125.69ms
iter 108040: loss 8.1318, time 127.87ms
iter 108050: loss 7.4756, time 125.11ms
iter 108060: loss 7.2345, time 125.99ms
iter 108070: loss 7.8077, time 125.57ms
iter 108080: loss 7.2221, time 125.65ms
iter 108090: loss 7.2635, time 126.11ms
iter 108100: loss 8.5114, time 124.95ms
iter 108110: loss 7.3508, time 128.85ms
iter 108120: loss 7.1911, time 124.70ms
iter 108130: loss 7.6755, time 125.95ms
iter 108140: loss 7.6045, time 126.34ms
iter 108150: loss 7.1129, time 124.83ms
iter 108160: loss 7.7255, time 125.76ms
iter 108170: loss 7.9111, time 125.82ms
iter 108180: loss 7.9187, time 125.67ms
iter 108190: loss 7.6293, time 125.59ms
iter 108200: loss 7.5992, time 125.44ms
iter 108210: loss 6.7098, time 125.77ms
iter 108220: loss 7.4593, time 128.83ms
iter 108230: loss 7.5523, time 125.61ms
iter 108240: loss 7.4557, time 126.14ms
step 108250: train loss 6.3571, val loss 6.4154
saving checkpoint to out-shakespeare-char
iter 108250: loss 7.6055, time 2903.21ms
iter 108260: loss 8.2246, time 125.90ms
iter 108270: loss 7.2715, time 125.70ms
iter 108280: loss 7.5863, time 127.45ms
iter 108290: loss 7.7821, time 125.81ms
iter 108300: loss 7.0298, time 125.47ms
iter 108310: loss 7.4044, time 126.02ms
iter 108320: loss 7.7149, time 124.65ms
iter 108330: loss 8.1668, time 125.60ms
iter 108340: loss 6.9858, time 125.04ms
iter 108350: loss 7.9486, time 125.77ms
iter 108360: loss 7.8456, time 125.63ms
iter 108370: loss 7.8415, time 125.84ms
iter 108380: loss 7.8517, time 125.89ms
iter 108390: loss 7.1171, time 125.76ms
iter 108400: loss 7.8579, time 124.42ms
iter 108410: loss 7.4245, time 125.61ms
iter 108420: loss 7.6931, time 125.82ms
iter 108430: loss 7.7652, time 125.96ms
iter 108440: loss 7.5098, time 128.66ms
iter 108450: loss 7.1848, time 125.55ms
iter 108460: loss 6.5244, time 125.89ms
iter 108470: loss 8.4945, time 125.69ms
iter 108480: loss 6.9471, time 126.12ms
iter 108490: loss 7.9776, time 125.96ms
step 108500: train loss 6.3186, val loss 6.3879
saving checkpoint to out-shakespeare-char
iter 108500: loss 8.5087, time 2894.39ms
iter 108510: loss 8.4338, time 125.32ms
iter 108520: loss 7.2010, time 125.26ms
iter 108530: loss 6.3000, time 125.25ms
iter 108540: loss 7.7799, time 125.95ms
iter 108550: loss 7.1141, time 124.96ms
iter 108560: loss 8.2862, time 125.25ms
iter 108570: loss 7.9423, time 124.84ms
iter 108580: loss 7.4516, time 124.56ms
iter 108590: loss 7.8523, time 125.23ms
iter 108600: loss 7.6094, time 125.65ms
iter 108610: loss 7.5075, time 125.43ms
iter 108620: loss 6.9830, time 124.06ms
iter 108630: loss 6.9289, time 125.38ms
iter 108640: loss 7.2055, time 126.41ms
iter 108650: loss 7.2807, time 125.56ms
iter 108660: loss 8.6190, time 124.51ms
iter 108670: loss 7.7732, time 125.12ms
iter 108680: loss 7.7130, time 124.72ms
iter 108690: loss 7.6116, time 125.42ms
iter 108700: loss 7.6361, time 124.83ms
iter 108710: loss 6.6861, time 127.36ms
iter 108720: loss 7.7265, time 125.18ms
iter 108730: loss 8.3666, time 125.48ms
iter 108740: loss 8.2566, time 125.60ms
step 108750: train loss 6.4103, val loss 6.3974
saving checkpoint to out-shakespeare-char
iter 108750: loss 7.9535, time 2870.88ms
iter 108760: loss 6.5793, time 125.52ms
iter 108770: loss 8.0950, time 125.28ms
iter 108780: loss 7.7055, time 125.54ms
iter 108790: loss 8.0482, time 126.05ms
iter 108800: loss 7.6404, time 125.55ms
iter 108810: loss 7.5246, time 125.02ms
iter 108820: loss 7.2645, time 125.40ms
iter 108830: loss 7.4861, time 125.49ms
iter 108840: loss 7.7116, time 125.40ms
iter 108850: loss 7.2712, time 129.41ms
iter 108860: loss 7.4607, time 125.86ms
iter 108870: loss 8.1128, time 125.98ms
iter 108880: loss 6.7668, time 125.63ms
iter 108890: loss 7.3285, time 126.20ms
iter 108900: loss 8.2870, time 125.59ms
iter 108910: loss 8.1819, time 125.69ms
iter 108920: loss 7.6417, time 126.36ms
iter 108930: loss 7.6549, time 125.62ms
iter 108940: loss 7.7101, time 125.55ms
iter 108950: loss 8.5030, time 125.55ms
iter 108960: loss 8.4705, time 125.62ms
iter 108970: loss 7.6398, time 125.55ms
iter 108980: loss 7.5515, time 125.67ms
iter 108990: loss 7.7938, time 125.84ms
step 109000: train loss 6.3847, val loss 6.4300
saving checkpoint to out-shakespeare-char
iter 109000: loss 8.1915, time 2888.51ms
iter 109010: loss 7.4484, time 125.78ms
iter 109020: loss 7.5992, time 126.16ms
iter 109030: loss 8.0078, time 125.63ms
iter 109040: loss 7.9453, time 125.59ms
iter 109050: loss 7.5827, time 124.92ms
iter 109060: loss 7.9122, time 128.75ms
iter 109070: loss 8.0057, time 125.76ms
iter 109080: loss 8.1004, time 125.54ms
iter 109090: loss 7.1077, time 125.40ms
iter 109100: loss 7.0476, time 125.69ms
iter 109110: loss 7.7133, time 125.55ms
iter 109120: loss 7.6630, time 125.67ms
iter 109130: loss 7.7666, time 125.62ms
iter 109140: loss 7.7029, time 125.57ms
iter 109150: loss 8.3165, time 124.98ms
iter 109160: loss 7.2204, time 125.86ms
iter 109170: loss 7.8714, time 128.72ms
iter 109180: loss 7.2713, time 125.38ms
iter 109190: loss 8.5133, time 125.81ms
iter 109200: loss 7.4055, time 125.73ms
iter 109210: loss 7.1341, time 125.52ms
iter 109220: loss 7.0406, time 125.66ms
iter 109230: loss 7.0992, time 126.62ms
iter 109240: loss 7.3558, time 125.71ms
step 109250: train loss 6.3524, val loss 6.3497
saving checkpoint to out-shakespeare-char
iter 109250: loss 8.0447, time 2881.64ms
iter 109260: loss 7.9204, time 126.06ms
iter 109270: loss 7.4419, time 124.16ms
iter 109280: loss 7.9082, time 125.91ms
iter 109290: loss 7.6080, time 125.02ms
iter 109300: loss 6.8579, time 125.54ms
iter 109310: loss 7.1397, time 128.36ms
iter 109320: loss 7.9030, time 124.99ms
iter 109330: loss 7.7285, time 126.50ms
iter 109340: loss 7.6942, time 124.44ms
iter 109350: loss 8.0983, time 123.86ms
iter 109360: loss 7.9778, time 123.63ms
iter 109370: loss 7.4622, time 123.51ms
iter 109380: loss 7.5541, time 123.77ms
iter 109390: loss 7.2843, time 123.55ms
iter 109400: loss 8.8602, time 125.25ms
iter 109410: loss 7.5711, time 125.09ms
iter 109420: loss 7.2653, time 125.79ms
iter 109430: loss 7.3561, time 125.51ms
iter 109440: loss 6.9806, time 125.37ms
iter 109450: loss 8.0558, time 124.76ms
iter 109460: loss 7.4417, time 125.19ms
iter 109470: loss 7.5986, time 125.05ms
iter 109480: loss 8.1333, time 125.94ms
iter 109490: loss 6.9841, time 125.22ms
step 109500: train loss 6.3709, val loss 6.3949
saving checkpoint to out-shakespeare-char
iter 109500: loss 7.3395, time 2879.84ms
iter 109510: loss 7.9905, time 126.27ms
iter 109520: loss 7.4606, time 125.40ms
iter 109530: loss 7.5195, time 125.13ms
iter 109540: loss 8.2332, time 125.13ms
iter 109550: loss 7.6482, time 124.93ms
iter 109560: loss 7.8520, time 125.23ms
iter 109570: loss 6.9075, time 124.79ms
iter 109580: loss 8.1228, time 125.48ms
iter 109590: loss 7.6487, time 128.09ms
iter 109600: loss 6.9358, time 125.49ms
iter 109610: loss 7.2058, time 126.75ms
iter 109620: loss 7.6228, time 125.68ms
iter 109630: loss 7.2939, time 125.27ms
iter 109640: loss 6.3886, time 125.37ms
iter 109650: loss 7.5610, time 124.72ms
iter 109660: loss 7.7947, time 125.34ms
iter 109670: loss 7.3083, time 125.77ms
iter 109680: loss 6.8473, time 125.20ms
iter 109690: loss 7.3480, time 125.48ms
iter 109700: loss 7.0037, time 125.32ms
iter 109710: loss 8.6704, time 125.36ms
iter 109720: loss 8.1925, time 125.40ms
iter 109730: loss 7.7001, time 125.01ms
iter 109740: loss 7.4123, time 125.22ms
step 109750: train loss 6.3314, val loss 6.3857
saving checkpoint to out-shakespeare-char
iter 109750: loss 7.7316, time 2893.86ms
iter 109760: loss 7.6032, time 126.14ms
iter 109770: loss 7.5183, time 125.89ms
iter 109780: loss 7.7862, time 126.02ms
iter 109790: loss 7.1150, time 126.22ms
iter 109800: loss 6.8780, time 128.51ms
iter 109810: loss 8.0568, time 125.52ms
iter 109820: loss 6.7821, time 125.62ms
iter 109830: loss 8.0491, time 126.09ms
iter 109840: loss 7.2151, time 124.38ms
iter 109850: loss 7.0355, time 126.16ms
iter 109860: loss 7.4880, time 126.07ms
iter 109870: loss 7.8876, time 125.83ms
iter 109880: loss 7.3392, time 125.90ms
iter 109890: loss 8.9812, time 124.71ms
iter 109900: loss 7.7370, time 126.08ms
iter 109910: loss 7.1516, time 129.16ms
iter 109920: loss 7.1086, time 125.64ms
iter 109930: loss 7.2009, time 125.54ms
iter 109940: loss 7.6080, time 126.32ms
iter 109950: loss 7.6410, time 125.21ms
iter 109960: loss 6.8485, time 125.88ms
iter 109970: loss 8.3631, time 126.08ms
iter 109980: loss 7.1926, time 126.45ms
iter 109990: loss 7.4117, time 125.72ms
step 110000: train loss 6.4007, val loss 6.3916
saving checkpoint to out-shakespeare-char
iter 110000: loss 7.3625, time 2882.07ms
iter 110010: loss 7.4266, time 125.67ms
iter 110020: loss 7.0517, time 125.58ms
iter 110030: loss 8.1211, time 125.50ms
iter 110040: loss 7.1296, time 125.47ms
iter 110050: loss 7.7408, time 125.37ms
iter 110060: loss 6.7792, time 125.75ms
iter 110070: loss 7.8528, time 126.17ms
iter 110080: loss 8.0034, time 128.52ms
iter 110090: loss 7.5019, time 125.66ms
iter 110100: loss 7.2066, time 126.17ms
iter 110110: loss 8.0280, time 125.58ms
iter 110120: loss 7.8028, time 125.52ms
iter 110130: loss 7.5021, time 125.57ms
iter 110140: loss 7.8516, time 125.81ms
iter 110150: loss 7.3247, time 125.27ms
iter 110160: loss 6.8575, time 125.68ms
iter 110170: loss 8.0037, time 125.80ms
iter 110180: loss 7.2492, time 125.96ms
iter 110190: loss 7.7048, time 120.89ms
iter 110200: loss 7.6595, time 121.46ms
iter 110210: loss 7.4111, time 121.42ms
iter 110220: loss 7.6579, time 121.60ms
iter 110230: loss 8.0482, time 122.07ms
iter 110240: loss 7.5895, time 121.47ms
step 110250: train loss 6.3709, val loss 6.3473
saving checkpoint to out-shakespeare-char
iter 110250: loss 7.9203, time 2874.92ms
iter 110260: loss 7.2356, time 122.36ms
iter 110270: loss 7.6247, time 120.77ms
iter 110280: loss 6.7001, time 123.49ms
iter 110290: loss 7.5827, time 121.07ms
iter 110300: loss 7.4802, time 122.49ms
iter 110310: loss 8.0769, time 120.84ms
iter 110320: loss 7.8997, time 122.27ms
iter 110330: loss 7.3295, time 122.46ms
iter 110340: loss 7.9440, time 122.64ms
iter 110350: loss 7.1584, time 120.54ms
iter 110360: loss 7.7633, time 122.73ms
iter 110370: loss 7.8034, time 121.41ms
iter 110380: loss 6.9226, time 123.19ms
iter 110390: loss 7.2592, time 121.55ms
iter 110400: loss 7.6098, time 123.30ms
iter 110410: loss 7.0870, time 121.54ms
iter 110420: loss 7.1809, time 122.66ms
iter 110430: loss 7.8686, time 121.73ms
iter 110440: loss 7.6352, time 123.06ms
iter 110450: loss 6.5978, time 121.33ms
iter 110460: loss 7.9315, time 122.86ms
iter 110470: loss 7.0400, time 124.71ms
iter 110480: loss 7.6732, time 123.19ms
iter 110490: loss 7.9116, time 121.33ms
step 110500: train loss 6.4109, val loss 6.3763
saving checkpoint to out-shakespeare-char
iter 110500: loss 8.7400, time 2884.00ms
iter 110510: loss 7.8786, time 121.45ms
iter 110520: loss 7.4272, time 120.97ms
iter 110530: loss 7.7100, time 122.58ms
iter 110540: loss 7.2306, time 121.44ms
iter 110550: loss 7.3654, time 121.42ms
iter 110560: loss 7.3930, time 121.34ms
iter 110570: loss 7.3809, time 121.33ms
iter 110580: loss 8.1603, time 121.34ms
iter 110590: loss 7.6546, time 121.13ms
iter 110600: loss 7.5061, time 121.48ms
iter 110610: loss 7.8722, time 121.59ms
iter 110620: loss 7.6002, time 121.42ms
iter 110630: loss 7.1400, time 121.51ms
iter 110640: loss 7.4545, time 121.55ms
iter 110650: loss 7.8151, time 121.21ms
iter 110660: loss 7.5102, time 121.47ms
iter 110670: loss 7.3769, time 121.40ms
iter 110680: loss 7.9143, time 120.65ms
iter 110690: loss 7.4791, time 121.39ms
iter 110700: loss 7.9390, time 121.49ms
iter 110710: loss 7.5069, time 121.93ms
iter 110720: loss 7.1780, time 121.36ms
iter 110730: loss 7.7560, time 121.70ms
iter 110740: loss 6.8620, time 121.39ms
step 110750: train loss 6.3289, val loss 6.3740
saving checkpoint to out-shakespeare-char
iter 110750: loss 7.5967, time 2891.01ms
iter 110760: loss 7.6820, time 121.49ms
iter 110770: loss 7.9224, time 121.52ms
iter 110780: loss 7.8632, time 121.43ms
iter 110790: loss 8.1033, time 121.49ms
iter 110800: loss 7.7753, time 121.55ms
iter 110810: loss 7.8460, time 121.69ms
iter 110820: loss 7.5033, time 121.55ms
iter 110830: loss 7.8104, time 121.70ms
iter 110840: loss 7.3705, time 120.67ms
iter 110850: loss 7.4572, time 121.52ms
iter 110860: loss 7.6939, time 121.57ms
iter 110870: loss 7.7089, time 121.62ms
iter 110880: loss 6.7439, time 121.41ms
iter 110890: loss 7.4877, time 121.40ms
iter 110900: loss 7.1871, time 121.41ms
iter 110910: loss 7.9547, time 120.80ms
iter 110920: loss 7.2993, time 121.42ms
iter 110930: loss 7.6838, time 121.58ms
iter 110940: loss 8.1300, time 121.45ms
iter 110950: loss 7.8044, time 121.70ms
iter 110960: loss 7.5354, time 121.44ms
iter 110970: loss 7.6939, time 121.64ms
iter 110980: loss 7.7409, time 121.58ms
iter 110990: loss 6.8619, time 121.61ms
step 111000: train loss 6.3785, val loss 6.3488
saving checkpoint to out-shakespeare-char
iter 111000: loss 7.1996, time 2898.67ms
iter 111010: loss 7.0853, time 121.80ms
iter 111020: loss 7.2383, time 124.45ms
iter 111030: loss 7.9947, time 120.72ms
iter 111040: loss 7.5193, time 124.43ms
iter 111050: loss 8.0475, time 122.10ms
iter 111060: loss 7.5690, time 124.40ms
iter 111070: loss 7.5644, time 121.46ms
iter 111080: loss 7.1013, time 124.61ms
iter 111090: loss 7.2107, time 121.71ms
iter 111100: loss 7.6320, time 123.81ms
iter 111110: loss 7.8527, time 121.54ms
iter 111120: loss 7.7524, time 124.17ms
iter 111130: loss 6.9983, time 121.52ms
iter 111140: loss 7.4540, time 124.37ms
iter 111150: loss 6.9462, time 121.57ms
iter 111160: loss 7.6497, time 123.81ms
iter 111170: loss 7.7906, time 121.65ms
iter 111180: loss 7.9769, time 124.55ms
iter 111190: loss 7.5980, time 121.85ms
iter 111200: loss 8.0202, time 124.50ms
iter 111210: loss 7.7522, time 120.83ms
iter 111220: loss 7.5094, time 124.05ms
iter 111230: loss 7.9957, time 121.52ms
iter 111240: loss 7.5046, time 124.47ms
step 111250: train loss 6.3557, val loss 6.3675
saving checkpoint to out-shakespeare-char
iter 111250: loss 7.1811, time 2901.63ms
iter 111260: loss 7.2456, time 121.96ms
iter 111270: loss 7.2256, time 123.41ms
iter 111280: loss 7.3430, time 122.48ms
iter 111290: loss 7.8569, time 123.02ms
iter 111300: loss 8.3879, time 121.91ms
iter 111310: loss 7.4964, time 123.19ms
iter 111320: loss 7.5708, time 121.76ms
iter 111330: loss 6.4955, time 122.98ms
iter 111340: loss 8.0887, time 121.95ms
iter 111350: loss 7.7626, time 122.61ms
iter 111360: loss 8.3183, time 121.94ms
iter 111370: loss 7.5588, time 122.93ms
iter 111380: loss 7.7293, time 122.01ms
iter 111390: loss 7.2003, time 124.09ms
iter 111400: loss 6.5760, time 121.80ms
iter 111410: loss 7.0781, time 122.94ms
iter 111420: loss 8.0406, time 121.81ms
iter 111430: loss 8.1708, time 122.79ms
iter 111440: loss 7.3068, time 121.43ms
iter 111450: loss 7.8632, time 123.52ms
iter 111460: loss 7.1964, time 121.88ms
iter 111470: loss 7.9970, time 123.60ms
iter 111480: loss 7.5012, time 121.84ms
iter 111490: loss 7.8428, time 122.98ms
step 111500: train loss 6.3389, val loss 6.3453
saving checkpoint to out-shakespeare-char
iter 111500: loss 6.7588, time 2884.50ms
iter 111510: loss 7.3536, time 126.38ms
iter 111520: loss 6.6065, time 125.70ms
iter 111530: loss 7.1814, time 125.49ms
iter 111540: loss 7.7788, time 125.97ms
iter 111550: loss 7.3734, time 125.31ms
iter 111560: loss 7.4323, time 125.55ms
iter 111570: loss 7.1548, time 125.96ms
iter 111580: loss 7.8297, time 126.51ms
iter 111590: loss 7.0154, time 128.63ms
iter 111600: loss 7.4701, time 126.00ms
iter 111610: loss 7.3251, time 125.92ms
iter 111620: loss 7.0997, time 125.42ms
iter 111630: loss 7.4593, time 125.21ms
iter 111640: loss 7.3413, time 124.93ms
iter 111650: loss 7.4613, time 125.34ms
iter 111660: loss 7.3231, time 125.55ms
iter 111670: loss 7.3002, time 125.72ms
iter 111680: loss 8.3747, time 125.92ms
iter 111690: loss 8.3975, time 125.92ms
iter 111700: loss 8.2969, time 128.91ms
iter 111710: loss 7.9347, time 125.78ms
iter 111720: loss 7.2989, time 125.81ms
iter 111730: loss 8.2335, time 125.62ms
iter 111740: loss 6.9647, time 125.86ms
step 111750: train loss 6.3175, val loss 6.3248
saving checkpoint to out-shakespeare-char
iter 111750: loss 7.3200, time 2909.81ms
iter 111760: loss 8.0941, time 125.78ms
iter 111770: loss 7.2152, time 125.90ms
iter 111780: loss 7.2946, time 125.35ms
iter 111790: loss 6.8586, time 125.39ms
iter 111800: loss 7.1421, time 128.52ms
iter 111810: loss 7.4662, time 125.48ms
iter 111820: loss 7.6745, time 126.07ms
iter 111830: loss 7.3389, time 126.24ms
iter 111840: loss 7.5951, time 125.76ms
iter 111850: loss 7.6899, time 125.61ms
iter 111860: loss 7.2947, time 125.45ms
iter 111870: loss 8.2587, time 125.54ms
iter 111880: loss 8.3823, time 125.47ms
iter 111890: loss 7.7841, time 125.36ms
iter 111900: loss 7.1763, time 125.62ms
iter 111910: loss 7.5353, time 128.69ms
iter 111920: loss 7.3896, time 125.37ms
iter 111930: loss 7.4538, time 125.68ms
iter 111940: loss 8.3219, time 125.55ms
iter 111950: loss 7.5712, time 125.63ms
iter 111960: loss 7.1357, time 125.60ms
iter 111970: loss 7.6888, time 125.65ms
iter 111980: loss 7.5400, time 125.45ms
iter 111990: loss 7.6870, time 125.71ms
step 112000: train loss 6.3092, val loss 6.3139
saving checkpoint to out-shakespeare-char
iter 112000: loss 8.2957, time 2894.22ms
iter 112010: loss 6.6384, time 128.57ms
iter 112020: loss 7.7763, time 126.41ms
iter 112030: loss 8.1473, time 125.59ms
iter 112040: loss 7.4068, time 125.59ms
iter 112050: loss 7.2001, time 125.26ms
iter 112060: loss 7.6983, time 125.09ms
iter 112070: loss 7.6968, time 125.38ms
iter 112080: loss 7.0130, time 125.29ms
iter 112090: loss 7.2770, time 125.54ms
iter 112100: loss 6.9566, time 125.23ms
iter 112110: loss 7.1787, time 125.38ms
iter 112120: loss 7.9334, time 128.19ms
iter 112130: loss 7.3047, time 125.34ms
iter 112140: loss 7.6219, time 125.23ms
iter 112150: loss 8.1524, time 126.11ms
iter 112160: loss 7.1652, time 125.09ms
iter 112170: loss 7.8412, time 125.30ms
iter 112180: loss 6.8769, time 125.74ms
iter 112190: loss 7.5366, time 124.53ms
iter 112200: loss 8.0389, time 125.31ms
iter 112210: loss 7.5752, time 125.25ms
iter 112220: loss 7.3710, time 125.26ms
iter 112230: loss 7.4791, time 128.30ms
iter 112240: loss 8.1883, time 125.17ms
step 112250: train loss 6.3389, val loss 6.3775
saving checkpoint to out-shakespeare-char
iter 112250: loss 7.5292, time 2909.37ms
iter 112260: loss 7.0635, time 125.28ms
iter 112270: loss 7.2267, time 125.17ms
iter 112280: loss 7.5695, time 125.12ms
iter 112290: loss 7.3935, time 125.18ms
iter 112300: loss 7.7783, time 126.25ms
iter 112310: loss 7.2999, time 125.40ms
iter 112320: loss 7.9141, time 128.44ms
iter 112330: loss 7.5443, time 124.42ms
iter 112340: loss 6.7018, time 125.08ms
iter 112350: loss 8.1644, time 124.08ms
iter 112360: loss 7.8029, time 124.55ms
iter 112370: loss 7.0784, time 125.29ms
iter 112380: loss 8.3925, time 125.12ms
iter 112390: loss 6.6466, time 123.52ms
iter 112400: loss 7.0672, time 125.23ms
iter 112410: loss 7.3534, time 124.98ms
iter 112420: loss 7.6436, time 125.29ms
iter 112430: loss 7.4390, time 125.41ms
iter 112440: loss 7.5945, time 125.21ms
iter 112450: loss 7.8837, time 125.29ms
iter 112460: loss 7.0531, time 123.67ms
iter 112470: loss 7.6524, time 125.18ms
iter 112480: loss 8.2338, time 125.04ms
iter 112490: loss 7.4346, time 125.14ms
step 112500: train loss 6.3464, val loss 6.3460
saving checkpoint to out-shakespeare-char
iter 112500: loss 7.3656, time 2881.46ms
iter 112510: loss 8.4431, time 125.96ms
iter 112520: loss 7.0372, time 125.98ms
iter 112530: loss 8.2080, time 125.79ms
iter 112540: loss 7.5130, time 125.65ms
iter 112550: loss 7.7298, time 126.47ms
iter 112560: loss 6.9924, time 128.72ms
iter 112570: loss 7.7452, time 125.59ms
iter 112580: loss 8.1901, time 125.53ms
iter 112590: loss 7.7229, time 125.97ms
iter 112600: loss 7.7078, time 125.71ms
iter 112610: loss 7.6251, time 124.95ms
iter 112620: loss 7.9020, time 124.95ms
iter 112630: loss 6.7106, time 125.66ms
iter 112640: loss 7.6780, time 125.31ms
iter 112650: loss 7.8527, time 125.04ms
iter 112660: loss 7.5042, time 125.26ms
iter 112670: loss 8.3218, time 128.06ms
iter 112680: loss 7.7242, time 125.28ms
iter 112690: loss 8.1855, time 124.97ms
iter 112700: loss 7.3397, time 125.26ms
iter 112710: loss 7.9347, time 125.21ms
iter 112720: loss 7.4215, time 125.02ms
iter 112730: loss 7.7081, time 125.31ms
iter 112740: loss 7.9501, time 124.78ms
step 112750: train loss 6.3814, val loss 6.4100
saving checkpoint to out-shakespeare-char
iter 112750: loss 6.5816, time 2885.16ms
iter 112760: loss 7.4071, time 125.09ms
iter 112770: loss 7.7099, time 125.43ms
iter 112780: loss 7.8306, time 125.35ms
iter 112790: loss 8.4149, time 125.63ms
iter 112800: loss 7.4497, time 128.36ms
iter 112810: loss 7.7928, time 125.13ms
iter 112820: loss 7.7752, time 125.22ms
iter 112830: loss 7.2016, time 125.74ms
iter 112840: loss 7.7683, time 125.32ms
iter 112850: loss 8.1526, time 125.27ms
iter 112860: loss 8.0063, time 125.28ms
iter 112870: loss 7.9496, time 125.25ms
iter 112880: loss 7.6984, time 125.26ms
iter 112890: loss 7.2864, time 124.99ms
iter 112900: loss 7.8764, time 125.19ms
iter 112910: loss 7.3171, time 128.15ms
iter 112920: loss 8.4754, time 125.27ms
iter 112930: loss 7.6746, time 125.09ms
iter 112940: loss 7.6102, time 125.31ms
iter 112950: loss 7.2793, time 125.52ms
iter 112960: loss 7.6074, time 125.03ms
iter 112970: loss 6.5593, time 125.73ms
iter 112980: loss 8.1977, time 124.32ms
iter 112990: loss 7.1356, time 125.25ms
step 113000: train loss 6.3416, val loss 6.3021
saving checkpoint to out-shakespeare-char
iter 113000: loss 7.1014, time 2877.68ms
iter 113010: loss 7.9104, time 125.05ms
iter 113020: loss 7.4575, time 125.11ms
iter 113030: loss 8.4592, time 125.34ms
iter 113040: loss 7.3064, time 128.21ms
iter 113050: loss 7.8933, time 125.52ms
iter 113060: loss 7.1079, time 126.54ms
iter 113070: loss 7.5659, time 125.29ms
iter 113080: loss 7.4758, time 125.25ms
iter 113090: loss 7.6968, time 125.14ms
iter 113100: loss 6.8593, time 125.28ms
iter 113110: loss 7.8940, time 127.87ms
iter 113120: loss 7.2980, time 125.28ms
iter 113130: loss 7.9938, time 125.38ms
iter 113140: loss 7.7985, time 125.22ms
iter 113150: loss 7.6945, time 125.10ms
iter 113160: loss 7.6133, time 124.29ms
iter 113170: loss 7.5193, time 125.39ms
iter 113180: loss 7.9086, time 124.97ms
iter 113190: loss 7.2598, time 125.13ms
iter 113200: loss 6.7597, time 125.12ms
iter 113210: loss 7.4777, time 125.37ms
iter 113220: loss 7.5226, time 127.99ms
iter 113230: loss 7.0986, time 125.16ms
iter 113240: loss 6.8352, time 125.10ms
step 113250: train loss 6.3059, val loss 6.3018
saving checkpoint to out-shakespeare-char
iter 113250: loss 6.7192, time 2903.76ms
iter 113260: loss 7.8153, time 125.80ms
iter 113270: loss 7.5469, time 125.47ms
iter 113280: loss 8.0894, time 124.52ms
iter 113290: loss 7.6154, time 124.55ms
iter 113300: loss 7.2664, time 124.80ms
iter 113310: loss 7.0859, time 125.02ms
iter 113320: loss 7.9903, time 128.58ms
iter 113330: loss 7.0872, time 124.78ms
iter 113340: loss 7.4478, time 124.98ms
iter 113350: loss 7.1898, time 124.86ms
iter 113360: loss 7.2495, time 124.51ms
iter 113370: loss 7.3737, time 125.02ms
iter 113380: loss 7.3593, time 125.31ms
iter 113390: loss 6.8882, time 124.05ms
iter 113400: loss 7.4176, time 125.01ms
iter 113410: loss 7.0391, time 124.75ms
iter 113420: loss 7.6837, time 124.72ms
iter 113430: loss 7.4061, time 127.36ms
iter 113440: loss 7.7799, time 125.03ms
iter 113450: loss 8.0026, time 125.75ms
iter 113460: loss 8.3332, time 125.96ms
iter 113470: loss 7.1247, time 128.19ms
iter 113480: loss 7.2103, time 125.53ms
iter 113490: loss 8.0362, time 125.46ms
step 113500: train loss 6.3418, val loss 6.3246
saving checkpoint to out-shakespeare-char
iter 113500: loss 7.1785, time 2885.83ms
iter 113510: loss 7.7002, time 121.04ms
iter 113520: loss 7.1125, time 119.54ms
iter 113530: loss 8.1821, time 120.85ms
iter 113540: loss 8.3296, time 119.50ms
iter 113550: loss 7.1954, time 119.72ms
iter 113560: loss 7.4913, time 119.77ms
iter 113570: loss 6.9343, time 121.62ms
iter 113580: loss 7.3743, time 121.76ms
iter 113590: loss 8.4509, time 121.61ms
iter 113600: loss 7.4563, time 121.72ms
iter 113610: loss 7.9282, time 125.87ms
iter 113620: loss 8.3345, time 125.91ms
iter 113630: loss 7.3828, time 125.71ms
iter 113640: loss 7.9163, time 125.74ms
iter 113650: loss 8.6490, time 125.91ms
iter 113660: loss 7.6124, time 126.67ms
iter 113670: loss 7.0290, time 128.48ms
iter 113680: loss 7.4356, time 126.10ms
iter 113690: loss 7.0482, time 125.73ms
iter 113700: loss 7.3113, time 125.96ms
iter 113710: loss 6.9910, time 125.97ms
iter 113720: loss 6.6343, time 125.95ms
iter 113730: loss 7.5912, time 125.85ms
iter 113740: loss 7.5938, time 125.75ms
step 113750: train loss 6.2983, val loss 6.3194
saving checkpoint to out-shakespeare-char
iter 113750: loss 7.0002, time 2896.52ms
iter 113760: loss 7.0828, time 126.59ms
iter 113770: loss 7.8004, time 125.82ms
iter 113780: loss 7.2188, time 125.76ms
iter 113790: loss 7.5126, time 126.01ms
iter 113800: loss 7.1606, time 126.05ms
iter 113810: loss 7.3676, time 128.98ms
iter 113820: loss 7.7099, time 126.47ms
iter 113830: loss 7.4399, time 125.98ms
iter 113840: loss 7.3744, time 126.44ms
iter 113850: loss 7.8628, time 127.39ms
iter 113860: loss 7.3776, time 125.56ms
iter 113870: loss 7.7648, time 125.69ms
iter 113880: loss 7.5009, time 125.74ms
iter 113890: loss 6.2331, time 125.48ms
iter 113900: loss 7.5327, time 125.67ms
iter 113910: loss 6.8269, time 126.18ms
iter 113920: loss 7.4592, time 128.57ms
iter 113930: loss 7.2684, time 126.43ms
iter 113940: loss 7.6319, time 125.89ms
iter 113950: loss 7.1436, time 126.10ms
iter 113960: loss 7.1437, time 128.66ms
iter 113970: loss 7.2625, time 125.97ms
iter 113980: loss 7.5338, time 126.24ms
iter 113990: loss 7.9014, time 125.78ms
step 114000: train loss 6.3372, val loss 6.3570
saving checkpoint to out-shakespeare-char
iter 114000: loss 6.8713, time 2879.17ms
iter 114010: loss 7.4056, time 125.86ms
iter 114020: loss 7.5675, time 125.90ms
iter 114030: loss 7.0683, time 125.63ms
iter 114040: loss 7.5427, time 125.69ms
iter 114050: loss 7.1616, time 126.02ms
iter 114060: loss 7.6116, time 128.35ms
iter 114070: loss 8.2202, time 125.24ms
iter 114080: loss 7.8530, time 125.51ms
iter 114090: loss 7.5044, time 125.67ms
iter 114100: loss 6.7836, time 125.61ms
iter 114110: loss 7.9068, time 125.86ms
iter 114120: loss 7.2767, time 126.93ms
iter 114130: loss 7.5427, time 125.51ms
iter 114140: loss 7.1182, time 125.60ms
iter 114150: loss 7.4065, time 125.70ms
iter 114160: loss 7.6485, time 125.66ms
iter 114170: loss 7.2640, time 128.57ms
iter 114180: loss 8.3180, time 125.49ms
iter 114190: loss 7.1896, time 125.46ms
iter 114200: loss 8.3563, time 125.71ms
iter 114210: loss 7.1251, time 128.61ms
iter 114220: loss 7.3488, time 125.32ms
iter 114230: loss 8.1490, time 126.03ms
iter 114240: loss 7.2754, time 125.44ms
step 114250: train loss 6.2880, val loss 6.3425
saving checkpoint to out-shakespeare-char
iter 114250: loss 7.3778, time 2893.78ms
iter 114260: loss 7.6511, time 125.98ms
iter 114270: loss 7.8891, time 126.03ms
iter 114280: loss 7.2385, time 125.96ms
iter 114290: loss 7.3772, time 125.88ms
iter 114300: loss 8.3736, time 126.32ms
iter 114310: loss 7.6384, time 126.25ms
iter 114320: loss 6.8717, time 125.92ms
iter 114330: loss 6.5635, time 126.18ms
iter 114340: loss 8.4182, time 125.22ms
iter 114350: loss 7.7219, time 128.89ms
iter 114360: loss 7.3445, time 126.07ms
iter 114370: loss 7.4690, time 126.39ms
iter 114380: loss 7.7817, time 125.94ms
iter 114390: loss 7.3205, time 125.91ms
iter 114400: loss 7.0398, time 125.94ms
iter 114410: loss 8.3067, time 125.89ms
iter 114420: loss 8.1832, time 124.99ms
iter 114430: loss 7.6547, time 125.39ms
iter 114440: loss 7.1912, time 125.52ms
iter 114450: loss 7.5025, time 125.60ms
iter 114460: loss 7.4321, time 128.10ms
iter 114470: loss 8.1993, time 125.53ms
iter 114480: loss 7.8417, time 125.72ms
iter 114490: loss 7.7336, time 125.75ms
step 114500: train loss 6.3035, val loss 6.3415
saving checkpoint to out-shakespeare-char
iter 114500: loss 7.9867, time 2890.15ms
iter 114510: loss 8.0863, time 126.23ms
iter 114520: loss 7.8160, time 128.44ms
iter 114530: loss 7.3385, time 126.07ms
iter 114540: loss 6.9050, time 124.45ms
iter 114550: loss 7.3725, time 125.53ms
iter 114560: loss 6.7874, time 125.53ms
iter 114570: loss 7.3647, time 125.24ms
iter 114580: loss 7.4497, time 125.73ms
iter 114590: loss 7.4395, time 128.30ms
iter 114600: loss 7.9707, time 125.34ms
iter 114610: loss 7.0413, time 125.02ms
iter 114620: loss 7.3556, time 125.18ms
iter 114630: loss 7.0222, time 125.15ms
iter 114640: loss 7.8461, time 125.83ms
iter 114650: loss 8.1372, time 125.93ms
iter 114660: loss 8.9330, time 125.66ms
iter 114670: loss 6.4556, time 125.72ms
iter 114680: loss 8.7233, time 125.78ms
iter 114690: loss 7.1013, time 125.77ms
iter 114700: loss 7.4221, time 127.89ms
iter 114710: loss 7.3197, time 125.18ms
iter 114720: loss 7.7159, time 125.35ms
iter 114730: loss 7.0283, time 125.32ms
iter 114740: loss 8.0124, time 125.39ms
step 114750: train loss 6.3378, val loss 6.2985
saving checkpoint to out-shakespeare-char
iter 114750: loss 7.1689, time 2870.64ms
iter 114760: loss 7.2029, time 126.17ms
iter 114770: loss 7.2749, time 128.53ms
iter 114780: loss 7.0417, time 125.32ms
iter 114790: loss 6.9941, time 125.24ms
iter 114800: loss 8.0310, time 126.99ms
iter 114810: loss 6.7697, time 125.70ms
iter 114820: loss 7.0393, time 126.20ms
iter 114830: loss 7.3413, time 125.97ms
iter 114840: loss 7.2399, time 127.17ms
iter 114850: loss 7.7045, time 126.35ms
iter 114860: loss 7.2805, time 125.27ms
iter 114870: loss 7.3277, time 125.44ms
iter 114880: loss 7.1846, time 125.91ms
iter 114890: loss 6.9594, time 125.40ms
iter 114900: loss 7.0331, time 125.32ms
iter 114910: loss 7.1251, time 125.63ms
iter 114920: loss 7.4313, time 128.58ms
iter 114930: loss 7.3608, time 125.64ms
iter 114940: loss 7.1649, time 125.31ms
iter 114950: loss 8.0501, time 125.46ms
iter 114960: loss 8.2949, time 125.83ms
iter 114970: loss 7.3872, time 125.76ms
iter 114980: loss 7.7056, time 125.40ms
iter 114990: loss 7.0530, time 124.92ms
step 115000: train loss 6.2842, val loss 6.3372
saving checkpoint to out-shakespeare-char
iter 115000: loss 7.3817, time 2891.19ms
iter 115010: loss 7.0780, time 125.36ms
iter 115020: loss 6.9186, time 122.39ms
iter 115030: loss 7.1052, time 124.73ms
iter 115040: loss 7.9260, time 122.35ms
iter 115050: loss 7.7314, time 125.18ms
iter 115060: loss 7.9089, time 121.26ms
iter 115070: loss 7.8214, time 124.81ms
iter 115080: loss 8.0243, time 121.76ms
iter 115090: loss 7.1954, time 123.81ms
iter 115100: loss 7.9731, time 121.97ms
iter 115110: loss 7.2608, time 124.33ms
iter 115120: loss 7.9429, time 121.73ms
iter 115130: loss 7.4755, time 124.10ms
iter 115140: loss 6.6465, time 120.88ms
iter 115150: loss 7.2383, time 124.63ms
iter 115160: loss 7.7663, time 121.49ms
iter 115170: loss 7.5857, time 124.34ms
iter 115180: loss 7.3411, time 121.60ms
iter 115190: loss 7.3107, time 124.61ms
iter 115200: loss 7.2381, time 121.39ms
iter 115210: loss 7.2930, time 124.79ms
iter 115220: loss 8.1939, time 121.96ms
iter 115230: loss 7.5645, time 123.72ms
iter 115240: loss 7.8435, time 120.85ms
step 115250: train loss 6.2821, val loss 6.3560
saving checkpoint to out-shakespeare-char
iter 115250: loss 7.5871, time 2877.25ms
iter 115260: loss 7.7339, time 123.86ms
iter 115270: loss 8.1352, time 121.88ms
iter 115280: loss 7.9857, time 122.78ms
iter 115290: loss 6.9675, time 121.68ms
iter 115300: loss 7.0853, time 121.65ms
iter 115310: loss 7.7134, time 121.17ms
iter 115320: loss 7.4028, time 121.71ms
iter 115330: loss 7.0726, time 121.17ms
iter 115340: loss 7.2455, time 121.67ms
iter 115350: loss 7.4837, time 121.63ms
iter 115360: loss 7.1118, time 122.21ms
iter 115370: loss 7.6611, time 121.57ms
iter 115380: loss 7.9241, time 121.59ms
iter 115390: loss 8.0797, time 121.71ms
iter 115400: loss 7.2286, time 121.63ms
iter 115410: loss 7.3084, time 121.80ms
iter 115420: loss 7.6944, time 121.64ms
iter 115430: loss 7.3886, time 121.66ms
iter 115440: loss 7.4376, time 120.70ms
iter 115450: loss 8.0454, time 121.53ms
iter 115460: loss 7.9554, time 122.27ms
iter 115470: loss 6.6325, time 121.92ms
iter 115480: loss 7.2014, time 121.65ms
iter 115490: loss 7.3354, time 121.87ms
step 115500: train loss 6.3063, val loss 6.3863
saving checkpoint to out-shakespeare-char
iter 115500: loss 6.9748, time 2893.89ms
iter 115510: loss 7.6603, time 121.17ms
iter 115520: loss 7.7436, time 121.08ms
iter 115530: loss 6.9330, time 122.54ms
iter 115540: loss 6.9573, time 122.07ms
iter 115550: loss 7.6758, time 121.45ms
iter 115560: loss 6.8421, time 121.84ms
iter 115570: loss 6.8942, time 121.93ms
iter 115580: loss 7.6575, time 121.84ms
iter 115590: loss 7.4893, time 122.27ms
iter 115600: loss 6.9424, time 121.97ms
iter 115610: loss 7.9886, time 122.04ms
iter 115620: loss 7.6166, time 121.95ms
iter 115630: loss 7.5348, time 121.92ms
iter 115640: loss 7.8980, time 121.83ms
iter 115650: loss 7.6858, time 122.01ms
iter 115660: loss 7.3239, time 121.85ms
iter 115670: loss 7.1634, time 122.34ms
iter 115680: loss 7.1265, time 121.97ms
iter 115690: loss 6.7100, time 121.86ms
iter 115700: loss 7.0303, time 121.67ms
iter 115710: loss 7.4567, time 121.93ms
iter 115720: loss 7.3319, time 122.01ms
iter 115730: loss 6.2833, time 121.93ms
iter 115740: loss 7.6811, time 122.02ms
step 115750: train loss 6.3257, val loss 6.3230
saving checkpoint to out-shakespeare-char
iter 115750: loss 7.5759, time 2912.41ms
iter 115760: loss 7.3414, time 125.62ms
iter 115770: loss 6.9056, time 127.73ms
iter 115780: loss 8.0005, time 125.16ms
iter 115790: loss 7.4688, time 124.93ms
iter 115800: loss 7.6837, time 124.13ms
iter 115810: loss 7.1739, time 125.02ms
iter 115820: loss 8.0807, time 124.91ms
iter 115830: loss 7.3747, time 126.38ms
iter 115840: loss 8.2518, time 124.99ms
iter 115850: loss 7.2611, time 125.13ms
iter 115860: loss 7.5202, time 125.52ms
iter 115870: loss 8.3539, time 124.83ms
iter 115880: loss 8.1342, time 128.33ms
iter 115890: loss 7.4964, time 124.99ms
iter 115900: loss 7.8162, time 125.31ms
iter 115910: loss 7.6834, time 125.31ms
iter 115920: loss 7.6108, time 125.39ms
iter 115930: loss 7.3219, time 125.38ms
iter 115940: loss 7.1202, time 124.87ms
iter 115950: loss 7.5288, time 124.79ms
iter 115960: loss 7.4244, time 125.34ms
iter 115970: loss 7.1342, time 125.54ms
iter 115980: loss 7.9733, time 125.74ms
iter 115990: loss 7.7960, time 128.25ms
step 116000: train loss 6.3275, val loss 6.2753
saving checkpoint to out-shakespeare-char
iter 116000: loss 7.0547, time 2910.44ms
iter 116010: loss 7.8544, time 125.70ms
iter 116020: loss 7.0303, time 125.10ms
iter 116030: loss 7.2336, time 124.90ms
iter 116040: loss 8.1915, time 124.16ms
iter 116050: loss 7.7268, time 124.53ms
iter 116060: loss 8.0240, time 125.18ms
iter 116070: loss 6.9223, time 125.41ms
iter 116080: loss 7.3421, time 124.43ms
iter 116090: loss 7.7178, time 125.58ms
iter 116100: loss 7.2097, time 127.51ms
iter 116110: loss 7.0311, time 125.39ms
iter 116120: loss 7.4404, time 125.28ms
iter 116130: loss 8.1951, time 125.14ms
iter 116140: loss 7.9213, time 125.51ms
iter 116150: loss 7.2762, time 125.73ms
iter 116160: loss 7.3179, time 124.48ms
iter 116170: loss 7.8802, time 128.14ms
iter 116180: loss 7.0837, time 124.88ms
iter 116190: loss 8.0524, time 125.25ms
iter 116200: loss 8.4663, time 125.02ms
iter 116210: loss 8.2976, time 125.65ms
iter 116220: loss 7.4625, time 125.82ms
iter 116230: loss 7.1735, time 125.76ms
iter 116240: loss 6.4589, time 128.13ms
step 116250: train loss 6.3007, val loss 6.2806
saving checkpoint to out-shakespeare-char
iter 116250: loss 7.2760, time 2884.79ms
iter 116260: loss 6.7374, time 125.62ms
iter 116270: loss 7.3265, time 125.33ms
iter 116280: loss 7.5577, time 125.52ms
iter 116290: loss 7.8222, time 125.22ms
iter 116300: loss 7.8362, time 128.48ms
iter 116310: loss 7.7370, time 124.74ms
iter 116320: loss 6.8911, time 125.69ms
iter 116330: loss 7.1854, time 124.31ms
iter 116340: loss 7.8630, time 125.01ms
iter 116350: loss 7.5198, time 124.58ms
iter 116360: loss 7.5390, time 125.20ms
iter 116370: loss 6.9283, time 124.29ms
iter 116380: loss 7.5900, time 125.63ms
iter 116390: loss 7.7497, time 126.31ms
iter 116400: loss 6.5829, time 125.52ms
iter 116410: loss 7.2802, time 128.48ms
iter 116420: loss 7.8637, time 126.04ms
iter 116430: loss 7.7418, time 125.92ms
iter 116440: loss 7.2444, time 125.01ms
iter 116450: loss 8.2621, time 127.79ms
iter 116460: loss 7.5677, time 124.98ms
iter 116470: loss 7.3834, time 124.89ms
iter 116480: loss 7.5684, time 125.26ms
iter 116490: loss 7.1794, time 124.73ms
step 116500: train loss 6.2388, val loss 6.3276
saving checkpoint to out-shakespeare-char
iter 116500: loss 7.6635, time 2905.72ms
iter 116510: loss 7.6428, time 121.90ms
iter 116520: loss 7.7176, time 121.59ms
iter 116530: loss 7.8146, time 120.83ms
iter 116540: loss 7.0889, time 121.52ms
iter 116550: loss 8.0658, time 121.59ms
iter 116560: loss 7.3770, time 121.96ms
iter 116570: loss 8.3086, time 122.49ms
iter 116580: loss 6.9109, time 121.21ms
iter 116590: loss 7.3279, time 121.87ms
iter 116600: loss 6.8350, time 121.97ms
iter 116610: loss 6.6330, time 121.59ms
iter 116620: loss 7.4100, time 121.56ms
iter 116630: loss 7.6050, time 121.69ms
iter 116640: loss 7.6201, time 122.17ms
iter 116650: loss 8.3772, time 121.90ms
iter 116660: loss 7.2919, time 121.56ms
iter 116670: loss 7.3038, time 121.53ms
iter 116680: loss 7.8575, time 122.23ms
iter 116690: loss 7.4225, time 121.54ms
iter 116700: loss 7.1569, time 121.45ms
iter 116710: loss 7.0412, time 121.75ms
iter 116720: loss 7.0967, time 121.46ms
iter 116730: loss 7.1007, time 121.87ms
iter 116740: loss 7.7557, time 121.59ms
step 116750: train loss 6.2504, val loss 6.3392
saving checkpoint to out-shakespeare-char
iter 116750: loss 7.5929, time 2909.77ms
iter 116760: loss 7.8942, time 125.81ms
iter 116770: loss 7.3404, time 125.46ms
iter 116780: loss 7.1439, time 126.12ms
iter 116790: loss 8.0481, time 125.52ms
iter 116800: loss 8.0272, time 125.81ms
iter 116810: loss 7.0923, time 125.59ms
iter 116820: loss 7.7466, time 126.65ms
iter 116830: loss 7.6865, time 124.90ms
iter 116840: loss 7.5943, time 126.35ms
iter 116850: loss 7.7944, time 125.36ms
iter 116860: loss 8.0362, time 125.32ms
iter 116870: loss 7.6736, time 125.40ms
iter 116880: loss 6.8077, time 126.26ms
iter 116890: loss 7.7468, time 125.78ms
iter 116900: loss 6.9222, time 125.58ms
iter 116910: loss 6.6022, time 125.35ms
iter 116920: loss 7.3900, time 128.31ms
iter 116930: loss 7.5295, time 124.63ms
iter 116940: loss 7.3207, time 125.79ms
iter 116950: loss 7.3457, time 125.42ms
iter 116960: loss 7.9261, time 125.30ms
iter 116970: loss 8.5147, time 126.03ms
iter 116980: loss 7.8481, time 125.11ms
iter 116990: loss 7.0392, time 124.67ms
step 117000: train loss 6.3055, val loss 6.3053
saving checkpoint to out-shakespeare-char
iter 117000: loss 8.1713, time 2876.86ms
iter 117010: loss 6.7398, time 127.87ms
iter 117020: loss 6.7567, time 124.71ms
iter 117030: loss 6.7704, time 125.94ms
iter 117040: loss 7.4898, time 125.56ms
iter 117050: loss 7.7359, time 126.31ms
iter 117060: loss 7.3720, time 121.55ms
iter 117070: loss 7.8323, time 121.99ms
iter 117080: loss 6.4506, time 121.37ms
iter 117090: loss 6.7814, time 121.73ms
iter 117100: loss 7.8709, time 121.32ms
iter 117110: loss 7.6800, time 121.32ms
iter 117120: loss 7.5205, time 121.60ms
iter 117130: loss 7.3400, time 121.34ms
iter 117140: loss 6.8822, time 121.43ms
iter 117150: loss 7.3144, time 121.51ms
iter 117160: loss 7.0812, time 121.74ms
iter 117170: loss 8.0929, time 121.07ms
iter 117180: loss 7.9358, time 121.51ms
iter 117190: loss 7.0701, time 120.83ms
iter 117200: loss 7.8104, time 121.58ms
iter 117210: loss 7.3925, time 121.87ms
iter 117220: loss 6.5007, time 121.06ms
iter 117230: loss 7.0933, time 121.82ms
iter 117240: loss 7.2035, time 121.69ms
step 117250: train loss 6.3136, val loss 6.2493
saving checkpoint to out-shakespeare-char
iter 117250: loss 7.4068, time 2896.61ms
iter 117260: loss 6.6773, time 121.03ms
iter 117270: loss 7.4884, time 121.98ms
iter 117280: loss 7.3061, time 121.68ms
iter 117290: loss 7.7967, time 120.95ms
iter 117300: loss 7.8088, time 125.90ms
iter 117310: loss 7.7855, time 126.24ms
iter 117320: loss 7.9437, time 128.46ms
iter 117330: loss 7.3666, time 125.30ms
iter 117340: loss 7.6224, time 125.43ms
iter 117350: loss 7.5790, time 126.12ms
iter 117360: loss 7.7528, time 125.50ms
iter 117370: loss 7.4566, time 125.98ms
iter 117380: loss 7.1534, time 125.75ms
iter 117390: loss 7.7053, time 125.24ms
iter 117400: loss 7.1635, time 125.54ms
iter 117410: loss 7.1851, time 125.88ms
iter 117420: loss 7.3007, time 124.92ms
iter 117430: loss 6.8528, time 128.68ms
iter 117440: loss 7.4461, time 125.46ms
iter 117450: loss 7.2551, time 125.41ms
iter 117460: loss 7.8451, time 126.80ms
iter 117470: loss 7.3020, time 125.56ms
iter 117480: loss 7.7396, time 126.03ms
iter 117490: loss 7.6060, time 125.81ms
step 117500: train loss 6.2522, val loss 6.3368
saving checkpoint to out-shakespeare-char
iter 117500: loss 7.4368, time 2902.89ms
iter 117510: loss 7.8857, time 125.53ms
iter 117520: loss 6.9189, time 125.59ms
iter 117530: loss 8.0474, time 125.44ms
iter 117540: loss 6.9309, time 125.36ms
iter 117550: loss 7.1887, time 125.40ms
iter 117560: loss 7.9327, time 125.35ms
iter 117570: loss 8.0074, time 125.94ms
iter 117580: loss 6.5496, time 125.67ms
iter 117590: loss 7.3151, time 124.56ms
iter 117600: loss 7.3444, time 128.52ms
iter 117610: loss 7.6603, time 125.04ms
iter 117620: loss 8.0100, time 125.92ms
iter 117630: loss 7.8125, time 127.49ms
iter 117640: loss 7.1413, time 126.01ms
iter 117650: loss 7.3713, time 125.78ms
iter 117660: loss 8.2635, time 125.74ms
iter 117670: loss 7.4414, time 125.82ms
iter 117680: loss 8.2022, time 125.93ms
iter 117690: loss 7.4287, time 126.06ms
iter 117700: loss 6.7299, time 124.67ms
iter 117710: loss 7.8293, time 128.64ms
iter 117720: loss 7.1569, time 125.78ms
iter 117730: loss 7.3328, time 125.54ms
iter 117740: loss 6.8443, time 126.08ms
step 117750: train loss 6.2647, val loss 6.2515
saving checkpoint to out-shakespeare-char
iter 117750: loss 7.2329, time 2879.55ms
iter 117760: loss 7.8721, time 125.20ms
iter 117770: loss 7.6223, time 125.01ms
iter 117780: loss 7.8363, time 125.46ms
iter 117790: loss 7.4110, time 124.91ms
iter 117800: loss 7.3334, time 124.83ms
iter 117810: loss 7.8378, time 126.94ms
iter 117820: loss 7.0681, time 124.92ms
iter 117830: loss 7.0874, time 125.20ms
iter 117840: loss 7.5937, time 125.87ms
iter 117850: loss 7.1226, time 125.66ms
iter 117860: loss 7.1946, time 125.64ms
iter 117870: loss 7.5570, time 125.76ms
iter 117880: loss 6.8414, time 125.38ms
iter 117890: loss 6.7756, time 125.68ms
iter 117900: loss 7.6218, time 124.83ms
iter 117910: loss 8.2421, time 125.50ms
iter 117920: loss 6.8739, time 128.11ms
iter 117930: loss 7.7917, time 126.02ms
iter 117940: loss 7.3058, time 124.21ms
iter 117950: loss 7.9606, time 126.00ms
iter 117960: loss 7.6138, time 125.74ms
iter 117970: loss 7.4058, time 126.13ms
iter 117980: loss 7.3183, time 126.20ms
iter 117990: loss 7.1529, time 124.80ms
step 118000: train loss 6.2776, val loss 6.2658
saving checkpoint to out-shakespeare-char
iter 118000: loss 7.6651, time 2861.03ms
iter 118010: loss 6.4517, time 126.19ms
iter 118020: loss 7.7949, time 125.87ms
iter 118030: loss 7.3745, time 126.12ms
iter 118040: loss 7.4566, time 125.55ms
iter 118050: loss 7.1405, time 125.97ms
iter 118060: loss 7.2010, time 125.80ms
iter 118070: loss 7.1243, time 126.08ms
iter 118080: loss 8.0696, time 125.89ms
iter 118090: loss 7.8496, time 128.94ms
iter 118100: loss 7.4975, time 125.82ms
iter 118110: loss 6.4696, time 126.05ms
iter 118120: loss 7.9001, time 126.53ms
iter 118130: loss 7.3566, time 126.05ms
iter 118140: loss 7.5150, time 125.94ms
iter 118150: loss 6.9410, time 126.24ms
iter 118160: loss 7.3819, time 125.59ms
iter 118170: loss 7.6056, time 125.81ms
iter 118180: loss 7.7804, time 125.37ms
iter 118190: loss 7.7352, time 125.16ms
iter 118200: loss 7.4296, time 127.93ms
iter 118210: loss 7.3222, time 125.92ms
iter 118220: loss 7.3472, time 125.59ms
iter 118230: loss 6.4842, time 125.66ms
iter 118240: loss 7.6629, time 125.31ms
step 118250: train loss 6.2228, val loss 6.1943
saving checkpoint to out-shakespeare-char
iter 118250: loss 7.1930, time 2885.63ms
iter 118260: loss 6.9352, time 125.52ms
iter 118270: loss 7.6209, time 125.52ms
iter 118280: loss 7.6102, time 125.52ms
iter 118290: loss 8.0492, time 125.84ms
iter 118300: loss 7.5340, time 128.41ms
iter 118310: loss 6.7541, time 125.64ms
iter 118320: loss 6.9407, time 125.43ms
iter 118330: loss 7.4743, time 125.43ms
iter 118340: loss 7.2373, time 125.62ms
iter 118350: loss 7.5418, time 125.60ms
iter 118360: loss 6.9408, time 123.29ms
iter 118370: loss 7.5538, time 121.75ms
iter 118380: loss 7.8369, time 122.05ms
iter 118390: loss 6.9944, time 122.18ms
iter 118400: loss 7.4003, time 121.84ms
iter 118410: loss 7.4173, time 121.63ms
iter 118420: loss 7.5141, time 122.14ms
iter 118430: loss 6.9389, time 122.16ms
iter 118440: loss 7.4076, time 122.14ms
iter 118450: loss 7.3379, time 121.81ms
iter 118460: loss 7.5240, time 121.49ms
iter 118470: loss 7.9511, time 121.51ms
iter 118480: loss 7.2147, time 121.77ms
iter 118490: loss 7.4320, time 121.57ms
step 118500: train loss 6.3336, val loss 6.3397
saving checkpoint to out-shakespeare-char
iter 118500: loss 7.5331, time 2899.60ms
iter 118510: loss 8.4459, time 121.35ms
iter 118520: loss 7.6661, time 121.54ms
iter 118530: loss 7.6921, time 121.48ms
iter 118540: loss 7.6679, time 121.38ms
iter 118550: loss 7.5151, time 121.35ms
iter 118560: loss 8.2924, time 121.60ms
iter 118570: loss 7.4037, time 121.37ms
iter 118580: loss 7.4427, time 121.60ms
iter 118590: loss 8.0652, time 121.60ms
iter 118600: loss 7.7165, time 121.65ms
iter 118610: loss 7.5411, time 121.12ms
iter 118620: loss 7.7261, time 121.66ms
iter 118630: loss 7.8409, time 121.52ms
iter 118640: loss 7.3451, time 121.16ms
iter 118650: loss 8.2607, time 120.45ms
iter 118660: loss 7.6485, time 121.64ms
iter 118670: loss 8.0311, time 121.52ms
iter 118680: loss 7.1704, time 120.89ms
iter 118690: loss 6.7132, time 121.48ms
iter 118700: loss 7.4095, time 121.54ms
iter 118710: loss 7.3634, time 122.45ms
iter 118720: loss 7.7288, time 121.79ms
iter 118730: loss 6.9234, time 121.70ms
iter 118740: loss 7.1871, time 121.65ms
step 118750: train loss 6.3134, val loss 6.2735
saving checkpoint to out-shakespeare-char
iter 118750: loss 7.2183, time 2897.69ms
iter 118760: loss 7.3983, time 121.47ms
iter 118770: loss 7.5515, time 121.79ms
iter 118780: loss 7.1118, time 121.46ms
iter 118790: loss 7.5388, time 121.47ms
iter 118800: loss 7.6033, time 121.47ms
iter 118810: loss 7.7122, time 121.38ms
iter 118820: loss 8.0397, time 121.53ms
iter 118830: loss 7.1754, time 121.38ms
iter 118840: loss 7.8780, time 122.69ms
iter 118850: loss 6.2473, time 121.52ms
iter 118860: loss 7.0173, time 121.30ms
iter 118870: loss 7.2686, time 121.57ms
iter 118880: loss 7.2301, time 121.36ms
iter 118890: loss 7.3480, time 121.87ms
iter 118900: loss 6.9872, time 121.42ms
iter 118910: loss 7.4186, time 121.53ms
iter 118920: loss 7.5613, time 121.54ms
iter 118930: loss 6.8784, time 121.67ms
iter 118940: loss 7.3890, time 121.50ms
iter 118950: loss 6.9685, time 121.95ms
iter 118960: loss 7.7780, time 121.29ms
iter 118970: loss 7.8062, time 121.55ms
iter 118980: loss 7.9428, time 121.38ms
iter 118990: loss 7.2680, time 121.50ms
step 119000: train loss 6.2135, val loss 6.2775
saving checkpoint to out-shakespeare-char
iter 119000: loss 7.0717, time 2901.79ms
iter 119010: loss 7.6998, time 121.46ms
iter 119020: loss 6.9723, time 121.58ms
iter 119030: loss 6.7640, time 121.39ms
iter 119040: loss 6.9233, time 121.51ms
iter 119050: loss 7.3417, time 121.45ms
iter 119060: loss 6.7298, time 121.61ms
iter 119070: loss 6.8063, time 121.46ms
iter 119080: loss 7.8828, time 121.79ms
iter 119090: loss 7.4152, time 121.39ms
iter 119100: loss 7.3526, time 121.79ms
iter 119110: loss 7.4126, time 121.30ms
iter 119120: loss 7.4122, time 121.42ms
iter 119130: loss 7.8745, time 121.66ms
iter 119140: loss 7.9119, time 121.46ms
iter 119150: loss 8.1815, time 121.17ms
iter 119160: loss 7.6412, time 122.03ms
iter 119170: loss 7.5195, time 121.05ms
iter 119180: loss 7.0195, time 121.89ms
iter 119190: loss 7.6234, time 121.76ms
iter 119200: loss 7.5219, time 121.52ms
iter 119210: loss 6.8814, time 121.61ms
iter 119220: loss 7.9176, time 121.66ms
iter 119230: loss 7.8634, time 121.83ms
iter 119240: loss 7.3809, time 121.67ms
step 119250: train loss 6.2260, val loss 6.2748
saving checkpoint to out-shakespeare-char
iter 119250: loss 8.1616, time 2903.68ms
iter 119260: loss 7.5301, time 121.38ms
iter 119270: loss 7.3619, time 121.34ms
iter 119280: loss 7.3378, time 120.33ms
iter 119290: loss 7.7348, time 121.52ms
iter 119300: loss 7.3211, time 121.57ms
iter 119310: loss 7.5987, time 121.40ms
iter 119320: loss 7.0493, time 121.50ms
iter 119330: loss 6.6816, time 121.75ms
iter 119340: loss 7.5010, time 121.52ms
iter 119350: loss 7.3405, time 121.64ms
iter 119360: loss 7.2225, time 121.49ms
iter 119370: loss 7.8070, time 122.09ms
iter 119380: loss 6.8544, time 121.88ms
iter 119390: loss 8.3399, time 121.72ms
iter 119400: loss 7.9212, time 121.32ms
iter 119410: loss 7.9768, time 121.31ms
iter 119420: loss 7.4087, time 121.64ms
iter 119430: loss 6.6435, time 121.55ms
iter 119440: loss 7.2188, time 121.60ms
iter 119450: loss 7.6259, time 120.65ms
iter 119460: loss 7.4132, time 121.54ms
iter 119470: loss 6.7962, time 121.71ms
iter 119480: loss 6.8185, time 121.50ms
iter 119490: loss 7.1946, time 121.47ms
step 119500: train loss 6.2256, val loss 6.2729
saving checkpoint to out-shakespeare-char
iter 119500: loss 6.8365, time 2893.48ms
iter 119510: loss 7.6509, time 121.48ms
iter 119520: loss 6.9156, time 121.43ms
iter 119530: loss 7.9080, time 121.34ms
iter 119540: loss 7.1939, time 121.67ms
iter 119550: loss 7.5071, time 121.49ms
iter 119560: loss 6.8346, time 121.44ms
iter 119570: loss 7.5913, time 121.43ms
iter 119580: loss 6.6769, time 121.54ms
iter 119590: loss 7.3557, time 121.30ms
iter 119600: loss 7.7940, time 121.53ms
iter 119610: loss 7.7427, time 121.36ms
iter 119620: loss 7.2554, time 122.65ms
iter 119630: loss 6.6774, time 121.46ms
iter 119640: loss 7.1515, time 121.49ms
iter 119650: loss 7.0886, time 121.36ms
iter 119660: loss 8.0996, time 121.93ms
iter 119670: loss 6.5635, time 121.47ms
iter 119680: loss 7.5897, time 121.71ms
iter 119690: loss 7.6626, time 121.52ms
iter 119700: loss 7.1589, time 121.53ms
iter 119710: loss 6.9649, time 121.68ms
iter 119720: loss 7.1269, time 121.64ms
iter 119730: loss 8.2615, time 121.52ms
iter 119740: loss 7.8710, time 121.30ms
step 119750: train loss 6.2562, val loss 6.2533
saving checkpoint to out-shakespeare-char
iter 119750: loss 8.2398, time 2906.13ms
iter 119760: loss 6.8641, time 121.55ms
iter 119770: loss 7.9431, time 122.88ms
iter 119780: loss 7.0480, time 121.46ms
iter 119790: loss 7.3244, time 122.84ms
iter 119800: loss 6.9639, time 121.40ms
iter 119810: loss 7.1764, time 122.19ms
iter 119820: loss 7.5959, time 122.09ms
iter 119830: loss 7.0089, time 122.63ms
iter 119840: loss 7.0483, time 121.63ms
iter 119850: loss 8.0200, time 122.74ms
iter 119860: loss 7.5588, time 121.72ms
iter 119870: loss 7.6520, time 123.21ms
iter 119880: loss 7.6378, time 121.72ms
iter 119890: loss 6.8122, time 122.39ms
iter 119900: loss 7.3129, time 121.59ms
iter 119910: loss 7.4105, time 122.86ms
iter 119920: loss 7.3387, time 121.17ms
iter 119930: loss 7.1123, time 122.91ms
iter 119940: loss 7.0872, time 121.78ms
iter 119950: loss 7.1084, time 123.06ms
iter 119960: loss 7.5868, time 121.40ms
iter 119970: loss 6.9867, time 122.90ms
iter 119980: loss 7.9578, time 121.71ms
iter 119990: loss 7.2556, time 122.65ms
step 120000: train loss 6.2893, val loss 6.2549
saving checkpoint to out-shakespeare-char
iter 120000: loss 6.7918, time 2909.67ms
iter 120010: loss 7.5431, time 121.53ms
iter 120020: loss 8.1245, time 121.73ms
iter 120030: loss 7.4470, time 121.58ms
iter 120040: loss 7.5672, time 121.49ms
iter 120050: loss 7.9582, time 122.24ms
iter 120060: loss 7.5250, time 121.42ms
iter 120070: loss 7.9844, time 121.63ms
iter 120080: loss 8.3786, time 120.74ms
iter 120090: loss 8.3552, time 121.58ms
iter 120100: loss 7.5316, time 121.44ms
iter 120110: loss 7.0215, time 121.73ms
iter 120120: loss 7.1225, time 120.99ms
iter 120130: loss 6.7826, time 121.69ms
iter 120140: loss 7.7012, time 121.43ms
iter 120150: loss 7.7253, time 121.56ms
iter 120160: loss 7.4519, time 121.50ms
iter 120170: loss 7.2227, time 121.96ms
iter 120180: loss 7.4120, time 121.87ms
iter 120190: loss 7.5284, time 121.52ms
iter 120200: loss 7.6462, time 122.05ms
iter 120210: loss 7.6667, time 121.71ms
iter 120220: loss 7.8610, time 122.10ms
iter 120230: loss 7.5818, time 121.09ms
iter 120240: loss 7.6260, time 121.57ms
step 120250: train loss 6.2811, val loss 6.2615
saving checkpoint to out-shakespeare-char
iter 120250: loss 6.3292, time 2904.78ms
iter 120260: loss 7.5343, time 121.77ms
iter 120270: loss 6.9437, time 121.56ms
iter 120280: loss 7.2014, time 121.52ms
iter 120290: loss 7.6243, time 121.87ms
iter 120300: loss 7.2850, time 120.75ms
iter 120310: loss 7.7977, time 121.89ms
iter 120320: loss 7.6502, time 121.46ms
iter 120330: loss 7.7126, time 121.74ms
iter 120340: loss 8.2095, time 121.52ms
iter 120350: loss 7.4490, time 121.59ms
iter 120360: loss 6.8799, time 121.76ms
iter 120370: loss 8.3718, time 121.74ms
iter 120380: loss 7.4439, time 121.40ms
iter 120390: loss 7.1631, time 120.59ms
iter 120400: loss 6.9581, time 121.36ms
iter 120410: loss 7.7432, time 121.29ms
iter 120420: loss 7.9157, time 121.71ms
iter 120430: loss 7.2298, time 121.70ms
iter 120440: loss 7.1295, time 121.77ms
iter 120450: loss 7.1662, time 121.72ms
iter 120460: loss 7.8373, time 121.32ms
iter 120470: loss 6.6648, time 121.63ms
iter 120480: loss 7.8879, time 121.36ms
iter 120490: loss 7.3904, time 121.37ms
step 120500: train loss 6.3058, val loss 6.2026
saving checkpoint to out-shakespeare-char
iter 120500: loss 7.7747, time 2890.79ms
iter 120510: loss 8.4383, time 121.81ms
iter 120520: loss 7.9477, time 124.03ms
iter 120530: loss 7.3855, time 122.58ms
iter 120540: loss 7.7755, time 124.06ms
iter 120550: loss 6.9398, time 121.32ms
iter 120560: loss 7.5329, time 124.59ms
iter 120570: loss 7.8058, time 121.67ms
iter 120580: loss 7.2831, time 124.54ms
iter 120590: loss 6.8802, time 121.69ms
iter 120600: loss 8.0276, time 123.72ms
iter 120610: loss 7.5827, time 121.81ms
iter 120620: loss 7.3987, time 125.28ms
iter 120630: loss 7.8100, time 121.92ms
iter 120640: loss 7.6132, time 124.53ms
iter 120650: loss 7.7823, time 121.55ms
iter 120660: loss 6.7134, time 124.12ms
iter 120670: loss 8.2053, time 120.85ms
iter 120680: loss 8.1075, time 125.17ms
iter 120690: loss 7.3829, time 121.82ms
iter 120700: loss 7.2609, time 124.54ms
iter 120710: loss 8.0216, time 122.20ms
iter 120720: loss 7.3626, time 124.87ms
iter 120730: loss 7.3496, time 122.82ms
iter 120740: loss 7.6301, time 124.58ms
step 120750: train loss 6.2308, val loss 6.2449
saving checkpoint to out-shakespeare-char
iter 120750: loss 7.0514, time 2908.83ms
iter 120760: loss 7.1462, time 126.00ms
iter 120770: loss 6.7024, time 125.23ms
iter 120780: loss 7.3405, time 125.31ms
iter 120790: loss 7.0248, time 125.23ms
iter 120800: loss 8.1090, time 125.47ms
iter 120810: loss 6.6205, time 125.22ms
iter 120820: loss 7.2364, time 125.37ms
iter 120830: loss 7.3973, time 125.29ms
iter 120840: loss 7.0712, time 125.37ms
iter 120850: loss 7.6732, time 125.43ms
iter 120860: loss 7.9802, time 128.08ms
iter 120870: loss 6.8140, time 125.18ms
iter 120880: loss 7.1130, time 125.60ms
iter 120890: loss 6.3519, time 125.21ms
iter 120900: loss 7.8886, time 124.47ms
iter 120910: loss 7.6854, time 124.93ms
iter 120920: loss 7.4076, time 126.02ms
iter 120930: loss 8.0126, time 125.44ms
iter 120940: loss 7.0130, time 125.59ms
iter 120950: loss 6.9644, time 125.29ms
iter 120960: loss 7.7264, time 125.58ms
iter 120970: loss 7.3093, time 127.58ms
iter 120980: loss 7.3682, time 121.80ms
iter 120990: loss 7.6964, time 124.66ms
step 121000: train loss 6.2554, val loss 6.2567
saving checkpoint to out-shakespeare-char
iter 121000: loss 7.1666, time 2907.51ms
iter 121010: loss 7.5251, time 125.40ms
iter 121020: loss 7.3119, time 125.76ms
iter 121030: loss 6.9889, time 129.33ms
iter 121040: loss 7.3767, time 126.08ms
iter 121050: loss 6.7032, time 126.20ms
iter 121060: loss 7.1214, time 127.55ms
iter 121070: loss 7.2441, time 126.04ms
iter 121080: loss 7.0116, time 126.09ms
iter 121090: loss 7.6406, time 125.73ms
iter 121100: loss 7.5096, time 126.31ms
iter 121110: loss 7.7359, time 125.93ms
iter 121120: loss 7.7128, time 126.14ms
iter 121130: loss 7.1649, time 126.23ms
iter 121140: loss 6.8717, time 128.89ms
iter 121150: loss 7.5455, time 125.83ms
iter 121160: loss 7.3230, time 124.76ms
iter 121170: loss 7.2583, time 125.85ms
iter 121180: loss 8.0729, time 124.86ms
iter 121190: loss 7.2737, time 125.92ms
iter 121200: loss 7.7091, time 124.77ms
iter 121210: loss 6.6399, time 125.71ms
iter 121220: loss 6.8174, time 125.26ms
iter 121230: loss 7.4705, time 125.77ms
iter 121240: loss 8.6049, time 125.97ms
step 121250: train loss 6.2811, val loss 6.2443
saving checkpoint to out-shakespeare-char
iter 121250: loss 7.7560, time 2877.66ms
iter 121260: loss 7.8014, time 125.83ms
iter 121270: loss 7.3264, time 128.86ms
iter 121280: loss 7.6622, time 125.98ms
iter 121290: loss 7.3781, time 125.72ms
iter 121300: loss 7.3241, time 125.85ms
iter 121310: loss 7.2937, time 125.52ms
iter 121320: loss 7.6117, time 125.29ms
iter 121330: loss 7.5755, time 125.33ms
iter 121340: loss 7.3805, time 123.59ms
iter 121350: loss 7.1769, time 125.36ms
iter 121360: loss 7.3856, time 125.22ms
iter 121370: loss 7.1532, time 125.34ms
iter 121380: loss 6.9945, time 128.41ms
iter 121390: loss 7.3600, time 124.67ms
iter 121400: loss 7.5773, time 125.16ms
iter 121410: loss 7.2977, time 125.70ms
iter 121420: loss 7.9744, time 125.47ms
iter 121430: loss 6.9230, time 126.33ms
iter 121440: loss 7.4461, time 119.62ms
iter 121450: loss 7.4438, time 120.21ms
iter 121460: loss 7.3086, time 119.98ms
iter 121470: loss 7.1799, time 120.06ms
iter 121480: loss 7.5151, time 119.72ms
iter 121490: loss 7.4662, time 120.10ms
step 121500: train loss 6.2888, val loss 6.2370
saving checkpoint to out-shakespeare-char
iter 121500: loss 6.8297, time 2894.63ms
iter 121510: loss 7.9154, time 120.40ms
iter 121520: loss 7.3478, time 119.96ms
iter 121530: loss 6.5414, time 119.94ms
iter 121540: loss 7.7475, time 121.94ms
iter 121550: loss 7.3442, time 121.84ms
iter 121560: loss 7.2483, time 121.64ms
iter 121570: loss 6.9814, time 121.76ms
iter 121580: loss 6.3128, time 121.84ms
iter 121590: loss 7.3594, time 122.12ms
iter 121600: loss 7.5557, time 121.92ms
iter 121610: loss 7.0870, time 121.17ms
iter 121620: loss 8.4915, time 121.78ms
iter 121630: loss 7.2424, time 121.68ms
iter 121640: loss 7.9142, time 121.82ms
iter 121650: loss 7.5327, time 121.81ms
iter 121660: loss 7.4262, time 121.73ms
iter 121670: loss 7.1926, time 121.80ms
iter 121680: loss 6.1620, time 121.92ms
iter 121690: loss 7.9086, time 121.66ms
iter 121700: loss 7.4688, time 121.79ms
iter 121710: loss 7.5359, time 121.76ms
iter 121720: loss 6.8384, time 122.04ms
iter 121730: loss 6.9332, time 121.82ms
iter 121740: loss 7.2919, time 121.90ms
step 121750: train loss 6.3145, val loss 6.2668
saving checkpoint to out-shakespeare-char
iter 121750: loss 7.7253, time 2896.44ms
iter 121760: loss 6.7523, time 121.87ms
iter 121770: loss 7.7544, time 122.66ms
iter 121780: loss 7.4789, time 122.35ms
iter 121790: loss 7.5967, time 121.83ms
iter 121800: loss 7.2932, time 121.69ms
iter 121810: loss 7.7097, time 121.78ms
iter 121820: loss 7.3425, time 122.78ms
iter 121830: loss 6.7071, time 121.77ms
iter 121840: loss 7.3806, time 121.95ms
iter 121850: loss 7.3110, time 121.90ms
iter 121860: loss 6.3669, time 124.89ms
iter 121870: loss 7.4971, time 122.35ms
iter 121880: loss 8.0241, time 124.60ms
iter 121890: loss 7.4577, time 120.90ms
iter 121900: loss 5.9302, time 124.75ms
iter 121910: loss 7.8082, time 121.87ms
iter 121920: loss 7.3061, time 125.16ms
iter 121930: loss 6.4347, time 121.67ms
iter 121940: loss 7.3998, time 124.66ms
iter 121950: loss 7.2563, time 124.86ms
iter 121960: loss 6.5467, time 128.40ms
iter 121970: loss 6.9423, time 125.61ms
iter 121980: loss 7.2081, time 125.03ms
iter 121990: loss 7.2407, time 125.54ms
step 122000: train loss 6.2510, val loss 6.2255
saving checkpoint to out-shakespeare-char
iter 122000: loss 7.0219, time 2869.28ms
iter 122010: loss 7.0915, time 126.18ms
iter 122020: loss 7.6034, time 125.81ms
iter 122030: loss 7.0930, time 125.82ms
iter 122040: loss 7.6320, time 124.88ms
iter 122050: loss 6.8775, time 125.34ms
iter 122060: loss 7.3282, time 125.25ms
iter 122070: loss 7.1618, time 125.34ms
iter 122080: loss 7.0602, time 124.80ms
iter 122090: loss 6.5227, time 124.96ms
iter 122100: loss 7.4893, time 125.34ms
iter 122110: loss 7.0470, time 125.45ms
iter 122120: loss 7.4413, time 128.21ms
iter 122130: loss 7.0264, time 124.20ms
iter 122140: loss 7.8085, time 125.04ms
iter 122150: loss 7.4488, time 124.76ms
iter 122160: loss 7.7041, time 124.63ms
iter 122170: loss 7.7896, time 125.17ms
iter 122180: loss 8.0075, time 125.66ms
iter 122190: loss 6.8402, time 125.26ms
iter 122200: loss 7.3248, time 125.13ms
iter 122210: loss 6.7347, time 126.43ms
iter 122220: loss 7.4860, time 126.34ms
iter 122230: loss 7.7380, time 127.71ms
iter 122240: loss 7.4967, time 124.81ms
step 122250: train loss 6.2337, val loss 6.2533
saving checkpoint to out-shakespeare-char
iter 122250: loss 7.4158, time 2900.66ms
iter 122260: loss 6.5672, time 121.46ms
iter 122270: loss 7.8432, time 121.95ms
iter 122280: loss 6.5004, time 121.08ms
iter 122290: loss 6.8167, time 126.24ms
iter 122300: loss 7.1463, time 125.15ms
iter 122310: loss 7.7447, time 125.32ms
iter 122320: loss 8.0703, time 125.37ms
iter 122330: loss 6.6879, time 125.40ms
iter 122340: loss 7.7427, time 128.39ms
iter 122350: loss 7.7805, time 125.10ms
iter 122360: loss 7.1851, time 125.82ms
iter 122370: loss 7.3185, time 126.20ms
iter 122380: loss 7.4824, time 125.64ms
iter 122390: loss 7.1099, time 126.04ms
iter 122400: loss 7.6850, time 125.45ms
iter 122410: loss 7.2111, time 125.49ms
iter 122420: loss 7.4469, time 124.50ms
iter 122430: loss 6.9819, time 125.17ms
iter 122440: loss 7.6652, time 125.29ms
iter 122450: loss 7.0837, time 128.76ms
iter 122460: loss 7.6124, time 124.87ms
iter 122470: loss 7.9395, time 124.88ms
iter 122480: loss 7.7551, time 125.41ms
iter 122490: loss 7.2272, time 125.44ms
step 122500: train loss 6.2962, val loss 6.2096
saving checkpoint to out-shakespeare-char
iter 122500: loss 6.8543, time 2873.36ms
iter 122510: loss 7.7563, time 125.47ms
iter 122520: loss 7.6834, time 125.24ms
iter 122530: loss 7.0033, time 125.11ms
iter 122540: loss 7.7469, time 125.09ms
iter 122550: loss 7.5326, time 125.25ms
iter 122560: loss 7.0370, time 128.34ms
iter 122570: loss 7.6806, time 124.51ms
iter 122580: loss 7.3881, time 125.17ms
iter 122590: loss 7.5950, time 125.07ms
iter 122600: loss 7.0694, time 125.12ms
iter 122610: loss 6.9723, time 125.08ms
iter 122620: loss 7.4338, time 125.23ms
iter 122630: loss 7.4544, time 125.13ms
iter 122640: loss 6.9628, time 125.19ms
iter 122650: loss 6.8960, time 125.11ms
iter 122660: loss 7.1744, time 125.08ms
iter 122670: loss 7.6564, time 128.37ms
iter 122680: loss 7.3699, time 125.16ms
iter 122690: loss 6.9823, time 125.42ms
iter 122700: loss 6.6431, time 124.91ms
iter 122710: loss 7.1463, time 125.32ms
iter 122720: loss 7.4274, time 125.39ms
iter 122730: loss 7.0473, time 125.67ms
iter 122740: loss 7.8763, time 124.74ms
step 122750: train loss 6.2232, val loss 6.1924
saving checkpoint to out-shakespeare-char
iter 122750: loss 7.0611, time 2899.04ms
iter 122760: loss 7.7179, time 125.27ms
iter 122770: loss 7.8765, time 128.35ms
iter 122780: loss 7.8583, time 125.02ms
iter 122790: loss 7.0897, time 125.46ms
iter 122800: loss 7.8918, time 125.58ms
iter 122810: loss 6.9326, time 125.48ms
iter 122820: loss 6.0603, time 125.21ms
iter 122830: loss 7.3346, time 125.45ms
iter 122840: loss 6.2327, time 125.57ms
iter 122850: loss 7.8171, time 129.46ms
iter 122860: loss 7.0481, time 125.72ms
iter 122870: loss 6.9001, time 125.67ms
iter 122880: loss 7.5812, time 125.92ms
iter 122890: loss 7.5945, time 126.17ms
iter 122900: loss 7.5397, time 125.77ms
iter 122910: loss 6.9845, time 125.83ms
iter 122920: loss 6.5892, time 125.34ms
iter 122930: loss 7.4353, time 125.91ms
iter 122940: loss 7.1909, time 125.93ms
iter 122950: loss 6.9519, time 125.62ms
iter 122960: loss 7.4452, time 128.41ms
iter 122970: loss 7.2084, time 125.43ms
iter 122980: loss 7.5343, time 125.85ms
iter 122990: loss 7.5234, time 126.27ms
step 123000: train loss 6.2591, val loss 6.2276
saving checkpoint to out-shakespeare-char
iter 123000: loss 7.7206, time 2899.49ms
iter 123010: loss 7.7644, time 125.96ms
iter 123020: loss 7.0596, time 125.85ms
iter 123030: loss 7.6499, time 125.54ms
iter 123040: loss 7.8750, time 125.28ms
iter 123050: loss 7.1306, time 125.56ms
iter 123060: loss 7.6495, time 125.42ms
iter 123070: loss 6.9124, time 128.18ms
iter 123080: loss 7.6812, time 126.64ms
iter 123090: loss 7.7660, time 125.41ms
iter 123100: loss 7.9076, time 126.03ms
iter 123110: loss 6.9991, time 125.90ms
iter 123120: loss 7.7531, time 125.58ms
iter 123130: loss 8.0080, time 125.37ms
iter 123140: loss 7.0785, time 125.34ms
iter 123150: loss 7.4959, time 125.61ms
iter 123160: loss 7.4782, time 125.47ms
iter 123170: loss 7.3680, time 125.25ms
iter 123180: loss 6.9148, time 128.14ms
iter 123190: loss 7.0953, time 125.19ms
iter 123200: loss 7.4507, time 125.65ms
iter 123210: loss 7.0706, time 125.56ms
iter 123220: loss 7.7571, time 125.07ms
iter 123230: loss 6.4747, time 124.99ms
iter 123240: loss 7.6037, time 125.36ms
step 123250: train loss 6.2124, val loss 6.2334
saving checkpoint to out-shakespeare-char
iter 123250: loss 7.4255, time 2898.54ms
iter 123260: loss 7.7026, time 125.67ms
iter 123270: loss 6.6187, time 125.54ms
iter 123280: loss 7.6558, time 125.46ms
iter 123290: loss 7.7756, time 125.30ms
iter 123300: loss 7.3473, time 124.54ms
iter 123310: loss 7.5638, time 125.43ms
iter 123320: loss 7.5864, time 128.22ms
iter 123330: loss 7.1819, time 125.45ms
iter 123340: loss 7.0134, time 125.34ms
iter 123350: loss 6.9345, time 125.60ms
iter 123360: loss 7.5866, time 125.47ms
iter 123370: loss 7.5498, time 125.73ms
iter 123380: loss 7.4847, time 125.79ms
iter 123390: loss 7.0073, time 125.48ms
iter 123400: loss 7.4473, time 124.48ms
iter 123410: loss 7.3993, time 125.69ms
iter 123420: loss 8.1912, time 125.79ms
iter 123430: loss 8.4178, time 128.16ms
iter 123440: loss 8.1958, time 125.69ms
iter 123450: loss 6.9255, time 125.53ms
iter 123460: loss 7.5649, time 126.03ms
iter 123470: loss 6.9719, time 129.47ms
iter 123480: loss 7.1206, time 125.48ms
iter 123490: loss 7.6855, time 125.82ms
step 123500: train loss 6.2589, val loss 6.2469
saving checkpoint to out-shakespeare-char
iter 123500: loss 7.3228, time 2883.12ms
iter 123510: loss 7.2709, time 125.87ms
iter 123520: loss 7.1833, time 125.27ms
iter 123530: loss 6.5443, time 128.06ms
iter 123540: loss 6.7600, time 121.39ms
iter 123550: loss 7.9351, time 122.55ms
iter 123560: loss 7.0285, time 120.84ms
iter 123570: loss 6.9056, time 122.43ms
iter 123580: loss 8.3263, time 121.75ms
iter 123590: loss 6.7343, time 122.64ms
iter 123600: loss 7.6111, time 121.60ms
iter 123610: loss 8.0365, time 123.02ms
iter 123620: loss 7.2975, time 121.71ms
iter 123630: loss 7.8814, time 123.67ms
iter 123640: loss 7.1729, time 121.56ms
iter 123650: loss 7.0107, time 123.04ms
iter 123660: loss 7.1276, time 121.44ms
iter 123670: loss 7.2238, time 122.45ms
iter 123680: loss 7.0338, time 121.61ms
iter 123690: loss 7.3999, time 122.60ms
iter 123700: loss 8.0735, time 121.58ms
iter 123710: loss 7.3309, time 122.60ms
iter 123720: loss 6.8980, time 121.56ms
iter 123730: loss 7.0790, time 122.86ms
iter 123740: loss 7.3853, time 121.50ms
step 123750: train loss 6.1928, val loss 6.2630
saving checkpoint to out-shakespeare-char
iter 123750: loss 6.9770, time 2882.14ms
iter 123760: loss 7.2349, time 121.45ms
iter 123770: loss 7.3314, time 121.84ms
iter 123780: loss 6.6021, time 121.27ms
iter 123790: loss 6.9448, time 121.99ms
iter 123800: loss 8.0924, time 121.95ms
iter 123810: loss 6.9156, time 122.07ms
iter 123820: loss 7.6336, time 121.77ms
iter 123830: loss 7.4247, time 122.15ms
iter 123840: loss 6.6543, time 121.82ms
iter 123850: loss 6.9470, time 122.01ms
iter 123860: loss 6.8476, time 121.93ms
iter 123870: loss 7.5358, time 122.28ms
iter 123880: loss 7.0217, time 121.86ms
iter 123890: loss 7.8663, time 121.65ms
iter 123900: loss 7.7955, time 121.69ms
iter 123910: loss 7.4027, time 121.63ms
iter 123920: loss 6.9862, time 121.58ms
iter 123930: loss 7.0659, time 121.60ms
iter 123940: loss 6.8793, time 121.47ms
iter 123950: loss 7.5237, time 121.58ms
iter 123960: loss 7.7517, time 121.48ms
iter 123970: loss 7.3924, time 121.64ms
iter 123980: loss 7.2075, time 121.58ms
iter 123990: loss 7.9776, time 122.35ms
step 124000: train loss 6.2312, val loss 6.2198
saving checkpoint to out-shakespeare-char
iter 124000: loss 7.3075, time 2874.51ms
iter 124010: loss 6.6060, time 121.87ms
iter 124020: loss 6.5825, time 124.47ms
iter 124030: loss 7.6116, time 121.69ms
iter 124040: loss 7.4844, time 124.78ms
iter 124050: loss 7.1416, time 120.97ms
iter 124060: loss 7.5804, time 124.81ms
iter 124070: loss 7.4637, time 120.81ms
iter 124080: loss 7.7610, time 124.85ms
iter 124090: loss 7.5995, time 121.75ms
iter 124100: loss 8.0636, time 124.55ms
iter 124110: loss 8.2173, time 121.77ms
iter 124120: loss 7.7409, time 124.49ms
iter 124130: loss 6.7793, time 121.80ms
iter 124140: loss 7.8234, time 124.62ms
iter 124150: loss 7.4350, time 121.38ms
iter 124160: loss 6.7527, time 124.70ms
iter 124170: loss 7.6332, time 121.84ms
iter 124180: loss 7.8567, time 124.70ms
iter 124190: loss 6.7404, time 121.58ms
iter 124200: loss 7.2559, time 124.62ms
iter 124210: loss 8.0677, time 121.95ms
iter 124220: loss 6.5564, time 124.91ms
iter 124230: loss 8.0293, time 121.99ms
iter 124240: loss 7.4020, time 124.99ms
step 124250: train loss 6.1803, val loss 6.1817
saving checkpoint to out-shakespeare-char
iter 124250: loss 7.1162, time 2891.06ms
iter 124260: loss 6.7314, time 121.81ms
iter 124270: loss 7.4839, time 121.54ms
iter 124280: loss 7.1348, time 121.95ms
iter 124290: loss 6.9955, time 121.74ms
iter 124300: loss 7.3705, time 121.76ms
iter 124310: loss 7.4634, time 121.83ms
iter 124320: loss 7.6052, time 121.68ms
iter 124330: loss 7.5362, time 121.89ms
iter 124340: loss 8.0780, time 121.10ms
iter 124350: loss 7.6549, time 121.65ms
iter 124360: loss 7.3387, time 121.64ms
iter 124370: loss 8.1262, time 122.75ms
iter 124380: loss 6.9745, time 121.61ms
iter 124390: loss 7.9436, time 122.11ms
iter 124400: loss 7.0143, time 121.76ms
iter 124410: loss 7.6179, time 122.01ms
iter 124420: loss 7.5340, time 121.69ms
iter 124430: loss 8.0109, time 121.41ms
iter 124440: loss 8.0440, time 121.70ms
iter 124450: loss 7.0692, time 121.63ms
iter 124460: loss 7.5032, time 121.75ms
iter 124470: loss 7.1062, time 121.77ms
iter 124480: loss 7.0391, time 121.70ms
iter 124490: loss 6.8543, time 121.69ms
step 124500: train loss 6.2269, val loss 6.2368
saving checkpoint to out-shakespeare-char
iter 124500: loss 7.7804, time 2883.60ms
iter 124510: loss 7.0917, time 123.43ms
iter 124520: loss 7.0335, time 121.90ms
iter 124530: loss 6.6374, time 123.32ms
iter 124540: loss 6.5918, time 121.69ms
iter 124550: loss 7.5991, time 121.95ms
iter 124560: loss 8.1047, time 120.86ms
iter 124570: loss 7.6183, time 122.75ms
iter 124580: loss 6.8956, time 121.47ms
iter 124590: loss 6.9373, time 123.04ms
iter 124600: loss 7.5548, time 121.78ms
iter 124610: loss 6.5764, time 122.52ms
iter 124620: loss 6.7333, time 121.65ms
iter 124630: loss 7.4262, time 122.09ms
iter 124640: loss 7.0127, time 121.06ms
iter 124650: loss 7.3443, time 122.49ms
iter 124660: loss 7.2426, time 121.71ms
iter 124670: loss 7.1550, time 122.79ms
iter 124680: loss 7.3905, time 121.62ms
iter 124690: loss 7.0268, time 122.15ms
iter 124700: loss 6.9897, time 121.83ms
iter 124710: loss 7.6751, time 123.01ms
iter 124720: loss 7.2757, time 121.71ms
iter 124730: loss 7.6881, time 123.25ms
iter 124740: loss 7.5546, time 121.78ms
step 124750: train loss 6.2180, val loss 6.2286
saving checkpoint to out-shakespeare-char
iter 124750: loss 7.3583, time 2895.93ms
iter 124760: loss 7.7297, time 121.72ms
iter 124770: loss 7.8274, time 121.63ms
iter 124780: loss 6.8619, time 121.78ms
iter 124790: loss 7.8457, time 121.61ms
iter 124800: loss 7.0019, time 121.72ms
iter 124810: loss 7.4348, time 121.64ms
iter 124820: loss 6.9404, time 121.99ms
iter 124830: loss 7.7489, time 121.09ms
iter 124840: loss 7.6868, time 121.58ms
iter 124850: loss 7.4803, time 121.55ms
iter 124860: loss 7.4090, time 121.56ms
iter 124870: loss 7.2179, time 121.55ms
iter 124880: loss 6.8499, time 121.44ms
iter 124890: loss 7.9310, time 121.67ms
iter 124900: loss 7.4487, time 119.55ms
iter 124910: loss 7.0270, time 119.73ms
iter 124920: loss 6.8027, time 119.73ms
iter 124930: loss 7.3197, time 119.75ms
iter 124940: loss 7.2960, time 119.79ms
iter 124950: loss 7.5492, time 119.81ms
iter 124960: loss 7.3425, time 120.96ms
iter 124970: loss 7.7715, time 119.69ms
iter 124980: loss 6.6204, time 119.60ms
iter 124990: loss 8.4988, time 119.74ms
step 125000: train loss 6.2957, val loss 6.2285
saving checkpoint to out-shakespeare-char
iter 125000: loss 6.4815, time 2914.01ms
iter 125010: loss 6.9301, time 126.25ms
iter 125020: loss 7.2138, time 126.13ms
iter 125030: loss 7.3390, time 126.72ms
iter 125040: loss 7.4047, time 126.42ms
iter 125050: loss 7.3041, time 126.20ms
iter 125060: loss 6.8792, time 125.97ms
iter 125070: loss 7.1198, time 125.86ms
iter 125080: loss 7.4115, time 125.55ms
iter 125090: loss 6.8371, time 126.28ms
iter 125100: loss 7.7910, time 125.78ms
iter 125110: loss 7.8757, time 128.41ms
iter 125120: loss 7.1921, time 125.74ms
iter 125130: loss 7.4311, time 125.42ms
iter 125140: loss 7.2581, time 125.32ms
iter 125150: loss 6.4594, time 125.63ms
iter 125160: loss 7.8135, time 125.09ms
iter 125170: loss 7.9545, time 125.65ms
iter 125180: loss 7.5663, time 125.48ms
iter 125190: loss 7.2009, time 125.76ms
iter 125200: loss 7.1782, time 125.68ms
iter 125210: loss 7.1093, time 126.13ms
iter 125220: loss 7.5181, time 128.84ms
iter 125230: loss 7.2872, time 125.64ms
iter 125240: loss 7.2368, time 125.83ms
step 125250: train loss 6.1845, val loss 6.2081
saving checkpoint to out-shakespeare-char
iter 125250: loss 7.1807, time 2875.74ms
iter 125260: loss 7.1383, time 125.23ms
iter 125270: loss 7.1286, time 124.96ms
iter 125280: loss 7.4996, time 125.22ms
iter 125290: loss 7.0400, time 125.36ms
iter 125300: loss 7.8381, time 125.89ms
iter 125310: loss 8.0606, time 125.36ms
iter 125320: loss 7.3286, time 125.35ms
iter 125330: loss 6.9826, time 124.42ms
iter 125340: loss 7.6176, time 125.85ms
iter 125350: loss 6.7954, time 123.88ms
iter 125360: loss 7.6835, time 125.06ms
iter 125370: loss 7.1132, time 124.79ms
iter 125380: loss 7.4802, time 126.21ms
iter 125390: loss 7.2642, time 125.99ms
iter 125400: loss 7.3203, time 128.85ms
iter 125410: loss 7.7856, time 125.55ms
iter 125420: loss 6.9342, time 125.66ms
iter 125430: loss 7.6460, time 125.86ms
iter 125440: loss 6.8433, time 128.98ms
iter 125450: loss 7.4927, time 125.57ms
iter 125460: loss 6.5361, time 126.22ms
iter 125470: loss 6.8601, time 126.60ms
iter 125480: loss 7.7578, time 129.08ms
iter 125490: loss 7.3647, time 126.68ms
step 125500: train loss 6.2464, val loss 6.2439
saving checkpoint to out-shakespeare-char
iter 125500: loss 6.8442, time 2895.88ms
iter 125510: loss 6.9348, time 125.54ms
iter 125520: loss 7.2004, time 125.70ms
iter 125530: loss 7.4138, time 125.77ms
iter 125540: loss 6.7379, time 128.55ms
iter 125550: loss 7.6666, time 125.52ms
iter 125560: loss 7.0996, time 125.64ms
iter 125570: loss 6.7485, time 124.00ms
iter 125580: loss 7.1806, time 125.30ms
iter 125590: loss 7.6201, time 124.94ms
iter 125600: loss 7.6407, time 125.86ms
iter 125610: loss 7.6186, time 124.56ms
iter 125620: loss 7.7279, time 125.39ms
iter 125630: loss 6.3267, time 124.52ms
iter 125640: loss 8.0935, time 125.40ms
iter 125650: loss 7.2547, time 128.01ms
iter 125660: loss 7.0891, time 125.17ms
iter 125670: loss 7.3187, time 125.08ms
iter 125680: loss 6.9814, time 124.29ms
iter 125690: loss 6.7915, time 125.08ms
iter 125700: loss 7.8411, time 125.07ms
iter 125710: loss 7.3010, time 124.84ms
iter 125720: loss 7.7103, time 124.01ms
iter 125730: loss 6.8115, time 126.33ms
iter 125740: loss 7.5347, time 125.61ms
step 125750: train loss 6.1837, val loss 6.2008
saving checkpoint to out-shakespeare-char
iter 125750: loss 7.6143, time 2891.30ms
iter 125760: loss 6.8373, time 126.76ms
iter 125770: loss 7.3825, time 124.26ms
iter 125780: loss 7.2294, time 126.81ms
iter 125790: loss 6.8833, time 124.97ms
iter 125800: loss 7.9501, time 125.07ms
iter 125810: loss 6.8398, time 124.27ms
iter 125820: loss 7.9089, time 122.25ms
iter 125830: loss 6.3285, time 121.47ms
iter 125840: loss 6.9505, time 121.48ms
iter 125850: loss 7.0755, time 121.42ms
iter 125860: loss 7.2794, time 121.46ms
iter 125870: loss 7.1702, time 121.40ms
iter 125880: loss 6.8540, time 121.16ms
iter 125890: loss 7.3116, time 121.15ms
iter 125900: loss 7.4155, time 121.27ms
iter 125910: loss 6.9361, time 121.62ms
iter 125920: loss 8.7299, time 121.33ms
iter 125930: loss 6.9585, time 121.46ms
iter 125940: loss 7.5155, time 121.51ms
iter 125950: loss 7.3579, time 121.61ms
iter 125960: loss 6.8141, time 121.43ms
iter 125970: loss 7.6790, time 120.95ms
iter 125980: loss 7.4649, time 121.34ms
iter 125990: loss 5.8569, time 121.10ms
step 126000: train loss 6.1867, val loss 6.2218
saving checkpoint to out-shakespeare-char
iter 126000: loss 7.5300, time 2894.69ms
iter 126010: loss 8.0366, time 128.26ms
iter 126020: loss 8.0100, time 125.35ms
iter 126030: loss 7.0694, time 128.35ms
iter 126040: loss 7.3947, time 125.32ms
iter 126050: loss 6.5553, time 127.94ms
iter 126060: loss 7.5123, time 125.46ms
iter 126070: loss 6.9364, time 127.16ms
iter 126080: loss 7.8548, time 125.12ms
iter 126090: loss 8.2316, time 125.23ms
iter 126100: loss 6.9368, time 125.48ms
iter 126110: loss 7.6955, time 125.59ms
iter 126120: loss 7.7379, time 125.55ms
iter 126130: loss 6.9008, time 125.33ms
iter 126140: loss 7.7987, time 125.47ms
iter 126150: loss 6.9096, time 125.69ms
iter 126160: loss 7.0696, time 128.43ms
iter 126170: loss 8.1189, time 125.77ms
iter 126180: loss 7.4961, time 125.68ms
iter 126190: loss 7.5421, time 125.69ms
iter 126200: loss 8.1525, time 125.65ms
iter 126210: loss 7.2483, time 125.61ms
iter 126220: loss 6.9322, time 125.27ms
iter 126230: loss 7.4516, time 124.92ms
iter 126240: loss 7.8296, time 125.39ms
step 126250: train loss 6.1713, val loss 6.2085
saving checkpoint to out-shakespeare-char
iter 126250: loss 6.8166, time 2873.72ms
iter 126260: loss 6.4704, time 121.48ms
iter 126270: loss 7.5400, time 121.02ms
iter 126280: loss 6.5737, time 121.80ms
iter 126290: loss 7.7277, time 123.99ms
iter 126300: loss 7.5380, time 121.88ms
iter 126310: loss 7.9679, time 122.82ms
iter 126320: loss 6.1484, time 121.73ms
iter 126330: loss 7.6496, time 122.83ms
iter 126340: loss 6.9395, time 121.72ms
iter 126350: loss 7.2173, time 122.86ms
iter 126360: loss 7.3019, time 121.81ms
iter 126370: loss 7.2016, time 122.66ms
iter 126380: loss 7.2813, time 121.72ms
iter 126390: loss 7.6571, time 123.22ms
iter 126400: loss 7.6504, time 121.76ms
iter 126410: loss 7.0442, time 123.12ms
iter 126420: loss 7.2264, time 120.57ms
iter 126430: loss 7.7959, time 123.18ms
iter 126440: loss 6.9088, time 121.83ms
iter 126450: loss 7.1920, time 122.78ms
iter 126460: loss 6.4522, time 121.78ms
iter 126470: loss 7.6809, time 122.70ms
iter 126480: loss 7.4146, time 121.67ms
iter 126490: loss 7.2873, time 122.74ms
step 126500: train loss 6.2301, val loss 6.2228
saving checkpoint to out-shakespeare-char
iter 126500: loss 6.8823, time 2900.34ms
iter 126510: loss 6.9080, time 121.89ms
iter 126520: loss 7.2495, time 122.11ms
iter 126530: loss 7.3463, time 121.99ms
iter 126540: loss 7.5643, time 122.16ms
iter 126550: loss 7.1321, time 122.03ms
iter 126560: loss 7.0439, time 122.09ms
iter 126570: loss 7.1735, time 121.81ms
iter 126580: loss 6.8223, time 120.90ms
iter 126590: loss 6.6426, time 121.18ms
iter 126600: loss 7.1459, time 121.12ms
iter 126610: loss 6.7608, time 121.81ms
iter 126620: loss 6.4956, time 121.87ms
iter 126630: loss 7.3939, time 121.93ms
iter 126640: loss 6.8558, time 121.98ms
iter 126650: loss 7.5743, time 122.00ms
iter 126660: loss 7.8104, time 121.96ms
iter 126670: loss 7.3195, time 121.90ms
iter 126680: loss 6.5644, time 121.82ms
iter 126690: loss 7.2556, time 121.28ms
iter 126700: loss 7.3152, time 121.29ms
iter 126710: loss 7.0557, time 122.06ms
iter 126720: loss 7.6871, time 121.85ms
iter 126730: loss 7.9777, time 121.94ms
iter 126740: loss 7.4702, time 121.20ms
step 126750: train loss 6.2073, val loss 6.1832
saving checkpoint to out-shakespeare-char
iter 126750: loss 6.5706, time 2896.66ms
iter 126760: loss 7.5326, time 121.85ms
iter 126770: loss 7.9751, time 121.78ms
iter 126780: loss 7.9417, time 121.67ms
iter 126790: loss 6.7616, time 121.16ms
iter 126800: loss 7.5198, time 120.77ms
iter 126810: loss 7.0341, time 121.67ms
iter 126820: loss 7.2700, time 121.65ms
iter 126830: loss 7.7064, time 122.10ms
iter 126840: loss 8.2385, time 122.13ms
iter 126850: loss 7.0998, time 122.06ms
iter 126860: loss 7.1632, time 121.68ms
iter 126870: loss 7.2947, time 121.57ms
iter 126880: loss 7.5146, time 121.62ms
iter 126890: loss 6.5374, time 121.63ms
iter 126900: loss 6.4069, time 121.72ms
iter 126910: loss 7.6133, time 125.56ms
iter 126920: loss 7.4813, time 124.77ms
iter 126930: loss 7.9500, time 128.46ms
iter 126940: loss 7.7779, time 125.44ms
iter 126950: loss 7.1958, time 125.50ms
iter 126960: loss 8.0279, time 125.37ms
iter 126970: loss 7.5521, time 124.35ms
iter 126980: loss 7.6442, time 125.14ms
iter 126990: loss 8.0606, time 125.43ms
step 127000: train loss 6.2341, val loss 6.2176
saving checkpoint to out-shakespeare-char
iter 127000: loss 8.2441, time 2902.05ms
iter 127010: loss 7.5097, time 124.75ms
iter 127020: loss 6.8213, time 125.64ms
iter 127030: loss 7.5486, time 125.67ms
iter 127040: loss 8.0389, time 125.49ms
iter 127050: loss 6.2809, time 125.91ms
iter 127060: loss 8.0092, time 128.47ms
iter 127070: loss 7.4926, time 125.91ms
iter 127080: loss 7.2080, time 125.63ms
iter 127090: loss 7.6554, time 125.85ms
iter 127100: loss 7.6576, time 126.01ms
iter 127110: loss 7.5587, time 125.60ms
iter 127120: loss 6.6356, time 125.98ms
iter 127130: loss 7.4456, time 125.19ms
iter 127140: loss 7.0681, time 125.23ms
iter 127150: loss 7.1165, time 125.42ms
iter 127160: loss 7.4012, time 125.50ms
iter 127170: loss 6.9297, time 127.32ms
iter 127180: loss 6.9475, time 125.29ms
iter 127190: loss 7.6299, time 125.22ms
iter 127200: loss 7.5071, time 125.57ms
iter 127210: loss 6.9011, time 124.69ms
iter 127220: loss 8.1054, time 125.30ms
iter 127230: loss 6.3316, time 124.52ms
iter 127240: loss 7.1508, time 125.30ms
step 127250: train loss 6.1952, val loss 6.2111
saving checkpoint to out-shakespeare-char
iter 127250: loss 7.7989, time 2907.85ms
iter 127260: loss 7.1320, time 125.81ms
iter 127270: loss 7.0567, time 125.67ms
iter 127280: loss 7.7039, time 125.45ms
iter 127290: loss 8.4013, time 125.48ms
iter 127300: loss 7.1369, time 124.99ms
iter 127310: loss 7.5424, time 125.20ms
iter 127320: loss 8.0797, time 123.90ms
iter 127330: loss 7.2298, time 125.24ms
iter 127340: loss 6.7766, time 125.20ms
iter 127350: loss 7.7597, time 125.35ms
iter 127360: loss 6.6170, time 125.30ms
iter 127370: loss 7.4567, time 124.21ms
iter 127380: loss 7.3940, time 128.09ms
iter 127390: loss 6.6488, time 125.35ms
iter 127400: loss 6.9542, time 125.41ms
iter 127410: loss 8.0743, time 125.37ms
iter 127420: loss 7.4665, time 125.87ms
iter 127430: loss 7.9272, time 125.02ms
iter 127440: loss 7.1667, time 125.38ms
iter 127450: loss 7.5424, time 125.17ms
iter 127460: loss 7.9676, time 125.34ms
iter 127470: loss 6.9371, time 125.39ms
iter 127480: loss 7.2275, time 124.79ms
iter 127490: loss 7.1455, time 128.72ms
step 127500: train loss 6.1582, val loss 6.2257
saving checkpoint to out-shakespeare-char
iter 127500: loss 7.1970, time 2891.47ms
iter 127510: loss 6.4982, time 125.90ms
iter 127520: loss 7.9771, time 125.95ms
iter 127530: loss 7.8718, time 125.56ms
iter 127540: loss 6.6607, time 125.24ms
iter 127550: loss 7.3592, time 125.61ms
iter 127560: loss 7.7795, time 124.25ms
iter 127570: loss 7.6031, time 125.04ms
iter 127580: loss 7.1300, time 125.78ms
iter 127590: loss 7.0033, time 126.00ms
iter 127600: loss 6.9426, time 128.32ms
iter 127610: loss 7.1761, time 125.49ms
iter 127620: loss 7.4968, time 125.53ms
iter 127630: loss 6.6078, time 123.99ms
iter 127640: loss 7.8816, time 125.63ms
iter 127650: loss 7.8042, time 125.29ms
iter 127660: loss 6.7618, time 125.57ms
iter 127670: loss 7.1016, time 123.90ms
iter 127680: loss 6.8799, time 125.70ms
iter 127690: loss 6.9131, time 125.46ms
iter 127700: loss 7.4764, time 125.17ms
iter 127710: loss 7.2932, time 123.07ms
iter 127720: loss 6.7274, time 125.34ms
iter 127730: loss 7.6420, time 124.54ms
iter 127740: loss 7.1140, time 124.78ms
step 127750: train loss 6.2049, val loss 6.1587
saving checkpoint to out-shakespeare-char
iter 127750: loss 7.8074, time 2895.89ms
iter 127760: loss 7.6566, time 125.86ms
iter 127770: loss 6.8686, time 125.13ms
iter 127780: loss 7.4298, time 124.69ms
iter 127790: loss 7.1860, time 125.70ms
iter 127800: loss 6.7224, time 125.66ms
iter 127810: loss 7.0930, time 124.38ms
iter 127820: loss 7.4660, time 125.42ms
iter 127830: loss 7.2004, time 125.51ms
iter 127840: loss 7.5930, time 125.14ms
iter 127850: loss 6.9732, time 125.46ms
iter 127860: loss 6.5647, time 125.61ms
iter 127870: loss 7.5595, time 125.96ms
iter 127880: loss 7.1184, time 128.44ms
iter 127890: loss 7.3428, time 125.02ms
iter 127900: loss 6.8034, time 125.38ms
iter 127910: loss 7.6458, time 125.41ms
iter 127920: loss 6.3545, time 128.61ms
iter 127930: loss 7.7031, time 125.29ms
iter 127940: loss 7.4156, time 125.34ms
iter 127950: loss 7.4896, time 125.56ms
iter 127960: loss 7.7053, time 125.40ms
iter 127970: loss 6.9665, time 125.22ms
iter 127980: loss 7.2097, time 125.23ms
iter 127990: loss 6.9316, time 125.21ms
step 128000: train loss 6.1774, val loss 6.1791
saving checkpoint to out-shakespeare-char
iter 128000: loss 7.0299, time 2870.36ms
iter 128010: loss 7.3530, time 126.38ms
iter 128020: loss 7.0866, time 125.23ms
iter 128030: loss 7.6273, time 125.10ms
iter 128040: loss 7.4683, time 125.24ms
iter 128050: loss 7.5125, time 124.93ms
iter 128060: loss 6.8718, time 127.98ms
iter 128070: loss 7.5098, time 125.06ms
iter 128080: loss 6.7818, time 127.10ms
iter 128090: loss 6.5076, time 125.83ms
iter 128100: loss 7.5480, time 125.09ms
iter 128110: loss 7.7290, time 125.19ms
iter 128120: loss 7.2380, time 125.12ms
iter 128130: loss 7.4432, time 125.02ms
iter 128140: loss 7.0455, time 125.08ms
iter 128150: loss 7.7168, time 125.33ms
iter 128160: loss 7.6228, time 125.68ms
iter 128170: loss 7.2261, time 128.36ms
iter 128180: loss 7.5469, time 125.13ms
iter 128190: loss 7.5621, time 125.44ms
iter 128200: loss 7.3852, time 125.38ms
iter 128210: loss 7.4456, time 125.04ms
iter 128220: loss 7.3240, time 125.04ms
iter 128230: loss 7.8197, time 125.28ms
iter 128240: loss 7.5245, time 125.33ms
step 128250: train loss 6.1987, val loss 6.1865
saving checkpoint to out-shakespeare-char
iter 128250: loss 7.8209, time 2882.01ms
iter 128260: loss 7.8815, time 124.95ms
iter 128270: loss 7.0013, time 121.89ms
iter 128280: loss 6.9513, time 124.82ms
iter 128290: loss 7.3390, time 121.99ms
iter 128300: loss 6.8556, time 124.86ms
iter 128310: loss 7.8657, time 121.89ms
iter 128320: loss 7.1681, time 125.07ms
iter 128330: loss 8.0480, time 121.95ms
iter 128340: loss 6.6053, time 124.66ms
iter 128350: loss 7.1408, time 121.90ms
iter 128360: loss 7.8675, time 124.73ms
iter 128370: loss 6.5924, time 121.85ms
iter 128380: loss 7.1664, time 124.95ms
iter 128390: loss 7.8194, time 122.03ms
iter 128400: loss 6.8640, time 124.50ms
iter 128410: loss 7.0845, time 121.91ms
iter 128420: loss 7.1564, time 124.96ms
iter 128430: loss 7.3961, time 121.84ms
iter 128440: loss 7.0351, time 124.61ms
iter 128450: loss 7.5227, time 121.93ms
iter 128460: loss 7.5544, time 124.58ms
iter 128470: loss 8.0860, time 121.80ms
iter 128480: loss 7.5540, time 124.99ms
iter 128490: loss 8.3391, time 121.96ms
step 128500: train loss 6.1489, val loss 6.1497
saving checkpoint to out-shakespeare-char
iter 128500: loss 7.4212, time 2896.35ms
iter 128510: loss 7.7115, time 122.08ms
iter 128520: loss 7.5096, time 121.75ms
iter 128530: loss 6.4758, time 121.70ms
iter 128540: loss 7.2522, time 121.84ms
iter 128550: loss 6.8275, time 121.73ms
iter 128560: loss 6.8327, time 121.73ms
iter 128570: loss 7.1116, time 121.86ms
iter 128580: loss 6.7030, time 121.82ms
iter 128590: loss 6.7515, time 121.85ms
iter 128600: loss 7.6398, time 121.11ms
iter 128610: loss 7.4166, time 121.89ms
iter 128620: loss 7.8499, time 121.88ms
iter 128630: loss 7.5736, time 121.01ms
iter 128640: loss 7.0415, time 121.74ms
iter 128650: loss 7.1396, time 121.63ms
iter 128660: loss 7.1297, time 121.82ms
iter 128670: loss 6.8647, time 121.89ms
iter 128680: loss 6.5611, time 121.20ms
iter 128690: loss 7.9627, time 121.45ms
iter 128700: loss 7.0802, time 121.76ms
iter 128710: loss 6.5721, time 122.47ms
iter 128720: loss 7.1995, time 121.87ms
iter 128730: loss 8.0661, time 121.91ms
iter 128740: loss 7.2026, time 121.50ms
step 128750: train loss 6.1546, val loss 6.2008
saving checkpoint to out-shakespeare-char
iter 128750: loss 7.4417, time 2897.90ms
iter 128760: loss 7.1506, time 121.93ms
iter 128770: loss 7.6324, time 121.95ms
iter 128780: loss 6.7784, time 121.97ms
iter 128790: loss 7.3089, time 122.23ms
iter 128800: loss 6.1841, time 121.77ms
iter 128810: loss 6.7337, time 122.03ms
iter 128820: loss 7.9383, time 122.12ms
iter 128830: loss 7.7144, time 121.98ms
iter 128840: loss 7.2333, time 122.19ms
iter 128850: loss 7.1756, time 121.99ms
iter 128860: loss 7.4458, time 122.08ms
iter 128870: loss 7.7201, time 122.09ms
iter 128880: loss 7.2810, time 121.44ms
iter 128890: loss 7.1225, time 122.15ms
iter 128900: loss 7.3338, time 121.88ms
iter 128910: loss 6.7859, time 122.06ms
iter 128920: loss 7.1197, time 121.95ms
iter 128930: loss 7.3452, time 122.13ms
iter 128940: loss 7.2706, time 121.93ms
iter 128950: loss 7.5055, time 121.25ms
iter 128960: loss 7.7747, time 122.00ms
iter 128970: loss 7.3780, time 122.16ms
iter 128980: loss 6.8928, time 122.01ms
iter 128990: loss 8.3049, time 122.13ms
step 129000: train loss 6.1472, val loss 6.1922
saving checkpoint to out-shakespeare-char
iter 129000: loss 6.8969, time 2900.24ms
iter 129010: loss 6.8151, time 122.39ms
iter 129020: loss 7.2239, time 122.17ms
iter 129030: loss 7.1524, time 122.06ms
iter 129040: loss 6.5488, time 122.05ms
iter 129050: loss 6.9986, time 122.31ms
iter 129060: loss 7.3468, time 121.96ms
iter 129070: loss 7.4542, time 122.00ms
iter 129080: loss 7.4781, time 122.26ms
iter 129090: loss 6.5794, time 121.64ms
iter 129100: loss 7.6231, time 121.77ms
iter 129110: loss 7.5412, time 121.27ms
iter 129120: loss 7.6255, time 121.99ms
iter 129130: loss 7.0307, time 121.67ms
iter 129140: loss 6.8478, time 121.29ms
iter 129150: loss 7.0531, time 121.77ms
iter 129160: loss 7.4318, time 121.90ms
iter 129170: loss 6.7538, time 121.95ms
iter 129180: loss 7.7700, time 121.65ms
iter 129190: loss 7.1147, time 121.89ms
iter 129200: loss 7.6735, time 126.67ms
iter 129210: loss 6.9544, time 125.97ms
iter 129220: loss 6.6625, time 125.63ms
iter 129230: loss 7.7876, time 125.41ms
iter 129240: loss 6.7674, time 125.49ms
step 129250: train loss 6.1947, val loss 6.1997
saving checkpoint to out-shakespeare-char
iter 129250: loss 6.8863, time 2888.20ms
iter 129260: loss 7.2418, time 125.90ms
iter 129270: loss 7.5867, time 125.55ms
iter 129280: loss 6.7847, time 125.66ms
iter 129290: loss 7.1801, time 125.70ms
iter 129300: loss 7.2419, time 125.68ms
iter 129310: loss 7.3328, time 128.40ms
iter 129320: loss 7.3221, time 126.10ms
iter 129330: loss 6.9131, time 125.99ms
iter 129340: loss 7.0831, time 125.71ms
iter 129350: loss 8.0977, time 128.71ms
iter 129360: loss 6.5660, time 125.78ms
iter 129370: loss 7.2374, time 125.58ms
iter 129380: loss 7.4210, time 125.82ms
iter 129390: loss 7.1698, time 124.70ms
iter 129400: loss 7.5487, time 125.49ms
iter 129410: loss 7.3143, time 125.88ms
iter 129420: loss 6.1713, time 125.90ms
iter 129430: loss 6.7726, time 125.93ms
iter 129440: loss 7.2875, time 125.61ms
iter 129450: loss 6.9869, time 126.39ms
iter 129460: loss 7.0762, time 128.74ms
iter 129470: loss 6.9269, time 126.31ms
iter 129480: loss 6.9914, time 126.34ms
iter 129490: loss 8.2295, time 126.14ms
step 129500: train loss 6.1947, val loss 6.1987
saving checkpoint to out-shakespeare-char
iter 129500: loss 7.1784, time 2893.26ms
iter 129510: loss 7.6498, time 125.86ms
iter 129520: loss 7.1702, time 126.19ms
iter 129530: loss 7.1390, time 125.48ms
iter 129540: loss 7.1104, time 125.79ms
iter 129550: loss 6.7432, time 125.53ms
iter 129560: loss 6.9724, time 125.44ms
iter 129570: loss 7.0691, time 125.49ms
iter 129580: loss 7.3331, time 125.38ms
iter 129590: loss 7.2945, time 125.81ms
iter 129600: loss 7.5137, time 128.36ms
iter 129610: loss 7.3690, time 125.39ms
iter 129620: loss 7.0957, time 125.61ms
iter 129630: loss 7.3757, time 126.39ms
iter 129640: loss 7.2015, time 125.48ms
iter 129650: loss 7.3587, time 125.93ms
iter 129660: loss 7.1825, time 125.53ms
iter 129670: loss 6.7619, time 125.27ms
iter 129680: loss 7.5853, time 125.55ms
iter 129690: loss 6.6448, time 125.68ms
iter 129700: loss 6.7982, time 126.01ms
iter 129710: loss 7.3819, time 128.50ms
iter 129720: loss 6.3725, time 125.50ms
iter 129730: loss 6.6864, time 125.75ms
iter 129740: loss 7.2781, time 125.77ms
step 129750: train loss 6.1852, val loss 6.1900
saving checkpoint to out-shakespeare-char
iter 129750: loss 7.7124, time 2893.93ms
iter 129760: loss 6.9070, time 125.78ms
iter 129770: loss 8.1225, time 125.03ms
iter 129780: loss 6.9829, time 125.75ms
iter 129790: loss 7.6635, time 125.66ms
iter 129800: loss 7.5662, time 126.34ms
iter 129810: loss 7.7051, time 125.71ms
iter 129820: loss 7.4910, time 125.78ms
iter 129830: loss 6.8158, time 125.79ms
iter 129840: loss 6.8704, time 126.03ms
iter 129850: loss 7.0824, time 128.78ms
iter 129860: loss 7.1607, time 125.99ms
iter 129870: loss 7.1791, time 125.96ms
iter 129880: loss 7.2005, time 126.42ms
iter 129890: loss 7.7542, time 125.86ms
iter 129900: loss 6.8931, time 125.86ms
iter 129910: loss 7.0552, time 125.81ms
iter 129920: loss 6.7357, time 125.96ms
iter 129930: loss 7.0489, time 127.08ms
iter 129940: loss 7.6288, time 125.71ms
iter 129950: loss 6.6761, time 126.28ms
iter 129960: loss 7.3124, time 125.88ms
iter 129970: loss 7.7109, time 125.86ms
iter 129980: loss 7.6936, time 126.89ms
iter 129990: loss 7.4372, time 126.12ms
step 130000: train loss 6.1973, val loss 6.2281
saving checkpoint to out-shakespeare-char
iter 130000: loss 7.5380, time 2884.08ms
iter 130010: loss 7.1370, time 123.06ms
iter 130020: loss 6.7811, time 121.72ms
iter 130030: loss 7.0657, time 122.97ms
iter 130040: loss 7.5365, time 121.93ms
iter 130050: loss 7.1383, time 123.08ms
iter 130060: loss 7.6236, time 121.86ms
iter 130070: loss 6.4332, time 123.02ms
iter 130080: loss 7.4733, time 121.40ms
iter 130090: loss 8.3118, time 123.24ms
iter 130100: loss 7.1085, time 122.01ms
iter 130110: loss 7.0661, time 123.29ms
iter 130120: loss 7.2304, time 121.91ms
iter 130130: loss 7.0740, time 122.97ms
iter 130140: loss 7.3801, time 122.04ms
iter 130150: loss 7.8393, time 123.06ms
iter 130160: loss 7.2882, time 121.97ms
iter 130170: loss 7.2350, time 122.22ms
iter 130180: loss 7.0595, time 121.82ms
iter 130190: loss 6.8154, time 123.21ms
iter 130200: loss 7.1427, time 122.04ms
iter 130210: loss 7.3647, time 123.29ms
iter 130220: loss 7.1200, time 121.82ms
iter 130230: loss 7.1508, time 122.71ms
iter 130240: loss 7.1140, time 121.79ms
step 130250: train loss 6.1359, val loss 6.1067
saving checkpoint to out-shakespeare-char
iter 130250: loss 7.4164, time 2920.64ms
iter 130260: loss 6.9977, time 121.86ms
iter 130270: loss 7.2265, time 124.24ms
iter 130280: loss 6.4427, time 121.87ms
iter 130290: loss 7.6395, time 125.22ms
iter 130300: loss 7.3756, time 121.78ms
iter 130310: loss 7.2900, time 124.74ms
iter 130320: loss 7.1225, time 122.04ms
iter 130330: loss 7.3619, time 124.75ms
iter 130340: loss 8.1672, time 121.86ms
iter 130350: loss 7.0443, time 124.78ms
iter 130360: loss 6.4894, time 121.99ms
iter 130370: loss 7.5636, time 124.70ms
iter 130380: loss 6.8065, time 121.84ms
iter 130390: loss 7.2060, time 124.70ms
iter 130400: loss 7.7926, time 122.04ms
iter 130410: loss 7.5027, time 125.21ms
iter 130420: loss 7.6757, time 121.87ms
iter 130430: loss 7.6970, time 124.61ms
iter 130440: loss 7.1579, time 121.88ms
iter 130450: loss 7.5947, time 125.20ms
iter 130460: loss 7.0964, time 121.94ms
iter 130470: loss 7.0277, time 124.17ms
iter 130480: loss 7.7106, time 121.53ms
iter 130490: loss 6.9090, time 124.14ms
step 130500: train loss 6.1619, val loss 6.1602
saving checkpoint to out-shakespeare-char
iter 130500: loss 6.7761, time 2890.01ms
iter 130510: loss 6.9406, time 121.94ms
iter 130520: loss 7.5698, time 122.91ms
iter 130530: loss 6.6622, time 121.75ms
iter 130540: loss 7.0633, time 123.08ms
iter 130550: loss 7.8146, time 121.85ms
iter 130560: loss 7.9466, time 123.22ms
iter 130570: loss 7.4321, time 122.07ms
iter 130580: loss 7.1363, time 123.01ms
iter 130590: loss 6.9091, time 121.16ms
iter 130600: loss 6.8788, time 122.94ms
iter 130610: loss 7.0425, time 121.09ms
iter 130620: loss 7.0064, time 123.33ms
iter 130630: loss 7.4643, time 121.80ms
iter 130640: loss 7.5071, time 123.28ms
iter 130650: loss 7.9408, time 121.89ms
iter 130660: loss 7.4590, time 123.07ms
iter 130670: loss 7.2826, time 121.98ms
iter 130680: loss 7.1288, time 122.24ms
iter 130690: loss 7.5294, time 121.78ms
iter 130700: loss 6.7648, time 122.50ms
iter 130710: loss 7.3000, time 121.94ms
iter 130720: loss 6.6130, time 123.09ms
iter 130730: loss 7.2913, time 120.94ms
iter 130740: loss 7.0433, time 123.17ms
step 130750: train loss 6.1626, val loss 6.1867
saving checkpoint to out-shakespeare-char
iter 130750: loss 7.5652, time 2896.44ms
iter 130760: loss 6.6156, time 121.62ms
iter 130770: loss 7.6653, time 122.71ms
iter 130780: loss 7.2799, time 121.41ms
iter 130790: loss 7.3795, time 121.25ms
iter 130800: loss 7.7679, time 121.62ms
iter 130810: loss 7.7489, time 121.33ms
iter 130820: loss 6.7541, time 121.50ms
iter 130830: loss 7.7037, time 121.45ms
iter 130840: loss 7.3037, time 121.45ms
iter 130850: loss 6.5626, time 120.71ms
iter 130860: loss 7.0264, time 121.60ms
iter 130870: loss 7.0466, time 121.29ms
iter 130880: loss 6.6379, time 124.54ms
iter 130890: loss 8.1201, time 121.51ms
iter 130900: loss 6.8982, time 124.88ms
iter 130910: loss 6.4396, time 121.57ms
iter 130920: loss 8.1119, time 125.09ms
iter 130930: loss 7.2215, time 121.49ms
iter 130940: loss 7.3781, time 124.10ms
iter 130950: loss 7.4379, time 121.47ms
iter 130960: loss 8.0172, time 124.93ms
iter 130970: loss 7.2635, time 121.51ms
iter 130980: loss 7.2495, time 124.66ms
iter 130990: loss 6.9040, time 121.56ms
step 131000: train loss 6.1937, val loss 6.1632
saving checkpoint to out-shakespeare-char
iter 131000: loss 7.1126, time 2907.09ms
iter 131010: loss 7.3994, time 122.00ms
iter 131020: loss 7.6017, time 121.80ms
iter 131030: loss 7.3919, time 121.91ms
iter 131040: loss 7.0042, time 121.78ms
iter 131050: loss 7.0156, time 121.89ms
iter 131060: loss 7.3450, time 121.75ms
iter 131070: loss 7.6666, time 121.96ms
iter 131080: loss 7.6818, time 121.84ms
iter 131090: loss 7.1777, time 122.54ms
iter 131100: loss 7.5373, time 122.36ms
iter 131110: loss 7.3222, time 122.96ms
iter 131120: loss 6.5964, time 122.36ms
iter 131130: loss 7.5692, time 122.13ms
iter 131140: loss 6.2948, time 121.87ms
iter 131150: loss 7.2791, time 122.00ms
iter 131160: loss 6.7214, time 121.85ms
iter 131170: loss 7.5871, time 121.38ms
iter 131180: loss 6.8064, time 122.21ms
iter 131190: loss 6.6091, time 121.87ms
iter 131200: loss 6.8271, time 121.68ms
iter 131210: loss 7.0378, time 122.40ms
iter 131220: loss 7.2411, time 122.20ms
iter 131230: loss 7.2395, time 121.93ms
iter 131240: loss 7.5541, time 121.84ms
step 131250: train loss 6.1637, val loss 6.1498
saving checkpoint to out-shakespeare-char
iter 131250: loss 7.6015, time 2890.54ms
iter 131260: loss 8.2802, time 121.23ms
iter 131270: loss 7.2941, time 120.66ms
iter 131280: loss 7.5362, time 121.38ms
iter 131290: loss 6.6791, time 121.33ms
iter 131300: loss 6.7131, time 121.10ms
iter 131310: loss 7.2546, time 121.40ms
iter 131320: loss 6.9471, time 121.20ms
iter 131330: loss 7.9532, time 121.22ms
iter 131340: loss 7.3009, time 121.12ms
iter 131350: loss 6.9277, time 121.88ms
iter 131360: loss 6.9853, time 121.19ms
iter 131370: loss 7.4153, time 121.34ms
iter 131380: loss 7.4675, time 121.13ms
iter 131390: loss 8.5774, time 122.41ms
iter 131400: loss 7.9213, time 121.21ms
iter 131410: loss 7.6054, time 120.75ms
iter 131420: loss 6.9695, time 121.28ms
iter 131430: loss 7.0597, time 121.39ms
iter 131440: loss 7.1522, time 121.14ms
iter 131450: loss 7.4976, time 121.56ms
iter 131460: loss 7.5623, time 121.33ms
iter 131470: loss 7.7066, time 121.53ms
iter 131480: loss 7.0598, time 120.20ms
iter 131490: loss 7.0237, time 121.27ms
step 131500: train loss 6.1673, val loss 6.1621
saving checkpoint to out-shakespeare-char
iter 131500: loss 7.4392, time 2897.38ms
iter 131510: loss 7.1867, time 125.62ms
iter 131520: loss 7.5294, time 125.53ms
iter 131530: loss 7.0919, time 125.65ms
iter 131540: loss 8.2614, time 125.76ms
iter 131550: loss 7.0774, time 124.88ms
iter 131560: loss 6.2344, time 125.76ms
iter 131570: loss 6.5877, time 125.82ms
iter 131580: loss 6.7250, time 126.45ms
iter 131590: loss 7.6942, time 125.50ms
iter 131600: loss 6.9810, time 125.74ms
iter 131610: loss 7.3175, time 125.55ms
iter 131620: loss 6.8313, time 125.79ms
iter 131630: loss 7.1988, time 125.81ms
iter 131640: loss 7.1679, time 125.65ms
iter 131650: loss 7.0028, time 129.86ms
iter 131660: loss 6.6521, time 125.65ms
iter 131670: loss 6.6331, time 125.61ms
iter 131680: loss 7.0744, time 125.61ms
iter 131690: loss 7.1848, time 125.67ms
iter 131700: loss 7.7109, time 126.47ms
iter 131710: loss 6.6424, time 125.83ms
iter 131720: loss 6.9359, time 125.75ms
iter 131730: loss 7.4256, time 125.68ms
iter 131740: loss 6.7729, time 125.62ms
step 131750: train loss 6.1685, val loss 6.1353
saving checkpoint to out-shakespeare-char
iter 131750: loss 6.5308, time 2875.92ms
iter 131760: loss 6.8660, time 125.10ms
iter 131770: loss 7.6581, time 124.75ms
iter 131780: loss 7.7773, time 124.87ms
iter 131790: loss 7.4514, time 124.29ms
iter 131800: loss 7.0710, time 124.92ms
iter 131810: loss 7.4573, time 124.51ms
iter 131820: loss 7.0259, time 128.09ms
iter 131830: loss 7.2923, time 122.13ms
iter 131840: loss 7.0354, time 125.36ms
iter 131850: loss 7.3141, time 125.36ms
iter 131860: loss 7.1396, time 125.24ms
iter 131870: loss 5.9774, time 125.26ms
iter 131880: loss 6.3067, time 125.14ms
iter 131890: loss 7.5371, time 125.15ms
iter 131900: loss 6.7412, time 124.94ms
iter 131910: loss 6.0881, time 125.53ms
iter 131920: loss 7.3880, time 125.27ms
iter 131930: loss 6.8831, time 128.09ms
iter 131940: loss 6.8076, time 125.42ms
iter 131950: loss 6.8654, time 126.68ms
iter 131960: loss 7.6109, time 125.68ms
iter 131970: loss 6.4329, time 125.69ms
iter 131980: loss 6.8536, time 125.82ms
iter 131990: loss 7.1394, time 125.52ms
step 132000: train loss 6.1583, val loss 6.1474
saving checkpoint to out-shakespeare-char
iter 132000: loss 8.1714, time 2874.04ms
iter 132010: loss 6.3431, time 125.93ms
iter 132020: loss 7.9177, time 126.02ms
iter 132030: loss 7.3616, time 125.36ms
iter 132040: loss 6.4856, time 124.89ms
iter 132050: loss 7.4735, time 125.82ms
iter 132060: loss 7.4657, time 126.58ms
iter 132070: loss 6.7832, time 125.83ms
iter 132080: loss 7.3899, time 125.85ms
iter 132090: loss 6.7200, time 125.57ms
iter 132100: loss 6.3035, time 125.57ms
iter 132110: loss 7.1750, time 125.61ms
iter 132120: loss 7.5483, time 125.49ms
iter 132130: loss 6.8008, time 126.00ms
iter 132140: loss 7.0938, time 124.52ms
iter 132150: loss 7.5660, time 126.12ms
iter 132160: loss 7.1009, time 125.96ms
iter 132170: loss 7.2854, time 125.10ms
iter 132180: loss 7.1600, time 124.92ms
iter 132190: loss 7.6564, time 124.83ms
iter 132200: loss 7.9001, time 125.61ms
iter 132210: loss 6.8838, time 125.16ms
iter 132220: loss 6.9877, time 128.86ms
iter 132230: loss 8.1643, time 125.83ms
iter 132240: loss 7.6099, time 125.84ms
step 132250: train loss 6.1579, val loss 6.1169
saving checkpoint to out-shakespeare-char
iter 132250: loss 7.6510, time 2884.25ms
iter 132260: loss 7.8142, time 126.05ms
iter 132270: loss 7.5703, time 125.86ms
iter 132280: loss 7.2333, time 124.97ms
iter 132290: loss 7.2046, time 125.57ms
iter 132300: loss 7.4790, time 125.42ms
iter 132310: loss 7.5422, time 125.48ms
iter 132320: loss 6.7284, time 128.20ms
iter 132330: loss 7.4240, time 125.38ms
iter 132340: loss 7.1360, time 125.62ms
iter 132350: loss 6.7752, time 125.22ms
iter 132360: loss 7.1822, time 125.38ms
iter 132370: loss 6.6873, time 125.00ms
iter 132380: loss 6.8030, time 125.22ms
iter 132390: loss 7.4400, time 125.16ms
iter 132400: loss 7.7597, time 125.04ms
iter 132410: loss 7.4425, time 124.99ms
iter 132420: loss 6.6817, time 125.26ms
iter 132430: loss 7.3012, time 127.95ms
iter 132440: loss 6.8045, time 125.67ms
iter 132450: loss 7.3403, time 125.08ms
iter 132460: loss 7.1094, time 125.26ms
iter 132470: loss 7.0968, time 125.09ms
iter 132480: loss 7.5520, time 125.16ms
iter 132490: loss 7.7973, time 125.73ms
step 132500: train loss 6.1550, val loss 6.1610
saving checkpoint to out-shakespeare-char
iter 132500: loss 7.7794, time 2877.58ms
iter 132510: loss 7.1730, time 125.58ms
iter 132520: loss 7.6462, time 125.29ms
iter 132530: loss 7.8374, time 125.26ms
iter 132540: loss 7.1708, time 125.06ms
iter 132550: loss 7.2363, time 124.97ms
iter 132560: loss 6.9569, time 124.80ms
iter 132570: loss 6.2228, time 125.35ms
iter 132580: loss 7.2317, time 125.67ms
iter 132590: loss 6.7747, time 125.92ms
iter 132600: loss 7.0430, time 127.81ms
iter 132610: loss 6.7210, time 125.42ms
iter 132620: loss 7.1981, time 126.18ms
iter 132630: loss 6.8516, time 125.65ms
iter 132640: loss 7.3242, time 125.56ms
iter 132650: loss 6.9000, time 125.83ms
iter 132660: loss 7.3778, time 125.62ms
iter 132670: loss 6.9941, time 125.76ms
iter 132680: loss 6.8720, time 125.85ms
iter 132690: loss 7.0864, time 125.79ms
iter 132700: loss 7.5472, time 125.79ms
iter 132710: loss 7.9887, time 128.65ms
iter 132720: loss 7.2287, time 125.90ms
iter 132730: loss 7.0552, time 125.91ms
iter 132740: loss 6.9811, time 125.49ms
step 132750: train loss 6.0865, val loss 6.0581
saving checkpoint to out-shakespeare-char
iter 132750: loss 6.9796, time 2875.11ms
iter 132760: loss 7.0062, time 126.46ms
iter 132770: loss 7.4949, time 128.86ms
iter 132780: loss 7.5890, time 125.97ms
iter 132790: loss 7.2163, time 125.98ms
iter 132800: loss 6.7529, time 126.87ms
iter 132810: loss 7.1896, time 126.99ms
iter 132820: loss 7.2351, time 126.03ms
iter 132830: loss 7.3102, time 125.92ms
iter 132840: loss 6.7932, time 126.15ms
iter 132850: loss 7.9622, time 125.84ms
iter 132860: loss 6.7572, time 125.88ms
iter 132870: loss 6.7471, time 125.96ms
iter 132880: loss 6.5350, time 128.19ms
iter 132890: loss 7.7411, time 125.23ms
iter 132900: loss 7.2077, time 125.59ms
iter 132910: loss 6.3630, time 126.55ms
iter 132920: loss 6.5600, time 126.26ms
iter 132930: loss 7.0081, time 125.66ms
iter 132940: loss 7.7265, time 125.96ms
iter 132950: loss 7.0166, time 125.36ms
iter 132960: loss 7.6436, time 125.52ms
iter 132970: loss 7.0598, time 125.42ms
iter 132980: loss 6.4895, time 125.89ms
iter 132990: loss 7.4315, time 128.13ms
step 133000: train loss 6.0926, val loss 6.1286
saving checkpoint to out-shakespeare-char
iter 133000: loss 8.1275, time 2884.08ms
iter 133010: loss 7.4428, time 126.87ms
iter 133020: loss 8.0820, time 125.52ms
iter 133030: loss 7.1663, time 125.27ms
iter 133040: loss 7.0120, time 126.72ms
iter 133050: loss 6.8818, time 125.49ms
iter 133060: loss 6.8928, time 126.61ms
iter 133070: loss 6.6396, time 125.74ms
iter 133080: loss 7.3299, time 125.24ms
iter 133090: loss 7.6839, time 125.48ms
iter 133100: loss 6.6663, time 125.09ms
iter 133110: loss 6.8410, time 125.71ms
iter 133120: loss 7.4080, time 126.06ms
iter 133130: loss 6.3025, time 128.75ms
iter 133140: loss 7.5665, time 124.98ms
iter 133150: loss 7.1578, time 125.70ms
iter 133160: loss 7.1191, time 125.25ms
iter 133170: loss 6.9655, time 126.33ms
iter 133180: loss 7.2648, time 125.07ms
iter 133190: loss 6.6274, time 125.32ms
iter 133200: loss 7.7294, time 126.11ms
iter 133210: loss 6.8884, time 125.92ms
iter 133220: loss 8.0172, time 125.87ms
iter 133230: loss 7.0607, time 126.03ms
iter 133240: loss 7.0581, time 128.55ms
step 133250: train loss 6.1526, val loss 6.1443
saving checkpoint to out-shakespeare-char
iter 133250: loss 7.3482, time 2905.52ms
iter 133260: loss 7.5365, time 126.14ms
iter 133270: loss 6.4229, time 125.86ms
iter 133280: loss 7.2915, time 125.77ms
iter 133290: loss 7.2314, time 125.87ms
iter 133300: loss 6.7864, time 125.63ms
iter 133310: loss 7.5235, time 125.83ms
iter 133320: loss 6.6261, time 126.30ms
iter 133330: loss 7.3813, time 126.05ms
iter 133340: loss 7.6472, time 129.02ms
iter 133350: loss 7.1034, time 126.23ms
iter 133360: loss 6.7811, time 126.09ms
iter 133370: loss 6.5659, time 126.12ms
iter 133380: loss 7.3491, time 128.89ms
iter 133390: loss 7.5205, time 126.06ms
iter 133400: loss 7.5294, time 125.94ms
iter 133410: loss 7.6255, time 127.79ms
iter 133420: loss 7.4847, time 126.26ms
iter 133430: loss 6.9659, time 126.01ms
iter 133440: loss 7.5294, time 126.04ms
iter 133450: loss 7.6190, time 126.31ms
iter 133460: loss 7.5866, time 125.89ms
iter 133470: loss 7.0087, time 125.80ms
iter 133480: loss 6.8082, time 126.70ms
iter 133490: loss 7.0474, time 128.96ms
step 133500: train loss 6.1578, val loss 6.1934
saving checkpoint to out-shakespeare-char
iter 133500: loss 7.5710, time 2871.44ms
iter 133510: loss 6.9069, time 120.36ms
iter 133520: loss 7.3097, time 119.93ms
iter 133530: loss 7.0433, time 120.03ms
iter 133540: loss 7.1198, time 119.90ms
iter 133550: loss 7.5674, time 120.30ms
iter 133560: loss 6.7972, time 119.61ms
iter 133570: loss 7.4280, time 120.00ms
iter 133580: loss 7.1991, time 119.87ms
iter 133590: loss 8.1551, time 121.43ms
iter 133600: loss 7.2198, time 119.87ms
iter 133610: loss 6.7584, time 119.76ms
iter 133620: loss 8.0246, time 119.89ms
iter 133630: loss 7.0344, time 120.58ms
iter 133640: loss 7.7085, time 121.33ms
iter 133650: loss 7.5023, time 120.22ms
iter 133660: loss 6.2206, time 119.78ms
iter 133670: loss 7.2024, time 119.92ms
iter 133680: loss 6.7430, time 119.57ms
iter 133690: loss 7.1725, time 119.88ms
iter 133700: loss 7.3430, time 119.58ms
iter 133710: loss 6.8968, time 119.69ms
iter 133720: loss 7.1935, time 119.68ms
iter 133730: loss 7.3603, time 119.65ms
iter 133740: loss 6.9659, time 119.67ms
step 133750: train loss 6.1717, val loss 6.1232
saving checkpoint to out-shakespeare-char
iter 133750: loss 6.9171, time 2902.57ms
iter 133760: loss 7.2683, time 119.68ms
iter 133770: loss 6.4238, time 119.45ms
iter 133780: loss 7.4808, time 119.61ms
iter 133790: loss 6.8418, time 119.30ms
iter 133800: loss 6.9839, time 119.34ms
iter 133810: loss 8.0412, time 119.38ms
iter 133820: loss 6.8594, time 119.68ms
iter 133830: loss 7.2592, time 119.37ms
iter 133840: loss 7.4194, time 119.48ms
iter 133850: loss 6.9475, time 120.47ms
iter 133860: loss 7.6046, time 120.21ms
iter 133870: loss 7.1941, time 119.68ms
iter 133880: loss 7.1684, time 121.01ms
iter 133890: loss 7.9513, time 119.53ms
iter 133900: loss 6.8624, time 119.91ms
iter 133910: loss 7.2392, time 120.47ms
iter 133920: loss 6.5095, time 120.11ms
iter 133930: loss 7.2937, time 120.33ms
iter 133940: loss 7.0927, time 121.50ms
iter 133950: loss 7.0440, time 119.56ms
iter 133960: loss 7.2600, time 119.63ms
iter 133970: loss 7.1026, time 119.66ms
iter 133980: loss 7.0956, time 120.35ms
iter 133990: loss 7.3924, time 120.92ms
step 134000: train loss 6.1455, val loss 6.1261
saving checkpoint to out-shakespeare-char
iter 134000: loss 6.4556, time 2901.13ms
iter 134010: loss 6.6268, time 122.80ms
iter 134020: loss 7.4797, time 121.48ms
iter 134030: loss 6.7221, time 122.70ms
iter 134040: loss 7.7247, time 122.09ms
iter 134050: loss 7.2190, time 122.76ms
iter 134060: loss 6.9131, time 121.98ms
iter 134070: loss 6.8301, time 122.77ms
iter 134080: loss 6.4747, time 121.55ms
iter 134090: loss 7.6131, time 122.49ms
iter 134100: loss 6.5954, time 121.49ms
iter 134110: loss 6.7444, time 123.13ms
iter 134120: loss 6.6694, time 121.48ms
iter 134130: loss 6.3435, time 122.77ms
iter 134140: loss 7.7824, time 121.62ms
iter 134150: loss 7.3885, time 122.75ms
iter 134160: loss 7.3875, time 121.74ms
iter 134170: loss 6.9691, time 122.65ms
iter 134180: loss 6.6903, time 121.53ms
iter 134190: loss 7.1754, time 122.84ms
iter 134200: loss 6.5138, time 122.03ms
iter 134210: loss 7.2677, time 122.77ms
iter 134220: loss 6.9739, time 121.65ms
iter 134230: loss 6.8207, time 122.74ms
iter 134240: loss 7.1894, time 121.68ms
step 134250: train loss 6.1318, val loss 6.1401
saving checkpoint to out-shakespeare-char
iter 134250: loss 7.4994, time 2895.83ms
iter 134260: loss 7.8387, time 121.76ms
iter 134270: loss 6.8453, time 124.88ms
iter 134280: loss 7.1339, time 121.80ms
iter 134290: loss 6.8329, time 121.86ms
iter 134300: loss 7.6236, time 122.18ms
iter 134310: loss 6.9263, time 123.40ms
iter 134320: loss 7.6633, time 121.92ms
iter 134330: loss 7.6908, time 121.96ms
iter 134340: loss 7.4366, time 122.09ms
iter 134350: loss 7.2451, time 123.93ms
iter 134360: loss 6.6039, time 121.94ms
iter 134370: loss 6.8889, time 122.19ms
iter 134380: loss 6.9526, time 121.78ms
iter 134390: loss 7.2192, time 123.23ms
iter 134400: loss 7.6449, time 121.94ms
iter 134410: loss 7.3453, time 121.94ms
iter 134420: loss 7.7042, time 121.66ms
iter 134430: loss 7.4029, time 122.21ms
iter 134440: loss 6.9664, time 122.37ms
iter 134450: loss 6.9447, time 122.14ms
iter 134460: loss 6.9986, time 121.89ms
iter 134470: loss 7.9607, time 122.17ms
iter 134480: loss 6.1851, time 121.88ms
iter 134490: loss 6.7307, time 123.50ms
step 134500: train loss 6.0841, val loss 6.1241
saving checkpoint to out-shakespeare-char
iter 134500: loss 6.8445, time 2901.75ms
iter 134510: loss 6.6815, time 126.00ms
iter 134520: loss 7.9516, time 125.58ms
iter 134530: loss 7.7817, time 124.95ms
iter 134540: loss 7.3782, time 125.33ms
iter 134550: loss 7.3983, time 125.43ms
iter 134560: loss 6.9538, time 125.31ms
iter 134570: loss 6.5916, time 125.01ms
iter 134580: loss 6.7917, time 125.02ms
iter 134590: loss 7.9688, time 125.18ms
iter 134600: loss 7.1666, time 125.09ms
iter 134610: loss 7.0984, time 127.65ms
iter 134620: loss 6.6934, time 125.73ms
iter 134630: loss 7.7069, time 125.52ms
iter 134640: loss 7.5617, time 125.58ms
iter 134650: loss 6.7419, time 125.59ms
iter 134660: loss 6.7521, time 125.55ms
iter 134670: loss 6.9123, time 125.57ms
iter 134680: loss 8.3077, time 125.91ms
iter 134690: loss 7.4560, time 128.55ms
iter 134700: loss 7.2983, time 125.59ms
iter 134710: loss 7.3384, time 125.57ms
iter 134720: loss 7.2499, time 125.67ms
iter 134730: loss 7.5043, time 125.87ms
iter 134740: loss 7.0425, time 126.15ms
step 134750: train loss 6.1401, val loss 6.1652
saving checkpoint to out-shakespeare-char
iter 134750: loss 7.5817, time 2865.42ms
iter 134760: loss 7.1377, time 125.34ms
iter 134770: loss 7.1452, time 125.66ms
iter 134780: loss 7.1669, time 125.68ms
iter 134790: loss 6.8092, time 126.05ms
iter 134800: loss 7.1452, time 126.26ms
iter 134810: loss 7.2242, time 127.17ms
iter 134820: loss 7.1064, time 128.64ms
iter 134830: loss 7.9713, time 124.74ms
iter 134840: loss 6.1378, time 125.73ms
iter 134850: loss 6.9014, time 126.22ms
iter 134860: loss 6.5408, time 125.78ms
iter 134870: loss 6.5611, time 125.71ms
iter 134880: loss 6.7522, time 126.58ms
iter 134890: loss 7.2683, time 125.50ms
iter 134900: loss 6.7341, time 125.59ms
iter 134910: loss 6.9338, time 125.84ms
iter 134920: loss 7.7813, time 125.45ms
iter 134930: loss 7.7942, time 125.45ms
iter 134940: loss 7.2645, time 125.80ms
iter 134950: loss 7.3469, time 125.47ms
iter 134960: loss 6.9059, time 126.27ms
iter 134970: loss 8.3174, time 125.98ms
iter 134980: loss 7.4689, time 125.62ms
iter 134990: loss 8.2506, time 125.70ms
step 135000: train loss 6.1647, val loss 6.1421
saving checkpoint to out-shakespeare-char
iter 135000: loss 6.5386, time 2902.65ms
iter 135010: loss 7.7794, time 125.87ms
iter 135020: loss 6.5226, time 125.88ms
iter 135030: loss 7.8424, time 129.49ms
iter 135040: loss 6.4480, time 125.41ms
iter 135050: loss 6.4989, time 125.62ms
iter 135060: loss 7.1590, time 125.69ms
iter 135070: loss 7.7144, time 125.20ms
iter 135080: loss 7.0018, time 125.54ms
iter 135090: loss 7.0271, time 126.17ms
iter 135100: loss 7.0570, time 125.46ms
iter 135110: loss 6.7113, time 125.59ms
iter 135120: loss 7.2391, time 125.57ms
iter 135130: loss 7.1196, time 125.81ms
iter 135140: loss 7.5896, time 128.46ms
iter 135150: loss 7.4665, time 125.55ms
iter 135160: loss 7.5918, time 125.71ms
iter 135170: loss 6.9241, time 125.94ms
iter 135180: loss 6.8004, time 125.90ms
iter 135190: loss 8.0666, time 125.42ms
iter 135200: loss 6.7664, time 125.61ms
iter 135210: loss 7.1289, time 125.36ms
iter 135220: loss 7.1923, time 125.39ms
iter 135230: loss 8.1009, time 125.59ms
iter 135240: loss 6.7889, time 125.61ms
step 135250: train loss 6.1052, val loss 6.1295
saving checkpoint to out-shakespeare-char
iter 135250: loss 7.1927, time 2899.28ms
iter 135260: loss 7.1424, time 125.59ms
iter 135270: loss 8.0742, time 125.36ms
iter 135280: loss 6.4405, time 125.59ms
iter 135290: loss 6.8368, time 125.99ms
iter 135300: loss 7.1013, time 125.03ms
iter 135310: loss 7.9898, time 128.33ms
iter 135320: loss 6.9763, time 125.14ms
iter 135330: loss 7.1717, time 126.34ms
iter 135340: loss 7.9170, time 125.16ms
iter 135350: loss 6.5739, time 124.99ms
iter 135360: loss 6.8743, time 125.06ms
iter 135370: loss 7.2837, time 126.00ms
iter 135380: loss 7.8564, time 129.33ms
iter 135390: loss 6.8546, time 125.93ms
iter 135400: loss 6.9191, time 125.85ms
iter 135410: loss 7.2477, time 126.42ms
iter 135420: loss 7.3359, time 125.69ms
iter 135430: loss 7.5296, time 125.81ms
iter 135440: loss 6.3874, time 125.74ms
iter 135450: loss 6.6446, time 125.67ms
iter 135460: loss 7.1421, time 126.02ms
iter 135470: loss 6.9861, time 126.18ms
iter 135480: loss 7.5448, time 125.73ms
iter 135490: loss 7.1644, time 128.41ms
step 135500: train loss 6.1151, val loss 6.1648
saving checkpoint to out-shakespeare-char
iter 135500: loss 7.2874, time 2900.25ms
iter 135510: loss 6.5059, time 125.99ms
iter 135520: loss 7.4603, time 128.42ms
iter 135530: loss 7.8085, time 125.93ms
iter 135540: loss 6.5425, time 126.41ms
iter 135550: loss 7.4305, time 126.29ms
iter 135560: loss 7.9271, time 125.74ms
iter 135570: loss 6.8656, time 125.72ms
iter 135580: loss 7.1654, time 125.75ms
iter 135590: loss 7.0677, time 125.74ms
iter 135600: loss 7.4860, time 125.84ms
iter 135610: loss 6.3355, time 126.04ms
iter 135620: loss 7.8014, time 126.15ms
iter 135630: loss 7.6984, time 129.45ms
iter 135640: loss 7.1446, time 125.72ms
iter 135650: loss 7.7042, time 125.78ms
iter 135660: loss 7.7922, time 125.68ms
iter 135670: loss 7.3698, time 125.78ms
iter 135680: loss 7.7269, time 125.61ms
iter 135690: loss 6.9413, time 125.75ms
iter 135700: loss 7.2717, time 125.85ms
iter 135710: loss 7.3710, time 125.70ms
iter 135720: loss 6.3212, time 125.72ms
iter 135730: loss 7.5063, time 125.06ms
iter 135740: loss 7.6831, time 125.65ms
step 135750: train loss 6.1280, val loss 6.1545
saving checkpoint to out-shakespeare-char
iter 135750: loss 6.9408, time 2879.12ms
iter 135760: loss 7.2122, time 126.01ms
iter 135770: loss 6.8756, time 125.86ms
iter 135780: loss 7.2565, time 125.34ms
iter 135790: loss 7.2538, time 123.93ms
iter 135800: loss 7.1228, time 125.58ms
iter 135810: loss 6.7877, time 125.13ms
iter 135820: loss 6.9046, time 125.27ms
iter 135830: loss 7.1508, time 125.82ms
iter 135840: loss 7.1407, time 128.47ms
iter 135850: loss 7.0616, time 124.81ms
iter 135860: loss 6.5750, time 125.63ms
iter 135870: loss 7.0884, time 125.52ms
iter 135880: loss 6.9155, time 125.44ms
iter 135890: loss 7.0588, time 125.32ms
iter 135900: loss 6.8979, time 125.77ms
iter 135910: loss 7.2574, time 126.08ms
iter 135920: loss 7.6526, time 125.63ms
iter 135930: loss 7.1305, time 125.58ms
iter 135940: loss 6.9102, time 125.57ms
iter 135950: loss 7.4789, time 125.47ms
iter 135960: loss 7.2628, time 125.48ms
iter 135970: loss 7.1543, time 125.47ms
iter 135980: loss 7.6107, time 125.72ms
iter 135990: loss 7.5687, time 128.52ms
step 136000: train loss 6.2165, val loss 6.0981
saving checkpoint to out-shakespeare-char
iter 136000: loss 7.6199, time 2888.77ms
iter 136010: loss 7.1825, time 125.56ms
iter 136020: loss 7.2565, time 125.57ms
iter 136030: loss 7.2428, time 125.96ms
iter 136040: loss 7.2552, time 125.82ms
iter 136050: loss 7.0961, time 128.12ms
iter 136060: loss 7.2642, time 125.21ms
iter 136070: loss 7.0064, time 125.07ms
iter 136080: loss 7.7780, time 124.98ms
iter 136090: loss 8.2323, time 125.07ms
iter 136100: loss 7.4447, time 124.39ms
iter 136110: loss 6.7519, time 125.43ms
iter 136120: loss 7.3874, time 128.10ms
iter 136130: loss 7.7909, time 125.87ms
iter 136140: loss 7.9343, time 125.37ms
iter 136150: loss 7.4180, time 125.31ms
iter 136160: loss 6.3363, time 124.98ms
iter 136170: loss 7.5515, time 125.02ms
iter 136180: loss 7.0972, time 125.67ms
iter 136190: loss 6.5884, time 125.13ms
iter 136200: loss 6.8556, time 125.55ms
iter 136210: loss 6.7546, time 124.91ms
iter 136220: loss 7.2953, time 125.46ms
iter 136230: loss 7.1777, time 127.94ms
iter 136240: loss 6.6027, time 125.15ms
step 136250: train loss 6.1590, val loss 6.1144
saving checkpoint to out-shakespeare-char
iter 136250: loss 7.1891, time 2892.28ms
iter 136260: loss 7.2960, time 125.27ms
iter 136270: loss 7.3299, time 125.55ms
iter 136280: loss 7.6445, time 125.56ms
iter 136290: loss 6.4414, time 125.06ms
iter 136300: loss 7.2853, time 125.48ms
iter 136310: loss 6.7723, time 126.04ms
iter 136320: loss 7.0776, time 128.89ms
iter 136330: loss 7.5591, time 125.65ms
iter 136340: loss 7.0842, time 125.93ms
iter 136350: loss 8.0040, time 125.93ms
iter 136360: loss 6.1661, time 125.55ms
iter 136370: loss 6.1426, time 125.61ms
iter 136380: loss 6.5362, time 125.54ms
iter 136390: loss 7.6951, time 125.44ms
iter 136400: loss 6.8072, time 125.71ms
iter 136410: loss 6.7832, time 125.80ms
iter 136420: loss 7.1110, time 125.88ms
iter 136430: loss 6.8753, time 128.64ms
iter 136440: loss 7.1561, time 125.45ms
iter 136450: loss 7.5142, time 125.89ms
iter 136460: loss 6.4901, time 125.47ms
iter 136470: loss 7.0502, time 125.67ms
iter 136480: loss 7.0365, time 125.52ms
iter 136490: loss 7.5808, time 125.21ms
step 136500: train loss 6.0894, val loss 6.1082
saving checkpoint to out-shakespeare-char
iter 136500: loss 7.1722, time 2895.82ms
iter 136510: loss 6.4259, time 125.32ms
iter 136520: loss 7.3747, time 125.23ms
iter 136530: loss 7.0777, time 125.09ms
iter 136540: loss 7.3319, time 125.97ms
iter 136550: loss 6.2862, time 124.97ms
iter 136560: loss 7.6635, time 126.44ms
iter 136570: loss 7.2280, time 125.07ms
iter 136580: loss 7.1968, time 125.34ms
iter 136590: loss 8.1512, time 125.09ms
iter 136600: loss 6.9629, time 124.96ms
iter 136610: loss 6.2526, time 128.19ms
iter 136620: loss 7.6773, time 125.40ms
iter 136630: loss 8.1524, time 125.45ms
iter 136640: loss 7.3839, time 125.62ms
iter 136650: loss 6.5578, time 125.36ms
iter 136660: loss 7.0250, time 125.18ms
iter 136670: loss 7.2996, time 125.16ms
iter 136680: loss 6.8457, time 125.51ms
iter 136690: loss 7.2794, time 125.28ms
iter 136700: loss 6.6888, time 125.09ms
iter 136710: loss 7.9045, time 124.95ms
iter 136720: loss 7.1093, time 128.10ms
iter 136730: loss 6.3550, time 124.82ms
iter 136740: loss 7.9585, time 125.40ms
step 136750: train loss 6.1549, val loss 6.1288
saving checkpoint to out-shakespeare-char
iter 136750: loss 6.8492, time 2886.65ms
iter 136760: loss 7.9078, time 125.08ms
iter 136770: loss 7.3649, time 124.47ms
iter 136780: loss 6.8883, time 125.23ms
iter 136790: loss 6.7006, time 125.06ms
iter 136800: loss 7.0595, time 125.29ms
iter 136810: loss 6.8257, time 128.38ms
iter 136820: loss 7.9386, time 124.46ms
iter 136830: loss 7.2262, time 124.23ms
iter 136840: loss 7.2269, time 125.42ms
iter 136850: loss 8.0995, time 128.20ms
iter 136860: loss 7.4839, time 124.87ms
iter 136870: loss 7.0986, time 125.09ms
iter 136880: loss 7.3110, time 125.17ms
iter 136890: loss 7.7341, time 124.78ms
iter 136900: loss 7.8747, time 124.21ms
iter 136910: loss 7.3422, time 124.92ms
iter 136920: loss 7.9941, time 124.85ms
iter 136930: loss 7.0945, time 125.14ms
iter 136940: loss 7.1733, time 124.86ms
iter 136950: loss 6.7022, time 125.42ms
iter 136960: loss 7.1193, time 127.65ms
iter 136970: loss 7.6738, time 125.18ms
iter 136980: loss 6.2376, time 125.22ms
iter 136990: loss 6.7409, time 125.17ms
step 137000: train loss 6.1306, val loss 6.1084
saving checkpoint to out-shakespeare-char
iter 137000: loss 6.9925, time 2877.62ms
iter 137010: loss 7.0533, time 125.47ms
iter 137020: loss 6.6354, time 127.90ms
iter 137030: loss 7.6674, time 125.30ms
iter 137040: loss 6.6102, time 126.30ms
iter 137050: loss 7.3766, time 125.65ms
iter 137060: loss 6.8799, time 125.47ms
iter 137070: loss 6.9550, time 126.82ms
iter 137080: loss 6.8646, time 125.43ms
iter 137090: loss 6.5654, time 125.70ms
iter 137100: loss 7.2616, time 125.98ms
iter 137110: loss 6.6759, time 128.44ms
iter 137120: loss 6.9972, time 125.08ms
iter 137130: loss 6.9956, time 125.01ms
iter 137140: loss 7.3150, time 125.50ms
iter 137150: loss 7.3647, time 125.04ms
iter 137160: loss 6.4746, time 125.05ms
iter 137170: loss 7.1600, time 125.37ms
iter 137180: loss 6.7934, time 125.47ms
iter 137190: loss 7.0577, time 124.96ms
iter 137200: loss 6.4245, time 125.13ms
iter 137210: loss 6.9471, time 125.05ms
iter 137220: loss 8.5699, time 128.30ms
iter 137230: loss 7.8473, time 125.20ms
iter 137240: loss 7.1027, time 125.16ms
step 137250: train loss 6.1226, val loss 6.0816
saving checkpoint to out-shakespeare-char
iter 137250: loss 6.8983, time 2870.99ms
iter 137260: loss 6.8943, time 125.32ms
iter 137270: loss 7.2056, time 124.97ms
iter 137280: loss 6.9717, time 123.97ms
iter 137290: loss 7.7467, time 125.73ms
iter 137300: loss 7.8310, time 125.04ms
iter 137310: loss 7.4307, time 125.53ms
iter 137320: loss 7.1648, time 128.10ms
iter 137330: loss 7.1036, time 124.88ms
iter 137340: loss 7.2105, time 124.94ms
iter 137350: loss 7.2516, time 125.05ms
iter 137360: loss 6.5686, time 125.21ms
iter 137370: loss 7.2761, time 125.10ms
iter 137380: loss 6.8526, time 125.18ms
iter 137390: loss 6.7514, time 124.22ms
iter 137400: loss 6.8733, time 125.06ms
iter 137410: loss 6.9220, time 125.15ms
iter 137420: loss 7.4172, time 125.40ms
iter 137430: loss 7.9015, time 128.33ms
iter 137440: loss 6.9386, time 125.33ms
iter 137450: loss 6.6738, time 125.03ms
iter 137460: loss 7.3834, time 125.34ms
iter 137470: loss 6.4089, time 125.51ms
iter 137480: loss 7.1094, time 125.84ms
iter 137490: loss 7.5624, time 125.76ms
step 137500: train loss 6.1639, val loss 6.1109
saving checkpoint to out-shakespeare-char
iter 137500: loss 6.8370, time 2867.10ms
iter 137510: loss 7.4560, time 126.00ms
iter 137520: loss 7.6515, time 125.43ms
iter 137530: loss 6.7469, time 126.05ms
iter 137540: loss 7.0726, time 125.88ms
iter 137550: loss 6.6567, time 126.13ms
iter 137560: loss 7.2660, time 128.51ms
iter 137570: loss 6.4489, time 126.37ms
iter 137580: loss 6.6199, time 125.72ms
iter 137590: loss 7.7737, time 125.65ms
iter 137600: loss 6.9392, time 125.58ms
iter 137610: loss 7.0207, time 125.53ms
iter 137620: loss 7.1563, time 125.67ms
iter 137630: loss 7.4730, time 125.51ms
iter 137640: loss 7.7213, time 125.82ms
iter 137650: loss 7.4651, time 125.82ms
iter 137660: loss 7.4054, time 125.87ms
iter 137670: loss 7.4468, time 128.85ms
iter 137680: loss 7.1180, time 125.37ms
iter 137690: loss 7.0416, time 126.22ms
iter 137700: loss 7.0679, time 125.59ms
iter 137710: loss 7.3394, time 125.72ms
iter 137720: loss 7.0832, time 125.58ms
iter 137730: loss 6.8462, time 125.78ms
iter 137740: loss 7.7563, time 125.57ms
step 137750: train loss 6.0887, val loss 6.1125
saving checkpoint to out-shakespeare-char
iter 137750: loss 6.7602, time 2868.32ms
iter 137760: loss 7.1361, time 125.61ms
iter 137770: loss 5.8856, time 125.07ms
iter 137780: loss 6.7330, time 125.17ms
iter 137790: loss 7.1302, time 124.99ms
iter 137800: loss 6.5586, time 125.26ms
iter 137810: loss 7.4863, time 125.20ms
iter 137820: loss 7.2094, time 125.10ms
iter 137830: loss 7.4826, time 125.35ms
iter 137840: loss 6.5559, time 128.13ms
iter 137850: loss 6.7425, time 125.20ms
iter 137860: loss 6.6928, time 124.93ms
iter 137870: loss 6.6825, time 125.25ms
iter 137880: loss 8.4184, time 128.50ms
iter 137890: loss 6.6994, time 127.80ms
iter 137900: loss 7.1043, time 125.79ms
iter 137910: loss 7.3954, time 125.67ms
iter 137920: loss 7.2881, time 128.96ms
iter 137930: loss 7.6383, time 125.66ms
iter 137940: loss 6.2753, time 125.19ms
iter 137950: loss 6.5802, time 125.31ms
iter 137960: loss 7.6127, time 125.16ms
iter 137970: loss 7.3691, time 125.42ms
iter 137980: loss 7.2992, time 125.43ms
iter 137990: loss 7.3432, time 124.23ms
step 138000: train loss 6.1001, val loss 6.1094
saving checkpoint to out-shakespeare-char
iter 138000: loss 7.1399, time 2879.27ms
iter 138010: loss 7.1730, time 125.01ms
iter 138020: loss 6.9403, time 125.16ms
iter 138030: loss 6.6026, time 125.27ms
iter 138040: loss 7.9100, time 125.59ms
iter 138050: loss 7.4462, time 127.96ms
iter 138060: loss 7.3812, time 125.05ms
iter 138070: loss 6.4163, time 125.36ms
iter 138080: loss 6.9345, time 126.36ms
iter 138090: loss 7.0560, time 124.98ms
iter 138100: loss 7.7490, time 125.56ms
iter 138110: loss 6.7165, time 125.78ms
iter 138120: loss 6.9963, time 125.31ms
iter 138130: loss 7.3420, time 124.19ms
iter 138140: loss 7.3394, time 125.40ms
iter 138150: loss 7.4864, time 125.53ms
iter 138160: loss 6.7125, time 128.22ms
iter 138170: loss 7.9592, time 124.66ms
iter 138180: loss 6.8194, time 125.68ms
iter 138190: loss 6.6681, time 125.17ms
iter 138200: loss 6.8913, time 124.79ms
iter 138210: loss 7.3612, time 124.93ms
iter 138220: loss 7.3821, time 124.93ms
iter 138230: loss 6.8411, time 125.09ms
iter 138240: loss 6.6985, time 125.01ms
step 138250: train loss 6.1484, val loss 6.1297
saving checkpoint to out-shakespeare-char
iter 138250: loss 6.8057, time 2867.49ms
iter 138260: loss 7.2612, time 125.66ms
iter 138270: loss 6.4720, time 124.97ms
iter 138280: loss 6.8970, time 125.01ms
iter 138290: loss 7.1793, time 124.83ms
iter 138300: loss 6.5205, time 125.54ms
iter 138310: loss 7.3349, time 124.77ms
iter 138320: loss 7.1691, time 126.69ms
iter 138330: loss 6.5399, time 123.71ms
iter 138340: loss 6.9429, time 124.81ms
iter 138350: loss 7.5490, time 125.19ms
iter 138360: loss 7.7358, time 127.76ms
iter 138370: loss 7.1815, time 124.94ms
iter 138380: loss 6.8698, time 124.02ms
iter 138390: loss 7.4223, time 125.46ms
iter 138400: loss 6.9090, time 124.82ms
iter 138410: loss 6.4929, time 124.18ms
iter 138420: loss 8.0803, time 124.61ms
iter 138430: loss 5.9554, time 124.96ms
iter 138440: loss 6.7137, time 124.90ms
iter 138450: loss 8.0648, time 124.86ms
iter 138460: loss 6.4497, time 124.82ms
iter 138470: loss 7.5639, time 128.07ms
iter 138480: loss 7.4633, time 125.35ms
iter 138490: loss 7.8127, time 124.87ms
step 138500: train loss 6.1206, val loss 6.1355
saving checkpoint to out-shakespeare-char
iter 138500: loss 7.8569, time 2900.85ms
iter 138510: loss 7.2224, time 121.89ms
iter 138520: loss 7.1667, time 124.63ms
iter 138530: loss 7.4711, time 120.97ms
iter 138540: loss 7.4301, time 124.58ms
iter 138550: loss 6.2361, time 120.64ms
iter 138560: loss 7.3292, time 124.71ms
iter 138570: loss 6.7700, time 120.89ms
iter 138580: loss 7.3961, time 124.26ms
iter 138590: loss 7.4207, time 121.53ms
iter 138600: loss 7.7147, time 124.63ms
iter 138610: loss 6.5891, time 121.03ms
iter 138620: loss 6.4362, time 124.70ms
iter 138630: loss 7.2397, time 121.48ms
iter 138640: loss 7.3124, time 124.15ms
iter 138650: loss 7.0083, time 121.46ms
iter 138660: loss 7.0896, time 123.79ms
iter 138670: loss 7.4292, time 121.39ms
iter 138680: loss 7.5820, time 124.60ms
iter 138690: loss 7.1726, time 121.28ms
iter 138700: loss 7.2665, time 124.36ms
iter 138710: loss 7.4469, time 121.51ms
iter 138720: loss 6.5830, time 123.20ms
iter 138730: loss 6.9191, time 121.33ms
iter 138740: loss 6.3213, time 123.48ms
step 138750: train loss 6.1398, val loss 6.1310
saving checkpoint to out-shakespeare-char
iter 138750: loss 6.3482, time 2898.30ms
iter 138760: loss 7.1383, time 121.40ms
iter 138770: loss 7.0324, time 122.78ms
iter 138780: loss 7.4356, time 121.63ms
iter 138790: loss 7.2045, time 122.89ms
iter 138800: loss 6.8496, time 121.66ms
iter 138810: loss 6.6461, time 122.56ms
iter 138820: loss 7.4822, time 121.11ms
iter 138830: loss 6.9162, time 123.22ms
iter 138840: loss 7.4302, time 121.55ms
iter 138850: loss 7.4171, time 121.86ms
iter 138860: loss 7.4783, time 121.62ms
iter 138870: loss 7.0132, time 122.14ms
iter 138880: loss 7.0568, time 121.71ms
iter 138890: loss 7.3295, time 122.72ms
iter 138900: loss 6.9855, time 121.80ms
iter 138910: loss 7.3974, time 122.75ms
iter 138920: loss 6.9796, time 121.58ms
iter 138930: loss 7.0112, time 122.78ms
iter 138940: loss 7.3754, time 121.77ms
iter 138950: loss 7.1405, time 122.66ms
iter 138960: loss 6.9411, time 121.67ms
iter 138970: loss 7.4853, time 123.08ms
iter 138980: loss 6.4415, time 121.62ms
iter 138990: loss 7.3335, time 123.64ms
step 139000: train loss 6.0883, val loss 6.1353
saving checkpoint to out-shakespeare-char
iter 139000: loss 7.3081, time 2910.08ms
iter 139010: loss 6.5522, time 124.59ms
iter 139020: loss 7.9572, time 121.72ms
iter 139030: loss 6.5632, time 124.63ms
iter 139040: loss 7.1940, time 121.59ms
iter 139050: loss 7.5528, time 124.53ms
iter 139060: loss 7.0182, time 121.50ms
iter 139070: loss 7.0870, time 124.62ms
iter 139080: loss 7.1762, time 121.77ms
iter 139090: loss 6.3241, time 124.62ms
iter 139100: loss 7.2739, time 121.55ms
iter 139110: loss 7.7710, time 125.22ms
iter 139120: loss 7.0109, time 121.51ms
iter 139130: loss 7.4780, time 124.88ms
iter 139140: loss 7.3007, time 121.93ms
iter 139150: loss 7.9153, time 124.77ms
iter 139160: loss 6.7812, time 121.72ms
iter 139170: loss 6.9142, time 124.36ms
iter 139180: loss 7.2466, time 122.01ms
iter 139190: loss 7.2170, time 124.75ms
iter 139200: loss 6.7589, time 121.59ms
iter 139210: loss 6.6287, time 124.57ms
iter 139220: loss 6.9093, time 121.57ms
iter 139230: loss 7.5729, time 123.11ms
iter 139240: loss 6.5663, time 121.88ms
step 139250: train loss 6.1575, val loss 6.1434
saving checkpoint to out-shakespeare-char
iter 139250: loss 6.8009, time 2912.40ms
iter 139260: loss 7.3396, time 121.68ms
iter 139270: loss 6.9146, time 121.98ms
iter 139280: loss 7.0904, time 121.67ms
iter 139290: loss 6.9206, time 121.98ms
iter 139300: loss 7.1684, time 121.63ms
iter 139310: loss 7.0843, time 121.89ms
iter 139320: loss 7.0254, time 121.71ms
iter 139330: loss 7.0235, time 121.82ms
iter 139340: loss 6.7185, time 121.99ms
iter 139350: loss 6.5375, time 121.84ms
iter 139360: loss 7.4052, time 121.67ms
iter 139370: loss 7.0777, time 121.98ms
iter 139380: loss 7.4442, time 120.87ms
iter 139390: loss 7.1469, time 122.29ms
iter 139400: loss 7.2271, time 121.75ms
iter 139410: loss 6.3948, time 121.94ms
iter 139420: loss 7.4955, time 121.84ms
iter 139430: loss 6.3937, time 122.11ms
iter 139440: loss 7.1272, time 121.59ms
iter 139450: loss 7.2773, time 121.76ms
iter 139460: loss 7.2992, time 121.62ms
iter 139470: loss 6.9993, time 121.87ms
iter 139480: loss 7.2314, time 121.89ms
iter 139490: loss 6.7769, time 122.09ms
step 139500: train loss 6.1067, val loss 6.0731
saving checkpoint to out-shakespeare-char
iter 139500: loss 7.0634, time 2910.85ms
iter 139510: loss 7.1746, time 125.63ms
iter 139520: loss 6.9075, time 124.24ms
iter 139530: loss 6.8539, time 127.37ms
iter 139540: loss 6.6576, time 126.22ms
iter 139550: loss 6.7874, time 128.50ms
iter 139560: loss 6.9085, time 125.87ms
iter 139570: loss 7.2477, time 125.69ms
iter 139580: loss 7.2529, time 125.64ms
iter 139590: loss 6.8982, time 125.56ms
iter 139600: loss 6.6688, time 125.10ms
iter 139610: loss 7.1127, time 125.57ms
iter 139620: loss 6.7491, time 125.56ms
iter 139630: loss 6.0633, time 125.89ms
iter 139640: loss 7.0732, time 125.99ms
iter 139650: loss 7.2395, time 125.71ms
iter 139660: loss 7.4337, time 128.96ms
iter 139670: loss 7.3747, time 125.59ms
iter 139680: loss 6.8846, time 126.02ms
iter 139690: loss 6.9972, time 126.04ms
iter 139700: loss 7.4789, time 125.50ms
iter 139710: loss 7.3185, time 125.59ms
iter 139720: loss 6.7054, time 125.43ms
iter 139730: loss 6.4850, time 125.57ms
iter 139740: loss 7.3852, time 125.41ms
step 139750: train loss 6.1100, val loss 6.1217
saving checkpoint to out-shakespeare-char
iter 139750: loss 7.2398, time 2873.71ms
iter 139760: loss 7.3941, time 125.81ms
iter 139770: loss 7.2220, time 125.36ms
iter 139780: loss 7.2617, time 125.26ms
iter 139790: loss 7.1371, time 124.85ms
iter 139800: loss 6.9672, time 125.36ms
iter 139810: loss 7.0858, time 124.97ms
iter 139820: loss 7.1683, time 125.64ms
iter 139830: loss 6.1646, time 128.47ms
iter 139840: loss 7.3727, time 125.90ms
iter 139850: loss 6.8139, time 124.58ms
iter 139860: loss 6.6021, time 124.90ms
iter 139870: loss 6.2879, time 125.09ms
iter 139880: loss 7.3421, time 124.94ms
iter 139890: loss 7.2556, time 125.13ms
iter 139900: loss 6.6446, time 125.53ms
iter 139910: loss 7.5079, time 124.99ms
iter 139920: loss 7.5768, time 125.12ms
iter 139930: loss 7.2760, time 125.08ms
iter 139940: loss 6.9128, time 128.24ms
iter 139950: loss 7.0285, time 125.01ms
iter 139960: loss 7.1451, time 125.02ms
iter 139970: loss 7.6277, time 124.84ms
iter 139980: loss 6.7856, time 125.09ms
iter 139990: loss 7.0325, time 124.98ms
step 140000: train loss 6.0944, val loss 6.1417
saving checkpoint to out-shakespeare-char
iter 140000: loss 7.6972, time 2872.29ms
iter 140010: loss 6.8507, time 124.14ms
iter 140020: loss 6.8305, time 123.23ms
iter 140030: loss 6.8286, time 123.87ms
iter 140040: loss 7.6544, time 125.62ms
iter 140050: loss 7.1305, time 124.55ms
iter 140060: loss 6.7976, time 124.74ms
iter 140070: loss 7.3235, time 124.37ms
iter 140080: loss 7.5268, time 124.48ms
iter 140090: loss 7.4852, time 124.64ms
iter 140100: loss 7.6470, time 124.95ms
iter 140110: loss 6.4087, time 124.36ms
iter 140120: loss 6.7974, time 127.90ms
iter 140130: loss 6.7165, time 123.07ms
iter 140140: loss 7.3217, time 123.33ms
iter 140150: loss 7.0447, time 125.33ms
iter 140160: loss 6.7969, time 124.88ms
iter 140170: loss 6.5169, time 125.94ms
iter 140180: loss 6.7652, time 125.96ms
iter 140190: loss 6.5923, time 125.90ms
iter 140200: loss 7.9578, time 125.95ms
iter 140210: loss 7.5412, time 125.98ms
iter 140220: loss 7.0832, time 125.92ms
iter 140230: loss 6.9843, time 129.62ms
iter 140240: loss 7.5286, time 125.95ms
step 140250: train loss 6.1188, val loss 6.1251
saving checkpoint to out-shakespeare-char
iter 140250: loss 7.2818, time 2884.28ms
iter 140260: loss 6.7225, time 125.61ms
iter 140270: loss 6.2113, time 125.54ms
iter 140280: loss 6.7054, time 125.82ms
iter 140290: loss 7.0256, time 125.72ms
iter 140300: loss 7.4544, time 125.90ms
iter 140310: loss 7.1809, time 125.70ms
iter 140320: loss 7.2872, time 125.76ms
iter 140330: loss 7.1815, time 124.79ms
iter 140340: loss 6.9104, time 125.78ms
iter 140350: loss 6.6037, time 126.08ms
iter 140360: loss 7.2965, time 129.37ms
iter 140370: loss 6.7356, time 124.49ms
iter 140380: loss 7.1702, time 125.17ms
iter 140390: loss 7.0675, time 124.73ms
iter 140400: loss 6.9652, time 125.98ms
iter 140410: loss 6.5183, time 126.03ms
iter 140420: loss 6.9667, time 126.39ms
iter 140430: loss 7.2870, time 128.82ms
iter 140440: loss 7.1059, time 125.92ms
iter 140450: loss 7.6503, time 125.40ms
iter 140460: loss 7.5258, time 125.09ms
iter 140470: loss 7.0415, time 125.65ms
iter 140480: loss 6.5221, time 125.85ms
iter 140490: loss 7.5028, time 125.68ms
step 140500: train loss 6.0768, val loss 6.1163
saving checkpoint to out-shakespeare-char
iter 140500: loss 7.0479, time 2880.48ms
iter 140510: loss 7.0994, time 126.19ms
iter 140520: loss 7.4948, time 127.88ms
iter 140530: loss 7.7973, time 126.47ms
iter 140540: loss 6.1717, time 125.64ms
iter 140550: loss 6.9612, time 125.76ms
iter 140560: loss 7.2654, time 125.50ms
iter 140570: loss 7.2384, time 127.58ms
iter 140580: loss 6.6141, time 125.66ms
iter 140590: loss 7.4891, time 126.60ms
iter 140600: loss 6.9402, time 125.73ms
iter 140610: loss 6.5159, time 125.40ms
iter 140620: loss 7.0423, time 125.94ms
iter 140630: loss 7.2477, time 125.86ms
iter 140640: loss 6.9788, time 126.01ms
iter 140650: loss 7.3191, time 125.54ms
iter 140660: loss 7.5867, time 126.49ms
iter 140670: loss 7.4751, time 125.81ms
iter 140680: loss 7.6608, time 125.54ms
iter 140690: loss 6.4187, time 124.75ms
iter 140700: loss 7.2965, time 125.19ms
iter 140710: loss 6.7213, time 125.77ms
iter 140720: loss 7.0018, time 128.99ms
iter 140730: loss 7.7224, time 124.98ms
iter 140740: loss 7.2313, time 125.52ms
step 140750: train loss 6.0801, val loss 6.0937
saving checkpoint to out-shakespeare-char
iter 140750: loss 7.0593, time 2914.99ms
iter 140760: loss 6.2673, time 125.78ms
iter 140770: loss 7.2749, time 125.62ms
iter 140780: loss 6.9761, time 125.61ms
iter 140790: loss 6.7961, time 125.51ms
iter 140800: loss 7.5166, time 125.41ms
iter 140810: loss 5.8566, time 125.80ms
iter 140820: loss 6.5903, time 126.49ms
iter 140830: loss 7.8262, time 128.73ms
iter 140840: loss 7.6135, time 126.02ms
iter 140850: loss 7.8443, time 125.89ms
iter 140860: loss 6.9885, time 126.61ms
iter 140870: loss 6.8234, time 125.83ms
iter 140880: loss 7.1779, time 125.89ms
iter 140890: loss 7.3759, time 125.56ms
iter 140900: loss 6.6374, time 125.68ms
iter 140910: loss 7.1074, time 125.59ms
iter 140920: loss 6.2883, time 125.59ms
iter 140930: loss 7.2436, time 128.38ms
iter 140940: loss 6.9289, time 125.41ms
iter 140950: loss 7.2025, time 125.49ms
iter 140960: loss 6.4318, time 125.54ms
iter 140970: loss 6.3873, time 125.39ms
iter 140980: loss 7.1018, time 128.55ms
iter 140990: loss 7.0630, time 125.37ms
step 141000: train loss 6.0778, val loss 6.1134
saving checkpoint to out-shakespeare-char
iter 141000: loss 7.4321, time 2878.14ms
iter 141010: loss 6.3885, time 129.00ms
iter 141020: loss 7.0978, time 125.84ms
iter 141030: loss 6.6533, time 125.89ms
iter 141040: loss 6.5660, time 125.98ms
iter 141050: loss 6.6727, time 125.24ms
iter 141060: loss 6.9150, time 125.63ms
iter 141070: loss 7.2541, time 125.40ms
iter 141080: loss 7.6679, time 125.37ms
iter 141090: loss 6.9053, time 125.46ms
iter 141100: loss 7.2010, time 125.14ms
iter 141110: loss 6.3597, time 125.35ms
iter 141120: loss 8.1287, time 128.42ms
iter 141130: loss 6.8831, time 125.31ms
iter 141140: loss 6.8701, time 125.44ms
iter 141150: loss 6.5774, time 125.59ms
iter 141160: loss 7.2299, time 125.25ms
iter 141170: loss 7.0951, time 125.29ms
iter 141180: loss 6.9467, time 125.22ms
iter 141190: loss 7.8725, time 125.60ms
iter 141200: loss 7.0644, time 125.61ms
iter 141210: loss 7.6073, time 121.74ms
iter 141220: loss 6.9115, time 121.79ms
iter 141230: loss 7.7560, time 121.53ms
iter 141240: loss 6.6259, time 121.54ms
step 141250: train loss 6.0815, val loss 6.0833
saving checkpoint to out-shakespeare-char
iter 141250: loss 6.9998, time 2890.74ms
iter 141260: loss 7.6842, time 121.77ms
iter 141270: loss 6.6237, time 121.95ms
iter 141280: loss 6.7187, time 121.56ms
iter 141290: loss 6.6955, time 121.39ms
iter 141300: loss 7.0244, time 121.65ms
iter 141310: loss 6.8941, time 122.78ms
iter 141320: loss 6.6872, time 120.67ms
iter 141330: loss 7.2912, time 122.78ms
iter 141340: loss 6.9857, time 121.73ms
iter 141350: loss 7.1977, time 122.97ms
iter 141360: loss 6.2811, time 122.30ms
iter 141370: loss 6.6465, time 122.46ms
iter 141380: loss 6.5854, time 121.60ms
iter 141390: loss 7.0704, time 122.74ms
iter 141400: loss 6.4982, time 121.58ms
iter 141410: loss 7.4475, time 122.74ms
iter 141420: loss 6.6027, time 120.74ms
iter 141430: loss 6.9551, time 122.56ms
iter 141440: loss 7.4809, time 121.62ms
iter 141450: loss 6.9587, time 121.96ms
iter 141460: loss 7.3090, time 121.67ms
iter 141470: loss 7.0573, time 122.80ms
iter 141480: loss 7.7837, time 121.28ms
iter 141490: loss 7.2637, time 122.65ms
step 141500: train loss 6.1278, val loss 6.1316
saving checkpoint to out-shakespeare-char
iter 141500: loss 6.7873, time 2899.20ms
iter 141510: loss 6.3382, time 121.96ms
iter 141520: loss 7.3834, time 121.78ms
iter 141530: loss 6.9889, time 120.91ms
iter 141540: loss 7.2320, time 121.80ms
iter 141550: loss 7.5298, time 121.77ms
iter 141560: loss 7.3899, time 121.74ms
iter 141570: loss 6.9212, time 121.77ms
iter 141580: loss 7.4609, time 121.63ms
iter 141590: loss 6.0528, time 121.72ms
iter 141600: loss 6.7187, time 121.69ms
iter 141610: loss 7.4315, time 121.77ms
iter 141620: loss 7.0172, time 121.73ms
iter 141630: loss 6.5910, time 121.84ms
iter 141640: loss 7.1484, time 121.79ms
iter 141650: loss 7.2806, time 121.87ms
iter 141660: loss 7.4097, time 121.96ms
iter 141670: loss 7.0005, time 121.95ms
iter 141680: loss 7.7104, time 121.60ms
iter 141690: loss 6.6659, time 121.67ms
iter 141700: loss 7.5720, time 123.20ms
iter 141710: loss 6.8147, time 121.46ms
iter 141720: loss 6.9491, time 121.85ms
iter 141730: loss 6.9575, time 121.91ms
iter 141740: loss 6.9704, time 121.93ms
step 141750: train loss 6.0868, val loss 6.0946
saving checkpoint to out-shakespeare-char
iter 141750: loss 7.1895, time 2901.27ms
iter 141760: loss 7.0228, time 121.81ms
iter 141770: loss 6.7791, time 121.38ms
iter 141780: loss 7.4975, time 121.98ms
iter 141790: loss 7.6362, time 121.64ms
iter 141800: loss 7.5706, time 121.91ms
iter 141810: loss 6.8195, time 121.52ms
iter 141820: loss 7.4916, time 121.21ms
iter 141830: loss 7.1886, time 120.71ms
iter 141840: loss 6.6726, time 122.26ms
iter 141850: loss 7.4135, time 121.34ms
iter 141860: loss 6.7303, time 122.52ms
iter 141870: loss 7.5699, time 121.51ms
iter 141880: loss 7.0567, time 123.06ms
iter 141890: loss 6.6019, time 121.39ms
iter 141900: loss 6.5812, time 123.11ms
iter 141910: loss 6.9766, time 121.35ms
iter 141920: loss 6.5667, time 122.56ms
iter 141930: loss 7.3926, time 121.02ms
iter 141940: loss 7.4286, time 122.44ms
iter 141950: loss 6.5441, time 121.71ms
iter 141960: loss 7.7387, time 122.54ms
iter 141970: loss 7.1758, time 121.47ms
iter 141980: loss 7.7520, time 122.65ms
iter 141990: loss 7.0293, time 121.42ms
step 142000: train loss 6.0829, val loss 6.1179
saving checkpoint to out-shakespeare-char
iter 142000: loss 7.2128, time 2889.69ms
iter 142010: loss 7.0447, time 121.94ms
iter 142020: loss 6.4531, time 121.81ms
iter 142030: loss 7.1363, time 120.97ms
iter 142040: loss 7.2648, time 121.86ms
iter 142050: loss 7.3432, time 121.84ms
iter 142060: loss 7.0598, time 121.89ms
iter 142070: loss 6.3896, time 122.45ms
iter 142080: loss 6.7358, time 121.89ms
iter 142090: loss 7.5393, time 121.85ms
iter 142100: loss 7.8250, time 121.97ms
iter 142110: loss 7.7694, time 121.96ms
iter 142120: loss 7.1575, time 121.83ms
iter 142130: loss 7.2210, time 121.95ms
iter 142140: loss 7.8048, time 122.30ms
iter 142150: loss 6.9490, time 121.80ms
iter 142160: loss 7.1649, time 122.12ms
iter 142170: loss 6.9375, time 121.93ms
iter 142180: loss 7.0932, time 121.99ms
iter 142190: loss 7.5217, time 121.96ms
iter 142200: loss 6.9830, time 121.91ms
iter 142210: loss 7.0129, time 121.96ms
iter 142220: loss 7.0769, time 121.82ms
iter 142230: loss 6.7967, time 121.85ms
iter 142240: loss 6.5050, time 122.27ms
step 142250: train loss 6.1131, val loss 6.0675
saving checkpoint to out-shakespeare-char
iter 142250: loss 6.4524, time 2898.20ms
iter 142260: loss 6.8730, time 125.96ms
iter 142270: loss 6.5549, time 126.63ms
iter 142280: loss 6.9423, time 124.94ms
iter 142290: loss 7.6077, time 125.65ms
iter 142300: loss 6.5019, time 125.60ms
iter 142310: loss 7.0389, time 125.67ms
iter 142320: loss 6.9579, time 128.54ms
iter 142330: loss 6.6163, time 126.01ms
iter 142340: loss 7.8229, time 125.79ms
iter 142350: loss 6.4338, time 126.82ms
iter 142360: loss 6.6952, time 125.56ms
iter 142370: loss 7.1087, time 125.87ms
iter 142380: loss 7.1067, time 125.73ms
iter 142390: loss 6.1808, time 125.66ms
iter 142400: loss 6.6658, time 125.79ms
iter 142410: loss 7.0789, time 125.34ms
iter 142420: loss 7.0974, time 125.68ms
iter 142430: loss 7.1674, time 128.53ms
iter 142440: loss 6.9376, time 125.87ms
iter 142450: loss 6.7153, time 125.67ms
iter 142460: loss 7.2384, time 125.40ms
iter 142470: loss 6.4964, time 127.11ms
iter 142480: loss 7.1900, time 125.68ms
iter 142490: loss 7.0631, time 125.52ms
step 142500: train loss 6.1558, val loss 6.0579
saving checkpoint to out-shakespeare-char
iter 142500: loss 7.0441, time 2862.61ms
iter 142510: loss 6.8726, time 121.97ms
iter 142520: loss 7.8102, time 123.02ms
iter 142530: loss 7.1617, time 121.91ms
iter 142540: loss 7.6641, time 122.96ms
iter 142550: loss 7.1028, time 121.81ms
iter 142560: loss 6.8359, time 122.89ms
iter 142570: loss 8.0309, time 121.69ms
iter 142580: loss 6.9842, time 123.51ms
iter 142590: loss 6.9378, time 121.77ms
iter 142600: loss 7.3001, time 123.12ms
iter 142610: loss 5.6407, time 122.22ms
iter 142620: loss 7.2294, time 123.21ms
iter 142630: loss 6.9236, time 121.93ms
iter 142640: loss 7.0925, time 122.84ms
iter 142650: loss 7.2589, time 122.05ms
iter 142660: loss 6.3906, time 123.85ms
iter 142670: loss 6.7431, time 121.81ms
iter 142680: loss 6.4286, time 123.21ms
iter 142690: loss 7.1974, time 121.91ms
iter 142700: loss 7.5558, time 122.84ms
iter 142710: loss 6.8023, time 121.78ms
iter 142720: loss 7.0606, time 123.44ms
iter 142730: loss 7.2930, time 121.90ms
iter 142740: loss 7.5237, time 123.01ms
step 142750: train loss 6.1063, val loss 6.0864
saving checkpoint to out-shakespeare-char
iter 142750: loss 7.1974, time 2902.14ms
iter 142760: loss 6.9394, time 121.84ms
iter 142770: loss 7.3738, time 122.36ms
iter 142780: loss 6.6200, time 121.93ms
iter 142790: loss 7.5212, time 122.37ms
iter 142800: loss 7.8720, time 121.99ms
iter 142810: loss 7.2111, time 121.91ms
iter 142820: loss 6.8370, time 121.84ms
iter 142830: loss 7.5035, time 122.02ms
iter 142840: loss 6.5191, time 122.11ms
iter 142850: loss 6.7712, time 121.84ms
iter 142860: loss 7.2214, time 121.92ms
iter 142870: loss 6.9957, time 121.86ms
iter 142880: loss 6.9355, time 122.26ms
iter 142890: loss 7.8742, time 122.57ms
iter 142900: loss 6.7532, time 121.98ms
iter 142910: loss 6.8043, time 122.54ms
iter 142920: loss 7.3390, time 122.05ms
iter 142930: loss 6.6617, time 122.56ms
iter 142940: loss 7.9373, time 122.06ms
iter 142950: loss 7.2860, time 122.48ms
iter 142960: loss 6.6585, time 121.86ms
iter 142970: loss 7.2287, time 121.98ms
iter 142980: loss 7.3467, time 121.86ms
iter 142990: loss 7.4513, time 122.08ms
step 143000: train loss 6.0816, val loss 6.1273
saving checkpoint to out-shakespeare-char
iter 143000: loss 7.4278, time 2898.60ms
iter 143010: loss 7.7131, time 122.77ms
iter 143020: loss 7.4664, time 122.57ms
iter 143030: loss 6.6457, time 123.62ms
iter 143040: loss 6.6322, time 122.27ms
iter 143050: loss 6.8513, time 122.12ms
iter 143060: loss 6.1778, time 122.62ms
iter 143070: loss 7.2943, time 124.26ms
iter 143080: loss 6.7141, time 122.17ms
iter 143090: loss 7.5909, time 121.78ms
iter 143100: loss 6.5929, time 122.01ms
iter 143110: loss 7.0722, time 122.20ms
iter 143120: loss 7.1970, time 122.57ms
iter 143130: loss 6.5061, time 122.03ms
iter 143140: loss 7.1239, time 121.90ms
iter 143150: loss 6.3246, time 122.30ms
iter 143160: loss 8.0351, time 122.31ms
iter 143170: loss 6.6691, time 122.59ms
iter 143180: loss 7.3312, time 121.91ms
iter 143190: loss 7.2511, time 121.87ms
iter 143200: loss 6.6350, time 121.92ms
iter 143210: loss 6.7302, time 122.23ms
iter 143220: loss 6.4566, time 121.30ms
iter 143230: loss 7.3210, time 121.81ms
iter 143240: loss 6.4338, time 121.87ms
step 143250: train loss 6.0944, val loss 6.0742
saving checkpoint to out-shakespeare-char
iter 143250: loss 7.0895, time 2889.07ms
iter 143260: loss 7.1345, time 122.26ms
iter 143270: loss 7.9022, time 121.34ms
iter 143280: loss 6.3960, time 121.16ms
iter 143290: loss 6.8711, time 122.04ms
iter 143300: loss 7.2804, time 122.38ms
iter 143310: loss 7.6945, time 121.99ms
iter 143320: loss 7.1658, time 122.05ms
iter 143330: loss 6.7032, time 120.60ms
iter 143340: loss 7.2361, time 121.75ms
iter 143350: loss 7.3576, time 121.86ms
iter 143360: loss 6.5658, time 122.03ms
iter 143370: loss 6.6908, time 121.97ms
iter 143380: loss 7.4416, time 121.93ms
iter 143390: loss 7.2749, time 121.86ms
iter 143400: loss 7.5816, time 121.71ms
iter 143410: loss 7.2317, time 121.90ms
iter 143420: loss 7.1694, time 122.03ms
iter 143430: loss 8.0172, time 122.04ms
iter 143440: loss 7.9905, time 121.81ms
iter 143450: loss 6.5637, time 121.92ms
iter 143460: loss 7.2020, time 121.78ms
iter 143470: loss 7.5489, time 121.79ms
iter 143480: loss 7.6263, time 122.12ms
iter 143490: loss 7.1129, time 123.12ms
step 143500: train loss 6.0921, val loss 6.0874
saving checkpoint to out-shakespeare-char
iter 143500: loss 6.9769, time 2899.88ms
iter 143510: loss 6.7073, time 123.14ms
iter 143520: loss 7.1955, time 121.93ms
iter 143530: loss 7.5816, time 123.53ms
iter 143540: loss 6.8619, time 121.78ms
iter 143550: loss 6.3926, time 123.46ms
iter 143560: loss 7.1058, time 121.87ms
iter 143570: loss 7.1260, time 123.33ms
iter 143580: loss 6.9093, time 121.99ms
iter 143590: loss 7.1084, time 123.12ms
iter 143600: loss 7.0089, time 121.92ms
iter 143610: loss 6.5063, time 123.34ms
iter 143620: loss 6.9783, time 121.87ms
iter 143630: loss 6.8845, time 123.03ms
iter 143640: loss 6.9140, time 121.88ms
iter 143650: loss 7.0046, time 123.47ms
iter 143660: loss 6.9763, time 121.96ms
iter 143670: loss 7.0921, time 123.16ms
iter 143680: loss 7.1503, time 122.10ms
iter 143690: loss 6.4624, time 123.92ms
iter 143700: loss 7.7335, time 121.99ms
iter 143710: loss 6.7462, time 123.35ms
iter 143720: loss 6.5197, time 121.94ms
iter 143730: loss 6.8226, time 124.43ms
iter 143740: loss 7.2111, time 121.91ms
step 143750: train loss 6.0367, val loss 6.0714
saving checkpoint to out-shakespeare-char
iter 143750: loss 7.1726, time 2899.18ms
iter 143760: loss 7.1236, time 125.73ms
iter 143770: loss 6.4849, time 125.60ms
iter 143780: loss 6.7176, time 128.05ms
iter 143790: loss 7.4525, time 125.09ms
iter 143800: loss 6.2476, time 125.30ms
iter 143810: loss 6.6418, time 125.43ms
iter 143820: loss 6.8796, time 125.13ms
iter 143830: loss 6.5951, time 125.39ms
iter 143840: loss 7.3378, time 125.22ms
iter 143850: loss 6.8530, time 125.41ms
iter 143860: loss 7.2513, time 125.93ms
iter 143870: loss 7.2354, time 125.79ms
iter 143880: loss 7.2277, time 125.84ms
iter 143890: loss 7.0088, time 128.76ms
iter 143900: loss 7.5723, time 124.88ms
iter 143910: loss 6.9767, time 125.38ms
iter 143920: loss 7.2778, time 126.26ms
iter 143930: loss 6.2098, time 126.03ms
iter 143940: loss 6.0992, time 126.31ms
iter 143950: loss 6.8182, time 125.87ms
iter 143960: loss 7.1506, time 125.58ms
iter 143970: loss 7.3911, time 125.99ms
iter 143980: loss 6.1909, time 125.82ms
iter 143990: loss 7.5796, time 125.57ms
step 144000: train loss 6.0559, val loss 6.0704
saving checkpoint to out-shakespeare-char
iter 144000: loss 7.1665, time 2890.66ms
iter 144010: loss 7.9381, time 125.73ms
iter 144020: loss 6.9570, time 126.20ms
iter 144030: loss 7.4758, time 124.96ms
iter 144040: loss 7.2251, time 125.43ms
iter 144050: loss 7.3136, time 125.12ms
iter 144060: loss 8.1203, time 125.32ms
iter 144070: loss 7.2847, time 125.81ms
iter 144080: loss 6.8818, time 125.20ms
iter 144090: loss 7.4509, time 125.61ms
iter 144100: loss 6.6308, time 125.55ms
iter 144110: loss 6.9569, time 125.27ms
iter 144120: loss 7.1406, time 125.47ms
iter 144130: loss 6.9711, time 120.89ms
iter 144140: loss 6.9087, time 119.87ms
iter 144150: loss 7.3882, time 121.24ms
iter 144160: loss 7.0576, time 119.76ms
iter 144170: loss 6.4671, time 121.19ms
iter 144180: loss 6.3764, time 120.49ms
iter 144190: loss 7.3565, time 122.68ms
iter 144200: loss 6.9843, time 121.77ms
iter 144210: loss 6.1332, time 122.69ms
iter 144220: loss 7.4768, time 122.09ms
iter 144230: loss 8.0371, time 122.70ms
iter 144240: loss 6.8677, time 121.72ms
step 144250: train loss 6.0996, val loss 6.0721
saving checkpoint to out-shakespeare-char
iter 144250: loss 6.0639, time 2880.90ms
iter 144260: loss 6.9002, time 121.78ms
iter 144270: loss 6.7879, time 121.60ms
iter 144280: loss 7.1605, time 121.49ms
iter 144290: loss 7.0484, time 121.59ms
iter 144300: loss 6.8963, time 121.62ms
iter 144310: loss 6.4776, time 121.84ms
iter 144320: loss 6.8281, time 121.97ms
iter 144330: loss 6.3040, time 121.49ms
iter 144340: loss 7.3288, time 122.09ms
iter 144350: loss 6.8707, time 121.51ms
iter 144360: loss 7.6127, time 121.57ms
iter 144370: loss 6.5722, time 121.79ms
iter 144380: loss 6.6425, time 121.71ms
iter 144390: loss 7.4392, time 121.41ms
iter 144400: loss 6.8784, time 121.52ms
iter 144410: loss 7.0659, time 121.66ms
iter 144420: loss 7.4764, time 121.82ms
iter 144430: loss 7.5247, time 121.61ms
iter 144440: loss 7.5845, time 121.66ms
iter 144450: loss 6.4852, time 121.61ms
iter 144460: loss 6.4681, time 122.05ms
iter 144470: loss 7.2013, time 121.77ms
iter 144480: loss 7.4586, time 121.60ms
iter 144490: loss 7.2186, time 121.59ms
step 144500: train loss 6.0783, val loss 6.0417
saving checkpoint to out-shakespeare-char
iter 144500: loss 7.6154, time 2883.29ms
iter 144510: loss 7.2210, time 121.68ms
iter 144520: loss 7.3419, time 121.53ms
iter 144530: loss 5.6763, time 121.64ms
iter 144540: loss 6.8612, time 121.60ms
iter 144550: loss 7.8437, time 121.64ms
iter 144560: loss 6.3282, time 121.70ms
iter 144570: loss 7.3315, time 121.67ms
iter 144580: loss 7.0783, time 122.03ms
iter 144590: loss 7.4930, time 121.56ms
iter 144600: loss 6.3279, time 121.62ms
iter 144610: loss 6.9593, time 121.53ms
iter 144620: loss 6.7365, time 121.96ms
iter 144630: loss 6.7058, time 121.82ms
iter 144640: loss 7.8669, time 121.65ms
iter 144650: loss 8.2377, time 121.50ms
iter 144660: loss 6.8604, time 121.68ms
iter 144670: loss 7.3971, time 121.65ms
iter 144680: loss 7.1457, time 121.73ms
iter 144690: loss 7.6375, time 121.59ms
iter 144700: loss 6.9743, time 121.36ms
iter 144710: loss 7.2808, time 121.61ms
iter 144720: loss 6.4678, time 121.73ms
iter 144730: loss 7.5216, time 121.91ms
iter 144740: loss 8.0307, time 121.58ms
step 144750: train loss 6.1366, val loss 6.0418
saving checkpoint to out-shakespeare-char
iter 144750: loss 6.9308, time 2892.69ms
iter 144760: loss 6.8450, time 121.81ms
iter 144770: loss 7.6313, time 121.83ms
iter 144780: loss 7.1857, time 121.71ms
iter 144790: loss 6.3599, time 121.92ms
iter 144800: loss 7.7333, time 121.98ms
iter 144810: loss 6.5378, time 121.90ms
iter 144820: loss 6.6789, time 121.91ms
iter 144830: loss 6.7721, time 122.03ms
iter 144840: loss 6.2587, time 121.84ms
iter 144850: loss 7.2104, time 121.93ms
iter 144860: loss 7.1750, time 121.85ms
iter 144870: loss 6.2859, time 121.91ms
iter 144880: loss 7.3348, time 121.67ms
iter 144890: loss 5.8257, time 121.91ms
iter 144900: loss 6.4242, time 121.87ms
iter 144910: loss 7.2965, time 121.95ms
iter 144920: loss 7.0221, time 122.42ms
iter 144930: loss 7.5233, time 121.88ms
iter 144940: loss 7.1120, time 121.59ms
iter 144950: loss 7.4367, time 121.98ms
iter 144960: loss 7.0605, time 121.87ms
iter 144970: loss 6.2971, time 121.99ms
iter 144980: loss 6.7364, time 121.37ms
iter 144990: loss 7.7931, time 121.92ms
step 145000: train loss 6.0938, val loss 6.0057
saving checkpoint to out-shakespeare-char
iter 145000: loss 7.3287, time 2901.17ms
iter 145010: loss 7.1419, time 122.72ms
iter 145020: loss 7.5210, time 121.94ms
iter 145030: loss 7.2932, time 121.42ms
iter 145040: loss 7.1378, time 122.13ms
iter 145050: loss 6.9182, time 122.04ms
iter 145060: loss 7.0022, time 122.35ms
iter 145070: loss 7.1063, time 121.95ms
iter 145080: loss 7.3920, time 121.92ms
iter 145090: loss 7.3627, time 121.06ms
iter 145100: loss 6.6129, time 121.92ms
iter 145110: loss 6.7001, time 120.66ms
iter 145120: loss 7.0332, time 120.85ms
iter 145130: loss 7.4929, time 121.71ms
iter 145140: loss 7.0393, time 121.90ms
iter 145150: loss 7.8081, time 121.69ms
iter 145160: loss 7.2543, time 121.95ms
iter 145170: loss 7.4214, time 122.03ms
iter 145180: loss 6.9498, time 121.91ms
iter 145190: loss 7.3410, time 121.94ms
iter 145200: loss 6.7997, time 121.27ms
iter 145210: loss 6.2505, time 121.23ms
iter 145220: loss 7.0472, time 121.87ms
iter 145230: loss 6.8253, time 121.72ms
iter 145240: loss 7.1026, time 122.26ms
step 145250: train loss 6.0289, val loss 6.0352
saving checkpoint to out-shakespeare-char
iter 145250: loss 7.3386, time 2894.13ms
iter 145260: loss 6.9089, time 121.60ms
iter 145270: loss 6.7269, time 121.47ms
iter 145280: loss 7.8323, time 120.99ms
iter 145290: loss 7.1334, time 120.90ms
iter 145300: loss 7.1807, time 121.24ms
iter 145310: loss 6.5681, time 121.52ms
iter 145320: loss 7.0248, time 121.73ms
iter 145330: loss 6.9436, time 121.76ms
iter 145340: loss 7.1533, time 121.70ms
iter 145350: loss 7.0826, time 121.48ms
iter 145360: loss 7.0337, time 118.72ms
iter 145370: loss 7.5390, time 119.54ms
iter 145380: loss 6.9532, time 121.65ms
iter 145390: loss 7.4645, time 121.57ms
iter 145400: loss 6.9979, time 121.66ms
iter 145410: loss 6.5126, time 122.09ms
iter 145420: loss 6.8145, time 120.76ms
iter 145430: loss 7.4525, time 121.23ms
iter 145440: loss 6.2677, time 119.70ms
iter 145450: loss 6.8053, time 121.61ms
iter 145460: loss 6.4836, time 121.70ms
iter 145470: loss 7.1282, time 121.69ms
iter 145480: loss 6.5320, time 122.15ms
iter 145490: loss 6.1068, time 121.57ms
step 145500: train loss 6.1090, val loss 5.9990
saving checkpoint to out-shakespeare-char
iter 145500: loss 7.5038, time 2899.92ms
iter 145510: loss 7.3039, time 121.53ms
iter 145520: loss 6.1678, time 124.60ms
iter 145530: loss 6.3929, time 121.62ms
iter 145540: loss 6.4554, time 124.26ms
iter 145550: loss 7.2679, time 121.53ms
iter 145560: loss 6.4342, time 124.20ms
iter 145570: loss 7.2901, time 121.94ms
iter 145580: loss 6.9326, time 124.98ms
iter 145590: loss 7.5030, time 121.94ms
iter 145600: loss 6.2630, time 124.31ms
iter 145610: loss 6.9107, time 121.53ms
iter 145620: loss 6.9717, time 124.41ms
iter 145630: loss 6.4577, time 121.67ms
iter 145640: loss 7.0833, time 124.78ms
iter 145650: loss 6.9609, time 121.51ms
iter 145660: loss 6.8181, time 124.33ms
iter 145670: loss 7.0216, time 120.76ms
iter 145680: loss 6.7758, time 124.78ms
iter 145690: loss 7.0179, time 121.59ms
iter 145700: loss 6.6869, time 124.36ms
iter 145710: loss 8.1371, time 121.70ms
iter 145720: loss 7.1402, time 124.48ms
iter 145730: loss 6.8975, time 121.39ms
iter 145740: loss 7.6977, time 124.42ms
step 145750: train loss 6.0661, val loss 6.0483
saving checkpoint to out-shakespeare-char
iter 145750: loss 6.8350, time 2888.48ms
iter 145760: loss 6.7696, time 121.85ms
iter 145770: loss 6.4455, time 124.49ms
iter 145780: loss 7.1897, time 121.58ms
iter 145790: loss 6.1530, time 124.72ms
iter 145800: loss 6.8312, time 121.73ms
iter 145810: loss 6.9725, time 124.66ms
iter 145820: loss 6.3349, time 121.70ms
iter 145830: loss 6.2978, time 124.35ms
iter 145840: loss 7.4364, time 122.05ms
iter 145850: loss 6.6078, time 124.47ms
iter 145860: loss 6.9356, time 121.97ms
iter 145870: loss 7.0215, time 124.46ms
iter 145880: loss 6.9720, time 121.71ms
iter 145890: loss 7.8025, time 124.88ms
iter 145900: loss 7.0353, time 122.27ms
iter 145910: loss 6.7749, time 124.42ms
iter 145920: loss 7.1411, time 121.73ms
iter 145930: loss 6.9431, time 124.86ms
iter 145940: loss 6.1698, time 121.72ms
iter 145950: loss 6.9179, time 124.59ms
iter 145960: loss 6.8350, time 121.56ms
iter 145970: loss 6.5789, time 124.42ms
iter 145980: loss 7.1428, time 121.80ms
iter 145990: loss 6.3933, time 124.42ms
step 146000: train loss 6.0704, val loss 6.0964
saving checkpoint to out-shakespeare-char
iter 146000: loss 7.5886, time 2885.60ms
iter 146010: loss 7.1157, time 121.29ms
iter 146020: loss 7.1184, time 121.53ms
iter 146030: loss 7.0119, time 121.52ms
iter 146040: loss 7.5115, time 121.44ms
iter 146050: loss 7.5044, time 120.65ms
iter 146060: loss 7.0441, time 121.41ms
iter 146070: loss 7.0511, time 121.49ms
iter 146080: loss 6.9112, time 122.15ms
iter 146090: loss 6.9818, time 121.67ms
iter 146100: loss 6.2505, time 121.08ms
iter 146110: loss 7.5368, time 122.00ms
iter 146120: loss 7.2301, time 121.67ms
iter 146130: loss 6.8869, time 121.88ms
iter 146140: loss 6.2922, time 121.68ms
iter 146150: loss 6.2059, time 121.62ms
iter 146160: loss 7.3556, time 121.61ms
iter 146170: loss 6.9943, time 121.62ms
iter 146180: loss 6.7254, time 121.49ms
iter 146190: loss 6.9413, time 121.76ms
iter 146200: loss 6.1634, time 121.48ms
iter 146210: loss 6.5585, time 121.66ms
iter 146220: loss 7.4837, time 121.51ms
iter 146230: loss 7.2961, time 120.85ms
iter 146240: loss 6.5796, time 120.63ms
step 146250: train loss 6.0819, val loss 6.0242
saving checkpoint to out-shakespeare-char
iter 146250: loss 7.2240, time 2898.25ms
iter 146260: loss 7.2570, time 121.63ms
iter 146270: loss 7.0089, time 118.55ms
iter 146280: loss 7.2227, time 121.91ms
iter 146290: loss 7.1740, time 121.34ms
iter 146300: loss 7.1312, time 121.75ms
iter 146310: loss 7.5724, time 121.63ms
iter 146320: loss 6.8508, time 121.83ms
iter 146330: loss 6.5590, time 121.69ms
iter 146340: loss 6.9384, time 121.40ms
iter 146350: loss 7.0867, time 121.64ms
iter 146360: loss 7.5087, time 121.81ms
iter 146370: loss 6.2228, time 122.03ms
iter 146380: loss 7.0255, time 120.54ms
iter 146390: loss 7.5819, time 121.59ms
iter 146400: loss 6.7386, time 121.88ms
iter 146410: loss 6.0701, time 122.46ms
iter 146420: loss 7.0500, time 121.63ms
iter 146430: loss 6.1344, time 121.52ms
iter 146440: loss 7.1013, time 122.64ms
iter 146450: loss 6.8602, time 121.59ms
iter 146460: loss 6.9179, time 121.75ms
iter 146470: loss 6.7547, time 121.47ms
iter 146480: loss 6.8415, time 121.80ms
iter 146490: loss 6.3366, time 121.17ms
step 146500: train loss 6.0202, val loss 6.1100
saving checkpoint to out-shakespeare-char
iter 146500: loss 7.2174, time 2893.07ms
iter 146510: loss 7.0354, time 122.17ms
iter 146520: loss 6.9108, time 121.82ms
iter 146530: loss 6.9514, time 121.64ms
iter 146540: loss 6.9795, time 121.77ms
iter 146550: loss 7.1612, time 121.75ms
iter 146560: loss 7.1140, time 122.06ms
iter 146570: loss 6.4019, time 120.79ms
iter 146580: loss 6.5299, time 122.49ms
iter 146590: loss 6.7147, time 122.40ms
iter 146600: loss 6.8286, time 121.51ms
iter 146610: loss 7.4983, time 121.61ms
iter 146620: loss 6.2071, time 122.23ms
iter 146630: loss 7.3891, time 121.72ms
iter 146640: loss 7.6151, time 121.61ms
iter 146650: loss 6.9857, time 121.94ms
iter 146660: loss 7.1181, time 120.73ms
iter 146670: loss 6.8724, time 121.57ms
iter 146680: loss 6.2929, time 121.75ms
iter 146690: loss 8.0131, time 122.43ms
iter 146700: loss 7.6859, time 121.61ms
iter 146710: loss 7.4398, time 121.56ms
iter 146720: loss 6.7315, time 121.62ms
iter 146730: loss 6.8681, time 121.77ms
iter 146740: loss 7.1561, time 121.89ms
step 146750: train loss 6.0744, val loss 6.0349
saving checkpoint to out-shakespeare-char
iter 146750: loss 7.8620, time 2902.51ms
iter 146760: loss 7.5995, time 121.48ms
iter 146770: loss 7.0600, time 121.66ms
iter 146780: loss 6.5985, time 121.54ms
iter 146790: loss 7.4439, time 121.06ms
iter 146800: loss 6.2509, time 121.38ms
iter 146810: loss 6.9721, time 122.47ms
iter 146820: loss 7.0048, time 121.73ms
iter 146830: loss 7.5321, time 124.45ms
iter 146840: loss 6.1136, time 120.72ms
iter 146850: loss 7.0974, time 121.89ms
iter 146860: loss 7.1568, time 121.54ms
iter 146870: loss 6.2385, time 122.67ms
iter 146880: loss 5.9070, time 121.60ms
iter 146890: loss 6.9254, time 122.75ms
iter 146900: loss 7.2559, time 122.15ms
iter 146910: loss 6.6371, time 122.70ms
iter 146920: loss 6.7860, time 121.72ms
iter 146930: loss 7.0565, time 122.83ms
iter 146940: loss 6.2300, time 122.12ms
iter 146950: loss 6.7808, time 122.67ms
iter 146960: loss 7.1133, time 121.62ms
iter 146970: loss 6.6661, time 122.74ms
iter 146980: loss 7.2449, time 121.43ms
iter 146990: loss 7.6836, time 123.04ms
step 147000: train loss 6.0164, val loss 6.1221
saving checkpoint to out-shakespeare-char
iter 147000: loss 7.8686, time 2885.39ms
iter 147010: loss 6.8931, time 121.66ms
iter 147020: loss 6.1335, time 122.94ms
iter 147030: loss 7.6146, time 121.75ms
iter 147040: loss 6.9179, time 121.43ms
iter 147050: loss 7.0970, time 121.60ms
iter 147060: loss 7.4050, time 121.64ms
iter 147070: loss 6.3920, time 121.74ms
iter 147080: loss 6.9653, time 121.63ms
iter 147090: loss 6.6370, time 121.95ms
iter 147100: loss 7.5843, time 121.78ms
iter 147110: loss 7.0509, time 121.67ms
iter 147120: loss 6.8277, time 121.91ms
iter 147130: loss 6.7574, time 121.80ms
iter 147140: loss 7.3016, time 121.58ms
iter 147150: loss 6.6122, time 121.77ms
iter 147160: loss 6.5190, time 121.62ms
iter 147170: loss 7.3209, time 121.72ms
iter 147180: loss 7.7576, time 121.71ms
iter 147190: loss 7.0153, time 122.14ms
iter 147200: loss 7.3102, time 121.66ms
iter 147210: loss 8.3572, time 121.85ms
iter 147220: loss 6.8338, time 121.71ms
iter 147230: loss 6.8649, time 121.79ms
iter 147240: loss 6.5834, time 121.21ms
step 147250: train loss 6.0912, val loss 6.0418
saving checkpoint to out-shakespeare-char
iter 147250: loss 8.0274, time 2884.83ms
iter 147260: loss 6.8564, time 123.75ms
iter 147270: loss 7.4692, time 121.52ms
iter 147280: loss 6.6684, time 124.69ms
iter 147290: loss 7.8128, time 121.43ms
iter 147300: loss 7.2907, time 124.65ms
iter 147310: loss 6.9308, time 121.85ms
iter 147320: loss 7.3767, time 124.67ms
iter 147330: loss 6.9601, time 121.45ms
iter 147340: loss 6.6090, time 124.38ms
iter 147350: loss 6.7909, time 121.49ms
iter 147360: loss 6.6737, time 124.60ms
iter 147370: loss 7.2361, time 121.60ms
iter 147380: loss 6.5543, time 124.24ms
iter 147390: loss 6.9622, time 121.55ms
iter 147400: loss 7.1244, time 124.24ms
iter 147410: loss 7.4523, time 121.60ms
iter 147420: loss 6.6986, time 124.65ms
iter 147430: loss 7.1370, time 121.84ms
iter 147440: loss 6.9760, time 123.74ms
iter 147450: loss 6.7231, time 121.54ms
iter 147460: loss 6.7279, time 125.24ms
iter 147470: loss 7.0619, time 120.70ms
iter 147480: loss 6.5062, time 124.31ms
iter 147490: loss 6.7968, time 121.25ms
step 147500: train loss 6.0905, val loss 6.1089
saving checkpoint to out-shakespeare-char
iter 147500: loss 7.0742, time 2892.18ms
iter 147510: loss 6.4013, time 121.74ms
iter 147520: loss 6.9925, time 120.36ms
iter 147530: loss 7.1617, time 121.74ms
iter 147540: loss 7.4999, time 121.53ms
iter 147550: loss 6.2482, time 121.57ms
iter 147560: loss 7.1008, time 121.52ms
iter 147570: loss 6.1369, time 122.52ms
iter 147580: loss 6.5035, time 121.40ms
iter 147590: loss 6.8732, time 122.62ms
iter 147600: loss 7.4143, time 121.43ms
iter 147610: loss 6.6152, time 122.59ms
iter 147620: loss 7.2345, time 121.81ms
iter 147630: loss 7.1174, time 122.50ms
iter 147640: loss 6.9355, time 121.65ms
iter 147650: loss 6.9525, time 123.68ms
iter 147660: loss 6.6143, time 121.81ms
iter 147670: loss 6.5822, time 122.85ms
iter 147680: loss 7.3322, time 122.02ms
iter 147690: loss 6.7501, time 122.60ms
iter 147700: loss 6.7296, time 121.40ms
iter 147710: loss 6.5252, time 120.85ms
iter 147720: loss 6.8337, time 121.77ms
iter 147730: loss 7.3671, time 122.91ms
iter 147740: loss 6.5497, time 121.36ms
step 147750: train loss 6.0426, val loss 6.0710
saving checkpoint to out-shakespeare-char
iter 147750: loss 6.8890, time 2893.52ms
iter 147760: loss 6.2984, time 120.74ms
iter 147770: loss 7.1827, time 121.50ms
iter 147780: loss 7.1277, time 121.49ms
iter 147790: loss 7.1345, time 121.52ms
iter 147800: loss 7.1378, time 121.38ms
iter 147810: loss 8.0618, time 122.71ms
iter 147820: loss 7.1065, time 121.39ms
iter 147830: loss 6.6128, time 121.94ms
iter 147840: loss 6.7748, time 121.53ms
iter 147850: loss 6.9167, time 121.44ms
iter 147860: loss 7.7403, time 121.49ms
iter 147870: loss 7.7177, time 121.31ms
iter 147880: loss 6.5139, time 121.42ms
iter 147890: loss 6.3462, time 121.38ms
iter 147900: loss 6.6772, time 121.67ms
iter 147910: loss 7.6210, time 121.38ms
iter 147920: loss 7.7499, time 121.50ms
iter 147930: loss 7.0350, time 121.39ms
iter 147940: loss 6.4781, time 121.49ms
iter 147950: loss 6.7509, time 121.49ms
iter 147960: loss 7.3375, time 121.60ms
iter 147970: loss 6.3571, time 121.36ms
iter 147980: loss 8.0854, time 121.39ms
iter 147990: loss 7.1881, time 121.44ms
step 148000: train loss 6.0258, val loss 6.0597
saving checkpoint to out-shakespeare-char
iter 148000: loss 7.8078, time 2886.32ms
iter 148010: loss 7.3777, time 122.56ms
iter 148020: loss 7.8256, time 121.67ms
iter 148030: loss 6.2908, time 123.29ms
iter 148040: loss 7.1323, time 121.71ms
iter 148050: loss 6.8416, time 122.27ms
iter 148060: loss 6.7443, time 121.16ms
iter 148070: loss 6.8784, time 122.83ms
iter 148080: loss 7.3743, time 121.55ms
iter 148090: loss 6.6747, time 122.69ms
iter 148100: loss 7.3738, time 121.15ms
iter 148110: loss 6.9881, time 123.03ms
iter 148120: loss 6.7448, time 121.60ms
iter 148130: loss 7.2013, time 122.75ms
iter 148140: loss 6.9122, time 121.91ms
iter 148150: loss 6.7277, time 122.62ms
iter 148160: loss 6.6891, time 121.84ms
iter 148170: loss 5.7199, time 122.38ms
iter 148180: loss 7.5037, time 121.66ms
iter 148190: loss 7.0890, time 122.49ms
iter 148200: loss 6.8198, time 121.67ms
iter 148210: loss 6.5119, time 123.60ms
iter 148220: loss 6.2447, time 121.56ms
iter 148230: loss 7.1389, time 123.71ms
iter 148240: loss 7.0878, time 122.63ms
step 148250: train loss 6.0164, val loss 6.0356
saving checkpoint to out-shakespeare-char
iter 148250: loss 5.9200, time 2879.73ms
iter 148260: loss 6.9198, time 122.14ms
iter 148270: loss 6.8480, time 121.65ms
iter 148280: loss 6.6588, time 121.61ms
iter 148290: loss 6.7689, time 121.13ms
iter 148300: loss 7.2896, time 121.34ms
iter 148310: loss 6.6286, time 121.61ms
iter 148320: loss 6.5627, time 121.92ms
iter 148330: loss 7.2498, time 121.58ms
iter 148340: loss 7.0084, time 122.01ms
iter 148350: loss 6.8353, time 121.65ms
iter 148360: loss 7.2889, time 121.53ms
iter 148370: loss 7.1477, time 121.56ms
iter 148380: loss 7.6014, time 121.76ms
iter 148390: loss 7.0197, time 121.47ms
iter 148400: loss 6.6829, time 121.75ms
iter 148410: loss 6.4833, time 121.67ms
iter 148420: loss 7.2954, time 121.74ms
iter 148430: loss 6.8837, time 121.87ms
iter 148440: loss 7.4194, time 121.64ms
iter 148450: loss 7.0385, time 121.58ms
iter 148460: loss 6.9740, time 121.54ms
iter 148470: loss 6.6422, time 121.60ms
iter 148480: loss 7.4294, time 121.62ms
iter 148490: loss 7.2018, time 121.66ms
step 148500: train loss 6.0510, val loss 6.0389
saving checkpoint to out-shakespeare-char
iter 148500: loss 6.5944, time 2893.03ms
iter 148510: loss 6.6545, time 121.67ms
iter 148520: loss 7.6270, time 122.05ms
iter 148530: loss 7.9021, time 121.94ms
iter 148540: loss 7.3168, time 121.70ms
iter 148550: loss 7.0182, time 121.88ms
iter 148560: loss 7.0275, time 122.05ms
iter 148570: loss 7.2109, time 121.31ms
iter 148580: loss 6.9446, time 121.67ms
iter 148590: loss 6.9233, time 121.64ms
iter 148600: loss 7.3758, time 121.16ms
iter 148610: loss 6.8853, time 121.60ms
iter 148620: loss 7.0170, time 121.39ms
iter 148630: loss 7.2208, time 122.05ms
iter 148640: loss 7.4433, time 121.56ms
iter 148650: loss 6.6193, time 121.38ms
iter 148660: loss 7.6323, time 121.52ms
iter 148670: loss 6.8489, time 121.46ms
iter 148680: loss 6.8612, time 121.57ms
iter 148690: loss 6.1482, time 121.56ms
iter 148700: loss 6.6499, time 121.51ms
iter 148710: loss 6.1982, time 121.47ms
iter 148720: loss 6.4515, time 121.64ms
iter 148730: loss 6.7679, time 121.28ms
iter 148740: loss 6.2467, time 121.54ms
step 148750: train loss 6.0582, val loss 6.0377
saving checkpoint to out-shakespeare-char
iter 148750: loss 6.9770, time 2890.25ms
iter 148760: loss 6.2958, time 121.15ms
iter 148770: loss 6.6384, time 124.41ms
iter 148780: loss 6.3777, time 121.53ms
iter 148790: loss 7.2000, time 124.17ms
iter 148800: loss 6.9914, time 121.03ms
iter 148810: loss 7.2648, time 123.56ms
iter 148820: loss 7.5114, time 120.84ms
iter 148830: loss 6.3925, time 124.21ms
iter 148840: loss 7.5386, time 121.40ms
iter 148850: loss 6.3792, time 124.17ms
iter 148860: loss 6.5926, time 121.39ms
iter 148870: loss 6.8012, time 124.24ms
iter 148880: loss 7.2383, time 121.43ms
iter 148890: loss 7.0916, time 124.45ms
iter 148900: loss 7.0437, time 121.52ms
iter 148910: loss 5.8311, time 124.31ms
iter 148920: loss 7.2408, time 121.51ms
iter 148930: loss 6.7166, time 124.36ms
iter 148940: loss 6.9828, time 121.31ms
iter 148950: loss 6.5406, time 124.57ms
iter 148960: loss 6.4732, time 121.63ms
iter 148970: loss 6.8595, time 124.81ms
iter 148980: loss 7.3181, time 121.80ms
iter 148990: loss 7.4569, time 124.24ms
step 149000: train loss 6.0942, val loss 6.0322
saving checkpoint to out-shakespeare-char
iter 149000: loss 7.0696, time 2892.01ms
iter 149010: loss 7.3674, time 122.13ms
iter 149020: loss 7.1064, time 121.90ms
iter 149030: loss 7.5405, time 121.88ms
iter 149040: loss 8.0791, time 121.88ms
iter 149050: loss 7.1840, time 122.22ms
iter 149060: loss 7.0660, time 121.95ms
iter 149070: loss 6.7669, time 124.84ms
iter 149080: loss 6.5019, time 121.73ms
iter 149090: loss 6.2544, time 124.95ms
iter 149100: loss 7.0099, time 121.95ms
iter 149110: loss 6.7480, time 124.75ms
iter 149120: loss 7.3722, time 121.65ms
iter 149130: loss 5.8773, time 124.62ms
iter 149140: loss 6.7668, time 121.77ms
iter 149150: loss 7.0828, time 124.73ms
iter 149160: loss 6.5674, time 122.02ms
iter 149170: loss 7.0735, time 124.76ms
iter 149180: loss 7.0703, time 121.90ms
iter 149190: loss 6.7973, time 123.86ms
iter 149200: loss 7.0743, time 121.81ms
iter 149210: loss 7.1583, time 124.22ms
iter 149220: loss 6.8110, time 121.96ms
iter 149230: loss 7.5334, time 124.86ms
iter 149240: loss 6.9243, time 121.84ms
step 149250: train loss 6.0255, val loss 6.0796
saving checkpoint to out-shakespeare-char
iter 149250: loss 7.6773, time 2891.36ms
iter 149260: loss 6.4424, time 121.92ms
iter 149270: loss 6.1055, time 121.68ms
iter 149280: loss 7.0301, time 122.45ms
iter 149290: loss 7.1353, time 121.62ms
iter 149300: loss 7.1297, time 123.08ms
iter 149310: loss 7.2330, time 121.88ms
iter 149320: loss 6.9511, time 123.12ms
iter 149330: loss 7.2387, time 121.60ms
iter 149340: loss 6.8986, time 122.80ms
iter 149350: loss 6.6041, time 121.64ms
iter 149360: loss 7.0372, time 122.76ms
iter 149370: loss 8.0197, time 121.64ms
iter 149380: loss 8.1414, time 122.76ms
iter 149390: loss 7.6557, time 121.67ms
iter 149400: loss 6.4200, time 121.80ms
iter 149410: loss 6.2133, time 121.52ms
iter 149420: loss 7.2764, time 122.99ms
iter 149430: loss 7.5129, time 121.63ms
iter 149440: loss 6.6703, time 122.71ms
iter 149450: loss 7.6121, time 121.63ms
iter 149460: loss 7.2706, time 122.20ms
iter 149470: loss 6.2642, time 121.57ms
iter 149480: loss 6.5633, time 122.63ms
iter 149490: loss 7.1761, time 121.96ms
step 149500: train loss 6.0360, val loss 6.0533
saving checkpoint to out-shakespeare-char
iter 149500: loss 7.0283, time 2897.13ms
iter 149510: loss 6.4134, time 121.67ms
iter 149520: loss 7.1960, time 121.45ms
iter 149530: loss 7.3562, time 121.56ms
iter 149540: loss 6.8233, time 121.77ms
iter 149550: loss 7.6253, time 121.51ms
iter 149560: loss 6.9141, time 121.63ms
iter 149570: loss 6.8528, time 121.52ms
iter 149580: loss 6.3512, time 120.76ms
iter 149590: loss 7.0057, time 121.64ms
iter 149600: loss 6.0868, time 121.69ms
iter 149610: loss 6.7449, time 121.48ms
iter 149620: loss 7.1710, time 120.43ms
iter 149630: loss 6.9246, time 121.45ms
iter 149640: loss 7.3230, time 121.70ms
iter 149650: loss 6.6651, time 121.55ms
iter 149660: loss 7.3342, time 121.92ms
iter 149670: loss 6.5345, time 120.44ms
iter 149680: loss 6.2746, time 121.28ms
iter 149690: loss 6.9822, time 120.82ms
iter 149700: loss 6.4864, time 121.29ms
iter 149710: loss 7.4877, time 121.52ms
iter 149720: loss 6.9431, time 121.69ms
iter 149730: loss 6.8860, time 121.47ms
iter 149740: loss 6.5555, time 121.59ms
step 149750: train loss 6.0483, val loss 6.0124
saving checkpoint to out-shakespeare-char
iter 149750: loss 7.4745, time 2887.20ms
iter 149760: loss 6.8557, time 125.52ms
iter 149770: loss 6.5756, time 125.31ms
iter 149780: loss 7.7296, time 125.01ms
iter 149790: loss 7.5112, time 125.17ms
iter 149800: loss 7.1738, time 125.06ms
iter 149810: loss 7.2090, time 126.09ms
iter 149820: loss 6.6718, time 128.51ms
iter 149830: loss 6.7646, time 125.61ms
iter 149840: loss 6.6856, time 125.72ms
iter 149850: loss 6.8233, time 125.71ms
iter 149860: loss 7.5349, time 125.60ms
iter 149870: loss 6.4433, time 125.74ms
iter 149880: loss 8.3481, time 125.60ms
iter 149890: loss 6.5671, time 125.54ms
iter 149900: loss 7.0819, time 126.98ms
iter 149910: loss 6.7081, time 125.82ms
iter 149920: loss 7.1845, time 125.84ms
iter 149930: loss 6.2777, time 128.51ms
iter 149940: loss 6.6583, time 125.53ms
iter 149950: loss 6.9460, time 125.82ms
iter 149960: loss 6.8017, time 125.72ms
iter 149970: loss 7.1712, time 125.87ms
iter 149980: loss 6.3397, time 125.60ms
iter 149990: loss 6.5607, time 125.82ms
step 150000: train loss 6.0484, val loss 6.0489
saving checkpoint to out-shakespeare-char
iter 150000: loss 7.0189, time 2877.92ms
iter 150010: loss 7.1204, time 124.93ms
iter 150020: loss 6.8440, time 124.83ms
iter 150030: loss 6.4664, time 127.92ms
iter 150040: loss 6.9818, time 125.06ms
iter 150050: loss 6.5936, time 125.00ms
iter 150060: loss 5.9725, time 125.09ms
iter 150070: loss 6.9227, time 125.00ms
iter 150080: loss 6.6393, time 125.01ms
iter 150090: loss 7.7766, time 125.05ms
iter 150100: loss 7.2449, time 125.06ms
iter 150110: loss 7.2341, time 124.94ms
iter 150120: loss 7.4956, time 125.06ms
iter 150130: loss 6.4956, time 125.00ms
iter 150140: loss 7.3484, time 128.02ms
iter 150150: loss 6.0005, time 125.04ms
iter 150160: loss 7.1096, time 124.97ms
iter 150170: loss 7.1752, time 125.10ms
iter 150180: loss 6.5312, time 124.72ms
iter 150190: loss 7.0135, time 124.82ms
iter 150200: loss 6.8371, time 126.00ms
iter 150210: loss 7.4696, time 125.06ms
iter 150220: loss 7.1716, time 124.89ms
iter 150230: loss 7.0380, time 124.87ms
iter 150240: loss 6.2639, time 125.00ms
step 150250: train loss 6.0391, val loss 6.0501
saving checkpoint to out-shakespeare-char
iter 150250: loss 7.2799, time 2898.49ms
iter 150260: loss 6.7631, time 124.99ms
iter 150270: loss 7.2368, time 125.11ms
iter 150280: loss 6.5454, time 125.00ms
iter 150290: loss 7.2183, time 125.05ms
iter 150300: loss 8.4435, time 125.07ms
iter 150310: loss 7.6461, time 125.28ms
iter 150320: loss 6.9525, time 128.30ms
iter 150330: loss 6.4894, time 125.02ms
iter 150340: loss 7.6117, time 125.18ms
iter 150350: loss 7.1355, time 125.23ms
iter 150360: loss 7.1116, time 127.88ms
iter 150370: loss 6.7698, time 125.01ms
iter 150380: loss 6.7153, time 124.95ms
iter 150390: loss 6.3567, time 125.09ms
iter 150400: loss 6.6981, time 125.00ms
iter 150410: loss 6.7148, time 124.98ms
iter 150420: loss 6.2558, time 125.10ms
iter 150430: loss 6.5120, time 124.99ms
iter 150440: loss 7.5420, time 125.14ms
iter 150450: loss 6.8151, time 124.83ms
iter 150460: loss 6.9420, time 125.09ms
iter 150470: loss 6.6192, time 127.94ms
iter 150480: loss 7.5874, time 125.05ms
iter 150490: loss 6.9835, time 125.12ms
step 150500: train loss 6.0464, val loss 6.0331
saving checkpoint to out-shakespeare-char
iter 150500: loss 6.8306, time 2892.17ms
iter 150510: loss 7.3718, time 124.76ms
iter 150520: loss 7.2695, time 124.99ms
iter 150530: loss 6.6923, time 124.99ms
iter 150540: loss 7.0139, time 124.85ms
iter 150550: loss 7.0054, time 125.20ms
iter 150560: loss 6.5842, time 124.91ms
iter 150570: loss 6.7057, time 127.58ms
iter 150580: loss 7.1960, time 125.36ms
iter 150590: loss 6.5890, time 124.96ms
iter 150600: loss 7.9662, time 125.92ms
iter 150610: loss 7.4387, time 124.92ms
iter 150620: loss 7.4126, time 124.84ms
iter 150630: loss 7.4869, time 124.84ms
iter 150640: loss 6.8695, time 124.70ms
iter 150650: loss 6.8319, time 124.85ms
iter 150660: loss 6.6836, time 125.04ms
iter 150670: loss 6.8396, time 125.25ms
iter 150680: loss 7.2140, time 128.10ms
iter 150690: loss 7.0466, time 125.43ms
iter 150700: loss 6.9636, time 125.04ms
iter 150710: loss 7.4823, time 125.12ms
iter 150720: loss 6.6386, time 124.99ms
iter 150730: loss 6.3830, time 124.97ms
iter 150740: loss 7.4613, time 124.91ms
step 150750: train loss 6.0154, val loss 6.0090
saving checkpoint to out-shakespeare-char
iter 150750: loss 6.9366, time 2896.50ms
iter 150760: loss 6.1697, time 125.24ms
iter 150770: loss 6.9395, time 125.10ms
iter 150780: loss 7.0523, time 127.88ms
iter 150790: loss 6.8373, time 125.11ms
iter 150800: loss 7.0337, time 125.15ms
iter 150810: loss 7.8903, time 125.23ms
iter 150820: loss 6.8800, time 125.12ms
iter 150830: loss 6.8098, time 125.17ms
iter 150840: loss 6.7906, time 125.26ms
iter 150850: loss 6.5479, time 125.20ms
iter 150860: loss 7.3760, time 124.92ms
iter 150870: loss 6.8065, time 124.98ms
iter 150880: loss 7.3340, time 125.14ms
iter 150890: loss 6.7125, time 127.77ms
iter 150900: loss 7.1109, time 125.04ms
iter 150910: loss 7.9224, time 125.19ms
iter 150920: loss 7.6456, time 125.04ms
iter 150930: loss 6.8006, time 125.03ms
iter 150940: loss 7.1794, time 124.90ms
iter 150950: loss 6.1144, time 125.15ms
iter 150960: loss 7.3198, time 124.93ms
iter 150970: loss 6.8383, time 125.05ms
iter 150980: loss 7.1972, time 125.93ms
iter 150990: loss 7.2008, time 125.82ms
step 151000: train loss 6.0368, val loss 6.0190
saving checkpoint to out-shakespeare-char
iter 151000: loss 6.6978, time 2875.96ms
iter 151010: loss 6.3719, time 126.01ms
iter 151020: loss 6.2696, time 125.86ms
iter 151030: loss 7.1766, time 125.63ms
iter 151040: loss 7.0350, time 125.68ms
iter 151050: loss 7.2357, time 125.61ms
iter 151060: loss 6.6273, time 126.17ms
iter 151070: loss 7.2378, time 125.60ms
iter 151080: loss 7.1604, time 127.18ms
iter 151090: loss 6.6451, time 125.95ms
iter 151100: loss 6.6752, time 126.15ms
iter 151110: loss 6.7793, time 129.15ms
iter 151120: loss 7.2926, time 125.74ms
iter 151130: loss 7.1212, time 126.13ms
iter 151140: loss 6.9482, time 125.07ms
iter 151150: loss 6.2752, time 128.16ms
iter 151160: loss 6.5490, time 125.74ms
iter 151170: loss 6.9669, time 125.86ms
iter 151180: loss 7.4361, time 124.93ms
iter 151190: loss 6.9252, time 124.82ms
iter 151200: loss 7.4789, time 124.86ms
iter 151210: loss 6.1340, time 124.81ms
iter 151220: loss 6.9637, time 124.69ms
iter 151230: loss 7.5601, time 124.93ms
iter 151240: loss 6.8768, time 124.80ms
step 151250: train loss 6.0629, val loss 6.0400
saving checkpoint to out-shakespeare-char
iter 151250: loss 7.2270, time 2899.06ms
iter 151260: loss 6.7916, time 125.29ms
iter 151270: loss 7.4927, time 125.62ms
iter 151280: loss 6.7861, time 125.33ms
iter 151290: loss 6.7490, time 125.50ms
iter 151300: loss 7.2161, time 125.60ms
iter 151310: loss 6.6233, time 125.71ms
iter 151320: loss 7.2380, time 128.59ms
iter 151330: loss 6.1514, time 125.88ms
iter 151340: loss 6.0824, time 125.80ms
iter 151350: loss 7.1491, time 125.10ms
iter 151360: loss 7.3848, time 125.81ms
iter 151370: loss 6.6589, time 125.68ms
iter 151380: loss 7.5529, time 126.12ms
iter 151390: loss 6.5398, time 125.52ms
iter 151400: loss 6.4830, time 125.79ms
iter 151410: loss 6.6530, time 125.76ms
iter 151420: loss 6.2824, time 125.78ms
iter 151430: loss 6.7497, time 127.92ms
iter 151440: loss 7.0054, time 125.82ms
iter 151450: loss 7.3794, time 125.82ms
iter 151460: loss 6.9607, time 124.95ms
iter 151470: loss 5.9873, time 126.09ms
iter 151480: loss 6.5214, time 126.03ms
iter 151490: loss 6.6880, time 125.90ms
step 151500: train loss 6.0158, val loss 6.0327
saving checkpoint to out-shakespeare-char
iter 151500: loss 6.8624, time 2878.84ms
iter 151510: loss 7.3736, time 122.09ms
iter 151520: loss 6.8545, time 122.16ms
iter 151530: loss 6.3156, time 122.27ms
iter 151540: loss 6.8989, time 122.02ms
iter 151550: loss 7.1158, time 121.89ms
iter 151560: loss 6.4469, time 121.76ms
iter 151570: loss 6.6967, time 125.62ms
iter 151580: loss 6.7756, time 124.54ms
iter 151590: loss 6.4846, time 124.38ms
iter 151600: loss 6.8630, time 126.67ms
iter 151610: loss 6.7850, time 123.38ms
iter 151620: loss 6.8128, time 122.02ms
iter 151630: loss 6.8881, time 123.02ms
iter 151640: loss 6.6489, time 122.42ms
iter 151650: loss 6.6511, time 124.11ms
iter 151660: loss 6.2343, time 122.35ms
iter 151670: loss 7.1184, time 123.31ms
iter 151680: loss 6.6900, time 122.77ms
iter 151690: loss 6.2341, time 124.29ms
iter 151700: loss 6.5927, time 122.11ms
iter 151710: loss 6.2922, time 123.25ms
iter 151720: loss 6.5786, time 121.62ms
iter 151730: loss 7.0771, time 123.59ms
iter 151740: loss 6.5708, time 121.76ms
step 151750: train loss 6.0293, val loss 6.0480
saving checkpoint to out-shakespeare-char
iter 151750: loss 7.6717, time 2893.17ms
iter 151760: loss 6.8285, time 121.50ms
iter 151770: loss 7.4019, time 121.92ms
iter 151780: loss 7.0075, time 121.48ms
iter 151790: loss 6.8465, time 122.35ms
iter 151800: loss 6.5261, time 121.48ms
iter 151810: loss 6.7811, time 121.50ms
iter 151820: loss 6.5038, time 121.31ms
iter 151830: loss 6.3750, time 121.38ms
iter 151840: loss 6.9790, time 121.53ms
iter 151850: loss 6.7854, time 121.53ms
iter 151860: loss 6.0204, time 121.36ms
iter 151870: loss 7.5667, time 120.51ms
iter 151880: loss 7.0586, time 121.83ms
iter 151890: loss 6.3808, time 121.20ms
iter 151900: loss 7.4380, time 121.44ms
iter 151910: loss 6.8111, time 121.44ms
iter 151920: loss 6.0964, time 121.89ms
iter 151930: loss 6.5998, time 122.20ms
iter 151940: loss 7.3809, time 121.87ms
iter 151950: loss 7.4591, time 121.84ms
iter 151960: loss 6.6065, time 124.86ms
iter 151970: loss 7.0910, time 124.87ms
iter 151980: loss 7.1826, time 126.00ms
iter 151990: loss 6.7275, time 124.33ms
step 152000: train loss 6.0843, val loss 6.0668
saving checkpoint to out-shakespeare-char
iter 152000: loss 7.4756, time 2916.31ms
iter 152010: loss 6.6167, time 125.49ms
iter 152020: loss 6.2703, time 125.08ms
iter 152030: loss 7.0416, time 125.31ms
iter 152040: loss 6.7564, time 126.05ms
iter 152050: loss 6.6516, time 123.55ms
iter 152060: loss 6.9490, time 127.04ms
iter 152070: loss 6.0215, time 125.29ms
iter 152080: loss 6.9837, time 125.29ms
iter 152090: loss 6.4706, time 125.90ms
iter 152100: loss 7.1744, time 125.42ms
iter 152110: loss 5.9802, time 125.26ms
iter 152120: loss 7.6633, time 125.84ms
iter 152130: loss 6.7289, time 125.26ms
iter 152140: loss 7.3215, time 125.29ms
iter 152150: loss 7.1176, time 125.12ms
iter 152160: loss 6.8232, time 125.48ms
iter 152170: loss 7.0373, time 126.09ms
iter 152180: loss 6.2624, time 125.69ms
iter 152190: loss 7.2607, time 125.49ms
iter 152200: loss 7.0272, time 125.44ms
iter 152210: loss 6.3437, time 124.96ms
iter 152220: loss 7.2364, time 125.04ms
iter 152230: loss 6.4916, time 124.34ms
iter 152240: loss 7.0013, time 125.38ms
step 152250: train loss 6.0415, val loss 6.0786
saving checkpoint to out-shakespeare-char
iter 152250: loss 7.2422, time 2898.21ms
iter 152260: loss 6.5054, time 126.41ms
iter 152270: loss 6.9022, time 125.33ms
iter 152280: loss 6.5017, time 125.40ms
iter 152290: loss 6.6885, time 125.82ms
iter 152300: loss 6.6294, time 125.18ms
iter 152310: loss 6.4017, time 124.41ms
iter 152320: loss 7.5877, time 125.22ms
iter 152330: loss 7.4848, time 125.51ms
iter 152340: loss 6.9113, time 125.06ms
iter 152350: loss 6.7736, time 128.44ms
iter 152360: loss 6.2830, time 125.34ms
iter 152370: loss 6.9293, time 126.18ms
iter 152380: loss 7.2802, time 125.47ms
iter 152390: loss 6.2088, time 128.38ms
iter 152400: loss 6.8218, time 125.12ms
iter 152410: loss 6.3797, time 126.66ms
iter 152420: loss 7.1626, time 125.74ms
iter 152430: loss 7.0048, time 125.46ms
iter 152440: loss 6.4645, time 125.82ms
iter 152450: loss 6.2275, time 125.77ms
iter 152460: loss 6.8556, time 126.78ms
iter 152470: loss 6.7119, time 125.69ms
iter 152480: loss 6.6904, time 125.48ms
iter 152490: loss 6.4443, time 125.96ms
step 152500: train loss 5.9989, val loss 6.0334
saving checkpoint to out-shakespeare-char
iter 152500: loss 6.8534, time 2878.57ms
iter 152510: loss 6.7737, time 125.82ms
iter 152520: loss 6.1680, time 125.70ms
iter 152530: loss 6.9806, time 128.67ms
iter 152540: loss 7.5299, time 125.73ms
iter 152550: loss 6.3910, time 125.57ms
iter 152560: loss 6.7225, time 125.62ms
iter 152570: loss 6.5935, time 125.66ms
iter 152580: loss 7.5064, time 125.65ms
iter 152590: loss 6.9717, time 126.02ms
iter 152600: loss 6.5322, time 125.28ms
iter 152610: loss 6.6413, time 125.56ms
iter 152620: loss 6.3917, time 123.19ms
iter 152630: loss 7.0528, time 121.55ms
iter 152640: loss 6.9123, time 119.69ms
iter 152650: loss 6.8081, time 121.86ms
iter 152660: loss 7.0474, time 119.73ms
iter 152670: loss 7.3014, time 121.27ms
iter 152680: loss 6.1511, time 119.46ms
iter 152690: loss 6.4061, time 121.80ms
iter 152700: loss 6.5429, time 121.64ms
iter 152710: loss 6.8985, time 121.91ms
iter 152720: loss 6.9808, time 121.77ms
iter 152730: loss 6.6600, time 122.12ms
iter 152740: loss 6.8550, time 121.91ms
step 152750: train loss 5.9989, val loss 6.0133
saving checkpoint to out-shakespeare-char
iter 152750: loss 6.4960, time 2906.39ms
iter 152760: loss 6.4369, time 121.69ms
iter 152770: loss 6.8225, time 124.53ms
iter 152780: loss 7.1446, time 121.42ms
iter 152790: loss 6.6730, time 124.39ms
iter 152800: loss 6.7122, time 122.02ms
iter 152810: loss 6.2213, time 124.61ms
iter 152820: loss 7.2148, time 121.51ms
iter 152830: loss 7.2971, time 123.68ms
iter 152840: loss 6.7713, time 121.56ms
iter 152850: loss 7.7184, time 124.52ms
iter 152860: loss 6.1996, time 121.08ms
iter 152870: loss 6.8841, time 124.66ms
iter 152880: loss 6.6629, time 121.47ms
iter 152890: loss 7.3640, time 123.94ms
iter 152900: loss 6.9632, time 121.55ms
iter 152910: loss 7.2740, time 124.55ms
iter 152920: loss 6.5859, time 121.70ms
iter 152930: loss 7.8699, time 124.49ms
iter 152940: loss 7.5362, time 121.56ms
iter 152950: loss 6.7821, time 124.45ms
iter 152960: loss 6.9429, time 121.55ms
iter 152970: loss 6.6826, time 123.93ms
iter 152980: loss 6.9693, time 120.88ms
iter 152990: loss 6.9571, time 124.63ms
step 153000: train loss 6.0409, val loss 6.0467
saving checkpoint to out-shakespeare-char
iter 153000: loss 6.1461, time 2909.81ms
iter 153010: loss 7.0159, time 121.53ms
iter 153020: loss 7.4237, time 124.49ms
iter 153030: loss 6.7078, time 121.56ms
iter 153040: loss 6.8732, time 124.55ms
iter 153050: loss 7.3137, time 121.49ms
iter 153060: loss 7.2415, time 124.56ms
iter 153070: loss 7.2618, time 121.58ms
iter 153080: loss 6.8279, time 124.78ms
iter 153090: loss 7.1797, time 121.62ms
iter 153100: loss 7.2163, time 125.78ms
iter 153110: loss 7.1018, time 125.72ms
iter 153120: loss 6.8682, time 125.95ms
iter 153130: loss 7.0385, time 125.96ms
iter 153140: loss 6.9304, time 128.44ms
iter 153150: loss 6.3549, time 125.73ms
iter 153160: loss 6.6319, time 125.62ms
iter 153170: loss 6.7938, time 125.83ms
iter 153180: loss 7.1784, time 125.62ms
iter 153190: loss 7.0395, time 125.63ms
iter 153200: loss 6.9468, time 125.62ms
iter 153210: loss 6.7489, time 125.70ms
iter 153220: loss 6.6133, time 125.62ms
iter 153230: loss 7.1718, time 125.61ms
iter 153240: loss 7.9061, time 125.72ms
step 153250: train loss 6.0504, val loss 6.0251
saving checkpoint to out-shakespeare-char
iter 153250: loss 7.0681, time 2895.91ms
iter 153260: loss 7.5305, time 125.95ms
iter 153270: loss 6.4161, time 128.77ms
iter 153280: loss 7.1647, time 125.77ms
iter 153290: loss 7.0066, time 125.72ms
iter 153300: loss 6.5242, time 125.56ms
iter 153310: loss 6.6140, time 125.99ms
iter 153320: loss 6.9962, time 125.67ms
iter 153330: loss 6.3956, time 125.71ms
iter 153340: loss 6.0967, time 125.57ms
iter 153350: loss 7.1112, time 125.71ms
iter 153360: loss 6.3278, time 125.85ms
iter 153370: loss 7.1649, time 125.77ms
iter 153380: loss 6.7278, time 129.19ms
iter 153390: loss 6.8822, time 125.96ms
iter 153400: loss 7.0031, time 125.95ms
iter 153410: loss 7.0636, time 125.63ms
iter 153420: loss 6.5938, time 125.16ms
iter 153430: loss 6.9458, time 125.82ms
iter 153440: loss 6.3571, time 125.74ms
iter 153450: loss 6.8463, time 125.73ms
iter 153460: loss 6.7860, time 125.30ms
iter 153470: loss 6.6120, time 124.98ms
iter 153480: loss 5.8221, time 125.30ms
iter 153490: loss 6.7838, time 127.10ms
step 153500: train loss 5.9687, val loss 6.0407
saving checkpoint to out-shakespeare-char
iter 153500: loss 7.4442, time 2897.04ms
iter 153510: loss 6.9199, time 126.15ms
iter 153520: loss 6.7578, time 125.74ms
iter 153530: loss 7.6191, time 125.65ms
iter 153540: loss 6.9340, time 125.83ms
iter 153550: loss 6.6832, time 125.87ms
iter 153560: loss 6.7856, time 125.86ms
iter 153570: loss 7.5352, time 125.96ms
iter 153580: loss 6.9861, time 126.27ms
iter 153590: loss 6.6747, time 128.97ms
iter 153600: loss 7.1148, time 125.98ms
iter 153610: loss 6.1705, time 125.75ms
iter 153620: loss 6.9331, time 125.89ms
iter 153630: loss 7.0293, time 125.96ms
iter 153640: loss 6.7114, time 125.88ms
iter 153650: loss 6.9225, time 125.60ms
iter 153660: loss 6.8134, time 125.53ms
iter 153670: loss 7.5310, time 126.07ms
iter 153680: loss 6.5531, time 125.76ms
iter 153690: loss 7.1923, time 125.73ms
iter 153700: loss 6.8777, time 125.71ms
iter 153710: loss 6.8767, time 126.13ms
iter 153720: loss 7.0042, time 121.33ms
iter 153730: loss 7.2101, time 121.41ms
iter 153740: loss 5.8654, time 121.02ms
step 153750: train loss 6.0112, val loss 6.0581
saving checkpoint to out-shakespeare-char
iter 153750: loss 6.3675, time 2901.04ms
iter 153760: loss 6.1498, time 122.43ms
iter 153770: loss 6.4885, time 121.94ms
iter 153780: loss 6.8653, time 122.67ms
iter 153790: loss 7.0601, time 122.30ms
iter 153800: loss 6.6359, time 122.51ms
iter 153810: loss 6.3033, time 122.28ms
iter 153820: loss 6.8380, time 122.82ms
iter 153830: loss 6.5364, time 122.16ms
iter 153840: loss 6.6392, time 122.39ms
iter 153850: loss 6.8704, time 121.59ms
iter 153860: loss 7.1987, time 122.53ms
iter 153870: loss 7.3055, time 121.11ms
iter 153880: loss 7.2866, time 121.97ms
iter 153890: loss 6.4797, time 121.89ms
iter 153900: loss 6.9439, time 122.49ms
iter 153910: loss 6.5835, time 121.71ms
iter 153920: loss 6.1943, time 122.91ms
iter 153930: loss 7.2831, time 121.84ms
iter 153940: loss 6.3980, time 126.39ms
iter 153950: loss 6.3692, time 124.75ms
iter 153960: loss 6.4422, time 125.83ms
iter 153970: loss 7.3208, time 125.63ms
iter 153980: loss 6.6356, time 126.30ms
iter 153990: loss 6.6377, time 128.35ms
step 154000: train loss 5.9782, val loss 6.0500
saving checkpoint to out-shakespeare-char
iter 154000: loss 7.4944, time 2876.69ms
iter 154010: loss 7.0711, time 125.39ms
iter 154020: loss 6.8640, time 125.06ms
iter 154030: loss 6.8130, time 125.12ms
iter 154040: loss 7.4686, time 124.73ms
iter 154050: loss 6.5771, time 125.05ms
iter 154060: loss 7.8107, time 125.25ms
iter 154070: loss 6.1956, time 125.20ms
iter 154080: loss 6.8089, time 125.31ms
iter 154090: loss 7.1519, time 128.57ms
iter 154100: loss 6.6369, time 125.09ms
iter 154110: loss 7.1730, time 125.18ms
iter 154120: loss 6.6523, time 125.35ms
iter 154130: loss 6.5002, time 125.51ms
iter 154140: loss 6.2556, time 125.83ms
iter 154150: loss 7.3057, time 125.60ms
iter 154160: loss 6.9774, time 125.66ms
iter 154170: loss 6.8099, time 125.83ms
iter 154180: loss 6.5925, time 125.61ms
iter 154190: loss 6.8964, time 125.73ms
iter 154200: loss 6.3550, time 128.74ms
iter 154210: loss 6.7794, time 125.72ms
iter 154220: loss 8.0658, time 126.13ms
iter 154230: loss 6.6624, time 125.91ms
iter 154240: loss 6.7786, time 125.72ms
step 154250: train loss 6.0299, val loss 6.0514
saving checkpoint to out-shakespeare-char
iter 154250: loss 6.6967, time 2854.16ms
iter 154260: loss 7.0218, time 125.24ms
iter 154270: loss 7.1157, time 125.23ms
iter 154280: loss 5.3598, time 124.69ms
iter 154290: loss 6.8598, time 125.29ms
iter 154300: loss 6.6154, time 124.41ms
iter 154310: loss 6.8646, time 127.02ms
iter 154320: loss 7.0111, time 124.86ms
iter 154330: loss 6.9250, time 125.27ms
iter 154340: loss 6.4881, time 124.90ms
iter 154350: loss 6.9476, time 125.09ms
iter 154360: loss 6.5481, time 124.27ms
iter 154370: loss 7.8028, time 124.91ms
iter 154380: loss 6.7822, time 125.22ms
iter 154390: loss 6.7755, time 125.10ms
iter 154400: loss 6.5227, time 125.93ms
iter 154410: loss 6.6671, time 125.87ms
iter 154420: loss 7.3877, time 129.15ms
iter 154430: loss 7.6529, time 125.57ms
iter 154440: loss 6.4966, time 125.54ms
iter 154450: loss 6.4751, time 125.91ms
iter 154460: loss 6.0801, time 125.62ms
iter 154470: loss 6.9061, time 125.67ms
iter 154480: loss 6.5268, time 125.42ms
iter 154490: loss 6.2209, time 125.50ms
step 154500: train loss 5.9897, val loss 6.0642
saving checkpoint to out-shakespeare-char
iter 154500: loss 6.6357, time 2873.70ms
iter 154510: loss 6.9613, time 125.52ms
iter 154520: loss 6.6891, time 128.21ms
iter 154530: loss 7.2415, time 125.28ms
iter 154540: loss 6.9047, time 125.26ms
iter 154550: loss 7.0826, time 125.45ms
iter 154560: loss 6.4337, time 125.14ms
iter 154570: loss 7.7043, time 125.32ms
iter 154580: loss 6.8960, time 126.33ms
iter 154590: loss 6.6865, time 125.29ms
iter 154600: loss 6.3937, time 124.59ms
iter 154610: loss 7.0430, time 125.54ms
iter 154620: loss 6.5323, time 125.18ms
iter 154630: loss 6.8114, time 121.63ms
iter 154640: loss 7.0894, time 121.50ms
iter 154650: loss 6.8224, time 121.21ms
iter 154660: loss 6.4862, time 121.44ms
iter 154670: loss 7.2524, time 121.12ms
iter 154680: loss 6.7712, time 121.18ms
iter 154690: loss 7.4576, time 121.35ms
iter 154700: loss 6.8874, time 121.44ms
iter 154710: loss 6.2124, time 121.48ms
iter 154720: loss 6.0534, time 121.68ms
iter 154730: loss 6.3175, time 121.49ms
iter 154740: loss 7.1932, time 121.45ms
step 154750: train loss 6.0138, val loss 6.0479
saving checkpoint to out-shakespeare-char
iter 154750: loss 6.3589, time 2901.38ms
iter 154760: loss 7.0034, time 124.75ms
iter 154770: loss 6.7798, time 121.64ms
iter 154780: loss 6.8381, time 124.91ms
iter 154790: loss 7.0876, time 122.30ms
iter 154800: loss 7.6037, time 126.22ms
iter 154810: loss 6.8825, time 121.51ms
iter 154820: loss 7.7015, time 124.86ms
iter 154830: loss 7.1177, time 121.62ms
iter 154840: loss 6.6433, time 124.92ms
iter 154850: loss 6.8373, time 121.65ms
iter 154860: loss 6.9905, time 124.66ms
iter 154870: loss 6.2861, time 120.98ms
iter 154880: loss 6.0768, time 124.55ms
iter 154890: loss 6.3091, time 121.53ms
iter 154900: loss 7.4755, time 124.54ms
iter 154910: loss 6.8295, time 121.59ms
iter 154920: loss 7.3810, time 124.54ms
iter 154930: loss 6.9705, time 121.62ms
iter 154940: loss 6.5288, time 124.60ms
iter 154950: loss 6.9076, time 121.60ms
iter 154960: loss 7.1232, time 124.65ms
iter 154970: loss 6.3948, time 121.71ms
iter 154980: loss 7.1127, time 125.09ms
iter 154990: loss 7.1684, time 121.53ms
step 155000: train loss 6.0032, val loss 5.9851
saving checkpoint to out-shakespeare-char
iter 155000: loss 6.4884, time 2894.99ms
iter 155010: loss 6.3977, time 121.45ms
iter 155020: loss 6.7971, time 121.98ms
iter 155030: loss 7.0225, time 121.03ms
iter 155040: loss 6.6271, time 121.02ms
iter 155050: loss 6.9987, time 121.44ms
iter 155060: loss 6.9676, time 121.64ms
iter 155070: loss 6.6635, time 121.69ms
iter 155080: loss 7.0520, time 121.55ms
iter 155090: loss 6.7653, time 121.30ms
iter 155100: loss 6.7367, time 121.18ms
iter 155110: loss 6.3375, time 121.61ms
iter 155120: loss 6.5875, time 121.59ms
iter 155130: loss 7.2270, time 121.48ms
iter 155140: loss 7.0679, time 125.10ms
iter 155150: loss 6.8542, time 125.42ms
iter 155160: loss 6.2812, time 125.21ms
iter 155170: loss 7.0480, time 124.91ms
iter 155180: loss 6.6773, time 124.71ms
iter 155190: loss 6.8449, time 125.56ms
iter 155200: loss 7.2418, time 125.75ms
iter 155210: loss 6.9710, time 125.67ms
iter 155220: loss 7.7920, time 126.12ms
iter 155230: loss 7.4210, time 125.74ms
iter 155240: loss 6.8470, time 125.73ms
step 155250: train loss 5.9761, val loss 5.9945
saving checkpoint to out-shakespeare-char
iter 155250: loss 8.1580, time 2869.25ms
iter 155260: loss 7.8651, time 125.92ms
iter 155270: loss 6.6572, time 125.99ms
iter 155280: loss 6.5513, time 124.56ms
iter 155290: loss 6.6790, time 126.24ms
iter 155300: loss 7.2248, time 125.55ms
iter 155310: loss 6.8518, time 125.36ms
iter 155320: loss 6.6722, time 126.89ms
iter 155330: loss 6.4968, time 125.31ms
iter 155340: loss 6.4412, time 125.55ms
iter 155350: loss 6.9671, time 128.32ms
iter 155360: loss 6.0140, time 125.52ms
iter 155370: loss 6.8207, time 126.02ms
iter 155380: loss 7.0129, time 125.81ms
iter 155390: loss 6.9436, time 125.87ms
iter 155400: loss 6.4519, time 125.95ms
iter 155410: loss 6.6946, time 125.83ms
iter 155420: loss 6.7749, time 125.91ms
iter 155430: loss 6.5633, time 125.86ms
iter 155440: loss 6.0956, time 125.93ms
iter 155450: loss 7.3668, time 125.68ms
iter 155460: loss 6.4253, time 128.24ms
iter 155470: loss 6.9220, time 125.39ms
iter 155480: loss 7.0393, time 125.16ms
iter 155490: loss 6.0736, time 125.97ms
step 155500: train loss 6.0741, val loss 5.9946
saving checkpoint to out-shakespeare-char
iter 155500: loss 7.5240, time 2875.30ms
iter 155510: loss 6.9497, time 126.03ms
iter 155520: loss 5.9155, time 127.53ms
iter 155530: loss 7.4911, time 125.29ms
iter 155540: loss 7.0498, time 125.58ms
iter 155550: loss 5.9730, time 125.59ms
iter 155560: loss 7.7504, time 128.81ms
iter 155570: loss 6.8793, time 126.62ms
iter 155580: loss 5.5609, time 126.02ms
iter 155590: loss 7.0392, time 126.18ms
iter 155600: loss 5.7406, time 125.99ms
iter 155610: loss 6.4172, time 126.76ms
iter 155620: loss 6.7700, time 125.97ms
iter 155630: loss 6.8416, time 126.01ms
iter 155640: loss 6.9017, time 125.91ms
iter 155650: loss 6.0872, time 126.14ms
iter 155660: loss 7.1024, time 125.50ms
iter 155670: loss 7.5557, time 129.04ms
iter 155680: loss 6.7068, time 126.33ms
iter 155690: loss 6.9224, time 125.59ms
iter 155700: loss 6.1342, time 125.98ms
iter 155710: loss 7.2063, time 125.77ms
iter 155720: loss 6.1032, time 125.93ms
iter 155730: loss 6.9775, time 125.85ms
iter 155740: loss 7.7619, time 125.88ms
step 155750: train loss 6.0264, val loss 6.0682
saving checkpoint to out-shakespeare-char
iter 155750: loss 6.5049, time 2872.15ms
iter 155760: loss 7.2027, time 125.56ms
iter 155770: loss 6.3403, time 126.12ms
iter 155780: loss 6.8278, time 124.99ms
iter 155790: loss 7.3608, time 125.34ms
iter 155800: loss 6.7643, time 128.01ms
iter 155810: loss 6.6188, time 125.05ms
iter 155820: loss 6.6923, time 125.10ms
iter 155830: loss 6.5724, time 125.27ms
iter 155840: loss 7.2446, time 124.58ms
iter 155850: loss 6.4663, time 125.21ms
iter 155860: loss 6.5873, time 125.07ms
iter 155870: loss 7.5147, time 122.99ms
iter 155880: loss 7.4288, time 121.45ms
iter 155890: loss 6.8570, time 121.66ms
iter 155900: loss 6.8713, time 121.62ms
iter 155910: loss 6.5145, time 122.63ms
iter 155920: loss 7.0141, time 121.80ms
iter 155930: loss 7.1539, time 122.68ms
iter 155940: loss 6.6448, time 121.60ms
iter 155950: loss 5.7825, time 123.13ms
iter 155960: loss 6.6627, time 121.59ms
iter 155970: loss 7.5776, time 122.87ms
iter 155980: loss 6.8595, time 120.29ms
iter 155990: loss 7.2505, time 121.77ms
step 156000: train loss 6.0624, val loss 6.0499
saving checkpoint to out-shakespeare-char
iter 156000: loss 7.0666, time 2897.09ms
iter 156010: loss 7.6858, time 120.91ms
iter 156020: loss 7.1631, time 121.27ms
iter 156030: loss 6.5534, time 121.53ms
iter 156040: loss 6.9836, time 121.97ms
iter 156050: loss 6.9989, time 121.57ms
iter 156060: loss 7.0578, time 121.38ms
iter 156070: loss 6.9655, time 121.63ms
iter 156080: loss 7.5459, time 121.67ms
iter 156090: loss 6.5148, time 121.59ms
iter 156100: loss 6.0276, time 122.38ms
iter 156110: loss 7.3431, time 120.80ms
iter 156120: loss 6.9457, time 120.72ms
iter 156130: loss 6.3032, time 121.61ms
iter 156140: loss 6.9391, time 121.56ms
iter 156150: loss 6.6893, time 121.70ms
iter 156160: loss 7.5586, time 121.57ms
iter 156170: loss 6.6949, time 121.50ms
iter 156180: loss 7.0510, time 121.46ms
iter 156190: loss 6.9597, time 122.11ms
iter 156200: loss 6.8230, time 121.76ms
iter 156210: loss 6.8636, time 121.55ms
iter 156220: loss 6.7321, time 120.74ms
iter 156230: loss 7.5241, time 120.60ms
iter 156240: loss 6.6990, time 120.56ms
step 156250: train loss 6.0737, val loss 6.0011
saving checkpoint to out-shakespeare-char
iter 156250: loss 6.9502, time 2883.10ms
iter 156260: loss 7.1444, time 123.37ms
iter 156270: loss 6.7888, time 121.16ms
iter 156280: loss 7.2398, time 124.57ms
iter 156290: loss 6.5496, time 121.56ms
iter 156300: loss 7.0064, time 124.55ms
iter 156310: loss 6.3309, time 121.43ms
iter 156320: loss 7.0035, time 124.29ms
iter 156330: loss 6.5664, time 121.65ms
iter 156340: loss 5.8696, time 124.41ms
iter 156350: loss 6.5355, time 121.47ms
iter 156360: loss 7.2948, time 124.44ms
iter 156370: loss 6.9860, time 121.96ms
iter 156380: loss 6.2476, time 124.34ms
iter 156390: loss 7.5276, time 121.59ms
iter 156400: loss 6.7043, time 124.54ms
iter 156410: loss 7.2826, time 121.49ms
iter 156420: loss 7.2906, time 123.80ms
iter 156430: loss 7.0642, time 121.25ms
iter 156440: loss 6.1150, time 124.33ms
iter 156450: loss 6.5289, time 121.07ms
iter 156460: loss 6.6837, time 124.76ms
iter 156470: loss 6.8554, time 121.43ms
iter 156480: loss 7.0093, time 124.28ms
iter 156490: loss 6.3029, time 121.48ms
step 156500: train loss 5.9948, val loss 6.0240
saving checkpoint to out-shakespeare-char
iter 156500: loss 6.8763, time 2890.50ms
iter 156510: loss 6.9196, time 121.66ms
iter 156520: loss 6.6210, time 121.53ms
iter 156530: loss 7.0713, time 121.66ms
iter 156540: loss 7.3160, time 121.71ms
iter 156550: loss 6.9848, time 122.01ms
iter 156560: loss 7.2056, time 121.64ms
iter 156570: loss 6.3511, time 121.89ms
iter 156580: loss 7.0873, time 121.69ms
iter 156590: loss 6.6352, time 121.60ms
iter 156600: loss 6.4560, time 121.28ms
iter 156610: loss 7.2149, time 121.47ms
iter 156620: loss 6.4982, time 121.67ms
iter 156630: loss 6.7976, time 121.40ms
iter 156640: loss 6.0767, time 121.57ms
iter 156650: loss 6.6744, time 121.41ms
iter 156660: loss 7.1030, time 121.67ms
iter 156670: loss 7.1173, time 121.89ms
iter 156680: loss 6.6895, time 121.86ms
iter 156690: loss 7.8339, time 122.22ms
iter 156700: loss 6.7221, time 121.72ms
iter 156710: loss 6.7779, time 121.57ms
iter 156720: loss 6.6049, time 121.62ms
iter 156730: loss 7.2191, time 121.42ms
iter 156740: loss 7.0217, time 121.83ms
step 156750: train loss 5.9620, val loss 5.9744
saving checkpoint to out-shakespeare-char
iter 156750: loss 6.5247, time 2886.70ms
iter 156760: loss 7.0675, time 122.02ms
iter 156770: loss 6.3447, time 122.71ms
iter 156780: loss 6.4957, time 122.03ms
iter 156790: loss 7.0224, time 121.60ms
iter 156800: loss 6.2616, time 121.12ms
iter 156810: loss 6.8552, time 121.66ms
iter 156820: loss 6.6127, time 122.27ms
iter 156830: loss 6.9507, time 121.33ms
iter 156840: loss 6.5113, time 121.20ms
iter 156850: loss 6.9644, time 121.99ms
iter 156860: loss 7.5927, time 121.66ms
iter 156870: loss 6.7047, time 121.55ms
iter 156880: loss 6.3737, time 121.25ms
iter 156890: loss 6.9537, time 121.50ms
iter 156900: loss 6.1152, time 121.57ms
iter 156910: loss 6.7556, time 121.31ms
iter 156920: loss 6.5382, time 121.69ms
iter 156930: loss 6.8115, time 121.07ms
iter 156940: loss 6.6337, time 121.42ms
iter 156950: loss 6.4769, time 121.34ms
iter 156960: loss 6.7684, time 120.92ms
iter 156970: loss 7.3134, time 121.23ms
iter 156980: loss 6.6202, time 120.81ms
iter 156990: loss 6.2992, time 121.19ms
step 157000: train loss 6.0322, val loss 6.0084
saving checkpoint to out-shakespeare-char
iter 157000: loss 7.3421, time 2874.84ms
iter 157010: loss 6.9391, time 125.26ms
iter 157020: loss 7.4580, time 125.57ms
iter 157030: loss 6.3160, time 125.49ms
iter 157040: loss 6.7221, time 128.34ms
iter 157050: loss 7.4085, time 125.01ms
iter 157060: loss 6.1190, time 125.77ms
iter 157070: loss 6.4093, time 125.16ms
iter 157080: loss 6.6520, time 125.58ms
iter 157090: loss 6.8622, time 125.23ms
iter 157100: loss 6.5017, time 125.22ms
iter 157110: loss 6.6860, time 125.44ms
iter 157120: loss 6.9767, time 125.25ms
iter 157130: loss 7.1012, time 125.08ms
iter 157140: loss 7.1287, time 125.54ms
iter 157150: loss 6.4077, time 128.71ms
iter 157160: loss 6.4031, time 124.80ms
iter 157170: loss 6.6700, time 125.57ms
iter 157180: loss 7.4171, time 125.38ms
iter 157190: loss 6.5912, time 124.12ms
iter 157200: loss 6.6853, time 126.09ms
iter 157210: loss 6.8026, time 126.11ms
iter 157220: loss 6.5179, time 125.97ms
iter 157230: loss 6.7995, time 125.95ms
iter 157240: loss 6.6718, time 125.72ms
step 157250: train loss 6.0439, val loss 6.0006
saving checkpoint to out-shakespeare-char
iter 157250: loss 6.4784, time 2878.28ms
iter 157260: loss 6.9923, time 121.64ms
iter 157270: loss 6.8815, time 121.00ms
iter 157280: loss 7.1497, time 121.63ms
iter 157290: loss 6.4248, time 122.20ms
iter 157300: loss 6.9639, time 121.75ms
iter 157310: loss 7.5493, time 121.65ms
iter 157320: loss 7.9000, time 121.34ms
iter 157330: loss 6.9976, time 121.84ms
iter 157340: loss 6.1293, time 121.53ms
iter 157350: loss 7.3754, time 121.73ms
iter 157360: loss 6.4651, time 121.23ms
iter 157370: loss 7.7013, time 121.55ms
iter 157380: loss 6.9416, time 121.55ms
iter 157390: loss 7.3981, time 121.51ms
iter 157400: loss 6.7010, time 121.34ms
iter 157410: loss 6.8106, time 121.52ms
iter 157420: loss 7.0861, time 121.71ms
iter 157430: loss 6.8023, time 121.29ms
iter 157440: loss 6.2116, time 121.53ms
iter 157450: loss 5.8396, time 121.62ms
iter 157460: loss 7.1731, time 121.54ms
iter 157470: loss 7.1137, time 121.56ms
iter 157480: loss 6.7659, time 122.14ms
iter 157490: loss 5.7678, time 121.68ms
step 157500: train loss 5.9668, val loss 5.9936
saving checkpoint to out-shakespeare-char
iter 157500: loss 6.6271, time 2895.89ms
iter 157510: loss 6.2760, time 121.92ms
iter 157520: loss 6.7782, time 121.99ms
iter 157530: loss 6.9666, time 121.81ms
iter 157540: loss 6.9105, time 121.94ms
iter 157550: loss 6.4179, time 122.95ms
iter 157560: loss 5.8974, time 122.00ms
iter 157570: loss 7.0570, time 121.58ms
iter 157580: loss 6.6027, time 121.72ms
iter 157590: loss 6.1788, time 121.72ms
iter 157600: loss 6.7009, time 121.70ms
iter 157610: loss 6.7511, time 121.56ms
iter 157620: loss 6.9346, time 120.73ms
iter 157630: loss 5.8772, time 121.53ms
iter 157640: loss 6.5569, time 120.89ms
iter 157650: loss 7.2149, time 121.40ms
iter 157660: loss 7.1153, time 121.88ms
iter 157670: loss 6.7185, time 121.77ms
iter 157680: loss 6.6358, time 121.68ms
iter 157690: loss 6.8954, time 121.77ms
iter 157700: loss 6.0377, time 121.90ms
iter 157710: loss 6.8408, time 121.69ms
iter 157720: loss 6.6667, time 121.53ms
iter 157730: loss 6.4597, time 121.48ms
iter 157740: loss 7.6616, time 121.40ms
step 157750: train loss 5.9969, val loss 5.9861
saving checkpoint to out-shakespeare-char
iter 157750: loss 7.3466, time 2896.77ms
iter 157760: loss 6.8601, time 121.57ms
iter 157770: loss 6.6428, time 124.30ms
iter 157780: loss 7.3852, time 121.53ms
iter 157790: loss 6.1811, time 123.96ms
iter 157800: loss 6.9222, time 122.17ms
iter 157810: loss 6.4542, time 124.35ms
iter 157820: loss 6.4867, time 121.53ms
iter 157830: loss 7.8409, time 128.51ms
iter 157840: loss 7.2395, time 125.08ms
iter 157850: loss 6.9852, time 124.96ms
iter 157860: loss 7.5962, time 125.28ms
iter 157870: loss 6.8324, time 125.17ms
iter 157880: loss 6.3815, time 125.20ms
iter 157890: loss 6.9141, time 124.45ms
iter 157900: loss 6.8355, time 125.36ms
iter 157910: loss 6.9644, time 125.16ms
iter 157920: loss 7.5428, time 125.36ms
iter 157930: loss 6.6333, time 125.51ms
iter 157940: loss 6.2266, time 124.59ms
iter 157950: loss 7.0765, time 121.63ms
iter 157960: loss 6.6156, time 124.42ms
iter 157970: loss 6.3403, time 121.81ms
iter 157980: loss 6.6959, time 124.49ms
iter 157990: loss 6.6451, time 122.28ms
step 158000: train loss 5.9614, val loss 5.9975
saving checkpoint to out-shakespeare-char
iter 158000: loss 6.8633, time 2913.28ms
iter 158010: loss 7.1591, time 124.84ms
iter 158020: loss 7.3475, time 125.50ms
iter 158030: loss 5.7077, time 125.21ms
iter 158040: loss 7.7011, time 125.57ms
iter 158050: loss 6.7330, time 124.71ms
iter 158060: loss 6.6387, time 127.63ms
iter 158070: loss 6.6318, time 124.76ms
iter 158080: loss 6.9477, time 125.30ms
iter 158090: loss 6.1693, time 125.01ms
iter 158100: loss 7.0464, time 125.64ms
iter 158110: loss 7.1672, time 124.89ms
iter 158120: loss 7.1664, time 125.73ms
iter 158130: loss 7.1965, time 125.52ms
iter 158140: loss 6.8837, time 124.60ms
iter 158150: loss 6.8822, time 124.92ms
iter 158160: loss 6.8857, time 124.67ms
iter 158170: loss 7.7887, time 128.56ms
iter 158180: loss 6.4101, time 124.67ms
iter 158190: loss 6.3148, time 125.80ms
iter 158200: loss 6.7917, time 124.78ms
iter 158210: loss 6.2866, time 125.76ms
iter 158220: loss 6.7934, time 125.33ms
iter 158230: loss 6.3881, time 125.68ms
iter 158240: loss 6.7743, time 121.39ms
step 158250: train loss 5.9778, val loss 6.0336
saving checkpoint to out-shakespeare-char
iter 158250: loss 6.9057, time 2898.60ms
iter 158260: loss 6.6586, time 121.52ms
iter 158270: loss 6.5651, time 121.57ms
iter 158280: loss 6.1776, time 122.20ms
iter 158290: loss 6.6344, time 121.17ms
iter 158300: loss 7.1183, time 121.38ms
iter 158310: loss 5.7046, time 121.05ms
iter 158320: loss 6.8187, time 121.52ms
iter 158330: loss 6.2615, time 120.61ms
iter 158340: loss 6.9908, time 120.84ms
iter 158350: loss 7.3370, time 121.75ms
iter 158360: loss 6.6155, time 122.12ms
iter 158370: loss 6.3602, time 121.72ms
iter 158380: loss 6.3634, time 120.00ms
iter 158390: loss 7.1366, time 121.20ms
iter 158400: loss 7.4726, time 120.82ms
iter 158410: loss 6.3056, time 121.70ms
iter 158420: loss 6.7515, time 121.63ms
iter 158430: loss 7.4012, time 121.62ms
iter 158440: loss 6.5954, time 121.78ms
iter 158450: loss 7.0123, time 121.46ms
iter 158460: loss 7.0044, time 121.56ms
iter 158470: loss 6.7860, time 121.65ms
iter 158480: loss 6.7627, time 121.62ms
iter 158490: loss 6.3634, time 121.63ms
step 158500: train loss 5.9734, val loss 6.0456
saving checkpoint to out-shakespeare-char
iter 158500: loss 7.1400, time 2884.76ms
iter 158510: loss 6.1667, time 123.25ms
iter 158520: loss 6.7582, time 121.72ms
iter 158530: loss 6.4823, time 122.59ms
iter 158540: loss 6.8219, time 121.47ms
iter 158550: loss 6.1921, time 122.01ms
iter 158560: loss 7.1563, time 121.54ms
iter 158570: loss 6.5975, time 122.54ms
iter 158580: loss 6.5762, time 122.36ms
iter 158590: loss 7.0067, time 124.52ms
iter 158600: loss 7.2163, time 121.48ms
iter 158610: loss 6.6664, time 122.67ms
iter 158620: loss 5.7120, time 121.10ms
iter 158630: loss 6.3034, time 122.98ms
iter 158640: loss 6.8047, time 121.10ms
iter 158650: loss 6.4602, time 122.00ms
iter 158660: loss 6.6461, time 121.43ms
iter 158670: loss 6.7193, time 122.00ms
iter 158680: loss 6.6091, time 121.50ms
iter 158690: loss 6.8435, time 122.61ms
iter 158700: loss 6.9553, time 121.45ms
iter 158710: loss 6.1098, time 122.61ms
iter 158720: loss 6.5430, time 121.37ms
iter 158730: loss 6.4620, time 122.68ms
iter 158740: loss 6.7152, time 121.61ms
step 158750: train loss 5.9642, val loss 5.9913
saving checkpoint to out-shakespeare-char
iter 158750: loss 7.2364, time 2879.75ms
iter 158760: loss 6.4751, time 122.62ms
iter 158770: loss 6.3289, time 121.35ms
iter 158780: loss 6.7095, time 122.24ms
iter 158790: loss 7.3551, time 121.45ms
iter 158800: loss 6.6117, time 121.25ms
iter 158810: loss 6.4668, time 121.41ms
iter 158820: loss 6.8607, time 121.48ms
iter 158830: loss 7.1513, time 121.47ms
iter 158840: loss 5.8875, time 121.61ms
iter 158850: loss 7.0475, time 121.32ms
iter 158860: loss 6.3148, time 120.66ms
iter 158870: loss 6.1407, time 121.39ms
iter 158880: loss 6.8864, time 120.50ms
iter 158890: loss 6.7153, time 121.41ms
iter 158900: loss 7.1266, time 122.15ms
iter 158910: loss 5.8410, time 122.06ms
iter 158920: loss 6.5640, time 121.02ms
iter 158930: loss 6.5199, time 121.31ms
iter 158940: loss 6.7660, time 121.20ms
iter 158950: loss 6.8550, time 121.31ms
iter 158960: loss 7.0216, time 121.14ms
iter 158970: loss 6.3702, time 121.36ms
iter 158980: loss 6.1575, time 121.18ms
iter 158990: loss 7.1346, time 121.34ms
step 159000: train loss 6.0036, val loss 6.0312
saving checkpoint to out-shakespeare-char
iter 159000: loss 6.8509, time 2890.63ms
iter 159010: loss 7.1986, time 120.90ms
iter 159020: loss 6.5872, time 124.08ms
iter 159030: loss 7.1695, time 121.34ms
iter 159040: loss 6.2586, time 124.53ms
iter 159050: loss 6.6526, time 121.49ms
iter 159060: loss 6.1345, time 124.15ms
iter 159070: loss 7.2372, time 120.56ms
iter 159080: loss 6.1804, time 123.71ms
iter 159090: loss 6.0654, time 121.35ms
iter 159100: loss 7.0813, time 123.73ms
iter 159110: loss 7.1158, time 121.49ms
iter 159120: loss 6.8332, time 124.29ms
iter 159130: loss 6.4977, time 121.47ms
iter 159140: loss 6.7086, time 124.40ms
iter 159150: loss 6.6810, time 121.38ms
iter 159160: loss 6.5593, time 124.76ms
iter 159170: loss 6.6101, time 121.41ms
iter 159180: loss 6.6426, time 124.69ms
iter 159190: loss 5.6838, time 121.52ms
iter 159200: loss 6.5957, time 123.75ms
iter 159210: loss 6.7979, time 121.45ms
iter 159220: loss 6.3286, time 124.76ms
iter 159230: loss 6.8566, time 121.51ms
iter 159240: loss 7.0234, time 124.21ms
step 159250: train loss 5.9924, val loss 5.9944
saving checkpoint to out-shakespeare-char
iter 159250: loss 7.3172, time 2884.91ms
iter 159260: loss 7.1517, time 125.24ms
iter 159270: loss 7.3436, time 124.60ms
iter 159280: loss 6.8706, time 124.68ms
iter 159290: loss 6.7342, time 128.61ms
iter 159300: loss 6.7907, time 124.81ms
iter 159310: loss 6.0947, time 125.73ms
iter 159320: loss 6.0094, time 125.69ms
iter 159330: loss 7.1635, time 126.04ms
iter 159340: loss 6.5894, time 125.40ms
iter 159350: loss 6.8168, time 125.56ms
iter 159360: loss 7.0609, time 125.47ms
iter 159370: loss 6.4071, time 126.12ms
iter 159380: loss 6.6604, time 126.17ms
iter 159390: loss 7.3337, time 126.30ms
iter 159400: loss 6.5497, time 128.47ms
iter 159410: loss 6.1581, time 126.05ms
iter 159420: loss 7.3377, time 125.96ms
iter 159430: loss 6.3668, time 126.31ms
iter 159440: loss 6.7093, time 126.03ms
iter 159450: loss 6.2007, time 126.87ms
iter 159460: loss 7.2895, time 125.97ms
iter 159470: loss 7.0657, time 125.86ms
iter 159480: loss 7.3643, time 125.97ms
iter 159490: loss 6.6436, time 125.69ms
step 159500: train loss 6.0161, val loss 6.0227
saving checkpoint to out-shakespeare-char
iter 159500: loss 7.1192, time 2900.77ms
iter 159510: loss 6.7127, time 125.80ms
iter 159520: loss 6.5032, time 125.78ms
iter 159530: loss 6.9122, time 126.16ms
iter 159540: loss 6.5409, time 125.07ms
iter 159550: loss 6.9183, time 125.52ms
iter 159560: loss 7.0823, time 125.92ms
iter 159570: loss 6.7025, time 125.57ms
iter 159580: loss 7.0351, time 125.70ms
iter 159590: loss 6.1968, time 125.58ms
iter 159600: loss 7.2124, time 121.23ms
iter 159610: loss 7.2833, time 120.81ms
iter 159620: loss 6.1004, time 121.17ms
iter 159630: loss 6.8841, time 120.17ms
iter 159640: loss 6.6618, time 121.26ms
iter 159650: loss 5.9617, time 120.06ms
iter 159660: loss 6.4270, time 121.43ms
iter 159670: loss 6.3481, time 119.88ms
iter 159680: loss 6.7910, time 121.21ms
iter 159690: loss 6.7041, time 119.96ms
iter 159700: loss 7.1501, time 121.40ms
iter 159710: loss 6.9700, time 120.03ms
iter 159720: loss 6.9881, time 121.07ms
iter 159730: loss 6.4507, time 120.64ms
iter 159740: loss 6.7613, time 123.41ms
step 159750: train loss 6.0092, val loss 5.9786
saving checkpoint to out-shakespeare-char
iter 159750: loss 6.2329, time 2872.56ms
iter 159760: loss 6.8911, time 119.95ms
iter 159770: loss 7.2225, time 121.20ms
iter 159780: loss 7.0226, time 119.98ms
iter 159790: loss 6.8540, time 122.01ms
iter 159800: loss 7.0528, time 120.89ms
iter 159810: loss 7.0888, time 121.15ms
iter 159820: loss 6.8330, time 120.02ms
iter 159830: loss 6.9635, time 121.10ms
iter 159840: loss 7.2067, time 120.60ms
iter 159850: loss 7.0600, time 121.17ms
iter 159860: loss 6.6824, time 120.03ms
iter 159870: loss 6.8469, time 121.29ms
iter 159880: loss 6.5912, time 119.85ms
iter 159890: loss 7.0620, time 123.67ms
iter 159900: loss 5.9162, time 121.95ms
iter 159910: loss 6.6251, time 123.00ms
iter 159920: loss 6.6575, time 121.88ms
iter 159930: loss 6.5731, time 123.25ms
iter 159940: loss 6.8492, time 121.88ms
iter 159950: loss 6.5813, time 123.38ms
iter 159960: loss 6.6630, time 121.14ms
iter 159970: loss 6.4600, time 123.19ms
iter 159980: loss 6.0782, time 121.85ms
iter 159990: loss 6.8965, time 123.28ms
step 160000: train loss 5.9833, val loss 5.9713
saving checkpoint to out-shakespeare-char
iter 160000: loss 6.8618, time 2895.71ms
iter 160010: loss 7.1585, time 121.89ms
iter 160020: loss 6.6550, time 121.57ms
iter 160030: loss 7.4581, time 121.47ms
iter 160040: loss 6.8056, time 121.53ms
iter 160050: loss 6.3765, time 121.44ms
iter 160060: loss 6.6401, time 121.41ms
iter 160070: loss 6.6476, time 121.10ms
iter 160080: loss 6.0488, time 120.78ms
iter 160090: loss 6.9552, time 120.80ms
iter 160100: loss 6.1152, time 121.82ms
iter 160110: loss 7.3931, time 121.83ms
iter 160120: loss 7.1458, time 121.64ms
iter 160130: loss 6.7451, time 121.79ms
iter 160140: loss 6.7196, time 121.45ms
iter 160150: loss 6.5454, time 121.58ms
iter 160160: loss 7.1109, time 121.55ms
iter 160170: loss 6.9814, time 121.69ms
iter 160180: loss 6.9800, time 121.53ms
iter 160190: loss 6.1556, time 121.63ms
iter 160200: loss 7.6797, time 121.76ms
iter 160210: loss 7.0721, time 121.51ms
iter 160220: loss 6.4575, time 120.75ms
iter 160230: loss 6.2424, time 121.52ms
iter 160240: loss 7.5032, time 121.70ms
step 160250: train loss 5.9508, val loss 5.9931
saving checkpoint to out-shakespeare-char
iter 160250: loss 6.2067, time 2891.69ms
iter 160260: loss 6.6692, time 121.06ms
iter 160270: loss 6.9091, time 120.61ms
iter 160280: loss 6.5208, time 121.42ms
iter 160290: loss 6.6535, time 121.59ms
iter 160300: loss 7.0794, time 122.01ms
iter 160310: loss 6.5049, time 121.39ms
iter 160320: loss 7.1581, time 121.54ms
iter 160330: loss 6.3326, time 121.54ms
iter 160340: loss 6.7856, time 121.94ms
iter 160350: loss 6.6351, time 121.88ms
iter 160360: loss 6.6965, time 122.00ms
iter 160370: loss 7.0632, time 122.02ms
iter 160380: loss 6.7645, time 122.06ms
iter 160390: loss 6.4824, time 121.77ms
iter 160400: loss 6.9219, time 121.59ms
iter 160410: loss 6.7266, time 121.54ms
iter 160420: loss 6.5404, time 122.59ms
iter 160430: loss 7.0119, time 121.28ms
iter 160440: loss 7.4259, time 121.69ms
iter 160450: loss 7.2796, time 121.63ms
iter 160460: loss 6.7834, time 121.68ms
iter 160470: loss 6.5007, time 121.55ms
iter 160480: loss 6.7618, time 119.90ms
iter 160490: loss 6.7936, time 120.63ms
step 160500: train loss 6.0159, val loss 6.0047
saving checkpoint to out-shakespeare-char
iter 160500: loss 6.4761, time 2882.49ms
iter 160510: loss 6.8987, time 122.85ms
iter 160520: loss 6.7050, time 121.49ms
iter 160530: loss 6.8390, time 122.68ms
iter 160540: loss 6.3064, time 122.70ms
iter 160550: loss 6.8517, time 122.72ms
iter 160560: loss 6.8544, time 121.33ms
iter 160570: loss 6.4831, time 122.67ms
iter 160580: loss 6.2844, time 121.50ms
iter 160590: loss 7.3826, time 122.59ms
iter 160600: loss 7.5040, time 121.41ms
iter 160610: loss 6.9143, time 122.66ms
iter 160620: loss 7.3016, time 121.80ms
iter 160630: loss 6.4578, time 123.12ms
iter 160640: loss 6.7366, time 121.86ms
iter 160650: loss 6.7681, time 123.06ms
iter 160660: loss 7.0294, time 121.48ms
iter 160670: loss 6.9373, time 121.76ms
iter 160680: loss 5.9968, time 121.60ms
iter 160690: loss 6.4475, time 123.23ms
iter 160700: loss 6.7595, time 121.53ms
iter 160710: loss 7.3592, time 122.34ms
iter 160720: loss 7.1984, time 121.36ms
iter 160730: loss 6.5386, time 123.81ms
iter 160740: loss 6.4990, time 121.85ms
step 160750: train loss 6.0016, val loss 5.9352
saving checkpoint to out-shakespeare-char
iter 160750: loss 7.2607, time 2889.12ms
iter 160760: loss 7.2346, time 121.67ms
iter 160770: loss 7.0554, time 121.10ms
iter 160780: loss 6.8222, time 121.38ms
iter 160790: loss 6.3091, time 121.74ms
iter 160800: loss 6.4009, time 120.79ms
iter 160810: loss 6.5889, time 121.00ms
iter 160820: loss 6.6733, time 122.23ms
iter 160830: loss 7.5019, time 121.97ms
iter 160840: loss 6.9589, time 121.66ms
iter 160850: loss 6.7622, time 122.17ms
iter 160860: loss 6.5658, time 122.15ms
iter 160870: loss 7.6186, time 122.72ms
iter 160880: loss 6.6915, time 122.38ms
iter 160890: loss 5.6371, time 121.76ms
iter 160900: loss 6.9356, time 122.02ms
iter 160910: loss 7.1419, time 122.17ms
iter 160920: loss 6.8867, time 121.90ms
iter 160930: loss 6.1517, time 121.90ms
iter 160940: loss 6.3983, time 121.47ms
iter 160950: loss 7.5028, time 121.33ms
iter 160960: loss 6.7180, time 121.72ms
iter 160970: loss 6.7980, time 121.87ms
iter 160980: loss 7.3688, time 121.70ms
iter 160990: loss 6.6852, time 121.89ms
step 161000: train loss 6.0238, val loss 5.9942
saving checkpoint to out-shakespeare-char
iter 161000: loss 5.7978, time 2894.39ms
iter 161010: loss 6.8972, time 121.59ms
iter 161020: loss 7.1554, time 121.35ms
iter 161030: loss 7.0224, time 121.26ms
iter 161040: loss 6.7228, time 120.88ms
iter 161050: loss 5.8420, time 121.53ms
iter 161060: loss 6.8103, time 121.56ms
iter 161070: loss 6.6564, time 121.54ms
iter 161080: loss 6.3912, time 121.52ms
iter 161090: loss 6.1514, time 121.50ms
iter 161100: loss 6.6488, time 121.61ms
iter 161110: loss 6.7280, time 122.44ms
iter 161120: loss 7.7865, time 121.58ms
iter 161130: loss 7.2367, time 121.55ms
iter 161140: loss 6.3758, time 121.56ms
iter 161150: loss 6.5677, time 121.41ms
iter 161160: loss 6.8789, time 121.65ms
iter 161170: loss 7.2299, time 121.41ms
iter 161180: loss 6.9719, time 121.38ms
iter 161190: loss 7.2474, time 121.40ms
iter 161200: loss 6.9704, time 121.49ms
iter 161210: loss 7.0192, time 121.50ms
iter 161220: loss 6.6689, time 122.04ms
iter 161230: loss 6.8708, time 120.81ms
iter 161240: loss 6.5057, time 122.16ms
step 161250: train loss 5.9738, val loss 6.0203
saving checkpoint to out-shakespeare-char
iter 161250: loss 7.4987, time 2886.81ms
iter 161260: loss 7.2299, time 126.00ms
iter 161270: loss 6.5841, time 125.33ms
iter 161280: loss 7.0528, time 125.61ms
iter 161290: loss 6.6735, time 125.24ms
iter 161300: loss 6.5216, time 125.23ms
iter 161310: loss 6.1801, time 125.21ms
iter 161320: loss 7.9190, time 125.86ms
iter 161330: loss 6.5983, time 127.70ms
iter 161340: loss 7.8940, time 125.02ms
iter 161350: loss 6.5747, time 124.83ms
iter 161360: loss 6.7508, time 125.18ms
iter 161370: loss 6.5856, time 125.04ms
iter 161380: loss 7.1107, time 125.20ms
iter 161390: loss 6.0768, time 125.29ms
iter 161400: loss 6.5506, time 125.35ms
iter 161410: loss 6.8494, time 125.60ms
iter 161420: loss 6.5314, time 125.16ms
iter 161430: loss 7.0541, time 125.37ms
iter 161440: loss 6.1733, time 125.18ms
iter 161450: loss 6.7880, time 125.61ms
iter 161460: loss 6.6250, time 125.30ms
iter 161470: loss 6.6462, time 124.85ms
iter 161480: loss 6.4414, time 125.19ms
iter 161490: loss 6.4341, time 125.30ms
step 161500: train loss 5.9822, val loss 5.9755
saving checkpoint to out-shakespeare-char
iter 161500: loss 6.7966, time 2896.64ms
iter 161510: loss 6.9574, time 125.21ms
iter 161520: loss 6.5081, time 125.04ms
iter 161530: loss 7.1252, time 125.21ms
iter 161540: loss 6.4870, time 127.52ms
iter 161550: loss 7.1217, time 125.33ms
iter 161560: loss 6.6136, time 123.94ms
iter 161570: loss 7.3591, time 125.85ms
iter 161580: loss 7.4163, time 124.17ms
iter 161590: loss 6.6094, time 125.01ms
iter 161600: loss 7.0710, time 125.22ms
iter 161610: loss 6.8156, time 124.98ms
iter 161620: loss 6.3721, time 125.12ms
iter 161630: loss 6.0248, time 125.27ms
iter 161640: loss 6.3040, time 125.27ms
iter 161650: loss 7.4744, time 128.44ms
iter 161660: loss 7.3305, time 125.56ms
iter 161670: loss 6.6814, time 124.91ms
iter 161680: loss 6.6171, time 125.13ms
iter 161690: loss 6.3447, time 125.01ms
iter 161700: loss 7.2509, time 125.57ms
iter 161710: loss 7.1698, time 125.72ms
iter 161720: loss 7.2647, time 125.60ms
iter 161730: loss 6.9071, time 125.72ms
iter 161740: loss 6.9548, time 125.65ms
step 161750: train loss 5.9384, val loss 6.0280
saving checkpoint to out-shakespeare-char
iter 161750: loss 6.4864, time 2899.35ms
iter 161760: loss 6.6949, time 125.61ms
iter 161770: loss 6.4505, time 125.68ms
iter 161780: loss 7.2799, time 125.48ms
iter 161790: loss 7.0542, time 125.65ms
iter 161800: loss 6.1435, time 125.30ms
iter 161810: loss 6.5346, time 125.54ms
iter 161820: loss 6.3813, time 125.50ms
iter 161830: loss 6.5563, time 125.65ms
iter 161840: loss 6.5007, time 128.66ms
iter 161850: loss 6.1223, time 125.56ms
iter 161860: loss 6.6850, time 125.73ms
iter 161870: loss 6.7677, time 125.95ms
iter 161880: loss 7.2247, time 125.43ms
iter 161890: loss 7.1100, time 125.41ms
iter 161900: loss 6.7180, time 125.50ms
iter 161910: loss 6.5430, time 124.64ms
iter 161920: loss 7.0821, time 125.50ms
iter 161930: loss 6.4521, time 125.69ms
iter 161940: loss 5.8191, time 125.82ms
iter 161950: loss 7.6432, time 128.50ms
iter 161960: loss 5.8955, time 125.71ms
iter 161970: loss 7.1897, time 126.07ms
iter 161980: loss 7.3330, time 125.03ms
iter 161990: loss 5.7176, time 125.89ms
step 162000: train loss 6.0710, val loss 6.0042
saving checkpoint to out-shakespeare-char
iter 162000: loss 7.0114, time 2903.74ms
iter 162010: loss 6.6532, time 122.99ms
iter 162020: loss 7.0591, time 121.44ms
iter 162030: loss 5.8298, time 122.09ms
iter 162040: loss 6.6092, time 121.76ms
iter 162050: loss 6.7100, time 121.89ms
iter 162060: loss 6.7426, time 121.79ms
iter 162070: loss 6.5747, time 121.03ms
iter 162080: loss 6.7643, time 121.83ms
iter 162090: loss 6.4078, time 121.96ms
iter 162100: loss 6.6758, time 121.85ms
iter 162110: loss 7.2293, time 121.34ms
iter 162120: loss 6.6938, time 121.78ms
iter 162130: loss 6.5923, time 121.99ms
iter 162140: loss 6.4455, time 121.63ms
iter 162150: loss 6.7290, time 122.37ms
iter 162160: loss 6.6202, time 121.15ms
iter 162170: loss 6.7339, time 121.80ms
iter 162180: loss 6.5822, time 121.78ms
iter 162190: loss 7.2743, time 121.86ms
iter 162200: loss 6.0163, time 122.01ms
iter 162210: loss 6.7701, time 121.79ms
iter 162220: loss 6.4454, time 121.70ms
iter 162230: loss 6.9022, time 121.78ms
iter 162240: loss 6.7319, time 121.79ms
step 162250: train loss 5.9881, val loss 5.9312
saving checkpoint to out-shakespeare-char
iter 162250: loss 6.4166, time 2879.63ms
iter 162260: loss 7.3679, time 120.14ms
iter 162270: loss 7.2306, time 119.67ms
iter 162280: loss 6.6585, time 119.83ms
iter 162290: loss 7.2527, time 119.69ms
iter 162300: loss 6.2553, time 120.95ms
iter 162310: loss 7.0074, time 119.59ms
iter 162320: loss 7.2831, time 122.32ms
iter 162330: loss 6.5776, time 125.46ms
iter 162340: loss 7.8261, time 125.28ms
iter 162350: loss 6.7468, time 125.65ms
iter 162360: loss 7.1022, time 128.19ms
iter 162370: loss 6.6768, time 125.15ms
iter 162380: loss 6.6674, time 125.35ms
iter 162390: loss 6.4038, time 124.85ms
iter 162400: loss 7.1410, time 125.05ms
iter 162410: loss 6.9158, time 125.42ms
iter 162420: loss 6.7235, time 124.59ms
iter 162430: loss 6.6400, time 128.22ms
iter 162440: loss 7.4233, time 125.40ms
iter 162450: loss 6.9063, time 125.07ms
iter 162460: loss 7.0597, time 124.70ms
iter 162470: loss 6.0576, time 125.31ms
iter 162480: loss 6.9888, time 125.33ms
iter 162490: loss 6.7157, time 125.90ms
step 162500: train loss 5.9802, val loss 6.0066
saving checkpoint to out-shakespeare-char
iter 162500: loss 6.7859, time 2884.89ms
iter 162510: loss 6.6095, time 125.96ms
iter 162520: loss 6.3225, time 129.01ms
iter 162530: loss 7.3988, time 125.33ms
iter 162540: loss 7.4239, time 125.37ms
iter 162550: loss 6.4916, time 125.29ms
iter 162560: loss 7.3824, time 125.45ms
iter 162570: loss 6.2894, time 125.22ms
iter 162580: loss 6.8405, time 125.09ms
iter 162590: loss 6.0111, time 124.32ms
iter 162600: loss 7.0520, time 125.04ms
iter 162610: loss 6.5087, time 125.15ms
iter 162620: loss 6.5454, time 125.27ms
iter 162630: loss 7.0934, time 128.03ms
iter 162640: loss 5.9659, time 125.33ms
iter 162650: loss 7.1983, time 125.05ms
iter 162660: loss 6.3968, time 125.45ms
iter 162670: loss 7.3440, time 125.35ms
iter 162680: loss 6.8457, time 125.25ms
iter 162690: loss 7.1470, time 125.33ms
iter 162700: loss 6.5630, time 125.24ms
iter 162710: loss 6.5982, time 125.14ms
iter 162720: loss 6.9391, time 125.33ms
iter 162730: loss 6.4310, time 125.33ms
iter 162740: loss 6.2945, time 128.70ms
step 162750: train loss 5.9654, val loss 5.9721
saving checkpoint to out-shakespeare-char
iter 162750: loss 6.0902, time 2887.79ms
iter 162760: loss 6.9990, time 126.46ms
iter 162770: loss 6.6114, time 124.87ms
iter 162780: loss 6.7350, time 125.81ms
iter 162790: loss 7.1184, time 125.51ms
iter 162800: loss 6.5894, time 127.78ms
iter 162810: loss 6.7613, time 125.24ms
iter 162820: loss 6.8108, time 124.80ms
iter 162830: loss 6.6187, time 125.09ms
iter 162840: loss 6.4760, time 126.16ms
iter 162850: loss 6.3919, time 125.65ms
iter 162860: loss 6.8184, time 125.63ms
iter 162870: loss 6.3309, time 125.50ms
iter 162880: loss 6.2397, time 124.79ms
iter 162890: loss 7.8242, time 125.49ms
iter 162900: loss 6.7868, time 125.65ms
iter 162910: loss 6.8162, time 128.88ms
iter 162920: loss 7.0972, time 126.39ms
iter 162930: loss 6.6148, time 125.88ms
iter 162940: loss 7.0745, time 125.88ms
iter 162950: loss 6.7016, time 125.43ms
iter 162960: loss 7.1916, time 125.90ms
iter 162970: loss 7.1554, time 125.58ms
iter 162980: loss 6.8111, time 125.54ms
iter 162990: loss 7.3097, time 125.40ms
step 163000: train loss 5.9826, val loss 5.9955
saving checkpoint to out-shakespeare-char
iter 163000: loss 6.4244, time 2899.06ms
iter 163010: loss 6.7869, time 121.62ms
iter 163020: loss 7.4859, time 122.73ms
iter 163030: loss 6.3147, time 121.70ms
iter 163040: loss 6.0904, time 122.92ms
iter 163050: loss 6.5811, time 121.75ms
iter 163060: loss 7.1004, time 122.73ms
iter 163070: loss 6.5301, time 121.44ms
iter 163080: loss 6.1436, time 123.41ms
iter 163090: loss 6.6630, time 121.11ms
iter 163100: loss 6.8632, time 123.19ms
iter 163110: loss 7.0553, time 121.37ms
iter 163120: loss 6.9546, time 122.38ms
iter 163130: loss 5.3057, time 121.59ms
iter 163140: loss 7.3287, time 122.81ms
iter 163150: loss 6.8060, time 121.46ms
iter 163160: loss 6.9546, time 122.73ms
iter 163170: loss 6.8122, time 121.36ms
iter 163180: loss 6.3467, time 122.73ms
iter 163190: loss 7.2708, time 121.85ms
iter 163200: loss 6.6763, time 122.60ms
iter 163210: loss 6.5592, time 121.42ms
iter 163220: loss 6.7921, time 122.68ms
iter 163230: loss 6.2514, time 121.21ms
iter 163240: loss 6.5725, time 122.77ms
step 163250: train loss 6.0462, val loss 5.9796
saving checkpoint to out-shakespeare-char
iter 163250: loss 7.0028, time 2902.50ms
iter 163260: loss 6.0004, time 121.52ms
iter 163270: loss 7.4435, time 123.00ms
iter 163280: loss 6.6738, time 121.26ms
iter 163290: loss 6.5393, time 121.17ms
iter 163300: loss 6.1557, time 121.53ms
iter 163310: loss 7.3217, time 121.27ms
iter 163320: loss 6.3450, time 121.75ms
iter 163330: loss 6.2142, time 121.23ms
iter 163340: loss 6.2183, time 121.46ms
iter 163350: loss 6.9509, time 121.55ms
iter 163360: loss 6.6489, time 121.50ms
iter 163370: loss 6.6330, time 121.61ms
iter 163380: loss 6.4411, time 121.75ms
iter 163390: loss 7.0271, time 121.49ms
iter 163400: loss 6.9685, time 121.37ms
iter 163410: loss 6.0596, time 121.31ms
iter 163420: loss 6.3660, time 121.65ms
iter 163430: loss 7.3267, time 121.88ms
iter 163440: loss 6.6836, time 122.72ms
iter 163450: loss 6.8531, time 121.17ms
iter 163460: loss 6.6528, time 121.40ms
iter 163470: loss 6.1174, time 121.22ms
iter 163480: loss 6.6092, time 121.44ms
iter 163490: loss 6.5009, time 122.56ms
step 163500: train loss 6.0162, val loss 5.9838
saving checkpoint to out-shakespeare-char
iter 163500: loss 6.9097, time 2897.20ms
iter 163510: loss 6.2121, time 125.08ms
iter 163520: loss 6.9157, time 125.30ms
iter 163530: loss 7.0606, time 125.02ms
iter 163540: loss 7.0521, time 124.89ms
iter 163550: loss 6.4474, time 124.93ms
iter 163560: loss 6.9983, time 124.94ms
iter 163570: loss 6.5466, time 125.54ms
iter 163580: loss 6.3362, time 125.68ms
iter 163590: loss 6.7762, time 125.36ms
iter 163600: loss 7.2254, time 128.47ms
iter 163610: loss 6.3583, time 125.59ms
iter 163620: loss 7.2683, time 125.96ms
iter 163630: loss 6.6664, time 125.56ms
iter 163640: loss 6.3306, time 125.58ms
iter 163650: loss 7.1594, time 125.92ms
iter 163660: loss 7.2311, time 125.70ms
iter 163670: loss 6.1761, time 125.59ms
iter 163680: loss 7.3920, time 125.61ms
iter 163690: loss 7.4509, time 125.83ms
iter 163700: loss 6.9410, time 125.82ms
iter 163710: loss 6.6853, time 125.53ms
iter 163720: loss 6.4152, time 125.58ms
iter 163730: loss 6.8967, time 125.52ms
iter 163740: loss 6.7044, time 125.69ms
step 163750: train loss 5.9991, val loss 5.9639
saving checkpoint to out-shakespeare-char
iter 163750: loss 6.8215, time 2878.04ms
iter 163760: loss 7.3253, time 120.73ms
iter 163770: loss 6.7116, time 121.76ms
iter 163780: loss 6.9346, time 121.49ms
iter 163790: loss 6.8571, time 121.57ms
iter 163800: loss 7.3063, time 121.70ms
iter 163810: loss 6.2416, time 120.61ms
iter 163820: loss 7.3081, time 121.60ms
iter 163830: loss 5.8307, time 121.56ms
iter 163840: loss 6.1596, time 121.39ms
iter 163850: loss 6.7020, time 120.99ms
iter 163860: loss 6.8469, time 121.72ms
iter 163870: loss 6.3579, time 121.65ms
iter 163880: loss 6.7284, time 121.50ms
iter 163890: loss 6.7306, time 121.77ms
iter 163900: loss 6.1389, time 122.03ms
iter 163910: loss 6.4585, time 121.63ms
iter 163920: loss 7.2088, time 121.61ms
iter 163930: loss 7.2515, time 121.72ms
iter 163940: loss 7.0723, time 120.72ms
iter 163950: loss 6.8052, time 121.67ms
iter 163960: loss 7.3624, time 121.74ms
iter 163970: loss 6.5804, time 121.62ms
iter 163980: loss 7.6086, time 121.57ms
iter 163990: loss 6.9820, time 120.99ms
step 164000: train loss 5.9517, val loss 5.9653
saving checkpoint to out-shakespeare-char
iter 164000: loss 5.7330, time 2897.02ms
iter 164010: loss 5.7936, time 121.97ms
iter 164020: loss 6.9774, time 121.44ms
iter 164030: loss 6.6979, time 121.70ms
iter 164040: loss 7.6542, time 120.83ms
iter 164050: loss 6.0025, time 121.71ms
iter 164060: loss 6.7133, time 121.66ms
iter 164070: loss 7.4645, time 122.17ms
iter 164080: loss 7.2788, time 121.45ms
iter 164090: loss 7.0530, time 121.81ms
iter 164100: loss 6.4052, time 121.62ms
iter 164110: loss 6.4708, time 121.74ms
iter 164120: loss 7.2179, time 121.54ms
iter 164130: loss 6.7748, time 121.10ms
iter 164140: loss 6.5640, time 121.76ms
iter 164150: loss 6.3711, time 121.84ms
iter 164160: loss 6.3125, time 121.45ms
iter 164170: loss 6.4491, time 121.70ms
iter 164180: loss 6.1422, time 121.76ms
iter 164190: loss 6.3688, time 121.63ms
iter 164200: loss 6.8847, time 121.80ms
iter 164210: loss 6.3271, time 121.60ms
iter 164220: loss 6.7502, time 121.83ms
iter 164230: loss 7.1237, time 121.84ms
iter 164240: loss 7.1536, time 121.59ms
step 164250: train loss 5.9993, val loss 5.9314
saving checkpoint to out-shakespeare-char
iter 164250: loss 6.4086, time 2885.61ms
iter 164260: loss 6.5776, time 122.01ms
iter 164270: loss 6.8217, time 122.08ms
iter 164280: loss 7.3300, time 121.71ms
iter 164290: loss 6.3932, time 121.45ms
iter 164300: loss 6.6450, time 121.70ms
iter 164310: loss 6.8252, time 121.68ms
iter 164320: loss 6.5700, time 121.43ms
iter 164330: loss 6.5488, time 121.72ms
iter 164340: loss 6.1409, time 121.77ms
iter 164350: loss 7.3752, time 121.70ms
iter 164360: loss 6.3908, time 121.65ms
iter 164370: loss 6.8132, time 121.28ms
iter 164380: loss 6.2069, time 121.71ms
iter 164390: loss 7.1896, time 121.67ms
iter 164400: loss 5.3911, time 121.34ms
iter 164410: loss 6.5615, time 121.50ms
iter 164420: loss 6.8564, time 120.10ms
iter 164430: loss 6.9270, time 121.64ms
iter 164440: loss 6.9687, time 121.63ms
iter 164450: loss 7.2073, time 121.69ms
iter 164460: loss 6.5057, time 121.58ms
iter 164470: loss 5.7224, time 121.79ms
iter 164480: loss 6.8758, time 121.76ms
iter 164490: loss 7.1457, time 122.40ms
step 164500: train loss 5.9657, val loss 5.9423
saving checkpoint to out-shakespeare-char
iter 164500: loss 5.8062, time 2889.54ms
iter 164510: loss 6.6721, time 120.37ms
iter 164520: loss 6.6821, time 121.78ms
iter 164530: loss 6.8504, time 120.75ms
iter 164540: loss 6.2837, time 121.19ms
iter 164550: loss 6.4574, time 121.68ms
iter 164560: loss 6.2490, time 120.28ms
iter 164570: loss 6.9851, time 121.19ms
iter 164580: loss 7.2202, time 121.37ms
iter 164590: loss 6.3100, time 121.65ms
iter 164600: loss 7.1311, time 121.96ms
iter 164610: loss 7.3313, time 121.58ms
iter 164620: loss 6.8671, time 121.63ms
iter 164630: loss 6.5515, time 121.80ms
iter 164640: loss 6.6494, time 121.87ms
iter 164650: loss 6.5438, time 121.48ms
iter 164660: loss 6.4969, time 122.05ms
iter 164670: loss 6.8400, time 121.95ms
iter 164680: loss 6.3496, time 121.81ms
iter 164690: loss 6.8851, time 122.21ms
iter 164700: loss 7.6462, time 122.29ms
iter 164710: loss 6.2786, time 121.58ms
iter 164720: loss 5.5856, time 120.75ms
iter 164730: loss 6.5198, time 121.58ms
iter 164740: loss 6.5549, time 122.21ms
step 164750: train loss 5.9494, val loss 5.9958
saving checkpoint to out-shakespeare-char
iter 164750: loss 6.6569, time 2893.99ms
iter 164760: loss 6.4523, time 125.30ms
iter 164770: loss 6.8192, time 125.92ms
iter 164780: loss 6.1311, time 124.39ms
iter 164790: loss 6.0894, time 125.12ms
iter 164800: loss 6.5205, time 126.19ms
iter 164810: loss 6.9153, time 128.54ms
iter 164820: loss 6.6289, time 125.35ms
iter 164830: loss 6.6303, time 125.39ms
iter 164840: loss 6.6719, time 125.68ms
iter 164850: loss 6.7338, time 125.10ms
iter 164860: loss 6.7992, time 125.28ms
iter 164870: loss 6.3959, time 125.13ms
iter 164880: loss 6.8379, time 125.90ms
iter 164890: loss 5.9537, time 125.51ms
iter 164900: loss 6.1816, time 125.37ms
iter 164910: loss 6.7507, time 126.08ms
iter 164920: loss 6.9835, time 129.08ms
iter 164930: loss 7.0297, time 125.66ms
iter 164940: loss 6.5275, time 125.33ms
iter 164950: loss 6.1436, time 127.32ms
iter 164960: loss 5.9156, time 125.58ms
iter 164970: loss 7.3505, time 127.06ms
iter 164980: loss 6.6694, time 125.90ms
iter 164990: loss 6.7291, time 125.92ms
step 165000: train loss 5.9411, val loss 6.0336
saving checkpoint to out-shakespeare-char
iter 165000: loss 7.6474, time 2899.90ms
iter 165010: loss 6.3998, time 125.70ms
iter 165020: loss 6.7954, time 125.70ms
iter 165030: loss 6.9912, time 125.83ms
iter 165040: loss 6.8434, time 125.36ms
iter 165050: loss 6.8210, time 125.57ms
iter 165060: loss 6.9911, time 125.59ms
iter 165070: loss 7.0553, time 127.43ms
iter 165080: loss 5.5733, time 124.51ms
iter 165090: loss 6.6353, time 124.78ms
iter 165100: loss 7.1441, time 126.04ms
iter 165110: loss 7.0147, time 125.52ms
iter 165120: loss 7.5409, time 125.34ms
iter 165130: loss 5.6064, time 125.51ms
iter 165140: loss 7.0586, time 125.31ms
iter 165150: loss 6.5388, time 125.39ms
iter 165160: loss 6.4388, time 125.58ms
iter 165170: loss 6.4248, time 124.96ms
iter 165180: loss 7.3041, time 128.22ms
iter 165190: loss 6.4280, time 125.81ms
iter 165200: loss 6.8360, time 125.61ms
iter 165210: loss 7.0663, time 126.41ms
iter 165220: loss 6.9399, time 125.16ms
iter 165230: loss 6.6729, time 125.17ms
iter 165240: loss 6.3900, time 121.77ms
step 165250: train loss 5.9389, val loss 5.9764
saving checkpoint to out-shakespeare-char
iter 165250: loss 5.9776, time 2880.00ms
iter 165260: loss 6.8256, time 121.79ms
iter 165270: loss 6.6461, time 121.71ms
iter 165280: loss 7.4867, time 121.37ms
iter 165290: loss 7.2431, time 121.74ms
iter 165300: loss 6.9112, time 121.54ms
iter 165310: loss 6.5699, time 121.07ms
iter 165320: loss 6.5360, time 121.80ms
iter 165330: loss 6.5216, time 121.69ms
iter 165340: loss 6.8043, time 121.72ms
iter 165350: loss 6.4832, time 121.66ms
iter 165360: loss 7.1468, time 122.07ms
iter 165370: loss 7.1373, time 122.39ms
iter 165380: loss 6.5096, time 121.74ms
iter 165390: loss 7.5066, time 121.59ms
iter 165400: loss 7.0957, time 121.85ms
iter 165410: loss 6.0193, time 122.20ms
iter 165420: loss 6.2539, time 121.95ms
iter 165430: loss 7.2047, time 121.87ms
iter 165440: loss 7.0337, time 121.85ms
iter 165450: loss 6.9144, time 121.73ms
iter 165460: loss 6.6135, time 121.82ms
iter 165470: loss 6.6347, time 121.60ms
iter 165480: loss 6.4615, time 122.14ms
iter 165490: loss 7.0046, time 121.97ms
step 165500: train loss 5.9409, val loss 5.9621
saving checkpoint to out-shakespeare-char
iter 165500: loss 6.1591, time 2899.48ms
iter 165510: loss 6.9899, time 125.92ms
iter 165520: loss 6.9160, time 125.98ms
iter 165530: loss 7.4056, time 128.77ms
iter 165540: loss 6.4362, time 125.35ms
iter 165550: loss 6.2906, time 125.21ms
iter 165560: loss 7.1900, time 125.96ms
iter 165570: loss 6.4944, time 128.73ms
iter 165580: loss 7.0105, time 126.12ms
iter 165590: loss 6.2807, time 127.33ms
iter 165600: loss 6.5778, time 126.15ms
iter 165610: loss 6.5681, time 127.52ms
iter 165620: loss 6.9697, time 125.99ms
iter 165630: loss 6.6429, time 125.94ms
iter 165640: loss 6.4435, time 127.01ms
iter 165650: loss 6.3256, time 126.12ms
iter 165660: loss 6.9503, time 126.26ms
iter 165670: loss 6.4114, time 126.83ms
iter 165680: loss 6.7063, time 121.96ms
iter 165690: loss 6.3289, time 121.34ms
iter 165700: loss 7.0900, time 121.83ms
iter 165710: loss 6.7116, time 121.84ms
iter 165720: loss 6.4885, time 122.03ms
iter 165730: loss 6.6104, time 121.76ms
iter 165740: loss 7.5596, time 122.55ms
step 165750: train loss 6.0099, val loss 5.9878
saving checkpoint to out-shakespeare-char
iter 165750: loss 6.6696, time 2909.38ms
iter 165760: loss 7.0725, time 121.90ms
iter 165770: loss 6.4568, time 121.67ms
iter 165780: loss 7.6753, time 121.93ms
iter 165790: loss 6.8139, time 122.04ms
iter 165800: loss 6.0223, time 122.23ms
iter 165810: loss 6.0342, time 121.59ms
iter 165820: loss 6.9445, time 121.48ms
iter 165830: loss 7.0050, time 122.42ms
iter 165840: loss 5.6045, time 122.35ms
iter 165850: loss 7.1667, time 122.11ms
iter 165860: loss 5.9175, time 121.91ms
iter 165870: loss 6.2630, time 122.23ms
iter 165880: loss 6.0872, time 121.43ms
iter 165890: loss 6.1657, time 122.12ms
iter 165900: loss 6.6926, time 122.03ms
iter 165910: loss 7.0994, time 121.73ms
iter 165920: loss 6.5351, time 122.58ms
iter 165930: loss 6.2515, time 122.16ms
iter 165940: loss 6.5053, time 122.18ms
iter 165950: loss 6.5077, time 121.85ms
iter 165960: loss 7.1845, time 122.38ms
iter 165970: loss 6.5385, time 122.16ms
iter 165980: loss 7.3313, time 122.57ms
iter 165990: loss 7.0987, time 121.08ms
step 166000: train loss 5.9924, val loss 5.9997
saving checkpoint to out-shakespeare-char
iter 166000: loss 7.6594, time 2918.04ms
iter 166010: loss 6.5937, time 122.33ms
iter 166020: loss 6.2134, time 123.76ms
iter 166030: loss 6.8741, time 121.94ms
iter 166040: loss 6.4944, time 122.23ms
iter 166050: loss 7.0938, time 121.51ms
iter 166060: loss 6.9436, time 123.50ms
iter 166070: loss 8.1609, time 121.57ms
iter 166080: loss 6.7112, time 122.03ms
iter 166090: loss 7.9357, time 121.82ms
iter 166100: loss 7.0875, time 122.44ms
iter 166110: loss 6.6371, time 121.43ms
iter 166120: loss 7.4922, time 122.86ms
iter 166130: loss 6.4686, time 121.68ms
iter 166140: loss 6.7779, time 122.88ms
iter 166150: loss 6.6098, time 121.73ms
iter 166160: loss 6.5184, time 122.71ms
iter 166170: loss 6.4999, time 121.67ms
iter 166180: loss 6.6875, time 122.79ms
iter 166190: loss 6.9004, time 121.55ms
iter 166200: loss 6.5446, time 122.71ms
iter 166210: loss 6.7648, time 121.52ms
iter 166220: loss 6.3865, time 123.42ms
iter 166230: loss 6.8760, time 121.47ms
iter 166240: loss 6.2813, time 122.73ms
step 166250: train loss 5.9805, val loss 5.9682
saving checkpoint to out-shakespeare-char
iter 166250: loss 6.7257, time 2902.29ms
iter 166260: loss 6.8743, time 122.28ms
iter 166270: loss 7.3034, time 121.76ms
iter 166280: loss 7.0829, time 121.76ms
iter 166290: loss 6.2740, time 121.81ms
iter 166300: loss 6.7262, time 122.02ms
iter 166310: loss 6.2444, time 122.02ms
iter 166320: loss 6.5776, time 122.23ms
iter 166330: loss 6.8173, time 121.51ms
iter 166340: loss 6.5529, time 122.00ms
iter 166350: loss 7.2424, time 121.72ms
iter 166360: loss 6.4934, time 121.44ms
iter 166370: loss 6.9035, time 121.73ms
iter 166380: loss 6.8201, time 121.57ms
iter 166390: loss 6.8781, time 121.82ms
iter 166400: loss 6.2770, time 121.76ms
iter 166410: loss 5.8391, time 121.97ms
iter 166420: loss 6.7365, time 121.76ms
iter 166430: loss 7.0813, time 121.75ms
iter 166440: loss 6.4315, time 121.45ms
iter 166450: loss 6.5166, time 121.79ms
iter 166460: loss 6.8255, time 120.98ms
iter 166470: loss 6.0028, time 121.71ms
iter 166480: loss 6.9391, time 128.08ms
iter 166490: loss 6.7703, time 125.23ms
step 166500: train loss 5.9239, val loss 6.0045
saving checkpoint to out-shakespeare-char
iter 166500: loss 6.9699, time 2900.21ms
iter 166510: loss 7.0793, time 124.48ms
iter 166520: loss 6.5758, time 126.05ms
iter 166530: loss 7.0241, time 125.15ms
iter 166540: loss 7.3474, time 125.08ms
iter 166550: loss 6.8546, time 124.34ms
iter 166560: loss 7.1394, time 124.57ms
iter 166570: loss 6.0569, time 125.14ms
iter 166580: loss 7.0228, time 128.99ms
iter 166590: loss 7.1009, time 126.12ms
iter 166600: loss 6.1987, time 125.05ms
iter 166610: loss 6.4980, time 126.59ms
iter 166620: loss 6.6371, time 126.11ms
iter 166630: loss 6.9177, time 126.24ms
iter 166640: loss 6.9461, time 125.92ms
iter 166650: loss 6.9603, time 125.80ms
iter 166660: loss 6.5284, time 125.86ms
iter 166670: loss 5.7005, time 126.27ms
iter 166680: loss 6.7035, time 125.55ms
iter 166690: loss 6.6160, time 127.58ms
iter 166700: loss 6.6896, time 125.30ms
iter 166710: loss 6.7590, time 125.60ms
iter 166720: loss 6.6970, time 125.32ms
iter 166730: loss 7.0621, time 125.38ms
iter 166740: loss 6.6176, time 126.04ms
step 166750: train loss 5.9807, val loss 5.9209
saving checkpoint to out-shakespeare-char
iter 166750: loss 6.6040, time 2904.12ms
iter 166760: loss 7.5011, time 125.25ms
iter 166770: loss 7.3828, time 124.70ms
iter 166780: loss 7.1675, time 124.73ms
iter 166790: loss 7.0113, time 124.25ms
iter 166800: loss 6.9108, time 124.72ms
iter 166810: loss 7.1859, time 125.79ms
iter 166820: loss 6.4754, time 124.84ms
iter 166830: loss 6.6675, time 124.66ms
iter 166840: loss 6.4466, time 125.49ms
iter 166850: loss 5.9537, time 125.35ms
iter 166860: loss 6.2508, time 125.30ms
iter 166870: loss 6.3078, time 125.46ms
iter 166880: loss 7.2005, time 125.95ms
iter 166890: loss 7.1153, time 128.11ms
iter 166900: loss 7.1336, time 125.47ms
iter 166910: loss 7.0245, time 124.62ms
iter 166920: loss 6.4589, time 125.66ms
iter 166930: loss 6.7235, time 125.35ms
iter 166940: loss 6.5411, time 125.47ms
iter 166950: loss 7.2365, time 125.38ms
iter 166960: loss 7.1925, time 124.54ms
iter 166970: loss 7.2674, time 125.36ms
iter 166980: loss 6.2902, time 125.04ms
iter 166990: loss 6.3183, time 125.53ms
step 167000: train loss 5.9558, val loss 5.9686
saving checkpoint to out-shakespeare-char
iter 167000: loss 7.2189, time 2863.05ms
iter 167010: loss 5.8759, time 125.11ms
iter 167020: loss 7.3898, time 125.28ms
iter 167030: loss 7.0698, time 124.78ms
iter 167040: loss 6.3848, time 124.89ms
iter 167050: loss 6.6232, time 125.31ms
iter 167060: loss 6.8714, time 127.89ms
iter 167070: loss 6.9421, time 125.02ms
iter 167080: loss 6.7052, time 125.44ms
iter 167090: loss 7.0278, time 125.02ms
iter 167100: loss 7.3540, time 124.92ms
iter 167110: loss 6.2292, time 125.07ms
iter 167120: loss 6.3596, time 125.59ms
iter 167130: loss 7.1483, time 125.30ms
iter 167140: loss 6.0781, time 125.49ms
iter 167150: loss 6.8666, time 125.40ms
iter 167160: loss 7.1969, time 125.47ms
iter 167170: loss 6.7531, time 128.32ms
iter 167180: loss 7.5732, time 125.34ms
iter 167190: loss 7.0327, time 125.33ms
iter 167200: loss 6.3242, time 125.77ms
iter 167210: loss 6.2213, time 125.60ms
iter 167220: loss 6.7028, time 125.57ms
iter 167230: loss 6.3142, time 125.49ms
iter 167240: loss 7.5357, time 125.55ms
step 167250: train loss 6.0387, val loss 5.9427
saving checkpoint to out-shakespeare-char
iter 167250: loss 6.2415, time 2889.79ms
iter 167260: loss 6.8699, time 125.55ms
iter 167270: loss 6.5635, time 125.48ms
iter 167280: loss 6.6778, time 127.25ms
iter 167290: loss 6.8709, time 125.47ms
iter 167300: loss 6.8628, time 125.06ms
iter 167310: loss 6.3783, time 124.40ms
iter 167320: loss 6.8530, time 125.37ms
iter 167330: loss 6.6090, time 125.41ms
iter 167340: loss 6.8098, time 125.51ms
iter 167350: loss 6.5680, time 125.57ms
iter 167360: loss 6.7945, time 125.43ms
iter 167370: loss 6.6189, time 125.43ms
iter 167380: loss 7.0074, time 128.19ms
iter 167390: loss 6.4523, time 125.28ms
iter 167400: loss 6.3928, time 125.35ms
iter 167410: loss 6.3647, time 125.45ms
iter 167420: loss 6.7968, time 125.03ms
iter 167430: loss 6.1145, time 124.38ms
iter 167440: loss 7.1064, time 125.48ms
iter 167450: loss 7.0757, time 125.04ms
iter 167460: loss 6.2097, time 125.44ms
iter 167470: loss 6.8310, time 124.57ms
iter 167480: loss 6.4283, time 125.50ms
iter 167490: loss 6.4800, time 128.22ms
step 167500: train loss 5.9354, val loss 5.9497
saving checkpoint to out-shakespeare-char
iter 167500: loss 6.7464, time 2904.09ms
iter 167510: loss 6.5356, time 124.46ms
iter 167520: loss 7.0032, time 125.30ms
iter 167530: loss 6.9269, time 125.20ms
iter 167540: loss 7.0445, time 125.30ms
iter 167550: loss 7.0268, time 128.05ms
iter 167560: loss 7.1085, time 124.82ms
iter 167570: loss 7.0469, time 125.00ms
iter 167580: loss 6.7644, time 125.09ms
iter 167590: loss 6.4933, time 125.36ms
iter 167600: loss 6.6220, time 125.14ms
iter 167610: loss 7.0266, time 125.12ms
iter 167620: loss 6.6534, time 125.24ms
iter 167630: loss 6.7814, time 125.24ms
iter 167640: loss 6.7929, time 125.05ms
iter 167650: loss 6.8193, time 125.02ms
iter 167660: loss 7.1509, time 124.98ms
iter 167670: loss 6.9097, time 126.33ms
iter 167680: loss 7.0698, time 125.53ms
iter 167690: loss 7.2990, time 125.13ms
iter 167700: loss 6.7416, time 125.29ms
iter 167710: loss 6.9441, time 125.59ms
iter 167720: loss 7.1976, time 125.68ms
iter 167730: loss 7.4474, time 125.07ms
iter 167740: loss 7.0885, time 127.12ms
step 167750: train loss 5.9133, val loss 5.9483
saving checkpoint to out-shakespeare-char
iter 167750: loss 6.1605, time 2877.85ms
iter 167760: loss 6.7319, time 125.30ms
iter 167770: loss 5.9636, time 125.43ms
iter 167780: loss 7.0631, time 125.68ms
iter 167790: loss 6.6160, time 124.60ms
iter 167800: loss 6.6809, time 125.46ms
iter 167810: loss 6.8376, time 124.95ms
iter 167820: loss 6.4260, time 125.04ms
iter 167830: loss 5.6320, time 124.99ms
iter 167840: loss 6.0619, time 128.14ms
iter 167850: loss 7.4125, time 125.09ms
iter 167860: loss 6.9023, time 125.21ms
iter 167870: loss 6.6041, time 125.25ms
iter 167880: loss 5.9974, time 124.92ms
iter 167890: loss 6.5508, time 125.06ms
iter 167900: loss 6.1105, time 125.12ms
iter 167910: loss 7.1068, time 124.93ms
iter 167920: loss 6.6802, time 124.93ms
iter 167930: loss 7.0553, time 126.33ms
iter 167940: loss 7.3753, time 125.65ms
iter 167950: loss 6.5557, time 126.06ms
iter 167960: loss 6.5685, time 125.21ms
iter 167970: loss 6.7781, time 124.99ms
iter 167980: loss 6.4875, time 124.93ms
iter 167990: loss 6.5291, time 124.93ms
step 168000: train loss 6.0314, val loss 5.9463
saving checkpoint to out-shakespeare-char
iter 168000: loss 6.8151, time 2878.39ms
iter 168010: loss 5.8975, time 128.00ms
iter 168020: loss 5.7168, time 125.31ms
iter 168030: loss 6.7448, time 124.77ms
iter 168040: loss 6.6910, time 125.58ms
iter 168050: loss 6.7510, time 121.30ms
iter 168060: loss 6.8412, time 121.42ms
iter 168070: loss 6.0079, time 120.95ms
iter 168080: loss 7.1566, time 122.92ms
iter 168090: loss 6.5650, time 121.11ms
iter 168100: loss 6.0908, time 120.94ms
iter 168110: loss 6.7871, time 121.06ms
iter 168120: loss 6.9358, time 121.20ms
iter 168130: loss 6.7395, time 121.96ms
iter 168140: loss 6.3855, time 121.81ms
iter 168150: loss 7.0150, time 121.52ms
iter 168160: loss 6.8709, time 121.60ms
iter 168170: loss 6.3719, time 121.45ms
iter 168180: loss 6.5059, time 122.08ms
iter 168190: loss 6.4977, time 122.00ms
iter 168200: loss 6.9651, time 121.59ms
iter 168210: loss 6.5981, time 121.69ms
iter 168220: loss 6.8481, time 121.78ms
iter 168230: loss 6.4818, time 122.04ms
iter 168240: loss 6.3719, time 121.50ms
step 168250: train loss 5.9687, val loss 5.9516
saving checkpoint to out-shakespeare-char
iter 168250: loss 6.4515, time 2908.82ms
iter 168260: loss 6.9768, time 124.60ms
iter 168270: loss 6.6917, time 121.55ms
iter 168280: loss 6.7080, time 124.56ms
iter 168290: loss 6.2207, time 121.53ms
iter 168300: loss 6.2142, time 124.65ms
iter 168310: loss 6.1426, time 121.60ms
iter 168320: loss 6.4976, time 124.64ms
iter 168330: loss 7.1210, time 121.45ms
iter 168340: loss 5.9389, time 124.56ms
iter 168350: loss 6.7180, time 121.52ms
iter 168360: loss 6.5749, time 125.15ms
iter 168370: loss 6.5712, time 121.66ms
iter 168380: loss 6.4151, time 123.86ms
iter 168390: loss 6.1289, time 121.62ms
iter 168400: loss 6.6059, time 124.81ms
iter 168410: loss 6.3755, time 121.60ms
iter 168420: loss 7.4448, time 124.60ms
iter 168430: loss 7.7496, time 121.53ms
iter 168440: loss 6.6395, time 124.34ms
iter 168450: loss 6.3020, time 121.77ms
iter 168460: loss 6.7068, time 123.67ms
iter 168470: loss 7.0390, time 122.11ms
iter 168480: loss 7.0334, time 124.86ms
iter 168490: loss 6.6511, time 121.12ms
step 168500: train loss 5.9429, val loss 5.8890
saving checkpoint to out-shakespeare-char
iter 168500: loss 7.2385, time 2897.13ms
iter 168510: loss 6.2936, time 123.08ms
iter 168520: loss 6.4995, time 121.94ms
iter 168530: loss 5.9140, time 123.13ms
iter 168540: loss 6.5455, time 121.32ms
iter 168550: loss 6.5781, time 123.23ms
iter 168560: loss 6.6798, time 121.44ms
iter 168570: loss 6.7504, time 122.86ms
iter 168580: loss 6.4733, time 121.91ms
iter 168590: loss 6.2821, time 123.06ms
iter 168600: loss 7.2124, time 121.77ms
iter 168610: loss 6.9007, time 123.26ms
iter 168620: loss 7.3017, time 121.69ms
iter 168630: loss 6.9516, time 122.90ms
iter 168640: loss 6.2176, time 125.09ms
iter 168650: loss 5.7852, time 125.05ms
iter 168660: loss 7.2843, time 125.58ms
iter 168670: loss 6.4212, time 125.60ms
iter 168680: loss 6.3374, time 125.68ms
iter 168690: loss 6.0840, time 125.35ms
iter 168700: loss 6.7928, time 125.21ms
iter 168710: loss 6.1133, time 125.15ms
iter 168720: loss 6.9756, time 125.44ms
iter 168730: loss 6.7310, time 125.63ms
iter 168740: loss 6.2733, time 127.93ms
step 168750: train loss 6.0129, val loss 5.9440
saving checkpoint to out-shakespeare-char
iter 168750: loss 6.8870, time 2885.15ms
iter 168760: loss 7.2156, time 125.26ms
iter 168770: loss 7.5067, time 126.06ms
iter 168780: loss 7.1884, time 125.19ms
iter 168790: loss 6.5868, time 125.35ms
iter 168800: loss 6.1048, time 124.92ms
iter 168810: loss 6.7806, time 125.15ms
iter 168820: loss 7.5019, time 125.65ms
iter 168830: loss 6.5550, time 125.01ms
iter 168840: loss 6.5656, time 128.32ms
iter 168850: loss 6.7024, time 125.26ms
iter 168860: loss 6.8509, time 125.64ms
iter 168870: loss 6.1018, time 125.19ms
iter 168880: loss 6.6565, time 125.70ms
iter 168890: loss 7.1238, time 125.29ms
iter 168900: loss 6.6194, time 125.54ms
iter 168910: loss 6.7070, time 128.15ms
iter 168920: loss 6.4457, time 125.48ms
iter 168930: loss 6.3643, time 125.37ms
iter 168940: loss 6.9720, time 125.28ms
iter 168950: loss 6.6378, time 121.27ms
iter 168960: loss 6.3989, time 121.49ms
iter 168970: loss 7.4448, time 121.63ms
iter 168980: loss 6.9977, time 121.99ms
iter 168990: loss 6.4615, time 121.95ms
step 169000: train loss 5.9253, val loss 5.9755
saving checkpoint to out-shakespeare-char
iter 169000: loss 7.2611, time 2907.36ms
iter 169010: loss 7.4903, time 121.69ms
iter 169020: loss 6.8204, time 121.91ms
iter 169030: loss 7.6634, time 121.58ms
iter 169040: loss 6.8777, time 121.33ms
iter 169050: loss 6.5644, time 120.76ms
iter 169060: loss 6.8200, time 121.71ms
iter 169070: loss 7.0922, time 121.46ms
iter 169080: loss 7.1584, time 121.79ms
iter 169090: loss 7.4671, time 121.82ms
iter 169100: loss 6.7888, time 121.29ms
iter 169110: loss 6.5189, time 121.30ms
iter 169120: loss 6.8212, time 121.71ms
iter 169130: loss 6.6091, time 121.22ms
iter 169140: loss 6.6618, time 121.56ms
iter 169150: loss 6.0184, time 121.50ms
iter 169160: loss 5.7375, time 122.07ms
iter 169170: loss 7.1207, time 122.08ms
iter 169180: loss 6.5183, time 121.72ms
iter 169190: loss 7.2509, time 121.76ms
iter 169200: loss 7.0859, time 121.04ms
iter 169210: loss 6.5703, time 121.45ms
iter 169220: loss 6.8578, time 121.68ms
iter 169230: loss 6.3752, time 121.26ms
iter 169240: loss 6.0317, time 121.50ms
step 169250: train loss 5.9721, val loss 5.9505
saving checkpoint to out-shakespeare-char
iter 169250: loss 7.1391, time 2906.62ms
iter 169260: loss 6.2484, time 120.74ms
iter 169270: loss 6.4440, time 123.04ms
iter 169280: loss 6.9143, time 121.67ms
iter 169290: loss 7.2123, time 123.06ms
iter 169300: loss 6.5675, time 121.85ms
iter 169310: loss 6.3782, time 123.17ms
iter 169320: loss 7.7523, time 120.90ms
iter 169330: loss 6.8806, time 122.96ms
iter 169340: loss 7.1551, time 121.49ms
iter 169350: loss 7.0824, time 123.13ms
iter 169360: loss 6.6003, time 121.80ms
iter 169370: loss 5.9929, time 122.79ms
iter 169380: loss 6.8362, time 121.69ms
iter 169390: loss 6.8006, time 123.25ms
iter 169400: loss 7.3780, time 121.59ms
iter 169410: loss 6.4180, time 122.95ms
iter 169420: loss 7.0163, time 121.57ms
iter 169430: loss 5.9961, time 122.92ms
iter 169440: loss 6.2065, time 121.86ms
iter 169450: loss 6.3708, time 123.28ms
iter 169460: loss 6.1511, time 121.64ms
iter 169470: loss 7.4801, time 123.25ms
iter 169480: loss 7.0726, time 121.56ms
iter 169490: loss 6.1426, time 123.16ms
step 169500: train loss 5.9810, val loss 5.9695
saving checkpoint to out-shakespeare-char
iter 169500: loss 6.3894, time 2896.54ms
iter 169510: loss 6.6259, time 121.01ms
iter 169520: loss 5.9953, time 121.93ms
iter 169530: loss 7.0290, time 121.87ms
iter 169540: loss 6.3609, time 121.40ms
iter 169550: loss 6.7503, time 121.78ms
iter 169560: loss 6.5661, time 121.68ms
iter 169570: loss 6.1522, time 121.78ms
iter 169580: loss 6.6474, time 120.91ms
iter 169590: loss 7.0799, time 121.57ms
iter 169600: loss 6.8447, time 121.48ms
iter 169610: loss 6.1039, time 121.58ms
iter 169620: loss 7.4038, time 121.73ms
iter 169630: loss 5.7018, time 121.51ms
iter 169640: loss 7.3787, time 121.84ms
iter 169650: loss 7.6155, time 121.67ms
iter 169660: loss 6.7868, time 120.88ms
iter 169670: loss 6.2987, time 121.56ms
iter 169680: loss 7.2975, time 121.82ms
iter 169690: loss 6.9878, time 121.46ms
iter 169700: loss 6.0678, time 121.80ms
iter 169710: loss 6.4589, time 121.71ms
iter 169720: loss 6.2047, time 121.83ms
iter 169730: loss 7.2207, time 121.57ms
iter 169740: loss 6.7443, time 122.28ms
step 169750: train loss 5.9537, val loss 5.9099
saving checkpoint to out-shakespeare-char
iter 169750: loss 6.7257, time 2898.61ms
iter 169760: loss 6.6327, time 124.48ms
iter 169770: loss 6.8475, time 121.65ms
iter 169780: loss 7.0620, time 124.78ms
iter 169790: loss 7.0455, time 122.12ms
iter 169800: loss 6.3906, time 124.78ms
iter 169810: loss 7.1766, time 121.59ms
iter 169820: loss 5.8764, time 124.88ms
iter 169830: loss 7.1528, time 121.59ms
iter 169840: loss 6.9706, time 124.65ms
iter 169850: loss 6.8849, time 121.71ms
iter 169860: loss 6.2961, time 124.68ms
iter 169870: loss 7.1744, time 121.68ms
iter 169880: loss 6.4609, time 125.07ms
iter 169890: loss 6.8394, time 121.78ms
iter 169900: loss 7.2230, time 124.54ms
iter 169910: loss 6.0835, time 121.49ms
iter 169920: loss 6.0170, time 124.57ms
iter 169930: loss 7.3636, time 121.53ms
iter 169940: loss 6.2505, time 124.71ms
iter 169950: loss 5.9831, time 120.50ms
iter 169960: loss 6.7376, time 124.43ms
iter 169970: loss 6.6213, time 121.66ms
iter 169980: loss 6.5833, time 124.21ms
iter 169990: loss 6.1111, time 122.01ms
step 170000: train loss 5.9604, val loss 6.0264
saving checkpoint to out-shakespeare-char
iter 170000: loss 6.7619, time 2903.99ms
iter 170010: loss 6.0216, time 123.19ms
iter 170020: loss 7.2182, time 122.44ms
iter 170030: loss 7.2991, time 122.54ms
iter 170040: loss 6.6440, time 121.72ms
iter 170050: loss 7.1558, time 123.32ms
iter 170060: loss 6.1086, time 121.58ms
iter 170070: loss 7.3549, time 122.94ms
iter 170080: loss 6.6058, time 121.63ms
iter 170090: loss 6.3548, time 122.89ms
iter 170100: loss 7.2775, time 121.89ms
iter 170110: loss 6.7959, time 123.08ms
iter 170120: loss 6.5511, time 121.60ms
iter 170130: loss 5.6528, time 122.98ms
iter 170140: loss 6.8462, time 121.57ms
iter 170150: loss 6.9314, time 122.86ms
iter 170160: loss 6.4974, time 122.09ms
iter 170170: loss 6.9609, time 122.96ms
iter 170180: loss 6.7841, time 121.60ms
iter 170190: loss 6.9291, time 123.38ms
iter 170200: loss 6.6266, time 121.57ms
iter 170210: loss 6.5402, time 122.88ms
iter 170220: loss 7.1557, time 121.65ms
iter 170230: loss 6.7794, time 122.99ms
iter 170240: loss 6.4332, time 121.51ms
step 170250: train loss 5.9281, val loss 5.9598
saving checkpoint to out-shakespeare-char
iter 170250: loss 6.8476, time 2898.46ms
iter 170260: loss 6.8211, time 121.60ms
iter 170270: loss 7.1697, time 121.61ms
iter 170280: loss 7.2799, time 121.58ms
iter 170290: loss 6.7827, time 121.61ms
iter 170300: loss 6.1300, time 121.51ms
iter 170310: loss 6.8130, time 121.68ms
iter 170320: loss 6.8024, time 121.43ms
iter 170330: loss 7.3284, time 121.79ms
iter 170340: loss 6.7384, time 121.42ms
iter 170350: loss 6.3451, time 121.64ms
iter 170360: loss 6.3807, time 121.46ms
iter 170370: loss 6.8123, time 121.56ms
iter 170380: loss 7.3746, time 121.53ms
iter 170390: loss 6.7793, time 121.56ms
iter 170400: loss 6.5678, time 121.49ms
iter 170410: loss 7.5814, time 121.43ms
iter 170420: loss 6.6016, time 121.40ms
iter 170430: loss 6.5461, time 122.07ms
iter 170440: loss 6.6344, time 121.52ms
iter 170450: loss 6.2546, time 121.66ms
iter 170460: loss 6.7221, time 121.50ms
iter 170470: loss 6.7776, time 121.97ms
iter 170480: loss 6.1417, time 121.71ms
iter 170490: loss 6.6850, time 121.80ms
step 170500: train loss 5.9758, val loss 6.0019
saving checkpoint to out-shakespeare-char
iter 170500: loss 6.8532, time 2907.47ms
iter 170510: loss 6.4079, time 125.33ms
iter 170520: loss 6.0497, time 125.11ms
iter 170530: loss 6.7006, time 125.39ms
iter 170540: loss 6.6148, time 127.94ms
iter 170550: loss 6.5518, time 124.08ms
iter 170560: loss 6.4648, time 125.88ms
iter 170570: loss 6.5754, time 127.98ms
iter 170580: loss 6.1979, time 125.57ms
iter 170590: loss 6.5083, time 125.89ms
iter 170600: loss 6.4426, time 125.97ms
iter 170610: loss 7.0219, time 126.56ms
iter 170620: loss 6.1744, time 126.10ms
iter 170630: loss 7.0096, time 126.29ms
iter 170640: loss 7.0636, time 125.54ms
iter 170650: loss 6.5284, time 125.92ms
iter 170660: loss 6.3511, time 126.02ms
iter 170670: loss 6.6389, time 125.72ms
iter 170680: loss 5.8526, time 125.71ms
iter 170690: loss 5.5538, time 125.86ms
iter 170700: loss 7.3619, time 125.77ms
iter 170710: loss 6.7980, time 125.95ms
iter 170720: loss 6.4624, time 129.03ms
iter 170730: loss 6.5622, time 125.70ms
iter 170740: loss 6.7415, time 125.93ms
step 170750: train loss 5.9606, val loss 5.9619
saving checkpoint to out-shakespeare-char
iter 170750: loss 6.4528, time 2886.34ms
iter 170760: loss 6.7633, time 125.99ms
iter 170770: loss 6.2435, time 125.55ms
iter 170780: loss 6.5134, time 128.29ms
iter 170790: loss 6.3666, time 125.56ms
iter 170800: loss 6.7382, time 124.55ms
iter 170810: loss 7.3391, time 124.76ms
iter 170820: loss 6.4481, time 125.13ms
iter 170830: loss 6.8011, time 125.29ms
iter 170840: loss 5.7221, time 125.37ms
iter 170850: loss 6.4803, time 125.17ms
iter 170860: loss 6.6591, time 125.82ms
iter 170870: loss 6.3629, time 125.52ms
iter 170880: loss 5.9039, time 125.51ms
iter 170890: loss 7.1140, time 125.60ms
iter 170900: loss 7.0438, time 125.55ms
iter 170910: loss 6.4393, time 125.92ms
iter 170920: loss 6.8415, time 128.44ms
iter 170930: loss 6.3970, time 125.26ms
iter 170940: loss 7.0916, time 125.31ms
iter 170950: loss 7.0389, time 125.28ms
iter 170960: loss 6.7599, time 125.47ms
iter 170970: loss 6.9457, time 125.27ms
iter 170980: loss 6.8939, time 125.24ms
iter 170990: loss 6.7833, time 125.10ms
step 171000: train loss 5.9586, val loss 5.9956
saving checkpoint to out-shakespeare-char
iter 171000: loss 6.6260, time 2873.81ms
iter 171010: loss 6.5506, time 125.53ms
iter 171020: loss 7.2276, time 128.45ms
iter 171030: loss 6.1628, time 125.40ms
iter 171040: loss 6.4948, time 125.37ms
iter 171050: loss 7.0406, time 125.36ms
iter 171060: loss 7.0371, time 125.47ms
iter 171070: loss 5.7630, time 125.08ms
iter 171080: loss 5.9207, time 126.14ms
iter 171090: loss 7.0884, time 125.33ms
iter 171100: loss 6.5577, time 125.39ms
iter 171110: loss 6.2972, time 125.25ms
iter 171120: loss 6.9280, time 125.26ms
iter 171130: loss 6.6469, time 125.29ms
iter 171140: loss 7.1747, time 126.05ms
iter 171150: loss 6.5469, time 125.08ms
iter 171160: loss 6.4155, time 125.65ms
iter 171170: loss 5.9360, time 128.36ms
iter 171180: loss 6.2701, time 125.08ms
iter 171190: loss 6.3297, time 125.45ms
iter 171200: loss 6.3486, time 125.45ms
iter 171210: loss 6.9319, time 125.16ms
iter 171220: loss 6.4544, time 125.15ms
iter 171230: loss 6.4116, time 125.71ms
iter 171240: loss 6.4910, time 125.69ms
step 171250: train loss 5.9611, val loss 5.9380
saving checkpoint to out-shakespeare-char
iter 171250: loss 6.4428, time 2905.45ms
iter 171260: loss 6.2663, time 125.71ms
iter 171270: loss 7.0193, time 128.20ms
iter 171280: loss 7.0180, time 125.55ms
iter 171290: loss 6.1310, time 125.14ms
iter 171300: loss 6.7220, time 125.89ms
iter 171310: loss 6.6032, time 126.03ms
iter 171320: loss 6.4856, time 125.90ms
iter 171330: loss 6.6140, time 125.58ms
iter 171340: loss 7.1909, time 125.23ms
iter 171350: loss 5.5250, time 125.65ms
iter 171360: loss 6.9851, time 126.49ms
iter 171370: loss 6.2684, time 125.82ms
iter 171380: loss 6.3903, time 128.70ms
iter 171390: loss 7.0222, time 125.57ms
iter 171400: loss 6.1457, time 125.49ms
iter 171410: loss 7.0897, time 125.95ms
iter 171420: loss 6.7007, time 128.40ms
iter 171430: loss 6.1718, time 124.60ms
iter 171440: loss 6.1479, time 125.51ms
iter 171450: loss 6.9022, time 125.24ms
iter 171460: loss 6.3017, time 125.45ms
iter 171470: loss 6.2790, time 125.36ms
iter 171480: loss 5.8168, time 125.36ms
iter 171490: loss 7.0608, time 125.47ms
step 171500: train loss 5.9052, val loss 5.9375
saving checkpoint to out-shakespeare-char
iter 171500: loss 6.3210, time 2913.33ms
iter 171510: loss 6.8826, time 126.11ms
iter 171520: loss 7.7585, time 126.00ms
iter 171530: loss 5.8668, time 124.65ms
iter 171540: loss 7.0761, time 125.41ms
iter 171550: loss 7.1911, time 128.41ms
iter 171560: loss 6.6712, time 124.70ms
iter 171570: loss 6.4529, time 125.50ms
iter 171580: loss 7.7562, time 125.85ms
iter 171590: loss 6.8072, time 125.96ms
iter 171600: loss 6.7689, time 125.37ms
iter 171610: loss 6.2834, time 125.12ms
iter 171620: loss 7.4078, time 125.37ms
iter 171630: loss 6.6595, time 125.29ms
iter 171640: loss 6.6418, time 124.61ms
iter 171650: loss 6.3996, time 125.65ms
iter 171660: loss 6.9889, time 128.91ms
iter 171670: loss 7.2887, time 125.34ms
iter 171680: loss 6.6294, time 125.69ms
iter 171690: loss 6.3656, time 125.17ms
iter 171700: loss 6.6929, time 125.23ms
iter 171710: loss 6.7960, time 125.25ms
iter 171720: loss 7.0936, time 124.90ms
iter 171730: loss 7.0905, time 128.31ms
iter 171740: loss 6.8919, time 125.56ms
step 171750: train loss 5.8598, val loss 5.9479
saving checkpoint to out-shakespeare-char
iter 171750: loss 6.8640, time 2896.98ms
iter 171760: loss 5.8070, time 125.35ms
iter 171770: loss 6.5038, time 125.22ms
iter 171780: loss 6.5125, time 125.01ms
iter 171790: loss 6.0406, time 127.69ms
iter 171800: loss 6.4386, time 125.32ms
iter 171810: loss 6.2049, time 125.57ms
iter 171820: loss 6.8024, time 125.71ms
iter 171830: loss 5.5274, time 125.57ms
iter 171840: loss 7.1148, time 125.10ms
iter 171850: loss 5.8679, time 125.06ms
iter 171860: loss 6.2556, time 125.18ms
iter 171870: loss 6.6310, time 125.31ms
iter 171880: loss 6.5071, time 125.27ms
iter 171890: loss 6.3997, time 125.69ms
iter 171900: loss 6.4714, time 128.02ms
iter 171910: loss 6.9499, time 124.64ms
iter 171920: loss 7.1779, time 125.33ms
iter 171930: loss 6.5988, time 124.43ms
iter 171940: loss 6.8380, time 125.36ms
iter 171950: loss 6.2261, time 124.76ms
iter 171960: loss 7.2450, time 125.68ms
iter 171970: loss 7.1306, time 127.62ms
iter 171980: loss 7.0842, time 125.47ms
iter 171990: loss 6.3643, time 125.11ms
step 172000: train loss 5.8559, val loss 5.9124
saving checkpoint to out-shakespeare-char
iter 172000: loss 6.0915, time 2896.17ms
iter 172010: loss 6.6241, time 125.55ms
iter 172020: loss 6.4842, time 125.81ms
iter 172030: loss 6.5025, time 127.14ms
iter 172040: loss 6.7716, time 125.41ms
iter 172050: loss 5.9040, time 125.31ms
iter 172060: loss 6.8715, time 125.07ms
iter 172070: loss 6.2952, time 125.18ms
iter 172080: loss 6.8170, time 125.17ms
iter 172090: loss 6.8220, time 124.92ms
iter 172100: loss 6.3474, time 128.59ms
iter 172110: loss 7.2287, time 125.64ms
iter 172120: loss 5.7820, time 125.65ms
iter 172130: loss 6.8365, time 125.81ms
iter 172140: loss 6.9830, time 125.29ms
iter 172150: loss 7.4222, time 125.57ms
iter 172160: loss 7.0126, time 125.64ms
iter 172170: loss 6.1507, time 129.07ms
iter 172180: loss 6.3144, time 125.74ms
iter 172190: loss 6.3113, time 125.56ms
iter 172200: loss 6.7876, time 125.67ms
iter 172210: loss 6.6417, time 125.89ms
iter 172220: loss 6.6898, time 126.12ms
iter 172230: loss 6.6929, time 125.49ms
iter 172240: loss 6.7518, time 125.66ms
step 172250: train loss 5.9059, val loss 5.9773
saving checkpoint to out-shakespeare-char
iter 172250: loss 6.7782, time 2900.51ms
iter 172260: loss 6.5436, time 125.86ms
iter 172270: loss 6.7019, time 128.65ms
iter 172280: loss 5.7217, time 125.62ms
iter 172290: loss 6.5842, time 125.62ms
iter 172300: loss 6.7280, time 125.36ms
iter 172310: loss 6.8368, time 125.05ms
iter 172320: loss 6.3972, time 125.18ms
iter 172330: loss 6.4448, time 125.60ms
iter 172340: loss 6.5598, time 125.45ms
iter 172350: loss 6.1450, time 125.53ms
iter 172360: loss 6.2767, time 125.55ms
iter 172370: loss 6.2702, time 125.59ms
iter 172380: loss 7.0090, time 128.63ms
iter 172390: loss 6.8710, time 125.11ms
iter 172400: loss 7.0184, time 125.26ms
iter 172410: loss 6.4436, time 125.02ms
iter 172420: loss 6.7065, time 125.21ms
iter 172430: loss 6.5334, time 125.20ms
iter 172440: loss 6.5845, time 125.14ms
iter 172450: loss 6.6689, time 125.16ms
iter 172460: loss 7.1039, time 125.20ms
iter 172470: loss 6.6610, time 125.27ms
iter 172480: loss 6.9030, time 125.20ms
iter 172490: loss 6.6472, time 125.20ms
step 172500: train loss 5.9628, val loss 5.9558
saving checkpoint to out-shakespeare-char
iter 172500: loss 7.3215, time 2897.28ms
iter 172510: loss 7.4498, time 125.86ms
iter 172520: loss 7.0492, time 126.02ms
iter 172530: loss 6.6165, time 125.65ms
iter 172540: loss 6.7611, time 125.74ms
iter 172550: loss 6.9351, time 125.79ms
iter 172560: loss 6.5941, time 128.82ms
iter 172570: loss 6.6245, time 125.61ms
iter 172580: loss 6.9450, time 125.76ms
iter 172590: loss 6.8090, time 126.08ms
iter 172600: loss 7.1733, time 125.73ms
iter 172610: loss 7.0266, time 125.54ms
iter 172620: loss 6.6690, time 125.21ms
iter 172630: loss 6.5070, time 125.19ms
iter 172640: loss 6.2119, time 125.56ms
iter 172650: loss 6.3522, time 125.60ms
iter 172660: loss 6.0092, time 125.29ms
iter 172670: loss 7.0719, time 128.48ms
iter 172680: loss 6.4631, time 125.33ms
iter 172690: loss 6.9250, time 125.24ms
iter 172700: loss 6.4531, time 125.66ms
iter 172710: loss 6.9619, time 124.93ms
iter 172720: loss 5.9162, time 125.21ms
iter 172730: loss 6.2344, time 124.63ms
iter 172740: loss 6.7360, time 125.26ms
step 172750: train loss 5.9530, val loss 5.9788
saving checkpoint to out-shakespeare-char
iter 172750: loss 6.2912, time 2895.72ms
iter 172760: loss 6.9831, time 125.69ms
iter 172770: loss 6.4160, time 125.36ms
iter 172780: loss 6.0802, time 124.73ms
iter 172790: loss 6.5258, time 125.59ms
iter 172800: loss 6.4416, time 125.23ms
iter 172810: loss 7.1827, time 125.71ms
iter 172820: loss 6.1420, time 125.53ms
iter 172830: loss 6.8256, time 127.65ms
iter 172840: loss 7.1154, time 125.51ms
iter 172850: loss 5.8872, time 122.24ms
iter 172860: loss 6.1666, time 121.65ms
iter 172870: loss 6.6754, time 125.26ms
iter 172880: loss 6.8186, time 125.09ms
iter 172890: loss 7.2950, time 122.22ms
iter 172900: loss 6.5344, time 125.56ms
iter 172910: loss 6.9893, time 124.54ms
iter 172920: loss 6.3176, time 124.70ms
iter 172930: loss 6.6262, time 128.49ms
iter 172940: loss 7.8895, time 125.98ms
iter 172950: loss 6.8166, time 125.37ms
iter 172960: loss 5.9669, time 125.75ms
iter 172970: loss 6.3778, time 128.64ms
iter 172980: loss 6.9552, time 125.69ms
iter 172990: loss 6.5477, time 125.06ms
step 173000: train loss 5.8870, val loss 6.0038
saving checkpoint to out-shakespeare-char
iter 173000: loss 6.7058, time 2899.79ms
iter 173010: loss 6.5185, time 125.36ms
iter 173020: loss 6.5864, time 123.51ms
iter 173030: loss 6.7384, time 120.93ms
iter 173040: loss 6.4355, time 123.23ms
iter 173050: loss 6.3826, time 122.28ms
iter 173060: loss 6.6399, time 122.68ms
iter 173070: loss 6.0436, time 121.29ms
iter 173080: loss 5.8445, time 122.89ms
iter 173090: loss 6.6408, time 121.80ms
iter 173100: loss 7.0358, time 122.44ms
iter 173110: loss 5.9248, time 121.29ms
iter 173120: loss 6.9318, time 122.98ms
iter 173130: loss 6.5490, time 121.19ms
iter 173140: loss 6.5590, time 122.69ms
iter 173150: loss 6.8115, time 120.82ms
iter 173160: loss 7.2708, time 123.32ms
iter 173170: loss 6.6757, time 121.78ms
iter 173180: loss 6.5503, time 123.02ms
iter 173190: loss 6.5633, time 121.46ms
iter 173200: loss 6.7568, time 122.92ms
iter 173210: loss 6.8714, time 121.73ms
iter 173220: loss 7.3368, time 122.98ms
iter 173230: loss 6.2826, time 121.51ms
iter 173240: loss 7.0836, time 123.09ms
step 173250: train loss 5.9085, val loss 5.9295
saving checkpoint to out-shakespeare-char
iter 173250: loss 5.8641, time 2901.95ms
iter 173260: loss 6.6010, time 121.39ms
iter 173270: loss 6.4653, time 121.47ms
iter 173280: loss 7.0710, time 121.23ms
iter 173290: loss 6.7712, time 121.15ms
iter 173300: loss 5.9536, time 121.53ms
iter 173310: loss 6.8200, time 121.68ms
iter 173320: loss 6.5821, time 121.58ms
iter 173330: loss 6.9411, time 121.69ms
iter 173340: loss 6.3283, time 121.69ms
iter 173350: loss 7.0383, time 121.71ms
iter 173360: loss 6.1784, time 121.14ms
iter 173370: loss 7.5875, time 122.36ms
iter 173380: loss 6.7834, time 121.16ms
iter 173390: loss 5.9495, time 120.30ms
iter 173400: loss 6.6739, time 121.75ms
iter 173410: loss 6.0975, time 121.35ms
iter 173420: loss 6.6225, time 122.19ms
iter 173430: loss 6.9040, time 122.19ms
iter 173440: loss 6.5370, time 121.68ms
iter 173450: loss 6.6545, time 122.14ms
iter 173460: loss 6.7562, time 121.46ms
iter 173470: loss 6.3374, time 121.06ms
iter 173480: loss 6.5504, time 121.45ms
iter 173490: loss 6.0025, time 121.54ms
step 173500: train loss 5.9105, val loss 5.9216
saving checkpoint to out-shakespeare-char
iter 173500: loss 6.6123, time 2894.82ms
iter 173510: loss 7.0284, time 121.81ms
iter 173520: loss 7.1858, time 121.39ms
iter 173530: loss 6.4953, time 121.84ms
iter 173540: loss 5.8989, time 121.89ms
iter 173550: loss 6.4658, time 121.68ms
iter 173560: loss 6.0994, time 121.42ms
iter 173570: loss 6.1880, time 121.90ms
iter 173580: loss 6.8988, time 121.81ms
iter 173590: loss 6.9744, time 121.54ms
iter 173600: loss 6.4136, time 121.74ms
iter 173610: loss 6.6197, time 122.18ms
iter 173620: loss 6.8240, time 121.74ms
iter 173630: loss 5.6281, time 123.42ms
iter 173640: loss 6.5827, time 127.59ms
iter 173650: loss 6.8363, time 127.25ms
iter 173660: loss 6.7177, time 125.43ms
iter 173670: loss 7.2710, time 125.51ms
iter 173680: loss 6.5236, time 125.76ms
iter 173690: loss 6.8890, time 125.41ms
iter 173700: loss 6.3031, time 125.41ms
iter 173710: loss 6.8513, time 127.91ms
iter 173720: loss 6.8586, time 125.43ms
iter 173730: loss 6.0831, time 121.34ms
iter 173740: loss 6.6531, time 122.43ms
step 173750: train loss 5.8781, val loss 5.9042
saving checkpoint to out-shakespeare-char
iter 173750: loss 6.5640, time 2880.86ms
iter 173760: loss 7.7902, time 121.68ms
iter 173770: loss 6.1581, time 121.82ms
iter 173780: loss 6.5883, time 120.91ms
iter 173790: loss 6.2257, time 122.44ms
iter 173800: loss 6.0182, time 121.76ms
iter 173810: loss 6.2856, time 121.45ms
iter 173820: loss 6.5038, time 121.39ms
iter 173830: loss 7.2346, time 122.85ms
iter 173840: loss 5.9313, time 121.78ms
iter 173850: loss 7.0176, time 121.98ms
iter 173860: loss 7.6745, time 121.80ms
iter 173870: loss 6.9527, time 121.42ms
iter 173880: loss 5.9675, time 121.51ms
iter 173890: loss 6.8598, time 121.79ms
iter 173900: loss 7.2353, time 121.60ms
iter 173910: loss 6.3542, time 120.75ms
iter 173920: loss 6.4132, time 122.69ms
iter 173930: loss 6.6476, time 121.12ms
iter 173940: loss 6.6127, time 121.79ms
iter 173950: loss 7.1274, time 121.29ms
iter 173960: loss 6.7675, time 121.63ms
iter 173970: loss 6.9858, time 121.89ms
iter 173980: loss 6.9174, time 121.69ms
iter 173990: loss 6.6586, time 121.29ms
step 174000: train loss 5.9233, val loss 5.9404
saving checkpoint to out-shakespeare-char
iter 174000: loss 6.2551, time 2891.53ms
iter 174010: loss 6.7023, time 120.42ms
iter 174020: loss 6.8593, time 123.73ms
iter 174030: loss 7.8409, time 121.24ms
iter 174040: loss 6.7127, time 121.82ms
iter 174050: loss 6.3612, time 121.09ms
iter 174060: loss 6.4468, time 121.60ms
iter 174070: loss 6.3945, time 120.83ms
iter 174080: loss 6.7783, time 122.35ms
iter 174090: loss 6.1410, time 122.24ms
iter 174100: loss 6.8911, time 121.92ms
iter 174110: loss 6.1733, time 121.01ms
iter 174120: loss 6.1573, time 121.75ms
iter 174130: loss 6.6023, time 121.93ms
iter 174140: loss 6.5451, time 121.90ms
iter 174150: loss 6.1067, time 121.56ms
iter 174160: loss 7.3078, time 121.45ms
iter 174170: loss 7.6312, time 121.48ms
iter 174180: loss 6.5056, time 121.55ms
iter 174190: loss 6.2834, time 122.18ms
iter 174200: loss 6.8084, time 121.19ms
iter 174210: loss 6.6324, time 121.79ms
iter 174220: loss 5.8282, time 121.49ms
iter 174230: loss 6.7024, time 121.61ms
iter 174240: loss 6.1960, time 121.66ms
step 174250: train loss 5.9968, val loss 5.8784
saving checkpoint to out-shakespeare-char
iter 174250: loss 6.2436, time 2898.17ms
iter 174260: loss 6.7074, time 123.71ms
iter 174270: loss 7.0234, time 122.13ms
iter 174280: loss 7.0992, time 123.30ms
iter 174290: loss 6.6270, time 121.64ms
iter 174300: loss 6.9389, time 123.33ms
iter 174310: loss 6.7873, time 121.71ms
iter 174320: loss 6.5191, time 122.02ms
iter 174330: loss 6.2349, time 121.88ms
iter 174340: loss 6.2260, time 123.09ms
iter 174350: loss 6.7664, time 121.15ms
iter 174360: loss 6.4621, time 122.11ms
iter 174370: loss 6.5509, time 119.67ms
iter 174380: loss 6.3596, time 121.87ms
iter 174390: loss 6.5494, time 121.52ms
iter 174400: loss 7.0827, time 121.94ms
iter 174410: loss 6.2717, time 122.20ms
iter 174420: loss 7.3729, time 123.54ms
iter 174430: loss 7.1915, time 121.72ms
iter 174440: loss 6.7237, time 122.78ms
iter 174450: loss 6.3489, time 122.10ms
iter 174460: loss 7.0744, time 124.10ms
iter 174470: loss 6.2720, time 121.63ms
iter 174480: loss 7.3476, time 126.02ms
iter 174490: loss 5.5744, time 125.74ms
step 174500: train loss 5.9000, val loss 5.9755
saving checkpoint to out-shakespeare-char
iter 174500: loss 6.8078, time 2899.18ms
iter 174510: loss 7.0462, time 125.95ms
iter 174520: loss 6.2531, time 126.00ms
iter 174530: loss 6.0279, time 128.64ms
iter 174540: loss 6.6276, time 125.82ms
iter 174550: loss 6.1945, time 125.81ms
iter 174560: loss 6.6944, time 125.11ms
iter 174570: loss 6.7559, time 125.71ms
iter 174580: loss 6.4849, time 125.84ms
iter 174590: loss 6.3774, time 125.71ms
iter 174600: loss 6.4274, time 124.50ms
iter 174610: loss 6.4686, time 125.68ms
iter 174620: loss 6.9111, time 125.82ms
iter 174630: loss 6.1952, time 125.67ms
iter 174640: loss 6.3693, time 128.18ms
iter 174650: loss 5.7041, time 124.40ms
iter 174660: loss 6.5133, time 125.70ms
iter 174670: loss 6.5430, time 125.04ms
iter 174680: loss 6.7582, time 125.75ms
iter 174690: loss 6.5409, time 125.75ms
iter 174700: loss 6.8028, time 125.77ms
iter 174710: loss 6.4254, time 124.67ms
iter 174720: loss 6.8906, time 125.62ms
iter 174730: loss 6.5025, time 125.55ms
iter 174740: loss 6.6932, time 125.75ms
step 174750: train loss 5.9108, val loss 5.9256
saving checkpoint to out-shakespeare-char
iter 174750: loss 7.6983, time 2878.78ms
iter 174760: loss 6.3847, time 125.93ms
iter 174770: loss 6.2635, time 129.10ms
iter 174780: loss 5.9634, time 125.80ms
iter 174790: loss 5.6735, time 123.64ms
iter 174800: loss 7.0877, time 125.18ms
iter 174810: loss 6.8146, time 126.30ms
iter 174820: loss 6.2704, time 125.73ms
iter 174830: loss 6.3646, time 125.40ms
iter 174840: loss 6.6257, time 124.56ms
iter 174850: loss 7.5573, time 125.32ms
iter 174860: loss 6.8653, time 125.75ms
iter 174870: loss 7.3195, time 125.90ms
iter 174880: loss 7.0177, time 128.61ms
iter 174890: loss 6.4440, time 125.62ms
iter 174900: loss 6.8871, time 125.71ms
iter 174910: loss 6.2701, time 125.22ms
iter 174920: loss 6.5667, time 125.76ms
iter 174930: loss 6.6071, time 125.57ms
iter 174940: loss 6.2627, time 125.65ms
iter 174950: loss 6.8275, time 124.56ms
iter 174960: loss 6.5538, time 125.99ms
iter 174970: loss 7.0571, time 125.62ms
iter 174980: loss 6.8916, time 126.21ms
iter 174990: loss 6.3697, time 128.66ms
step 175000: train loss 5.9067, val loss 5.9799
saving checkpoint to out-shakespeare-char
iter 175000: loss 6.5639, time 2889.80ms
iter 175010: loss 6.2283, time 125.77ms
iter 175020: loss 6.1792, time 125.80ms
iter 175030: loss 6.4225, time 125.78ms
iter 175040: loss 6.8321, time 125.61ms
iter 175050: loss 6.5145, time 128.81ms
iter 175060: loss 7.7130, time 125.63ms
iter 175070: loss 7.0151, time 125.75ms
iter 175080: loss 7.2091, time 126.25ms
iter 175090: loss 7.4136, time 125.71ms
iter 175100: loss 6.6942, time 125.85ms
iter 175110: loss 7.0880, time 125.22ms
iter 175120: loss 6.3574, time 125.34ms
iter 175130: loss 6.7044, time 125.49ms
iter 175140: loss 6.6271, time 125.80ms
iter 175150: loss 6.6860, time 124.57ms
iter 175160: loss 6.1200, time 128.79ms
iter 175170: loss 6.9496, time 125.74ms
iter 175180: loss 7.4626, time 125.49ms
iter 175190: loss 7.2500, time 125.68ms
iter 175200: loss 6.6490, time 125.66ms
iter 175210: loss 7.2744, time 125.50ms
iter 175220: loss 6.6918, time 125.45ms
iter 175230: loss 5.8754, time 125.35ms
iter 175240: loss 6.5803, time 125.78ms
step 175250: train loss 5.9176, val loss 5.9389
saving checkpoint to out-shakespeare-char
iter 175250: loss 7.1533, time 2915.02ms
iter 175260: loss 6.9660, time 125.40ms
iter 175270: loss 6.7712, time 125.47ms
iter 175280: loss 6.5072, time 125.41ms
iter 175290: loss 6.5324, time 128.12ms
iter 175300: loss 6.9156, time 125.43ms
iter 175310: loss 6.3222, time 125.40ms
iter 175320: loss 6.4954, time 125.23ms
iter 175330: loss 6.4704, time 125.09ms
iter 175340: loss 7.0975, time 125.79ms
iter 175350: loss 6.3534, time 125.27ms
iter 175360: loss 6.4878, time 125.33ms
iter 175370: loss 7.2206, time 124.72ms
iter 175380: loss 6.4245, time 124.80ms
iter 175390: loss 6.9952, time 124.64ms
iter 175400: loss 7.1230, time 127.99ms
iter 175410: loss 7.0232, time 124.78ms
iter 175420: loss 6.6177, time 125.44ms
iter 175430: loss 6.6574, time 125.97ms
iter 175440: loss 6.6429, time 124.74ms
iter 175450: loss 6.1804, time 125.06ms
iter 175460: loss 7.4484, time 125.66ms
iter 175470: loss 6.8439, time 125.93ms
iter 175480: loss 6.9773, time 124.34ms
iter 175490: loss 6.5721, time 125.50ms
step 175500: train loss 5.9281, val loss 5.9474
saving checkpoint to out-shakespeare-char
iter 175500: loss 6.8052, time 2859.92ms
iter 175510: loss 6.4624, time 121.67ms
iter 175520: loss 6.1144, time 121.93ms
iter 175530: loss 6.3363, time 121.69ms
iter 175540: loss 6.1312, time 121.52ms
iter 175550: loss 6.3929, time 121.70ms
iter 175560: loss 6.0804, time 121.79ms
iter 175570: loss 6.5643, time 121.37ms
iter 175580: loss 6.6071, time 121.70ms
iter 175590: loss 6.0516, time 121.66ms
iter 175600: loss 6.7450, time 121.69ms
iter 175610: loss 6.2908, time 121.66ms
iter 175620: loss 6.7664, time 121.83ms
iter 175630: loss 6.6073, time 121.66ms
iter 175640: loss 5.7750, time 121.72ms
iter 175650: loss 6.4754, time 121.56ms
iter 175660: loss 6.9668, time 121.82ms
iter 175670: loss 7.1776, time 121.74ms
iter 175680: loss 6.5826, time 120.89ms
iter 175690: loss 6.9339, time 121.80ms
iter 175700: loss 6.3889, time 123.71ms
iter 175710: loss 6.9471, time 121.87ms
iter 175720: loss 6.8851, time 124.50ms
iter 175730: loss 5.9247, time 121.57ms
iter 175740: loss 6.2432, time 124.42ms
step 175750: train loss 5.9330, val loss 5.9317
saving checkpoint to out-shakespeare-char
iter 175750: loss 5.5400, time 2906.49ms
iter 175760: loss 6.7523, time 128.37ms
iter 175770: loss 6.4579, time 125.20ms
iter 175780: loss 6.7005, time 125.42ms
iter 175790: loss 5.8078, time 125.24ms
iter 175800: loss 6.8520, time 125.48ms
iter 175810: loss 6.2935, time 124.87ms
iter 175820: loss 6.0731, time 125.44ms
iter 175830: loss 7.1525, time 125.12ms
iter 175840: loss 7.5364, time 125.76ms
iter 175850: loss 6.0348, time 125.15ms
iter 175860: loss 6.5391, time 125.46ms
iter 175870: loss 6.2103, time 127.96ms
iter 175880: loss 6.5432, time 124.92ms
iter 175890: loss 6.7381, time 124.95ms
iter 175900: loss 6.1976, time 125.24ms
iter 175910: loss 6.7626, time 125.10ms
iter 175920: loss 6.6770, time 125.09ms
iter 175930: loss 6.2699, time 125.23ms
iter 175940: loss 6.4021, time 124.99ms
iter 175950: loss 6.0815, time 125.53ms
iter 175960: loss 6.5839, time 125.07ms
iter 175970: loss 6.3518, time 125.21ms
iter 175980: loss 6.1188, time 127.97ms
iter 175990: loss 7.6452, time 124.00ms
step 176000: train loss 5.8950, val loss 5.9244
saving checkpoint to out-shakespeare-char
iter 176000: loss 6.3341, time 2890.42ms
iter 176010: loss 6.3547, time 125.26ms
iter 176020: loss 6.6028, time 125.30ms
iter 176030: loss 6.3298, time 125.36ms
iter 176040: loss 6.7358, time 125.15ms
iter 176050: loss 6.0423, time 125.59ms
iter 176060: loss 7.6040, time 125.12ms
iter 176070: loss 5.8317, time 124.83ms
iter 176080: loss 7.2334, time 125.17ms
iter 176090: loss 6.3877, time 125.08ms
iter 176100: loss 6.3899, time 125.15ms
iter 176110: loss 6.6016, time 127.87ms
iter 176120: loss 6.4418, time 125.48ms
iter 176130: loss 6.7496, time 125.07ms
iter 176140: loss 6.9628, time 126.67ms
iter 176150: loss 6.8943, time 125.56ms
iter 176160: loss 6.4439, time 125.58ms
iter 176170: loss 7.0231, time 125.93ms
iter 176180: loss 7.0463, time 125.96ms
iter 176190: loss 7.0296, time 125.80ms
iter 176200: loss 6.7325, time 124.96ms
iter 176210: loss 6.1748, time 125.86ms
iter 176220: loss 6.7971, time 127.62ms
iter 176230: loss 6.8556, time 125.64ms
iter 176240: loss 7.3378, time 124.26ms
step 176250: train loss 5.9093, val loss 5.9087
saving checkpoint to out-shakespeare-char
iter 176250: loss 6.8628, time 2885.30ms
iter 176260: loss 6.4324, time 121.53ms
iter 176270: loss 7.1905, time 121.55ms
iter 176280: loss 6.4971, time 121.54ms
iter 176290: loss 7.0752, time 121.25ms
iter 176300: loss 6.7923, time 121.53ms
iter 176310: loss 6.7117, time 121.41ms
iter 176320: loss 5.9516, time 121.45ms
iter 176330: loss 6.9739, time 121.55ms
iter 176340: loss 7.2161, time 121.46ms
iter 176350: loss 6.2723, time 121.56ms
iter 176360: loss 6.5171, time 121.28ms
iter 176370: loss 7.2447, time 121.39ms
iter 176380: loss 7.0673, time 121.31ms
iter 176390: loss 6.7217, time 121.44ms
iter 176400: loss 7.1926, time 121.39ms
iter 176410: loss 6.5587, time 121.50ms
iter 176420: loss 7.2161, time 121.44ms
iter 176430: loss 6.8061, time 121.45ms
iter 176440: loss 6.6923, time 121.36ms
iter 176450: loss 6.6364, time 121.72ms
iter 176460: loss 6.3875, time 121.41ms
iter 176470: loss 6.4586, time 121.29ms
iter 176480: loss 7.4532, time 120.21ms
iter 176490: loss 6.5651, time 121.54ms
step 176500: train loss 5.9282, val loss 5.9447
saving checkpoint to out-shakespeare-char
iter 176500: loss 6.6158, time 2885.92ms
iter 176510: loss 6.3724, time 121.47ms
iter 176520: loss 6.7673, time 121.58ms
iter 176530: loss 6.2684, time 121.45ms
iter 176540: loss 6.9069, time 121.52ms
iter 176550: loss 6.5805, time 121.39ms
iter 176560: loss 6.7892, time 121.46ms
iter 176570: loss 7.1759, time 121.49ms
iter 176580: loss 6.7351, time 121.82ms
iter 176590: loss 7.0063, time 121.34ms
iter 176600: loss 6.1672, time 121.67ms
iter 176610: loss 7.1437, time 121.56ms
iter 176620: loss 6.7838, time 121.60ms
iter 176630: loss 7.0087, time 121.52ms
iter 176640: loss 6.9999, time 121.81ms
iter 176650: loss 6.7196, time 121.79ms
iter 176660: loss 7.3001, time 121.78ms
iter 176670: loss 6.4225, time 121.78ms
iter 176680: loss 6.1126, time 121.68ms
iter 176690: loss 6.8743, time 121.47ms
iter 176700: loss 6.1757, time 121.59ms
iter 176710: loss 6.5180, time 121.49ms
iter 176720: loss 7.1081, time 121.79ms
iter 176730: loss 6.5799, time 121.28ms
iter 176740: loss 7.3942, time 121.52ms
step 176750: train loss 5.9548, val loss 5.9751
saving checkpoint to out-shakespeare-char
iter 176750: loss 6.2946, time 2896.15ms
iter 176760: loss 6.5934, time 120.73ms
iter 176770: loss 6.4873, time 121.46ms
iter 176780: loss 6.3367, time 121.62ms
iter 176790: loss 6.4179, time 121.27ms
iter 176800: loss 6.7300, time 121.37ms
iter 176810: loss 6.9691, time 121.46ms
iter 176820: loss 6.8099, time 121.73ms
iter 176830: loss 6.3023, time 121.32ms
iter 176840: loss 7.0035, time 121.43ms
iter 176850: loss 6.8485, time 121.47ms
iter 176860: loss 5.9209, time 121.48ms
iter 176870: loss 6.6785, time 121.35ms
iter 176880: loss 7.1620, time 121.70ms
iter 176890: loss 6.5856, time 121.51ms
iter 176900: loss 6.3476, time 120.69ms
iter 176910: loss 6.2239, time 122.14ms
iter 176920: loss 6.7963, time 121.59ms
iter 176930: loss 7.2975, time 122.74ms
iter 176940: loss 6.3048, time 121.87ms
iter 176950: loss 6.9972, time 122.79ms
iter 176960: loss 6.4153, time 121.60ms
iter 176970: loss 6.8583, time 121.69ms
iter 176980: loss 6.3971, time 120.17ms
iter 176990: loss 6.6778, time 122.53ms
step 177000: train loss 5.9508, val loss 5.8870
saving checkpoint to out-shakespeare-char
iter 177000: loss 7.0079, time 2901.89ms
iter 177010: loss 6.6048, time 128.54ms
iter 177020: loss 6.5080, time 124.68ms
iter 177030: loss 6.8202, time 124.42ms
iter 177040: loss 6.5779, time 125.12ms
iter 177050: loss 7.4015, time 121.86ms
iter 177060: loss 6.4735, time 121.46ms
iter 177070: loss 6.7179, time 121.26ms
iter 177080: loss 6.2241, time 121.46ms
iter 177090: loss 5.9510, time 121.41ms
iter 177100: loss 6.2560, time 121.54ms
iter 177110: loss 6.7244, time 121.56ms
iter 177120: loss 6.4612, time 121.59ms
iter 177130: loss 6.2642, time 120.79ms
iter 177140: loss 7.0211, time 121.57ms
iter 177150: loss 6.5312, time 121.25ms
iter 177160: loss 6.6052, time 121.39ms
iter 177170: loss 6.5724, time 121.64ms
iter 177180: loss 6.6870, time 121.57ms
iter 177190: loss 6.6905, time 121.61ms
iter 177200: loss 6.3173, time 121.83ms
iter 177210: loss 7.1287, time 121.67ms
iter 177220: loss 6.2636, time 121.40ms
iter 177230: loss 7.2399, time 121.13ms
iter 177240: loss 5.8887, time 121.87ms
step 177250: train loss 5.9467, val loss 5.9354
saving checkpoint to out-shakespeare-char
iter 177250: loss 6.4230, time 2873.41ms
iter 177260: loss 6.3852, time 121.46ms
iter 177270: loss 6.4157, time 121.55ms
iter 177280: loss 7.1043, time 121.41ms
iter 177290: loss 6.6002, time 121.43ms
iter 177300: loss 6.7653, time 120.97ms
iter 177310: loss 6.9348, time 121.57ms
iter 177320: loss 6.5111, time 121.46ms
iter 177330: loss 6.2294, time 121.53ms
iter 177340: loss 6.0570, time 121.40ms
iter 177350: loss 5.6000, time 121.58ms
iter 177360: loss 6.7215, time 121.51ms
iter 177370: loss 6.5766, time 120.78ms
iter 177380: loss 6.6974, time 121.57ms
iter 177390: loss 6.0316, time 121.31ms
iter 177400: loss 6.9794, time 121.56ms
iter 177410: loss 6.9441, time 121.27ms
iter 177420: loss 6.8291, time 120.98ms
iter 177430: loss 6.8739, time 121.68ms
iter 177440: loss 6.0722, time 121.47ms
iter 177450: loss 5.9190, time 121.23ms
iter 177460: loss 7.0630, time 121.47ms
iter 177470: loss 6.5160, time 121.31ms
iter 177480: loss 6.9725, time 120.88ms
iter 177490: loss 6.3990, time 121.65ms
step 177500: train loss 5.9592, val loss 5.9053
saving checkpoint to out-shakespeare-char
iter 177500: loss 6.4678, time 2866.28ms
iter 177510: loss 6.9743, time 123.97ms
iter 177520: loss 6.7586, time 122.47ms
iter 177530: loss 7.0247, time 124.45ms
iter 177540: loss 5.8843, time 121.70ms
iter 177550: loss 6.6470, time 125.00ms
iter 177560: loss 7.3132, time 121.56ms
iter 177570: loss 7.0050, time 124.54ms
iter 177580: loss 6.7279, time 121.59ms
iter 177590: loss 6.2874, time 124.37ms
iter 177600: loss 6.7941, time 121.63ms
iter 177610: loss 6.2549, time 124.17ms
iter 177620: loss 6.2047, time 121.82ms
iter 177630: loss 7.1452, time 123.79ms
iter 177640: loss 6.0270, time 121.67ms
iter 177650: loss 6.2946, time 124.46ms
iter 177660: loss 7.4304, time 121.65ms
iter 177670: loss 7.6521, time 123.80ms
iter 177680: loss 6.6088, time 121.92ms
iter 177690: loss 6.3346, time 124.53ms
iter 177700: loss 6.0249, time 122.18ms
iter 177710: loss 7.0597, time 125.01ms
iter 177720: loss 6.6781, time 121.69ms
iter 177730: loss 7.2686, time 124.30ms
iter 177740: loss 5.8799, time 122.35ms
step 177750: train loss 5.8769, val loss 5.9131
saving checkpoint to out-shakespeare-char
iter 177750: loss 6.6613, time 2891.14ms
iter 177760: loss 6.6271, time 122.72ms
iter 177770: loss 6.9582, time 121.66ms
iter 177780: loss 7.1966, time 122.96ms
iter 177790: loss 5.8426, time 121.66ms
iter 177800: loss 6.4409, time 122.74ms
iter 177810: loss 6.2512, time 121.73ms
iter 177820: loss 6.6711, time 122.55ms
iter 177830: loss 6.0039, time 122.15ms
iter 177840: loss 7.0740, time 122.68ms
iter 177850: loss 6.6195, time 121.24ms
iter 177860: loss 6.0972, time 122.13ms
iter 177870: loss 5.9268, time 121.83ms
iter 177880: loss 6.9082, time 122.59ms
iter 177890: loss 7.0018, time 121.53ms
iter 177900: loss 6.4796, time 122.63ms
iter 177910: loss 7.1206, time 121.41ms
iter 177920: loss 6.5752, time 123.14ms
iter 177930: loss 5.7655, time 121.50ms
iter 177940: loss 5.7680, time 122.23ms
iter 177950: loss 6.0523, time 121.53ms
iter 177960: loss 6.5060, time 122.82ms
iter 177970: loss 6.3852, time 121.58ms
iter 177980: loss 6.6884, time 122.62ms
iter 177990: loss 6.8225, time 121.71ms
step 178000: train loss 5.8892, val loss 5.9626
saving checkpoint to out-shakespeare-char
iter 178000: loss 6.6715, time 2894.40ms
iter 178010: loss 7.2770, time 125.81ms
iter 178020: loss 6.7871, time 125.57ms
iter 178030: loss 6.5389, time 125.43ms
iter 178040: loss 6.1475, time 125.57ms
iter 178050: loss 6.3265, time 125.49ms
iter 178060: loss 6.6548, time 125.61ms
iter 178070: loss 6.5234, time 128.20ms
iter 178080: loss 6.4363, time 125.34ms
iter 178090: loss 6.3514, time 125.83ms
iter 178100: loss 6.2821, time 125.77ms
iter 178110: loss 6.3112, time 125.30ms
iter 178120: loss 6.2499, time 125.83ms
iter 178130: loss 6.8032, time 125.79ms
iter 178140: loss 6.4555, time 125.37ms
iter 178150: loss 7.5061, time 125.40ms
iter 178160: loss 7.2385, time 125.43ms
iter 178170: loss 6.1490, time 125.48ms
iter 178180: loss 6.1810, time 128.55ms
iter 178190: loss 7.2943, time 125.28ms
iter 178200: loss 6.9062, time 125.57ms
iter 178210: loss 7.2066, time 125.65ms
iter 178220: loss 6.2333, time 128.17ms
iter 178230: loss 6.5244, time 125.31ms
iter 178240: loss 6.6263, time 125.56ms
step 178250: train loss 5.9251, val loss 5.8847
saving checkpoint to out-shakespeare-char
iter 178250: loss 6.2741, time 2885.26ms
iter 178260: loss 6.6616, time 125.48ms
iter 178270: loss 6.3857, time 125.23ms
iter 178280: loss 6.4651, time 128.25ms
iter 178290: loss 5.9163, time 125.24ms
iter 178300: loss 6.2615, time 125.37ms
iter 178310: loss 6.7852, time 126.21ms
iter 178320: loss 6.1653, time 124.60ms
iter 178330: loss 7.2469, time 124.89ms
iter 178340: loss 6.5371, time 125.65ms
iter 178350: loss 5.8116, time 128.53ms
iter 178360: loss 6.5895, time 126.87ms
iter 178370: loss 6.5587, time 125.54ms
iter 178380: loss 6.7636, time 125.40ms
iter 178390: loss 6.3096, time 125.58ms
iter 178400: loss 6.3575, time 125.37ms
iter 178410: loss 6.4433, time 125.67ms
iter 178420: loss 7.0997, time 127.36ms
iter 178430: loss 6.4487, time 125.59ms
iter 178440: loss 6.7781, time 125.29ms
iter 178450: loss 7.0294, time 124.59ms
iter 178460: loss 7.2879, time 125.45ms
iter 178470: loss 6.6719, time 125.52ms
iter 178480: loss 6.2233, time 125.50ms
iter 178490: loss 5.8238, time 125.54ms
step 178500: train loss 5.9301, val loss 5.9571
saving checkpoint to out-shakespeare-char
iter 178500: loss 6.4341, time 2882.24ms
iter 178510: loss 5.7007, time 124.37ms
iter 178520: loss 6.5641, time 121.67ms
iter 178530: loss 5.8166, time 124.50ms
iter 178540: loss 7.0836, time 121.18ms
iter 178550: loss 7.2166, time 124.29ms
iter 178560: loss 5.7683, time 121.49ms
iter 178570: loss 5.9677, time 124.41ms
iter 178580: loss 7.0570, time 121.55ms
iter 178590: loss 6.5313, time 124.90ms
iter 178600: loss 6.7377, time 121.49ms
iter 178610: loss 6.7121, time 124.47ms
iter 178620: loss 6.2280, time 121.88ms
iter 178630: loss 6.6528, time 124.49ms
iter 178640: loss 6.6951, time 122.33ms
iter 178650: loss 6.0290, time 124.71ms
iter 178660: loss 6.4660, time 123.11ms
iter 178670: loss 6.8519, time 124.32ms
iter 178680: loss 7.0949, time 122.25ms
iter 178690: loss 6.2084, time 124.25ms
iter 178700: loss 6.2344, time 121.55ms
iter 178710: loss 6.3907, time 124.38ms
iter 178720: loss 6.8342, time 121.61ms
iter 178730: loss 6.5593, time 124.43ms
iter 178740: loss 6.3646, time 121.27ms
step 178750: train loss 5.9157, val loss 5.9374
saving checkpoint to out-shakespeare-char
iter 178750: loss 6.6637, time 2906.70ms
iter 178760: loss 6.2322, time 125.18ms
iter 178770: loss 6.0466, time 125.68ms
iter 178780: loss 6.7487, time 125.66ms
iter 178790: loss 6.7359, time 125.71ms
iter 178800: loss 7.3591, time 126.21ms
iter 178810: loss 6.7351, time 126.01ms
iter 178820: loss 6.4764, time 125.88ms
iter 178830: loss 6.3843, time 125.95ms
iter 178840: loss 7.0391, time 126.33ms
iter 178850: loss 6.4557, time 128.88ms
iter 178860: loss 6.6282, time 126.04ms
iter 178870: loss 6.3487, time 125.90ms
iter 178880: loss 6.2066, time 126.38ms
iter 178890: loss 6.9538, time 125.91ms
iter 178900: loss 6.4327, time 125.73ms
iter 178910: loss 6.9887, time 125.58ms
iter 178920: loss 6.6960, time 125.69ms
iter 178930: loss 6.6285, time 126.12ms
iter 178940: loss 6.7563, time 125.94ms
iter 178950: loss 7.0838, time 125.16ms
iter 178960: loss 6.0920, time 125.04ms
iter 178970: loss 6.2657, time 125.86ms
iter 178980: loss 5.7374, time 126.04ms
iter 178990: loss 6.3445, time 122.76ms
step 179000: train loss 5.8915, val loss 5.9266
saving checkpoint to out-shakespeare-char
iter 179000: loss 6.3122, time 2876.01ms
iter 179010: loss 6.2185, time 121.72ms
iter 179020: loss 6.6106, time 121.51ms
iter 179030: loss 6.1792, time 121.64ms
iter 179040: loss 6.2011, time 121.57ms
iter 179050: loss 6.0756, time 121.55ms
iter 179060: loss 6.6649, time 121.70ms
iter 179070: loss 6.5023, time 122.10ms
iter 179080: loss 6.2292, time 121.46ms
iter 179090: loss 6.6349, time 121.65ms
iter 179100: loss 6.1926, time 121.36ms
iter 179110: loss 6.6675, time 121.62ms
iter 179120: loss 6.6106, time 121.23ms
iter 179130: loss 6.7720, time 121.67ms
iter 179140: loss 5.9223, time 121.46ms
iter 179150: loss 6.4353, time 121.24ms
iter 179160: loss 6.7495, time 121.58ms
iter 179170: loss 6.6069, time 121.56ms
iter 179180: loss 6.6623, time 121.65ms
iter 179190: loss 6.3624, time 121.67ms
iter 179200: loss 6.4765, time 122.09ms
iter 179210: loss 6.1688, time 121.67ms
iter 179220: loss 7.3432, time 121.45ms
iter 179230: loss 6.0058, time 121.46ms
iter 179240: loss 6.8478, time 121.58ms
step 179250: train loss 5.8832, val loss 5.9598
saving checkpoint to out-shakespeare-char
iter 179250: loss 7.4648, time 2888.75ms
iter 179260: loss 6.8267, time 122.77ms
iter 179270: loss 5.7500, time 121.96ms
iter 179280: loss 6.9634, time 122.69ms
iter 179290: loss 6.0553, time 121.75ms
iter 179300: loss 7.2191, time 123.00ms
iter 179310: loss 6.2848, time 120.73ms
iter 179320: loss 7.2399, time 122.66ms
iter 179330: loss 6.3991, time 121.54ms
iter 179340: loss 6.4358, time 122.76ms
iter 179350: loss 7.3649, time 121.71ms
iter 179360: loss 6.8552, time 122.60ms
iter 179370: loss 6.7349, time 120.88ms
iter 179380: loss 6.5858, time 122.77ms
iter 179390: loss 6.7146, time 121.08ms
iter 179400: loss 5.5385, time 121.77ms
iter 179410: loss 6.0821, time 121.93ms
iter 179420: loss 6.3524, time 122.74ms
iter 179430: loss 6.8126, time 121.60ms
iter 179440: loss 6.7425, time 122.71ms
iter 179450: loss 6.6544, time 121.69ms
iter 179460: loss 6.3581, time 122.59ms
iter 179470: loss 6.3542, time 121.88ms
iter 179480: loss 6.3935, time 122.74ms
iter 179490: loss 6.2664, time 121.61ms
step 179500: train loss 5.9113, val loss 5.8923
saving checkpoint to out-shakespeare-char
iter 179500: loss 6.7102, time 2891.30ms
iter 179510: loss 6.9622, time 121.63ms
iter 179520: loss 6.6031, time 121.72ms
iter 179530: loss 5.3641, time 120.57ms
iter 179540: loss 7.5056, time 120.94ms
iter 179550: loss 5.8823, time 121.88ms
iter 179560: loss 6.5034, time 121.58ms
iter 179570: loss 6.9318, time 121.31ms
iter 179580: loss 6.4018, time 121.67ms
iter 179590: loss 6.3610, time 121.46ms
iter 179600: loss 6.6679, time 121.61ms
iter 179610: loss 7.0135, time 121.32ms
iter 179620: loss 6.3465, time 121.76ms
iter 179630: loss 7.0617, time 121.45ms
iter 179640: loss 6.7387, time 121.90ms
iter 179650: loss 6.6417, time 121.58ms
iter 179660: loss 6.8088, time 121.56ms
iter 179670: loss 7.1635, time 121.45ms
iter 179680: loss 6.9300, time 121.71ms
iter 179690: loss 6.1204, time 121.64ms
iter 179700: loss 6.8455, time 121.55ms
iter 179710: loss 6.1219, time 121.18ms
iter 179720: loss 6.3102, time 121.57ms
iter 179730: loss 6.5873, time 121.30ms
iter 179740: loss 6.5050, time 121.81ms
step 179750: train loss 5.8900, val loss 5.9430
saving checkpoint to out-shakespeare-char
iter 179750: loss 6.9603, time 2892.58ms
iter 179760: loss 5.8659, time 121.68ms
iter 179770: loss 6.5646, time 121.38ms
iter 179780: loss 6.8609, time 121.68ms
iter 179790: loss 5.9915, time 120.99ms
iter 179800: loss 6.2470, time 121.47ms
iter 179810: loss 6.2253, time 120.97ms
iter 179820: loss 6.1664, time 121.01ms
iter 179830: loss 6.7532, time 121.91ms
iter 179840: loss 6.7044, time 122.11ms
iter 179850: loss 6.2722, time 124.23ms
iter 179860: loss 7.0101, time 121.52ms
iter 179870: loss 6.6362, time 123.65ms
iter 179880: loss 6.1416, time 121.35ms
iter 179890: loss 6.7707, time 124.10ms
iter 179900: loss 6.9806, time 121.43ms
iter 179910: loss 6.4153, time 124.69ms
iter 179920: loss 6.0708, time 121.80ms
iter 179930: loss 6.9926, time 124.43ms
iter 179940: loss 6.6800, time 121.69ms
iter 179950: loss 6.4392, time 124.47ms
iter 179960: loss 6.4648, time 121.60ms
iter 179970: loss 7.3091, time 124.46ms
iter 179980: loss 6.2585, time 121.61ms
iter 179990: loss 7.4477, time 124.57ms
step 180000: train loss 5.9198, val loss 5.8886
saving checkpoint to out-shakespeare-char
iter 180000: loss 5.8771, time 2878.60ms
iter 180010: loss 6.5737, time 121.59ms
iter 180020: loss 6.2019, time 121.63ms
iter 180030: loss 7.0783, time 121.35ms
iter 180040: loss 6.1960, time 121.81ms
iter 180050: loss 6.7642, time 122.83ms
iter 180060: loss 6.4246, time 121.48ms
iter 180070: loss 6.6747, time 121.27ms
iter 180080: loss 6.7238, time 121.58ms
iter 180090: loss 7.0808, time 121.22ms
iter 180100: loss 6.6699, time 122.06ms
iter 180110: loss 5.8399, time 120.81ms
iter 180120: loss 6.1973, time 121.54ms
iter 180130: loss 6.8508, time 121.64ms
iter 180140: loss 5.9801, time 121.92ms
iter 180150: loss 7.0053, time 121.57ms
iter 180160: loss 5.9275, time 121.58ms
iter 180170: loss 6.0054, time 121.39ms
iter 180180: loss 6.6320, time 121.61ms
iter 180190: loss 6.8777, time 120.99ms
iter 180200: loss 6.3307, time 121.69ms
iter 180210: loss 6.8509, time 121.43ms
iter 180220: loss 6.4125, time 121.42ms
iter 180230: loss 5.8390, time 120.61ms
iter 180240: loss 7.3720, time 121.04ms
step 180250: train loss 5.8990, val loss 5.8496
saving checkpoint to out-shakespeare-char
iter 180250: loss 7.2277, time 2893.47ms
iter 180260: loss 6.3424, time 121.64ms
iter 180270: loss 6.4765, time 124.47ms
iter 180280: loss 6.9438, time 121.72ms
iter 180290: loss 6.6830, time 124.83ms
iter 180300: loss 7.1164, time 121.48ms
iter 180310: loss 6.1435, time 123.77ms
iter 180320: loss 7.5369, time 121.62ms
iter 180330: loss 6.1798, time 124.28ms
iter 180340: loss 6.8975, time 121.50ms
iter 180350: loss 6.1638, time 124.53ms
iter 180360: loss 6.2636, time 122.95ms
iter 180370: loss 6.7792, time 124.48ms
iter 180380: loss 7.1003, time 120.87ms
iter 180390: loss 7.0258, time 124.36ms
iter 180400: loss 6.5967, time 121.55ms
iter 180410: loss 6.3682, time 124.78ms
iter 180420: loss 6.3594, time 120.61ms
iter 180430: loss 6.5872, time 124.64ms
iter 180440: loss 6.9559, time 121.55ms
iter 180450: loss 6.6959, time 124.40ms
iter 180460: loss 6.0893, time 121.63ms
iter 180470: loss 6.1009, time 124.67ms
iter 180480: loss 7.0620, time 121.38ms
iter 180490: loss 6.7133, time 124.41ms
step 180500: train loss 5.8896, val loss 5.9129
saving checkpoint to out-shakespeare-char
iter 180500: loss 6.3131, time 2895.96ms
iter 180510: loss 6.6255, time 121.90ms
iter 180520: loss 7.3438, time 123.19ms
iter 180530: loss 6.3664, time 121.90ms
iter 180540: loss 6.1755, time 123.31ms
iter 180550: loss 6.3356, time 121.86ms
iter 180560: loss 6.6634, time 123.06ms
iter 180570: loss 6.0759, time 122.18ms
iter 180580: loss 6.9162, time 122.98ms
iter 180590: loss 7.1554, time 121.88ms
iter 180600: loss 6.2867, time 123.02ms
iter 180610: loss 6.2985, time 121.83ms
iter 180620: loss 6.5032, time 123.05ms
iter 180630: loss 6.9100, time 121.97ms
iter 180640: loss 6.1595, time 124.09ms
iter 180650: loss 6.4183, time 121.69ms
iter 180660: loss 6.3485, time 124.85ms
iter 180670: loss 5.8519, time 122.14ms
iter 180680: loss 6.3722, time 122.59ms
iter 180690: loss 6.3857, time 121.82ms
iter 180700: loss 6.4583, time 122.61ms
iter 180710: loss 7.1332, time 121.45ms
iter 180720: loss 6.5866, time 122.61ms
iter 180730: loss 7.0600, time 121.37ms
iter 180740: loss 6.4277, time 122.91ms
step 180750: train loss 5.9215, val loss 5.8953
saving checkpoint to out-shakespeare-char
iter 180750: loss 6.8521, time 2902.85ms
iter 180760: loss 6.0405, time 126.07ms
iter 180770: loss 6.5970, time 128.95ms
iter 180780: loss 6.1755, time 126.09ms
iter 180790: loss 6.5225, time 125.69ms
iter 180800: loss 6.6224, time 126.03ms
iter 180810: loss 5.6178, time 125.35ms
iter 180820: loss 6.2434, time 125.19ms
iter 180830: loss 7.0908, time 124.33ms
iter 180840: loss 6.3771, time 124.86ms
iter 180850: loss 6.8329, time 124.93ms
iter 180860: loss 6.0605, time 125.31ms
iter 180870: loss 6.5718, time 125.31ms
iter 180880: loss 6.3087, time 128.86ms
iter 180890: loss 6.8968, time 125.73ms
iter 180900: loss 6.5626, time 126.03ms
iter 180910: loss 7.3876, time 125.03ms
iter 180920: loss 6.9794, time 124.59ms
iter 180930: loss 6.7863, time 124.99ms
iter 180940: loss 6.7169, time 124.90ms
iter 180950: loss 6.8377, time 124.88ms
iter 180960: loss 7.1860, time 124.85ms
iter 180970: loss 6.5658, time 125.18ms
iter 180980: loss 7.2187, time 125.07ms
iter 180990: loss 5.7128, time 123.45ms
step 181000: train loss 5.8980, val loss 5.9341
saving checkpoint to out-shakespeare-char
iter 181000: loss 6.5265, time 2891.98ms
iter 181010: loss 6.8244, time 126.75ms
iter 181020: loss 5.6361, time 124.47ms
iter 181030: loss 6.7753, time 127.55ms
iter 181040: loss 7.1778, time 127.71ms
iter 181050: loss 6.8938, time 126.19ms
iter 181060: loss 6.1612, time 128.84ms
iter 181070: loss 6.4539, time 126.03ms
iter 181080: loss 6.4314, time 125.75ms
iter 181090: loss 6.5801, time 126.11ms
iter 181100: loss 6.4959, time 128.85ms
iter 181110: loss 7.0269, time 124.96ms
iter 181120: loss 6.6338, time 125.87ms
iter 181130: loss 6.7009, time 125.93ms
iter 181140: loss 7.4269, time 125.82ms
iter 181150: loss 6.6301, time 125.71ms
iter 181160: loss 6.6573, time 125.71ms
iter 181170: loss 6.5172, time 125.54ms
iter 181180: loss 6.8565, time 126.65ms
iter 181190: loss 5.5624, time 126.20ms
iter 181200: loss 7.0578, time 126.02ms
iter 181210: loss 7.0139, time 127.66ms
iter 181220: loss 7.2487, time 125.70ms
iter 181230: loss 6.9984, time 125.92ms
iter 181240: loss 7.0146, time 125.80ms
step 181250: train loss 5.8872, val loss 5.8836
saving checkpoint to out-shakespeare-char
iter 181250: loss 5.7370, time 2898.37ms
iter 181260: loss 7.0983, time 122.58ms
iter 181270: loss 6.4380, time 121.53ms
iter 181280: loss 6.8198, time 123.52ms
iter 181290: loss 7.8597, time 121.50ms
iter 181300: loss 6.2688, time 122.95ms
iter 181310: loss 7.0942, time 121.31ms
iter 181320: loss 6.7928, time 122.61ms
iter 181330: loss 6.2427, time 122.09ms
iter 181340: loss 7.5882, time 122.64ms
iter 181350: loss 6.7826, time 121.52ms
iter 181360: loss 6.6492, time 122.58ms
iter 181370: loss 6.7763, time 121.51ms
iter 181380: loss 7.1587, time 123.83ms
iter 181390: loss 6.6251, time 121.43ms
iter 181400: loss 6.4934, time 122.48ms
iter 181410: loss 6.8324, time 121.44ms
iter 181420: loss 6.2540, time 121.66ms
iter 181430: loss 6.9512, time 121.51ms
iter 181440: loss 7.2859, time 122.65ms
iter 181450: loss 6.1449, time 121.67ms
iter 181460: loss 6.4133, time 122.56ms
iter 181470: loss 6.6063, time 121.13ms
iter 181480: loss 6.5224, time 122.46ms
iter 181490: loss 6.8065, time 121.34ms
step 181500: train loss 5.8227, val loss 5.9127
saving checkpoint to out-shakespeare-char
iter 181500: loss 6.4914, time 2887.81ms
iter 181510: loss 6.8421, time 125.15ms
iter 181520: loss 6.2054, time 125.71ms
iter 181530: loss 6.2022, time 128.58ms
iter 181540: loss 6.7028, time 125.18ms
iter 181550: loss 7.0145, time 125.42ms
iter 181560: loss 6.3115, time 124.84ms
iter 181570: loss 6.4666, time 125.48ms
iter 181580: loss 7.5516, time 125.08ms
iter 181590: loss 6.7934, time 125.84ms
iter 181600: loss 7.1687, time 125.51ms
iter 181610: loss 7.4771, time 124.83ms
iter 181620: loss 7.1847, time 125.73ms
iter 181630: loss 7.2719, time 125.90ms
iter 181640: loss 6.3813, time 127.65ms
iter 181650: loss 5.9338, time 125.04ms
iter 181660: loss 6.8920, time 124.85ms
iter 181670: loss 7.4371, time 125.28ms
iter 181680: loss 6.5495, time 124.90ms
iter 181690: loss 6.8161, time 125.17ms
iter 181700: loss 6.2199, time 125.22ms
iter 181710: loss 6.0827, time 125.35ms
iter 181720: loss 6.5318, time 125.29ms
iter 181730: loss 6.1926, time 125.62ms
iter 181740: loss 6.1759, time 125.12ms
step 181750: train loss 5.8569, val loss 5.9293
saving checkpoint to out-shakespeare-char
iter 181750: loss 6.1701, time 2895.31ms
iter 181760: loss 6.9693, time 123.28ms
iter 181770: loss 6.4342, time 125.21ms
iter 181780: loss 6.6833, time 124.13ms
iter 181790: loss 7.1287, time 124.77ms
iter 181800: loss 5.9249, time 123.85ms
iter 181810: loss 6.2052, time 125.23ms
iter 181820: loss 7.2877, time 126.72ms
iter 181830: loss 6.2514, time 124.78ms
iter 181840: loss 7.0864, time 124.10ms
iter 181850: loss 6.3125, time 124.99ms
iter 181860: loss 6.4050, time 125.01ms
iter 181870: loss 6.8535, time 125.10ms
iter 181880: loss 7.1408, time 125.12ms
iter 181890: loss 6.3004, time 125.12ms
iter 181900: loss 6.7741, time 125.26ms
iter 181910: loss 5.6034, time 126.21ms
iter 181920: loss 7.1916, time 125.04ms
iter 181930: loss 5.9967, time 127.92ms
iter 181940: loss 5.7684, time 125.04ms
iter 181950: loss 6.7766, time 125.13ms
iter 181960: loss 6.5543, time 124.31ms
iter 181970: loss 7.0610, time 125.61ms
iter 181980: loss 6.4740, time 124.41ms
iter 181990: loss 6.2326, time 125.21ms
step 182000: train loss 5.9066, val loss 5.8704
saving checkpoint to out-shakespeare-char
iter 182000: loss 6.4745, time 2893.33ms
iter 182010: loss 6.5005, time 125.16ms
iter 182020: loss 7.2807, time 125.29ms
iter 182030: loss 6.7436, time 125.15ms
iter 182040: loss 6.3648, time 125.45ms
iter 182050: loss 6.4318, time 125.27ms
iter 182060: loss 7.0361, time 124.82ms
iter 182070: loss 5.9227, time 124.84ms
iter 182080: loss 6.1102, time 125.20ms
iter 182090: loss 6.2692, time 125.49ms
iter 182100: loss 6.8360, time 128.30ms
iter 182110: loss 6.6220, time 125.16ms
iter 182120: loss 5.8287, time 125.25ms
iter 182130: loss 5.5620, time 125.43ms
iter 182140: loss 7.2244, time 124.50ms
iter 182150: loss 5.8419, time 125.35ms
iter 182160: loss 7.2143, time 125.04ms
iter 182170: loss 7.1715, time 125.22ms
iter 182180: loss 6.5560, time 125.86ms
iter 182190: loss 6.1192, time 125.38ms
iter 182200: loss 6.7816, time 125.26ms
iter 182210: loss 6.6368, time 128.43ms
iter 182220: loss 6.6075, time 125.29ms
iter 182230: loss 6.6125, time 125.59ms
iter 182240: loss 6.3357, time 126.04ms
step 182250: train loss 5.8780, val loss 5.9094
saving checkpoint to out-shakespeare-char
iter 182250: loss 6.5417, time 2899.41ms
iter 182260: loss 6.2981, time 125.78ms
iter 182270: loss 6.2542, time 125.49ms
iter 182280: loss 6.5116, time 125.50ms
iter 182290: loss 6.1925, time 125.28ms
iter 182300: loss 6.9950, time 124.92ms
iter 182310: loss 7.0280, time 125.11ms
iter 182320: loss 6.4444, time 124.89ms
iter 182330: loss 6.2686, time 124.62ms
iter 182340: loss 6.1061, time 125.11ms
iter 182350: loss 6.1360, time 125.21ms
iter 182360: loss 6.6723, time 127.98ms
iter 182370: loss 6.7668, time 125.02ms
iter 182380: loss 6.1528, time 125.17ms
iter 182390: loss 6.6325, time 125.21ms
iter 182400: loss 6.6815, time 125.24ms
iter 182410: loss 7.0956, time 125.03ms
iter 182420: loss 7.2285, time 125.00ms
iter 182430: loss 6.7261, time 125.03ms
iter 182440: loss 6.5676, time 125.03ms
iter 182450: loss 6.3691, time 124.97ms
iter 182460: loss 6.6519, time 125.11ms
iter 182470: loss 6.1678, time 127.03ms
iter 182480: loss 5.7835, time 124.99ms
iter 182490: loss 6.5683, time 125.41ms
step 182500: train loss 5.8821, val loss 5.9419
saving checkpoint to out-shakespeare-char
iter 182500: loss 6.2104, time 2867.93ms
iter 182510: loss 6.6712, time 126.37ms
iter 182520: loss 7.1828, time 125.13ms
iter 182530: loss 6.9263, time 127.85ms
iter 182540: loss 6.2856, time 124.92ms
iter 182550: loss 6.5043, time 125.01ms
iter 182560: loss 6.1454, time 125.22ms
iter 182570: loss 6.5084, time 125.57ms
iter 182580: loss 6.8805, time 126.57ms
iter 182590: loss 6.6565, time 124.87ms
iter 182600: loss 6.6450, time 125.06ms
iter 182610: loss 7.0603, time 125.04ms
iter 182620: loss 5.9233, time 124.97ms
iter 182630: loss 6.6633, time 125.07ms
iter 182640: loss 6.5751, time 125.02ms
iter 182650: loss 5.8866, time 125.06ms
iter 182660: loss 5.4598, time 124.83ms
iter 182670: loss 6.6453, time 125.06ms
iter 182680: loss 6.6881, time 125.16ms
iter 182690: loss 6.2653, time 125.09ms
iter 182700: loss 6.4807, time 124.96ms
iter 182710: loss 6.9185, time 125.50ms
iter 182720: loss 7.5389, time 125.16ms
iter 182730: loss 6.0862, time 125.00ms
iter 182740: loss 6.6958, time 125.28ms
step 182750: train loss 5.9353, val loss 5.8404
saving checkpoint to out-shakespeare-char
iter 182750: loss 7.1379, time 2903.72ms
iter 182760: loss 6.6504, time 125.25ms
iter 182770: loss 7.0902, time 125.08ms
iter 182780: loss 7.1824, time 125.51ms
iter 182790: loss 6.1431, time 124.67ms
iter 182800: loss 6.6309, time 125.11ms
iter 182810: loss 5.9411, time 125.18ms
iter 182820: loss 6.2118, time 125.42ms
iter 182830: loss 6.5558, time 125.12ms
iter 182840: loss 6.9711, time 125.33ms
iter 182850: loss 7.0677, time 128.37ms
iter 182860: loss 5.7582, time 122.07ms
iter 182870: loss 6.4311, time 123.37ms
iter 182880: loss 6.8132, time 121.58ms
iter 182890: loss 7.0635, time 123.36ms
iter 182900: loss 6.3188, time 118.93ms
iter 182910: loss 5.8633, time 122.85ms
iter 182920: loss 6.4733, time 122.02ms
iter 182930: loss 7.2265, time 122.88ms
iter 182940: loss 6.6167, time 121.64ms
iter 182950: loss 6.7331, time 122.40ms
iter 182960: loss 6.9820, time 125.65ms
iter 182970: loss 7.0875, time 125.51ms
iter 182980: loss 5.5301, time 125.59ms
iter 182990: loss 6.9191, time 128.14ms
step 183000: train loss 5.9167, val loss 5.9051
saving checkpoint to out-shakespeare-char
iter 183000: loss 6.2427, time 2872.23ms
iter 183010: loss 6.7377, time 122.30ms
iter 183020: loss 5.6931, time 121.89ms
iter 183030: loss 7.0847, time 121.93ms
iter 183040: loss 6.4986, time 122.30ms
iter 183050: loss 6.2852, time 122.25ms
iter 183060: loss 6.9699, time 122.22ms
iter 183070: loss 5.9860, time 122.15ms
iter 183080: loss 6.8633, time 121.67ms
iter 183090: loss 7.3624, time 121.77ms
iter 183100: loss 6.0414, time 121.72ms
iter 183110: loss 7.1577, time 121.84ms
iter 183120: loss 6.1228, time 121.91ms
iter 183130: loss 6.7131, time 121.90ms
iter 183140: loss 5.9441, time 121.86ms
iter 183150: loss 6.7308, time 121.95ms
iter 183160: loss 6.5573, time 121.86ms
iter 183170: loss 6.8807, time 121.92ms
iter 183180: loss 6.0634, time 121.75ms
iter 183190: loss 6.5369, time 121.88ms
iter 183200: loss 6.5400, time 121.89ms
iter 183210: loss 6.0787, time 121.87ms
iter 183220: loss 6.2708, time 121.92ms
iter 183230: loss 5.7701, time 122.20ms
iter 183240: loss 6.2119, time 121.30ms
step 183250: train loss 5.9182, val loss 5.9341
saving checkpoint to out-shakespeare-char
iter 183250: loss 6.2326, time 2886.96ms
iter 183260: loss 6.2688, time 122.23ms
iter 183270: loss 6.1872, time 121.34ms
iter 183280: loss 6.7033, time 122.97ms
iter 183290: loss 6.1568, time 122.12ms
iter 183300: loss 6.2380, time 124.20ms
iter 183310: loss 6.9956, time 121.93ms
iter 183320: loss 6.4180, time 123.00ms
iter 183330: loss 6.5424, time 121.87ms
iter 183340: loss 6.2957, time 122.20ms
iter 183350: loss 6.5019, time 121.73ms
iter 183360: loss 6.8183, time 122.77ms
iter 183370: loss 6.7628, time 121.46ms
iter 183380: loss 6.3148, time 122.73ms
iter 183390: loss 6.5697, time 121.59ms
iter 183400: loss 6.5026, time 123.07ms
iter 183410: loss 5.8063, time 121.66ms
iter 183420: loss 6.1091, time 122.75ms
iter 183430: loss 7.1399, time 121.31ms
iter 183440: loss 7.0077, time 123.21ms
iter 183450: loss 6.6511, time 121.48ms
iter 183460: loss 6.5273, time 122.66ms
iter 183470: loss 6.5938, time 121.68ms
iter 183480: loss 6.3595, time 122.90ms
iter 183490: loss 7.0505, time 121.69ms
step 183500: train loss 5.8836, val loss 5.8677
saving checkpoint to out-shakespeare-char
iter 183500: loss 7.0549, time 2895.91ms
iter 183510: loss 6.6091, time 120.26ms
iter 183520: loss 6.0034, time 121.54ms
iter 183530: loss 7.4673, time 121.59ms
iter 183540: loss 5.7734, time 121.75ms
iter 183550: loss 7.1382, time 121.54ms
iter 183560: loss 6.0725, time 121.47ms
iter 183570: loss 7.4856, time 121.49ms
iter 183580: loss 6.4102, time 121.44ms
iter 183590: loss 6.5402, time 121.43ms
iter 183600: loss 6.3321, time 120.97ms
iter 183610: loss 5.8360, time 121.33ms
iter 183620: loss 6.3055, time 121.58ms
iter 183630: loss 5.6779, time 121.58ms
iter 183640: loss 6.6231, time 120.75ms
iter 183650: loss 6.2392, time 121.26ms
iter 183660: loss 7.3398, time 121.68ms
iter 183670: loss 6.2780, time 121.37ms
iter 183680: loss 6.5390, time 121.71ms
iter 183690: loss 6.1965, time 121.53ms
iter 183700: loss 6.8127, time 121.62ms
iter 183710: loss 6.1530, time 121.50ms
iter 183720: loss 6.6419, time 121.94ms
iter 183730: loss 6.1736, time 121.74ms
iter 183740: loss 6.2196, time 121.80ms
step 183750: train loss 5.8978, val loss 5.8545
saving checkpoint to out-shakespeare-char
iter 183750: loss 6.1557, time 2885.33ms
iter 183760: loss 6.2113, time 121.51ms
iter 183770: loss 6.7045, time 121.54ms
iter 183780: loss 7.1804, time 121.74ms
iter 183790: loss 6.1668, time 121.73ms
iter 183800: loss 6.9648, time 121.46ms
iter 183810: loss 6.9188, time 121.70ms
iter 183820: loss 6.9189, time 122.42ms
iter 183830: loss 6.2830, time 121.82ms
iter 183840: loss 7.1790, time 121.56ms
iter 183850: loss 6.8279, time 121.58ms
iter 183860: loss 5.9398, time 121.72ms
iter 183870: loss 7.2491, time 121.87ms
iter 183880: loss 6.0750, time 121.61ms
iter 183890: loss 6.7627, time 121.56ms
iter 183900: loss 6.9497, time 121.99ms
iter 183910: loss 6.3855, time 121.46ms
iter 183920: loss 5.5262, time 121.83ms
iter 183930: loss 6.5671, time 121.53ms
iter 183940: loss 6.3136, time 122.11ms
iter 183950: loss 6.6196, time 121.61ms
iter 183960: loss 6.4562, time 121.61ms
iter 183970: loss 7.0554, time 121.99ms
iter 183980: loss 6.5834, time 121.24ms
iter 183990: loss 6.4609, time 121.67ms
step 184000: train loss 5.9392, val loss 5.9412
saving checkpoint to out-shakespeare-char
iter 184000: loss 7.3444, time 2877.45ms
iter 184010: loss 6.6503, time 121.28ms
iter 184020: loss 7.6999, time 122.69ms
iter 184030: loss 6.2734, time 121.73ms
iter 184040: loss 6.7189, time 123.51ms
iter 184050: loss 6.5875, time 121.58ms
iter 184060: loss 6.4801, time 123.58ms
iter 184070: loss 6.9847, time 121.69ms
iter 184080: loss 6.8245, time 122.41ms
iter 184090: loss 6.8248, time 121.99ms
iter 184100: loss 6.6698, time 121.97ms
iter 184110: loss 6.8661, time 122.98ms
iter 184120: loss 6.6773, time 122.85ms
iter 184130: loss 6.6060, time 122.36ms
iter 184140: loss 5.6893, time 122.70ms
iter 184150: loss 5.4430, time 121.74ms
iter 184160: loss 5.9911, time 122.79ms
iter 184170: loss 6.1564, time 121.74ms
iter 184180: loss 6.8878, time 122.77ms
iter 184190: loss 6.1610, time 121.88ms
iter 184200: loss 6.2465, time 122.68ms
iter 184210: loss 7.0064, time 120.96ms
iter 184220: loss 6.3459, time 123.06ms
iter 184230: loss 5.9047, time 121.66ms
iter 184240: loss 7.1856, time 123.17ms
step 184250: train loss 5.8968, val loss 5.9097
saving checkpoint to out-shakespeare-char
iter 184250: loss 6.0489, time 2898.76ms
iter 184260: loss 6.5569, time 121.21ms
iter 184270: loss 6.2655, time 121.54ms
iter 184280: loss 5.4784, time 121.54ms
iter 184290: loss 6.4303, time 121.56ms
iter 184300: loss 6.5247, time 121.69ms
iter 184310: loss 7.2926, time 121.68ms
iter 184320: loss 5.7104, time 121.74ms
iter 184330: loss 6.4212, time 121.55ms
iter 184340: loss 7.0172, time 121.94ms
iter 184350: loss 6.6581, time 121.54ms
iter 184360: loss 6.1655, time 121.54ms
iter 184370: loss 6.1728, time 121.75ms
iter 184380: loss 7.1770, time 121.56ms
iter 184390: loss 6.2630, time 121.45ms
iter 184400: loss 6.4198, time 121.64ms
iter 184410: loss 5.6883, time 121.47ms
iter 184420: loss 6.4392, time 121.03ms
iter 184430: loss 6.4938, time 121.30ms
iter 184440: loss 6.5797, time 121.62ms
iter 184450: loss 6.4850, time 121.83ms
iter 184460: loss 6.4362, time 121.67ms
iter 184470: loss 6.0767, time 121.32ms
iter 184480: loss 5.9865, time 120.91ms
iter 184490: loss 6.5061, time 121.43ms
step 184500: train loss 5.9672, val loss 5.8654
saving checkpoint to out-shakespeare-char
iter 184500: loss 6.2018, time 2870.54ms
iter 184510: loss 7.2472, time 121.54ms
iter 184520: loss 6.4993, time 121.50ms
iter 184530: loss 6.5807, time 121.55ms
iter 184540: loss 7.3386, time 121.31ms
iter 184550: loss 6.4943, time 121.61ms
iter 184560: loss 6.2070, time 121.53ms
iter 184570: loss 5.7400, time 121.41ms
iter 184580: loss 6.4472, time 121.38ms
iter 184590: loss 6.9587, time 121.41ms
iter 184600: loss 6.3278, time 121.67ms
iter 184610: loss 6.9410, time 120.44ms
iter 184620: loss 7.4043, time 121.21ms
iter 184630: loss 6.2766, time 120.18ms
iter 184640: loss 7.0146, time 121.66ms
iter 184650: loss 6.4792, time 121.63ms
iter 184660: loss 7.0363, time 121.67ms
iter 184670: loss 6.7253, time 121.97ms
iter 184680: loss 6.7419, time 121.61ms
iter 184690: loss 6.7551, time 121.59ms
iter 184700: loss 6.4372, time 121.70ms
iter 184710: loss 5.8391, time 122.00ms
iter 184720: loss 6.3283, time 121.65ms
iter 184730: loss 6.7431, time 121.46ms
iter 184740: loss 6.3742, time 121.60ms
step 184750: train loss 5.9296, val loss 5.8802
saving checkpoint to out-shakespeare-char
iter 184750: loss 7.3819, time 2899.20ms
iter 184760: loss 5.8531, time 121.41ms
iter 184770: loss 5.9131, time 124.49ms
iter 184780: loss 7.0188, time 121.60ms
iter 184790: loss 6.9006, time 125.00ms
iter 184800: loss 6.5539, time 121.63ms
iter 184810: loss 6.8561, time 124.50ms
iter 184820: loss 6.1566, time 121.65ms
iter 184830: loss 6.4365, time 124.68ms
iter 184840: loss 6.1305, time 122.07ms
iter 184850: loss 6.1252, time 124.50ms
iter 184860: loss 6.7564, time 121.70ms
iter 184870: loss 6.5037, time 124.47ms
iter 184880: loss 6.3876, time 121.91ms
iter 184890: loss 6.5157, time 124.55ms
iter 184900: loss 6.1488, time 121.93ms
iter 184910: loss 6.4525, time 123.93ms
iter 184920: loss 7.0777, time 121.12ms
iter 184930: loss 6.2348, time 124.48ms
iter 184940: loss 6.1086, time 123.14ms
iter 184950: loss 6.9616, time 124.39ms
iter 184960: loss 6.8520, time 122.14ms
iter 184970: loss 5.9025, time 124.44ms
iter 184980: loss 6.6121, time 121.72ms
iter 184990: loss 6.4591, time 125.08ms
step 185000: train loss 5.8641, val loss 5.8698
saving checkpoint to out-shakespeare-char
iter 185000: loss 6.5863, time 2861.28ms
iter 185010: loss 6.7064, time 126.16ms
iter 185020: loss 7.0318, time 125.95ms
iter 185030: loss 7.3230, time 125.06ms
iter 185040: loss 6.4822, time 125.64ms
iter 185050: loss 6.1760, time 125.41ms
iter 185060: loss 6.3001, time 125.26ms
iter 185070: loss 6.5493, time 125.89ms
iter 185080: loss 6.4655, time 125.51ms
iter 185090: loss 6.3088, time 125.40ms
iter 185100: loss 6.8638, time 125.88ms
iter 185110: loss 6.5318, time 126.10ms
iter 185120: loss 6.9211, time 126.51ms
iter 185130: loss 6.8903, time 126.21ms
iter 185140: loss 7.1509, time 125.76ms
iter 185150: loss 7.4652, time 125.40ms
iter 185160: loss 6.1136, time 125.36ms
iter 185170: loss 6.5951, time 126.12ms
iter 185180: loss 6.5495, time 129.39ms
iter 185190: loss 6.0223, time 125.46ms
iter 185200: loss 7.0430, time 125.51ms
iter 185210: loss 6.5243, time 125.46ms
iter 185220: loss 6.9245, time 125.48ms
iter 185230: loss 6.6489, time 125.81ms
iter 185240: loss 7.0068, time 125.76ms
step 185250: train loss 5.9252, val loss 5.8807
saving checkpoint to out-shakespeare-char
iter 185250: loss 6.7600, time 2895.37ms
iter 185260: loss 6.4877, time 125.16ms
iter 185270: loss 6.5229, time 125.21ms
iter 185280: loss 6.2674, time 125.16ms
iter 185290: loss 6.4469, time 125.36ms
iter 185300: loss 7.2033, time 125.39ms
iter 185310: loss 6.3039, time 128.14ms
iter 185320: loss 7.1818, time 125.15ms
iter 185330: loss 6.1817, time 124.20ms
iter 185340: loss 6.2989, time 125.14ms
iter 185350: loss 5.9700, time 124.58ms
iter 185360: loss 6.2089, time 124.83ms
iter 185370: loss 7.3499, time 126.40ms
iter 185380: loss 6.8120, time 125.04ms
iter 185390: loss 6.0988, time 126.21ms
iter 185400: loss 6.4600, time 125.87ms
iter 185410: loss 5.8218, time 126.18ms
iter 185420: loss 6.5509, time 129.30ms
iter 185430: loss 6.6878, time 125.77ms
iter 185440: loss 6.4580, time 126.23ms
iter 185450: loss 7.0595, time 126.15ms
iter 185460: loss 6.4011, time 126.26ms
iter 185470: loss 6.8818, time 125.75ms
iter 185480: loss 6.9587, time 126.23ms
iter 185490: loss 6.5705, time 126.07ms
step 185500: train loss 5.8069, val loss 5.8873
saving checkpoint to out-shakespeare-char
iter 185500: loss 6.3676, time 2877.62ms
iter 185510: loss 7.1185, time 128.26ms
iter 185520: loss 6.3216, time 125.92ms
iter 185530: loss 6.8407, time 124.88ms
iter 185540: loss 6.5856, time 125.84ms
iter 185550: loss 7.1165, time 125.82ms
iter 185560: loss 5.9984, time 126.24ms
iter 185570: loss 6.8260, time 126.15ms
iter 185580: loss 6.8055, time 125.46ms
iter 185590: loss 6.7798, time 125.71ms
iter 185600: loss 6.7680, time 125.57ms
iter 185610: loss 6.3929, time 124.82ms
iter 185620: loss 7.0486, time 128.60ms
iter 185630: loss 6.3003, time 125.74ms
iter 185640: loss 6.3847, time 125.15ms
iter 185650: loss 7.0617, time 125.50ms
iter 185660: loss 6.7627, time 125.11ms
iter 185670: loss 6.6461, time 125.64ms
iter 185680: loss 6.2676, time 125.35ms
iter 185690: loss 6.9648, time 125.34ms
iter 185700: loss 5.9833, time 124.31ms
iter 185710: loss 7.1120, time 125.69ms
iter 185720: loss 5.5140, time 125.82ms
iter 185730: loss 7.1557, time 125.77ms
iter 185740: loss 6.1968, time 127.27ms
step 185750: train loss 5.8545, val loss 5.8434
saving checkpoint to out-shakespeare-char
iter 185750: loss 6.1238, time 2889.42ms
iter 185760: loss 6.4231, time 125.23ms
iter 185770: loss 5.9563, time 125.92ms
iter 185780: loss 6.1249, time 125.50ms
iter 185790: loss 7.0737, time 125.67ms
iter 185800: loss 6.3344, time 125.44ms
iter 185810: loss 6.4294, time 125.42ms
iter 185820: loss 6.6933, time 125.15ms
iter 185830: loss 6.7521, time 125.05ms
iter 185840: loss 6.1050, time 125.51ms
iter 185850: loss 7.3110, time 125.85ms
iter 185860: loss 6.1353, time 125.95ms
iter 185870: loss 6.1783, time 128.87ms
iter 185880: loss 6.3623, time 126.00ms
iter 185890: loss 7.2844, time 125.62ms
iter 185900: loss 6.0816, time 125.58ms
iter 185910: loss 6.9119, time 126.18ms
iter 185920: loss 6.8045, time 125.67ms
iter 185930: loss 6.1275, time 126.21ms
iter 185940: loss 6.1951, time 125.98ms
iter 185950: loss 5.8458, time 125.76ms
iter 185960: loss 6.1461, time 125.73ms
iter 185970: loss 6.5571, time 126.12ms
iter 185980: loss 5.5874, time 128.61ms
iter 185990: loss 6.5347, time 125.58ms
step 186000: train loss 5.8907, val loss 5.8901
saving checkpoint to out-shakespeare-char
iter 186000: loss 6.2269, time 2878.73ms
iter 186010: loss 6.7249, time 125.68ms
iter 186020: loss 6.5579, time 125.88ms
iter 186030: loss 6.9209, time 125.76ms
iter 186040: loss 7.1853, time 125.27ms
iter 186050: loss 5.9871, time 125.79ms
iter 186060: loss 6.6420, time 125.92ms
iter 186070: loss 6.0868, time 125.27ms
iter 186080: loss 6.3237, time 125.37ms
iter 186090: loss 6.2518, time 124.92ms
iter 186100: loss 6.2787, time 125.11ms
iter 186110: loss 6.7435, time 128.04ms
iter 186120: loss 6.1885, time 124.97ms
iter 186130: loss 6.9634, time 125.60ms
iter 186140: loss 6.6678, time 124.82ms
iter 186150: loss 5.6436, time 125.01ms
iter 186160: loss 6.9622, time 124.93ms
iter 186170: loss 6.3224, time 124.94ms
iter 186180: loss 6.9434, time 124.91ms
iter 186190: loss 6.5341, time 125.57ms
iter 186200: loss 6.7039, time 125.69ms
iter 186210: loss 6.7186, time 125.53ms
iter 186220: loss 7.1270, time 128.34ms
iter 186230: loss 6.8052, time 124.88ms
iter 186240: loss 6.1888, time 124.95ms
step 186250: train loss 5.8856, val loss 5.8702
saving checkpoint to out-shakespeare-char
iter 186250: loss 7.0199, time 2861.64ms
iter 186260: loss 6.3551, time 125.79ms
iter 186270: loss 6.2772, time 125.90ms
iter 186280: loss 5.5351, time 127.90ms
iter 186290: loss 5.8064, time 124.83ms
iter 186300: loss 6.8874, time 125.30ms
iter 186310: loss 6.1744, time 124.95ms
iter 186320: loss 6.4600, time 124.94ms
iter 186330: loss 6.3388, time 124.99ms
iter 186340: loss 6.4847, time 125.00ms
iter 186350: loss 6.8102, time 124.87ms
iter 186360: loss 6.6620, time 124.99ms
iter 186370: loss 6.8624, time 124.93ms
iter 186380: loss 6.3795, time 125.34ms
iter 186390: loss 6.5841, time 127.98ms
iter 186400: loss 7.1211, time 125.33ms
iter 186410: loss 7.3352, time 125.18ms
iter 186420: loss 6.1177, time 125.54ms
iter 186430: loss 6.6687, time 124.99ms
iter 186440: loss 6.3636, time 125.03ms
iter 186450: loss 6.3827, time 125.03ms
iter 186460: loss 7.1089, time 124.73ms
iter 186470: loss 7.7295, time 124.79ms
iter 186480: loss 7.3853, time 125.17ms
iter 186490: loss 6.6567, time 125.27ms
step 186500: train loss 5.9382, val loss 5.8439
saving checkpoint to out-shakespeare-char
iter 186500: loss 7.2528, time 2864.86ms
iter 186510: loss 5.9660, time 125.02ms
iter 186520: loss 6.9055, time 124.67ms
iter 186530: loss 6.5739, time 124.96ms
iter 186540: loss 6.0750, time 125.23ms
iter 186550: loss 6.5431, time 125.43ms
iter 186560: loss 6.8287, time 127.74ms
iter 186570: loss 7.1179, time 125.37ms
iter 186580: loss 7.2889, time 125.19ms
iter 186590: loss 7.1589, time 125.11ms
iter 186600: loss 6.7734, time 124.20ms
iter 186610: loss 6.9477, time 124.76ms
iter 186620: loss 5.9679, time 125.48ms
iter 186630: loss 7.2670, time 125.42ms
iter 186640: loss 6.3396, time 125.32ms
iter 186650: loss 7.0600, time 125.31ms
iter 186660: loss 6.1776, time 125.71ms
iter 186670: loss 5.8214, time 128.21ms
iter 186680: loss 7.1834, time 124.74ms
iter 186690: loss 6.3804, time 124.77ms
iter 186700: loss 7.0412, time 125.41ms
iter 186710: loss 6.2827, time 125.42ms
iter 186720: loss 5.9741, time 125.32ms
iter 186730: loss 6.6094, time 124.91ms
iter 186740: loss 6.1319, time 125.04ms
step 186750: train loss 5.8376, val loss 5.8766
saving checkpoint to out-shakespeare-char
iter 186750: loss 6.7940, time 2872.83ms
iter 186760: loss 5.8600, time 125.40ms
iter 186770: loss 6.4527, time 127.87ms
iter 186780: loss 6.2934, time 124.97ms
iter 186790: loss 6.0717, time 124.96ms
iter 186800: loss 6.9287, time 125.36ms
iter 186810: loss 6.6920, time 125.15ms
iter 186820: loss 6.7271, time 125.05ms
iter 186830: loss 6.3660, time 125.15ms
iter 186840: loss 5.8103, time 125.21ms
iter 186850: loss 6.7957, time 125.32ms
iter 186860: loss 6.8225, time 125.01ms
iter 186870: loss 6.7261, time 125.37ms
iter 186880: loss 6.0543, time 128.06ms
iter 186890: loss 6.1181, time 125.16ms
iter 186900: loss 7.2843, time 125.10ms
iter 186910: loss 6.1579, time 125.78ms
iter 186920: loss 6.6607, time 125.30ms
iter 186930: loss 6.7069, time 125.36ms
iter 186940: loss 7.1107, time 125.43ms
iter 186950: loss 6.8958, time 125.39ms
iter 186960: loss 6.6055, time 125.17ms
iter 186970: loss 6.3802, time 125.51ms
iter 186980: loss 6.5666, time 125.08ms
iter 186990: loss 6.6544, time 125.24ms
step 187000: train loss 5.9056, val loss 5.8553
saving checkpoint to out-shakespeare-char
iter 187000: loss 6.6587, time 2900.67ms
iter 187010: loss 6.9068, time 125.60ms
iter 187020: loss 6.0863, time 125.87ms
iter 187030: loss 6.6155, time 125.98ms
iter 187040: loss 6.4915, time 125.58ms
iter 187050: loss 6.0852, time 126.70ms
iter 187060: loss 5.9475, time 124.81ms
iter 187070: loss 6.1868, time 126.09ms
iter 187080: loss 6.1960, time 125.57ms
iter 187090: loss 7.1755, time 125.51ms
iter 187100: loss 6.5233, time 128.61ms
iter 187110: loss 6.0513, time 126.18ms
iter 187120: loss 6.5420, time 125.96ms
iter 187130: loss 5.8612, time 126.09ms
iter 187140: loss 6.6237, time 128.95ms
iter 187150: loss 6.5710, time 126.19ms
iter 187160: loss 6.4207, time 126.32ms
iter 187170: loss 6.5652, time 125.58ms
iter 187180: loss 6.5524, time 125.08ms
iter 187190: loss 6.0781, time 125.15ms
iter 187200: loss 7.1147, time 125.16ms
iter 187210: loss 6.5746, time 126.00ms
iter 187220: loss 6.5084, time 125.73ms
iter 187230: loss 6.2979, time 125.61ms
iter 187240: loss 6.8807, time 125.47ms
step 187250: train loss 5.8617, val loss 5.9094
saving checkpoint to out-shakespeare-char
iter 187250: loss 6.9410, time 2885.10ms
iter 187260: loss 6.9640, time 125.51ms
iter 187270: loss 7.1402, time 125.05ms
iter 187280: loss 6.8517, time 125.50ms
iter 187290: loss 5.4799, time 126.78ms
iter 187300: loss 5.7935, time 125.85ms
iter 187310: loss 7.5693, time 127.69ms
iter 187320: loss 6.6088, time 125.45ms
iter 187330: loss 6.7701, time 125.47ms
iter 187340: loss 6.6874, time 124.32ms
iter 187350: loss 6.0025, time 125.36ms
iter 187360: loss 6.9344, time 125.39ms
iter 187370: loss 6.5625, time 125.18ms
iter 187380: loss 5.8453, time 127.66ms
iter 187390: loss 6.7018, time 125.16ms
iter 187400: loss 5.8288, time 125.13ms
iter 187410: loss 6.3851, time 124.19ms
iter 187420: loss 6.1406, time 124.89ms
iter 187430: loss 7.0979, time 125.01ms
iter 187440: loss 6.2569, time 124.94ms
iter 187450: loss 6.0517, time 124.99ms
iter 187460: loss 6.7999, time 125.21ms
iter 187470: loss 6.6501, time 124.52ms
iter 187480: loss 6.8072, time 125.31ms
iter 187490: loss 7.0231, time 126.53ms
step 187500: train loss 5.8749, val loss 5.9326
saving checkpoint to out-shakespeare-char
iter 187500: loss 6.7948, time 2884.53ms
iter 187510: loss 6.6559, time 124.53ms
iter 187520: loss 6.6771, time 125.09ms
iter 187530: loss 6.7980, time 125.00ms
iter 187540: loss 6.5284, time 125.28ms
iter 187550: loss 6.6285, time 123.62ms
iter 187560: loss 6.0112, time 125.05ms
iter 187570: loss 7.1000, time 125.46ms
iter 187580: loss 6.3008, time 125.25ms
iter 187590: loss 6.8971, time 127.94ms
iter 187600: loss 7.4394, time 124.95ms
iter 187610: loss 5.9873, time 125.65ms
iter 187620: loss 6.0484, time 129.29ms
iter 187630: loss 6.0232, time 125.51ms
iter 187640: loss 6.2380, time 126.37ms
iter 187650: loss 7.0697, time 124.89ms
iter 187660: loss 6.9387, time 126.60ms
iter 187670: loss 5.7995, time 125.92ms
iter 187680: loss 5.8084, time 124.83ms
iter 187690: loss 6.7085, time 124.33ms
iter 187700: loss 6.8023, time 125.52ms
iter 187710: loss 6.6597, time 126.08ms
iter 187720: loss 7.8171, time 126.29ms
iter 187730: loss 6.2033, time 128.63ms
iter 187740: loss 6.4153, time 125.81ms
step 187750: train loss 5.8835, val loss 5.8727
saving checkpoint to out-shakespeare-char
iter 187750: loss 6.4477, time 2889.05ms
iter 187760: loss 6.3221, time 129.11ms
iter 187770: loss 6.2874, time 126.22ms
iter 187780: loss 6.5454, time 125.74ms
iter 187790: loss 6.1212, time 126.19ms
iter 187800: loss 6.9419, time 125.68ms
iter 187810: loss 6.7842, time 126.10ms
iter 187820: loss 6.8609, time 125.75ms
iter 187830: loss 6.2440, time 126.02ms
iter 187840: loss 7.7174, time 125.71ms
iter 187850: loss 6.6138, time 126.52ms
iter 187860: loss 6.3070, time 125.26ms
iter 187870: loss 6.4008, time 128.70ms
iter 187880: loss 6.1290, time 125.57ms
iter 187890: loss 5.7582, time 125.59ms
iter 187900: loss 6.0188, time 126.06ms
iter 187910: loss 6.3751, time 125.75ms
iter 187920: loss 7.2078, time 126.87ms
iter 187930: loss 6.2621, time 125.81ms
iter 187940: loss 6.5626, time 125.93ms
iter 187950: loss 6.6176, time 126.06ms
iter 187960: loss 5.9125, time 125.82ms
iter 187970: loss 6.7163, time 125.93ms
iter 187980: loss 6.2227, time 129.94ms
iter 187990: loss 5.9749, time 125.55ms
step 188000: train loss 5.8929, val loss 5.8709
saving checkpoint to out-shakespeare-char
iter 188000: loss 6.3805, time 2870.94ms
iter 188010: loss 6.6330, time 125.91ms
iter 188020: loss 6.3447, time 125.81ms
iter 188030: loss 6.2091, time 125.60ms
iter 188040: loss 6.1262, time 128.73ms
iter 188050: loss 7.0004, time 125.66ms
iter 188060: loss 6.7515, time 124.82ms
iter 188070: loss 6.7121, time 126.35ms
iter 188080: loss 7.0445, time 125.97ms
iter 188090: loss 7.0991, time 124.96ms
iter 188100: loss 6.1127, time 125.91ms
iter 188110: loss 6.1230, time 126.29ms
iter 188120: loss 6.5478, time 125.97ms
iter 188130: loss 6.4611, time 125.36ms
iter 188140: loss 6.8672, time 125.35ms
iter 188150: loss 6.8929, time 124.86ms
iter 188160: loss 6.6766, time 126.05ms
iter 188170: loss 6.3273, time 125.00ms
iter 188180: loss 6.4430, time 125.02ms
iter 188190: loss 7.0665, time 125.10ms
iter 188200: loss 6.3270, time 125.10ms
iter 188210: loss 6.2193, time 124.77ms
iter 188220: loss 5.9198, time 125.15ms
iter 188230: loss 6.1657, time 127.85ms
iter 188240: loss 6.3014, time 124.85ms
step 188250: train loss 5.8537, val loss 5.8717
saving checkpoint to out-shakespeare-char
iter 188250: loss 6.6949, time 2875.30ms
iter 188260: loss 6.9455, time 126.02ms
iter 188270: loss 5.8051, time 125.72ms
iter 188280: loss 6.9053, time 125.84ms
iter 188290: loss 6.2263, time 125.20ms
iter 188300: loss 6.7115, time 124.98ms
iter 188310: loss 6.3442, time 125.15ms
iter 188320: loss 6.8902, time 125.03ms
iter 188330: loss 6.4554, time 124.54ms
iter 188340: loss 6.2498, time 125.12ms
iter 188350: loss 5.9522, time 125.83ms
iter 188360: loss 6.0447, time 128.97ms
iter 188370: loss 6.5920, time 126.04ms
iter 188380: loss 6.6119, time 125.90ms
iter 188390: loss 6.2162, time 126.10ms
iter 188400: loss 6.2534, time 126.08ms
iter 188410: loss 6.9267, time 126.04ms
iter 188420: loss 6.4477, time 125.43ms
iter 188430: loss 6.5452, time 125.44ms
iter 188440: loss 6.1888, time 125.64ms
iter 188450: loss 5.9040, time 125.86ms
iter 188460: loss 6.8311, time 125.65ms
iter 188470: loss 6.4950, time 127.06ms
iter 188480: loss 5.7616, time 125.57ms
iter 188490: loss 7.0222, time 125.40ms
step 188500: train loss 5.8488, val loss 5.9098
saving checkpoint to out-shakespeare-char
iter 188500: loss 6.5688, time 2897.37ms
iter 188510: loss 6.4004, time 125.98ms
iter 188520: loss 6.4530, time 125.85ms
iter 188530: loss 6.7515, time 125.70ms
iter 188540: loss 6.2965, time 125.62ms
iter 188550: loss 6.4688, time 125.55ms
iter 188560: loss 6.8892, time 125.55ms
iter 188570: loss 6.6205, time 124.83ms
iter 188580: loss 5.8473, time 125.52ms
iter 188590: loss 6.6815, time 125.97ms
iter 188600: loss 6.8093, time 128.51ms
iter 188610: loss 7.5369, time 125.52ms
iter 188620: loss 6.8001, time 125.69ms
iter 188630: loss 7.3278, time 126.54ms
iter 188640: loss 6.2943, time 125.13ms
iter 188650: loss 6.7898, time 125.45ms
iter 188660: loss 5.9709, time 125.51ms
iter 188670: loss 5.9672, time 126.22ms
iter 188680: loss 6.5982, time 125.43ms
iter 188690: loss 6.2464, time 125.60ms
iter 188700: loss 5.4893, time 125.54ms
iter 188710: loss 5.9867, time 124.91ms
iter 188720: loss 6.9313, time 124.82ms
iter 188730: loss 6.4188, time 125.25ms
iter 188740: loss 6.9426, time 125.00ms
step 188750: train loss 5.9468, val loss 5.8692
saving checkpoint to out-shakespeare-char
iter 188750: loss 6.7225, time 2887.06ms
iter 188760: loss 6.6988, time 123.22ms
iter 188770: loss 6.3784, time 121.70ms
iter 188780: loss 6.2393, time 122.97ms
iter 188790: loss 6.1719, time 121.77ms
iter 188800: loss 5.9216, time 123.67ms
iter 188810: loss 6.4251, time 121.89ms
iter 188820: loss 6.8771, time 122.89ms
iter 188830: loss 6.7005, time 121.49ms
iter 188840: loss 5.8789, time 123.75ms
iter 188850: loss 5.9362, time 121.67ms
iter 188860: loss 6.5276, time 123.05ms
iter 188870: loss 5.9711, time 121.60ms
iter 188880: loss 6.5624, time 122.85ms
iter 188890: loss 6.8790, time 121.50ms
iter 188900: loss 6.5875, time 123.23ms
iter 188910: loss 7.0227, time 121.86ms
iter 188920: loss 6.0788, time 122.87ms
iter 188930: loss 6.7142, time 121.02ms
iter 188940: loss 5.5945, time 123.37ms
iter 188950: loss 6.5535, time 121.50ms
iter 188960: loss 6.0979, time 123.08ms
iter 188970: loss 6.2766, time 121.54ms
iter 188980: loss 6.9188, time 122.96ms
iter 188990: loss 7.0040, time 121.25ms
step 189000: train loss 5.8574, val loss 5.9051
saving checkpoint to out-shakespeare-char
iter 189000: loss 6.3632, time 2902.47ms
iter 189010: loss 6.5263, time 121.90ms
iter 189020: loss 5.9431, time 121.67ms
iter 189030: loss 6.1985, time 121.34ms
iter 189040: loss 5.6844, time 121.66ms
iter 189050: loss 6.7815, time 121.79ms
iter 189060: loss 6.7946, time 121.65ms
iter 189070: loss 6.4382, time 121.81ms
iter 189080: loss 7.0956, time 121.52ms
iter 189090: loss 6.5602, time 122.08ms
iter 189100: loss 6.6628, time 121.61ms
iter 189110: loss 6.2449, time 121.75ms
iter 189120: loss 6.2634, time 121.57ms
iter 189130: loss 6.4659, time 121.73ms
iter 189140: loss 5.8841, time 121.86ms
iter 189150: loss 6.5895, time 121.69ms
iter 189160: loss 6.4619, time 121.80ms
iter 189170: loss 6.5056, time 122.02ms
iter 189180: loss 5.9231, time 121.80ms
iter 189190: loss 6.4541, time 121.70ms
iter 189200: loss 7.1107, time 126.08ms
iter 189210: loss 7.2911, time 124.67ms
iter 189220: loss 7.4540, time 125.79ms
iter 189230: loss 5.7637, time 125.81ms
iter 189240: loss 5.8582, time 125.15ms
step 189250: train loss 5.8543, val loss 5.8492
saving checkpoint to out-shakespeare-char
iter 189250: loss 6.2767, time 2886.27ms
iter 189260: loss 6.2856, time 128.16ms
iter 189270: loss 6.0415, time 125.04ms
iter 189280: loss 6.9141, time 125.14ms
iter 189290: loss 6.1501, time 125.30ms
iter 189300: loss 6.4801, time 125.41ms
iter 189310: loss 6.4756, time 124.98ms
iter 189320: loss 6.2972, time 124.91ms
iter 189330: loss 5.6926, time 124.64ms
iter 189340: loss 5.8297, time 124.78ms
iter 189350: loss 7.0241, time 124.84ms
iter 189360: loss 6.9295, time 125.23ms
iter 189370: loss 6.3226, time 127.86ms
iter 189380: loss 6.6969, time 124.95ms
iter 189390: loss 6.3836, time 125.06ms
iter 189400: loss 6.3191, time 125.46ms
iter 189410: loss 5.8869, time 125.09ms
iter 189420: loss 6.9996, time 125.13ms
iter 189430: loss 6.2896, time 124.51ms
iter 189440: loss 6.7495, time 125.15ms
iter 189450: loss 7.1471, time 125.22ms
iter 189460: loss 6.4821, time 125.22ms
iter 189470: loss 6.3513, time 125.18ms
iter 189480: loss 7.4623, time 128.02ms
iter 189490: loss 7.1386, time 124.83ms
step 189500: train loss 5.9214, val loss 5.8374
saving checkpoint to out-shakespeare-char
iter 189500: loss 6.4188, time 2904.09ms
iter 189510: loss 6.2537, time 123.77ms
iter 189520: loss 6.6369, time 121.69ms
iter 189530: loss 6.8681, time 123.30ms
iter 189540: loss 6.7897, time 121.78ms
iter 189550: loss 6.7630, time 122.15ms
iter 189560: loss 5.7238, time 121.60ms
iter 189570: loss 5.9711, time 121.91ms
iter 189580: loss 6.4307, time 121.76ms
iter 189590: loss 6.3909, time 121.73ms
iter 189600: loss 5.9541, time 121.62ms
iter 189610: loss 6.5391, time 121.83ms
iter 189620: loss 6.6207, time 121.46ms
iter 189630: loss 6.3754, time 121.63ms
iter 189640: loss 6.0396, time 121.75ms
iter 189650: loss 5.9148, time 121.66ms
iter 189660: loss 6.5976, time 121.76ms
iter 189670: loss 6.1902, time 121.51ms
iter 189680: loss 6.2566, time 121.63ms
iter 189690: loss 6.5585, time 121.70ms
iter 189700: loss 6.3453, time 121.71ms
iter 189710: loss 6.1809, time 122.01ms
iter 189720: loss 6.8066, time 121.72ms
iter 189730: loss 6.2392, time 122.13ms
iter 189740: loss 5.9073, time 121.67ms
step 189750: train loss 5.8793, val loss 5.8789
saving checkpoint to out-shakespeare-char
iter 189750: loss 6.1328, time 2902.03ms
iter 189760: loss 5.9558, time 126.71ms
iter 189770: loss 5.9169, time 125.02ms
iter 189780: loss 5.8117, time 125.96ms
iter 189790: loss 6.0043, time 125.99ms
iter 189800: loss 5.5891, time 125.28ms
iter 189810: loss 6.4441, time 125.85ms
iter 189820: loss 6.8184, time 125.92ms
iter 189830: loss 6.8113, time 126.07ms
iter 189840: loss 6.2913, time 128.89ms
iter 189850: loss 6.6782, time 125.92ms
iter 189860: loss 6.7292, time 125.75ms
iter 189870: loss 6.0310, time 126.03ms
iter 189880: loss 6.2468, time 126.16ms
iter 189890: loss 6.6704, time 125.99ms
iter 189900: loss 6.3027, time 125.88ms
iter 189910: loss 6.4389, time 126.09ms
iter 189920: loss 6.5711, time 125.89ms
iter 189930: loss 5.6139, time 125.86ms
iter 189940: loss 6.2605, time 125.56ms
iter 189950: loss 7.0881, time 123.75ms
iter 189960: loss 6.8142, time 125.70ms
iter 189970: loss 6.9230, time 125.36ms
iter 189980: loss 6.9622, time 125.86ms
iter 189990: loss 5.9523, time 128.52ms
step 190000: train loss 5.8786, val loss 5.8373
saving checkpoint to out-shakespeare-char
iter 190000: loss 6.8951, time 2892.53ms
iter 190010: loss 5.9042, time 126.10ms
iter 190020: loss 6.3479, time 125.30ms
iter 190030: loss 6.2189, time 126.43ms
iter 190040: loss 5.8268, time 125.14ms
iter 190050: loss 6.7420, time 125.61ms
iter 190060: loss 6.6404, time 124.95ms
iter 190070: loss 6.5387, time 125.95ms
iter 190080: loss 6.4955, time 126.23ms
iter 190090: loss 6.5808, time 127.88ms
iter 190100: loss 6.4184, time 125.96ms
iter 190110: loss 6.3181, time 125.11ms
iter 190120: loss 6.6395, time 126.14ms
iter 190130: loss 6.3765, time 124.78ms
iter 190140: loss 6.5003, time 126.03ms
iter 190150: loss 6.2521, time 125.26ms
iter 190160: loss 6.2902, time 124.90ms
iter 190170: loss 6.4393, time 125.22ms
iter 190180: loss 6.7655, time 124.69ms
iter 190190: loss 6.0573, time 125.94ms
iter 190200: loss 6.0648, time 127.52ms
iter 190210: loss 6.3850, time 121.28ms
iter 190220: loss 6.5482, time 124.04ms
iter 190230: loss 6.2431, time 121.60ms
iter 190240: loss 7.0522, time 124.39ms
step 190250: train loss 5.8359, val loss 5.8520
saving checkpoint to out-shakespeare-char
iter 190250: loss 7.0112, time 2904.40ms
iter 190260: loss 6.2344, time 121.82ms
iter 190270: loss 6.0067, time 121.92ms
iter 190280: loss 6.7749, time 122.68ms
iter 190290: loss 6.2904, time 121.61ms
iter 190300: loss 5.4041, time 122.57ms
iter 190310: loss 6.4173, time 121.71ms
iter 190320: loss 6.3366, time 122.60ms
iter 190330: loss 6.9170, time 121.87ms
iter 190340: loss 5.9955, time 122.55ms
iter 190350: loss 5.8028, time 121.56ms
iter 190360: loss 6.6792, time 122.66ms
iter 190370: loss 5.9687, time 121.49ms
iter 190380: loss 6.2977, time 122.56ms
iter 190390: loss 6.2758, time 121.71ms
iter 190400: loss 6.4350, time 121.72ms
iter 190410: loss 6.1670, time 121.54ms
iter 190420: loss 6.2449, time 121.87ms
iter 190430: loss 6.8816, time 121.11ms
iter 190440: loss 6.7321, time 121.58ms
iter 190450: loss 6.6084, time 121.58ms
iter 190460: loss 6.4237, time 121.20ms
iter 190470: loss 5.1541, time 121.97ms
iter 190480: loss 6.6369, time 121.97ms
iter 190490: loss 7.3763, time 121.17ms
step 190500: train loss 5.8680, val loss 5.8756
saving checkpoint to out-shakespeare-char
iter 190500: loss 6.4918, time 2899.51ms
iter 190510: loss 6.6204, time 121.65ms
iter 190520: loss 6.5995, time 121.43ms
iter 190530: loss 6.1750, time 121.01ms
iter 190540: loss 5.5684, time 121.41ms
iter 190550: loss 6.4109, time 121.35ms
iter 190560: loss 6.0099, time 121.26ms
iter 190570: loss 6.8329, time 121.46ms
iter 190580: loss 7.1511, time 121.43ms
iter 190590: loss 6.4718, time 121.51ms
iter 190600: loss 6.5013, time 121.49ms
iter 190610: loss 6.7189, time 121.50ms
iter 190620: loss 6.2825, time 121.86ms
iter 190630: loss 6.9188, time 121.47ms
iter 190640: loss 6.7751, time 120.79ms
iter 190650: loss 6.5535, time 121.50ms
iter 190660: loss 6.4597, time 121.64ms
iter 190670: loss 6.9287, time 121.28ms
iter 190680: loss 6.3799, time 121.28ms
iter 190690: loss 6.4642, time 121.55ms
iter 190700: loss 6.7481, time 120.74ms
iter 190710: loss 6.9225, time 121.44ms
iter 190720: loss 5.6271, time 121.46ms
iter 190730: loss 6.6873, time 120.59ms
iter 190740: loss 7.0687, time 121.56ms
step 190750: train loss 5.8696, val loss 5.8726
saving checkpoint to out-shakespeare-char
iter 190750: loss 6.9741, time 2890.13ms
iter 190760: loss 7.0812, time 121.93ms
iter 190770: loss 7.1205, time 121.85ms
iter 190780: loss 6.2899, time 123.13ms
iter 190790: loss 6.0873, time 121.74ms
iter 190800: loss 6.7708, time 122.69ms
iter 190810: loss 7.3882, time 120.78ms
iter 190820: loss 5.9909, time 122.64ms
iter 190830: loss 6.8705, time 121.84ms
iter 190840: loss 6.9408, time 123.32ms
iter 190850: loss 6.0434, time 123.17ms
iter 190860: loss 6.6091, time 122.67ms
iter 190870: loss 6.1454, time 122.23ms
iter 190880: loss 6.7088, time 123.27ms
iter 190890: loss 6.3357, time 122.52ms
iter 190900: loss 5.6711, time 123.00ms
iter 190910: loss 6.4054, time 121.71ms
iter 190920: loss 6.5375, time 123.07ms
iter 190930: loss 6.4810, time 122.01ms
iter 190940: loss 6.9239, time 122.90ms
iter 190950: loss 6.5032, time 121.90ms
iter 190960: loss 6.5419, time 123.21ms
iter 190970: loss 6.6745, time 122.06ms
iter 190980: loss 6.1743, time 123.22ms
iter 190990: loss 6.4741, time 120.54ms
step 191000: train loss 5.8943, val loss 5.8455
saving checkpoint to out-shakespeare-char
iter 191000: loss 5.9372, time 2898.97ms
iter 191010: loss 6.7530, time 121.57ms
iter 191020: loss 5.5113, time 124.49ms
iter 191030: loss 7.3510, time 121.74ms
iter 191040: loss 5.9231, time 124.93ms
iter 191050: loss 6.9863, time 121.63ms
iter 191060: loss 5.8211, time 124.63ms
iter 191070: loss 6.3028, time 122.09ms
iter 191080: loss 6.3605, time 125.27ms
iter 191090: loss 6.1989, time 122.02ms
iter 191100: loss 6.4893, time 125.09ms
iter 191110: loss 6.6128, time 121.69ms
iter 191120: loss 6.7558, time 124.47ms
iter 191130: loss 5.4583, time 121.59ms
iter 191140: loss 7.1517, time 124.45ms
iter 191150: loss 6.8747, time 121.67ms
iter 191160: loss 6.6626, time 124.88ms
iter 191170: loss 6.6608, time 121.70ms
iter 191180: loss 5.8529, time 125.29ms
iter 191190: loss 7.3699, time 121.46ms
iter 191200: loss 5.6224, time 124.48ms
iter 191210: loss 6.5555, time 121.65ms
iter 191220: loss 6.0804, time 124.97ms
iter 191230: loss 6.3029, time 121.49ms
iter 191240: loss 6.0540, time 124.13ms
step 191250: train loss 5.8889, val loss 5.8870
saving checkpoint to out-shakespeare-char
iter 191250: loss 6.6827, time 2896.17ms
iter 191260: loss 5.7953, time 125.66ms
iter 191270: loss 7.3267, time 125.74ms
iter 191280: loss 5.8788, time 125.08ms
iter 191290: loss 6.3729, time 128.42ms
iter 191300: loss 7.0852, time 125.86ms
iter 191310: loss 6.5889, time 125.81ms
iter 191320: loss 6.3987, time 125.15ms
iter 191330: loss 6.3373, time 125.71ms
iter 191340: loss 7.5523, time 126.06ms
iter 191350: loss 6.1877, time 125.47ms
iter 191360: loss 6.3365, time 126.08ms
iter 191370: loss 6.0424, time 125.56ms
iter 191380: loss 6.2149, time 126.15ms
iter 191390: loss 6.5748, time 125.74ms
iter 191400: loss 7.0238, time 128.12ms
iter 191410: loss 6.5290, time 126.35ms
iter 191420: loss 6.5115, time 126.13ms
iter 191430: loss 7.1658, time 125.92ms
iter 191440: loss 6.0669, time 123.89ms
iter 191450: loss 6.1025, time 126.34ms
iter 191460: loss 6.4050, time 127.19ms
iter 191470: loss 6.8712, time 126.00ms
iter 191480: loss 6.6831, time 126.17ms
iter 191490: loss 6.7401, time 126.22ms
step 191500: train loss 5.8644, val loss 5.8885
saving checkpoint to out-shakespeare-char
iter 191500: loss 5.7155, time 2897.54ms
iter 191510: loss 6.7422, time 122.40ms
iter 191520: loss 6.7804, time 122.05ms
iter 191530: loss 6.3941, time 121.59ms
iter 191540: loss 7.0321, time 122.00ms
iter 191550: loss 6.3636, time 121.89ms
iter 191560: loss 6.7654, time 121.84ms
iter 191570: loss 6.1751, time 120.61ms
iter 191580: loss 6.4388, time 120.94ms
iter 191590: loss 6.3366, time 122.00ms
iter 191600: loss 6.6805, time 121.98ms
iter 191610: loss 6.4342, time 121.72ms
iter 191620: loss 6.6711, time 122.18ms
iter 191630: loss 5.6183, time 121.77ms
iter 191640: loss 6.5520, time 121.88ms
iter 191650: loss 7.3122, time 122.18ms
iter 191660: loss 5.9079, time 121.20ms
iter 191670: loss 6.1138, time 121.16ms
iter 191680: loss 6.3123, time 121.75ms
iter 191690: loss 6.0448, time 121.51ms
iter 191700: loss 6.2625, time 121.99ms
iter 191710: loss 5.6311, time 121.68ms
iter 191720: loss 5.9962, time 121.40ms
iter 191730: loss 6.1300, time 121.64ms
iter 191740: loss 6.6040, time 122.60ms
step 191750: train loss 5.8375, val loss 5.8398
saving checkpoint to out-shakespeare-char
iter 191750: loss 6.3065, time 2920.51ms
iter 191760: loss 6.7506, time 125.51ms
iter 191770: loss 6.0951, time 125.23ms
iter 191780: loss 5.9451, time 125.44ms
iter 191790: loss 6.6282, time 125.37ms
iter 191800: loss 6.6978, time 125.30ms
iter 191810: loss 5.8864, time 125.76ms
iter 191820: loss 6.4722, time 126.01ms
iter 191830: loss 6.2153, time 123.69ms
iter 191840: loss 6.8746, time 128.16ms
iter 191850: loss 6.6855, time 124.65ms
iter 191860: loss 5.8209, time 125.53ms
iter 191870: loss 6.3537, time 125.41ms
iter 191880: loss 6.3297, time 125.87ms
iter 191890: loss 6.3853, time 126.02ms
iter 191900: loss 6.2818, time 126.11ms
iter 191910: loss 5.8263, time 129.04ms
iter 191920: loss 6.7344, time 125.75ms
iter 191930: loss 6.2582, time 126.07ms
iter 191940: loss 6.2101, time 125.93ms
iter 191950: loss 6.4120, time 126.02ms
iter 191960: loss 6.8361, time 124.89ms
iter 191970: loss 6.3370, time 125.45ms
iter 191980: loss 6.4955, time 125.60ms
iter 191990: loss 6.4352, time 125.44ms
step 192000: train loss 5.8212, val loss 5.8815
saving checkpoint to out-shakespeare-char
iter 192000: loss 5.9385, time 2869.20ms
iter 192010: loss 6.9239, time 125.55ms
iter 192020: loss 7.1915, time 125.43ms
iter 192030: loss 6.3775, time 125.51ms
iter 192040: loss 6.7580, time 125.14ms
iter 192050: loss 6.4417, time 125.15ms
iter 192060: loss 6.8933, time 125.20ms
iter 192070: loss 6.2169, time 125.72ms
iter 192080: loss 6.5195, time 128.47ms
iter 192090: loss 5.8928, time 125.13ms
iter 192100: loss 7.0547, time 125.50ms
iter 192110: loss 6.1725, time 125.03ms
iter 192120: loss 6.6697, time 125.07ms
iter 192130: loss 6.6273, time 125.00ms
iter 192140: loss 5.9059, time 125.20ms
iter 192150: loss 6.4661, time 124.96ms
iter 192160: loss 5.9174, time 125.48ms
iter 192170: loss 6.6996, time 125.36ms
iter 192180: loss 5.8952, time 125.98ms
iter 192190: loss 6.4975, time 124.47ms
iter 192200: loss 6.8270, time 121.66ms
iter 192210: loss 6.7464, time 124.37ms
iter 192220: loss 6.9629, time 121.58ms
iter 192230: loss 6.6264, time 124.35ms
iter 192240: loss 5.3917, time 121.83ms
step 192250: train loss 5.8763, val loss 5.7971
saving checkpoint to out-shakespeare-char
iter 192250: loss 6.6100, time 2879.86ms
iter 192260: loss 6.4124, time 122.68ms
iter 192270: loss 6.2139, time 121.47ms
iter 192280: loss 6.3027, time 122.70ms
iter 192290: loss 6.1710, time 121.56ms
iter 192300: loss 6.5355, time 122.52ms
iter 192310: loss 6.4580, time 121.53ms
iter 192320: loss 6.4261, time 122.68ms
iter 192330: loss 7.1825, time 121.28ms
iter 192340: loss 6.3583, time 122.80ms
iter 192350: loss 6.0938, time 121.47ms
iter 192360: loss 6.3393, time 121.96ms
iter 192370: loss 5.9807, time 121.54ms
iter 192380: loss 6.8738, time 122.58ms
iter 192390: loss 6.3212, time 121.64ms
iter 192400: loss 5.5423, time 122.70ms
iter 192410: loss 6.6589, time 121.45ms
iter 192420: loss 5.6652, time 123.03ms
iter 192430: loss 5.8181, time 121.49ms
iter 192440: loss 6.7672, time 121.54ms
iter 192450: loss 5.8228, time 121.61ms
iter 192460: loss 5.7450, time 121.51ms
iter 192470: loss 6.8512, time 121.70ms
iter 192480: loss 7.4247, time 121.56ms
iter 192490: loss 6.1427, time 122.04ms
step 192500: train loss 5.8409, val loss 5.8430
saving checkpoint to out-shakespeare-char
iter 192500: loss 6.8138, time 2896.66ms
iter 192510: loss 6.5825, time 121.74ms
iter 192520: loss 6.1750, time 119.09ms
iter 192530: loss 6.9865, time 123.43ms
iter 192540: loss 6.4849, time 121.33ms
iter 192550: loss 7.3610, time 123.04ms
iter 192560: loss 6.8335, time 121.81ms
iter 192570: loss 6.2663, time 122.95ms
iter 192580: loss 6.5003, time 122.00ms
iter 192590: loss 6.8445, time 123.31ms
iter 192600: loss 5.4455, time 121.89ms
iter 192610: loss 6.3888, time 123.14ms
iter 192620: loss 6.1547, time 122.30ms
iter 192630: loss 6.3455, time 122.99ms
iter 192640: loss 5.5871, time 121.92ms
iter 192650: loss 6.3088, time 123.05ms
iter 192660: loss 6.1937, time 121.87ms
iter 192670: loss 6.7738, time 123.29ms
iter 192680: loss 6.4717, time 121.79ms
iter 192690: loss 5.7658, time 123.39ms
iter 192700: loss 6.6709, time 121.92ms
iter 192710: loss 6.3266, time 122.71ms
iter 192720: loss 5.6554, time 121.83ms
iter 192730: loss 6.1442, time 123.05ms
iter 192740: loss 6.1137, time 121.69ms
step 192750: train loss 5.8315, val loss 5.8548
saving checkpoint to out-shakespeare-char
iter 192750: loss 5.9809, time 2889.97ms
iter 192760: loss 7.1342, time 121.54ms
iter 192770: loss 6.7896, time 121.40ms
iter 192780: loss 6.8765, time 121.35ms
iter 192790: loss 6.8356, time 121.85ms
iter 192800: loss 6.2715, time 121.66ms
iter 192810: loss 6.0948, time 121.19ms
iter 192820: loss 6.0272, time 121.43ms
iter 192830: loss 7.2292, time 121.38ms
iter 192840: loss 6.5659, time 121.31ms
iter 192850: loss 6.4593, time 121.39ms
iter 192860: loss 5.7812, time 121.54ms
iter 192870: loss 6.4070, time 122.09ms
iter 192880: loss 6.1005, time 121.35ms
iter 192890: loss 6.7049, time 121.47ms
iter 192900: loss 6.7612, time 121.12ms
iter 192910: loss 6.0899, time 121.38ms
iter 192920: loss 6.6683, time 121.40ms
iter 192930: loss 5.4554, time 121.44ms
iter 192940: loss 6.4965, time 121.43ms
iter 192950: loss 6.6544, time 121.60ms
iter 192960: loss 6.7133, time 121.52ms
iter 192970: loss 6.8150, time 121.54ms
iter 192980: loss 7.0986, time 121.57ms
iter 192990: loss 7.3041, time 121.43ms
step 193000: train loss 5.9095, val loss 5.8557
saving checkpoint to out-shakespeare-char
iter 193000: loss 6.7171, time 2891.63ms
iter 193010: loss 6.7941, time 120.53ms
iter 193020: loss 7.0096, time 123.83ms
iter 193030: loss 6.5203, time 121.64ms
iter 193040: loss 6.6809, time 124.35ms
iter 193050: loss 6.3460, time 121.42ms
iter 193060: loss 6.6077, time 123.81ms
iter 193070: loss 6.4623, time 121.38ms
iter 193080: loss 6.9029, time 125.27ms
iter 193090: loss 6.4886, time 121.49ms
iter 193100: loss 6.1623, time 124.28ms
iter 193110: loss 6.1370, time 121.71ms
iter 193120: loss 6.4741, time 124.66ms
iter 193130: loss 6.4602, time 121.32ms
iter 193140: loss 6.4361, time 125.18ms
iter 193150: loss 6.3366, time 121.91ms
iter 193160: loss 6.0998, time 124.16ms
iter 193170: loss 6.9037, time 121.61ms
iter 193180: loss 6.7994, time 123.53ms
iter 193190: loss 6.8683, time 121.08ms
iter 193200: loss 6.4908, time 124.25ms
iter 193210: loss 6.4865, time 121.47ms
iter 193220: loss 6.2964, time 124.17ms
iter 193230: loss 6.3033, time 121.80ms
iter 193240: loss 6.5795, time 124.27ms
step 193250: train loss 5.8861, val loss 5.9194
saving checkpoint to out-shakespeare-char
iter 193250: loss 6.6617, time 2885.17ms
iter 193260: loss 6.6828, time 125.55ms
iter 193270: loss 6.5616, time 125.56ms
iter 193280: loss 5.6918, time 125.38ms
iter 193290: loss 5.6413, time 128.28ms
iter 193300: loss 6.5042, time 125.09ms
iter 193310: loss 6.4615, time 125.40ms
iter 193320: loss 5.8305, time 126.56ms
iter 193330: loss 7.2154, time 124.42ms
iter 193340: loss 6.6089, time 125.44ms
iter 193350: loss 6.4475, time 125.55ms
iter 193360: loss 5.8914, time 125.17ms
iter 193370: loss 6.6224, time 125.29ms
iter 193380: loss 6.6981, time 125.53ms
iter 193390: loss 6.4290, time 125.39ms
iter 193400: loss 6.1734, time 128.39ms
iter 193410: loss 6.4693, time 125.33ms
iter 193420: loss 6.0991, time 125.26ms
iter 193430: loss 7.0177, time 126.22ms
iter 193440: loss 6.7081, time 125.54ms
iter 193450: loss 7.2079, time 125.17ms
iter 193460: loss 6.5523, time 125.58ms
iter 193470: loss 6.8788, time 125.28ms
iter 193480: loss 7.1029, time 125.55ms
iter 193490: loss 6.3726, time 121.15ms
step 193500: train loss 5.8022, val loss 5.8701
saving checkpoint to out-shakespeare-char
iter 193500: loss 6.9852, time 2909.40ms
iter 193510: loss 6.7699, time 123.25ms
iter 193520: loss 6.0815, time 122.01ms
iter 193530: loss 6.4829, time 123.79ms
iter 193540: loss 6.0297, time 121.74ms
iter 193550: loss 6.8988, time 122.92ms
iter 193560: loss 5.6181, time 121.98ms
iter 193570: loss 6.7927, time 122.86ms
iter 193580: loss 6.7133, time 121.81ms
iter 193590: loss 6.3700, time 122.98ms
iter 193600: loss 6.7836, time 120.88ms
iter 193610: loss 6.4910, time 123.26ms
iter 193620: loss 5.9256, time 121.90ms
iter 193630: loss 6.3204, time 122.98ms
iter 193640: loss 6.3617, time 121.88ms
iter 193650: loss 5.5793, time 123.00ms
iter 193660: loss 6.5437, time 121.92ms
iter 193670: loss 6.7724, time 123.10ms
iter 193680: loss 6.8713, time 122.05ms
iter 193690: loss 6.3455, time 122.19ms
iter 193700: loss 6.3381, time 121.93ms
iter 193710: loss 6.3994, time 123.82ms
iter 193720: loss 6.1902, time 121.87ms
iter 193730: loss 6.6952, time 123.34ms
iter 193740: loss 6.8774, time 121.20ms
step 193750: train loss 5.8262, val loss 5.8241
saving checkpoint to out-shakespeare-char
iter 193750: loss 6.4868, time 2885.23ms
iter 193760: loss 6.2510, time 125.45ms
iter 193770: loss 6.9246, time 124.97ms
iter 193780: loss 6.3068, time 128.05ms
iter 193790: loss 6.9364, time 124.92ms
iter 193800: loss 6.5305, time 124.78ms
iter 193810: loss 6.5030, time 125.44ms
iter 193820: loss 7.6273, time 125.71ms
iter 193830: loss 5.9260, time 124.85ms
iter 193840: loss 5.6224, time 125.65ms
iter 193850: loss 6.4541, time 125.73ms
iter 193860: loss 5.9932, time 125.59ms
iter 193870: loss 6.6173, time 125.82ms
iter 193880: loss 6.6681, time 124.92ms
iter 193890: loss 6.6268, time 128.39ms
iter 193900: loss 6.1386, time 125.15ms
iter 193910: loss 6.3586, time 124.44ms
iter 193920: loss 6.3569, time 125.73ms
iter 193930: loss 6.2623, time 125.09ms
iter 193940: loss 7.2426, time 125.19ms
iter 193950: loss 5.9850, time 124.89ms
iter 193960: loss 6.9204, time 124.13ms
iter 193970: loss 6.1153, time 126.23ms
iter 193980: loss 5.8475, time 126.87ms
iter 193990: loss 5.9155, time 130.39ms
step 194000: train loss 5.8725, val loss 5.8257
saving checkpoint to out-shakespeare-char
iter 194000: loss 7.5020, time 2912.62ms
iter 194010: loss 6.4900, time 124.44ms
iter 194020: loss 6.8730, time 121.80ms
iter 194030: loss 6.6740, time 122.26ms
iter 194040: loss 6.3168, time 119.40ms
iter 194050: loss 6.7021, time 122.23ms
iter 194060: loss 6.4159, time 121.89ms
iter 194070: loss 6.6841, time 122.30ms
iter 194080: loss 6.0252, time 120.18ms
iter 194090: loss 6.0827, time 121.82ms
iter 194100: loss 6.8423, time 121.40ms
iter 194110: loss 6.0299, time 121.54ms
iter 194120: loss 6.2036, time 121.56ms
iter 194130: loss 6.0991, time 123.06ms
iter 194140: loss 5.5146, time 121.31ms
iter 194150: loss 6.7163, time 121.53ms
iter 194160: loss 6.5406, time 120.29ms
iter 194170: loss 6.2821, time 121.55ms
iter 194180: loss 6.6970, time 121.79ms
iter 194190: loss 5.6430, time 121.30ms
iter 194200: loss 6.7627, time 121.65ms
iter 194210: loss 6.4473, time 121.37ms
iter 194220: loss 6.4188, time 121.21ms
iter 194230: loss 6.3715, time 120.33ms
iter 194240: loss 7.1325, time 121.35ms
step 194250: train loss 5.8714, val loss 5.8509
saving checkpoint to out-shakespeare-char
iter 194250: loss 6.3894, time 2877.16ms
iter 194260: loss 6.3186, time 121.56ms
iter 194270: loss 6.7042, time 122.45ms
iter 194280: loss 6.8573, time 122.67ms
iter 194290: loss 6.8036, time 120.85ms
iter 194300: loss 6.1931, time 120.47ms
iter 194310: loss 6.9284, time 121.15ms
iter 194320: loss 5.9950, time 122.83ms
iter 194330: loss 6.8405, time 121.58ms
iter 194340: loss 6.2358, time 122.90ms
iter 194350: loss 7.2880, time 121.67ms
iter 194360: loss 6.1265, time 122.30ms
iter 194370: loss 6.0171, time 118.28ms
iter 194380: loss 7.1395, time 122.87ms
iter 194390: loss 6.0144, time 121.50ms
iter 194400: loss 6.4045, time 122.60ms
iter 194410: loss 6.7239, time 121.63ms
iter 194420: loss 6.9693, time 123.11ms
iter 194430: loss 6.9813, time 121.50ms
iter 194440: loss 6.4781, time 121.66ms
iter 194450: loss 6.1756, time 121.22ms
iter 194460: loss 6.0889, time 122.51ms
iter 194470: loss 6.4580, time 121.56ms
iter 194480: loss 6.7849, time 122.51ms
iter 194490: loss 7.3594, time 121.56ms
step 194500: train loss 5.8454, val loss 5.8639
saving checkpoint to out-shakespeare-char
iter 194500: loss 6.2516, time 2896.72ms
iter 194510: loss 6.7478, time 125.17ms
iter 194520: loss 6.5759, time 125.27ms
iter 194530: loss 6.8040, time 125.21ms
iter 194540: loss 6.7720, time 125.09ms
iter 194550: loss 6.1833, time 124.99ms
iter 194560: loss 6.2812, time 125.30ms
iter 194570: loss 6.8620, time 124.95ms
iter 194580: loss 6.9411, time 124.60ms
iter 194590: loss 6.6133, time 125.22ms
iter 194600: loss 6.5160, time 128.53ms
iter 194610: loss 6.4863, time 126.79ms
iter 194620: loss 6.1401, time 125.42ms
iter 194630: loss 6.7791, time 124.71ms
iter 194640: loss 6.1847, time 125.52ms
iter 194650: loss 5.9116, time 125.78ms
iter 194660: loss 6.7035, time 125.79ms
iter 194670: loss 6.4908, time 125.81ms
iter 194680: loss 6.5552, time 125.50ms
iter 194690: loss 5.8905, time 124.71ms
iter 194700: loss 6.0049, time 125.54ms
iter 194710: loss 6.4586, time 128.82ms
iter 194720: loss 6.2295, time 125.41ms
iter 194730: loss 6.0732, time 125.36ms
iter 194740: loss 6.5471, time 125.67ms
step 194750: train loss 5.8549, val loss 5.8303
saving checkpoint to out-shakespeare-char
iter 194750: loss 6.0429, time 2897.52ms
iter 194760: loss 6.6418, time 125.74ms
iter 194770: loss 6.2220, time 129.07ms
iter 194780: loss 5.8148, time 125.71ms
iter 194790: loss 6.6856, time 125.89ms
iter 194800: loss 6.5397, time 125.73ms
iter 194810: loss 6.7094, time 125.97ms
iter 194820: loss 6.8014, time 125.78ms
iter 194830: loss 6.3911, time 126.25ms
iter 194840: loss 6.3864, time 126.12ms
iter 194850: loss 7.5362, time 125.85ms
iter 194860: loss 6.3228, time 125.74ms
iter 194870: loss 7.1080, time 125.73ms
iter 194880: loss 6.7905, time 128.31ms
iter 194890: loss 6.7175, time 125.45ms
iter 194900: loss 6.3618, time 125.43ms
iter 194910: loss 5.9318, time 125.89ms
iter 194920: loss 6.4198, time 126.12ms
iter 194930: loss 6.6699, time 126.17ms
iter 194940: loss 6.7563, time 125.70ms
iter 194950: loss 6.3797, time 126.36ms
iter 194960: loss 7.0698, time 125.89ms
iter 194970: loss 6.2562, time 124.03ms
iter 194980: loss 6.6684, time 124.77ms
iter 194990: loss 6.8988, time 125.32ms
step 195000: train loss 5.8621, val loss 5.8793
saving checkpoint to out-shakespeare-char
iter 195000: loss 6.7146, time 2850.86ms
iter 195010: loss 6.4643, time 124.18ms
iter 195020: loss 6.2559, time 124.10ms
iter 195030: loss 6.2033, time 124.07ms
iter 195040: loss 6.1697, time 127.94ms
iter 195050: loss 7.0337, time 125.09ms
iter 195060: loss 6.4236, time 124.20ms
iter 195070: loss 5.8475, time 125.13ms
iter 195080: loss 5.9587, time 125.22ms
iter 195090: loss 6.0895, time 124.38ms
iter 195100: loss 5.8316, time 124.33ms
iter 195110: loss 5.4245, time 125.25ms
iter 195120: loss 6.8143, time 124.99ms
iter 195130: loss 6.0429, time 125.04ms
iter 195140: loss 6.7382, time 124.35ms
iter 195150: loss 6.7525, time 127.90ms
iter 195160: loss 6.9336, time 124.98ms
iter 195170: loss 7.0501, time 124.13ms
iter 195180: loss 6.7083, time 124.32ms
iter 195190: loss 6.2986, time 125.31ms
iter 195200: loss 7.4366, time 125.53ms
iter 195210: loss 6.2273, time 124.97ms
iter 195220: loss 7.1154, time 125.22ms
iter 195230: loss 6.5385, time 125.17ms
iter 195240: loss 6.4723, time 125.24ms
step 195250: train loss 5.8248, val loss 5.8491
saving checkpoint to out-shakespeare-char
iter 195250: loss 6.8378, time 2880.02ms
iter 195260: loss 6.4130, time 125.64ms
iter 195270: loss 6.4055, time 125.91ms
iter 195280: loss 5.9093, time 125.97ms
iter 195290: loss 6.3927, time 124.93ms
iter 195300: loss 6.1893, time 125.07ms
iter 195310: loss 6.4808, time 125.16ms
iter 195320: loss 7.5202, time 124.99ms
iter 195330: loss 5.5843, time 125.43ms
iter 195340: loss 6.9586, time 124.87ms
iter 195350: loss 6.5993, time 125.28ms
iter 195360: loss 6.1086, time 128.15ms
iter 195370: loss 6.0744, time 124.96ms
iter 195380: loss 6.1548, time 125.00ms
iter 195390: loss 6.7252, time 125.11ms
iter 195400: loss 6.1474, time 124.98ms
iter 195410: loss 6.4181, time 124.86ms
iter 195420: loss 7.2887, time 124.97ms
iter 195430: loss 6.3602, time 124.88ms
iter 195440: loss 6.3601, time 125.28ms
iter 195450: loss 6.6157, time 125.67ms
iter 195460: loss 5.9819, time 125.04ms
iter 195470: loss 6.0414, time 127.92ms
iter 195480: loss 6.7462, time 124.94ms
iter 195490: loss 7.0349, time 125.41ms
step 195500: train loss 5.8155, val loss 5.8286
saving checkpoint to out-shakespeare-char
iter 195500: loss 6.5446, time 2887.82ms
iter 195510: loss 6.5866, time 126.35ms
iter 195520: loss 7.1752, time 126.43ms
iter 195530: loss 6.5673, time 122.32ms
iter 195540: loss 6.1833, time 122.06ms
iter 195550: loss 7.2859, time 121.19ms
iter 195560: loss 6.9724, time 121.99ms
iter 195570: loss 6.9588, time 122.07ms
iter 195580: loss 6.2024, time 121.58ms
iter 195590: loss 6.7649, time 121.08ms
iter 195600: loss 6.6635, time 120.83ms
iter 195610: loss 6.4233, time 122.00ms
iter 195620: loss 6.3100, time 121.66ms
iter 195630: loss 6.3928, time 121.99ms
iter 195640: loss 6.5993, time 122.55ms
iter 195650: loss 6.1581, time 122.08ms
iter 195660: loss 5.6033, time 121.95ms
iter 195670: loss 6.3682, time 123.01ms
iter 195680: loss 6.5614, time 121.85ms
iter 195690: loss 5.2049, time 121.50ms
iter 195700: loss 6.1433, time 121.81ms
iter 195710: loss 5.9537, time 121.72ms
iter 195720: loss 6.0813, time 121.83ms
iter 195730: loss 6.0851, time 124.86ms
iter 195740: loss 6.2791, time 121.97ms
step 195750: train loss 5.8223, val loss 5.9401
saving checkpoint to out-shakespeare-char
iter 195750: loss 7.0055, time 2904.42ms
iter 195760: loss 6.7936, time 121.57ms
iter 195770: loss 6.0025, time 122.33ms
iter 195780: loss 6.7668, time 122.26ms
iter 195790: loss 6.4938, time 121.68ms
iter 195800: loss 6.4630, time 123.15ms
iter 195810: loss 5.8024, time 122.01ms
iter 195820: loss 6.0062, time 123.46ms
iter 195830: loss 6.3032, time 120.14ms
iter 195840: loss 6.5384, time 122.69ms
iter 195850: loss 6.8759, time 121.32ms
iter 195860: loss 5.9459, time 122.97ms
iter 195870: loss 6.0853, time 121.40ms
iter 195880: loss 6.8843, time 121.75ms
iter 195890: loss 6.6668, time 120.93ms
iter 195900: loss 6.1827, time 123.64ms
iter 195910: loss 5.9958, time 122.16ms
iter 195920: loss 7.0648, time 123.06ms
iter 195930: loss 7.2198, time 121.21ms
iter 195940: loss 6.4978, time 122.95ms
iter 195950: loss 6.2341, time 122.17ms
iter 195960: loss 7.2551, time 123.49ms
iter 195970: loss 7.1937, time 121.96ms
iter 195980: loss 7.6768, time 123.10ms
iter 195990: loss 6.4173, time 122.42ms
step 196000: train loss 5.8299, val loss 5.8323
saving checkpoint to out-shakespeare-char
iter 196000: loss 5.9910, time 2907.61ms
iter 196010: loss 6.3829, time 125.33ms
iter 196020: loss 5.9665, time 125.67ms
iter 196030: loss 6.7299, time 129.44ms
iter 196040: loss 6.3632, time 125.93ms
iter 196050: loss 6.0191, time 125.82ms
iter 196060: loss 5.9805, time 126.45ms
iter 196070: loss 6.3741, time 127.55ms
iter 196080: loss 7.2954, time 123.98ms
iter 196090: loss 5.7489, time 124.43ms
iter 196100: loss 6.2974, time 124.93ms
iter 196110: loss 6.6639, time 124.73ms
iter 196120: loss 6.7128, time 124.41ms
iter 196130: loss 6.7745, time 126.10ms
iter 196140: loss 6.0029, time 125.25ms
iter 196150: loss 6.3168, time 125.27ms
iter 196160: loss 5.8917, time 123.86ms
iter 196170: loss 6.7500, time 124.05ms
iter 196180: loss 6.7204, time 125.41ms
iter 196190: loss 6.4190, time 125.52ms
iter 196200: loss 6.7732, time 125.14ms
iter 196210: loss 7.2339, time 124.08ms
iter 196220: loss 6.5976, time 124.33ms
iter 196230: loss 6.0963, time 124.49ms
iter 196240: loss 6.2394, time 124.74ms
step 196250: train loss 5.8485, val loss 5.8373
saving checkpoint to out-shakespeare-char
iter 196250: loss 6.6394, time 2856.20ms
iter 196260: loss 6.8972, time 124.75ms
iter 196270: loss 6.4476, time 124.90ms
iter 196280: loss 6.5232, time 125.90ms
iter 196290: loss 5.9341, time 125.61ms
iter 196300: loss 6.9802, time 125.53ms
iter 196310: loss 6.7871, time 125.13ms
iter 196320: loss 6.5354, time 125.41ms
iter 196330: loss 6.4126, time 125.32ms
iter 196340: loss 6.8242, time 125.91ms
iter 196350: loss 6.3400, time 123.42ms
iter 196360: loss 6.6593, time 125.39ms
iter 196370: loss 6.8688, time 125.08ms
iter 196380: loss 6.9879, time 126.33ms
iter 196390: loss 6.9144, time 125.99ms
iter 196400: loss 6.5867, time 125.94ms
iter 196410: loss 5.8131, time 126.09ms
iter 196420: loss 6.2289, time 126.18ms
iter 196430: loss 6.3790, time 126.21ms
iter 196440: loss 6.9681, time 125.60ms
iter 196450: loss 6.6909, time 124.81ms
iter 196460: loss 6.1016, time 125.52ms
iter 196470: loss 7.0287, time 125.16ms
iter 196480: loss 6.3642, time 124.75ms
iter 196490: loss 6.5837, time 124.79ms
step 196500: train loss 5.8689, val loss 5.8772
saving checkpoint to out-shakespeare-char
iter 196500: loss 6.2843, time 2881.34ms
iter 196510: loss 6.5907, time 124.56ms
iter 196520: loss 5.4187, time 125.64ms
iter 196530: loss 6.1872, time 124.72ms
iter 196540: loss 5.9897, time 124.94ms
iter 196550: loss 7.0864, time 124.91ms
iter 196560: loss 6.9896, time 125.71ms
iter 196570: loss 6.4772, time 125.72ms
iter 196580: loss 7.1182, time 124.63ms
iter 196590: loss 6.8737, time 129.12ms
iter 196600: loss 6.5733, time 125.59ms
iter 196610: loss 6.9714, time 125.06ms
iter 196620: loss 6.8513, time 125.75ms
iter 196630: loss 6.2937, time 129.55ms
iter 196640: loss 7.0110, time 124.75ms
iter 196650: loss 7.1427, time 126.26ms
iter 196660: loss 6.9214, time 125.62ms
iter 196670: loss 6.3329, time 129.12ms
iter 196680: loss 5.3613, time 125.49ms
iter 196690: loss 6.5196, time 125.59ms
iter 196700: loss 6.1275, time 126.44ms
iter 196710: loss 5.9346, time 125.75ms
iter 196720: loss 6.6982, time 126.23ms
iter 196730: loss 6.7201, time 126.52ms
iter 196740: loss 6.3083, time 124.94ms
step 196750: train loss 5.8765, val loss 5.8665
saving checkpoint to out-shakespeare-char
iter 196750: loss 6.5645, time 2880.13ms
iter 196760: loss 7.0967, time 126.39ms
iter 196770: loss 6.3861, time 126.08ms
iter 196780: loss 6.1321, time 125.69ms
iter 196790: loss 6.4155, time 125.90ms
iter 196800: loss 5.9756, time 128.99ms
iter 196810: loss 6.3770, time 124.88ms
iter 196820: loss 5.8379, time 125.21ms
iter 196830: loss 6.4348, time 125.17ms
iter 196840: loss 6.1735, time 125.80ms
iter 196850: loss 6.5954, time 125.37ms
iter 196860: loss 6.1374, time 124.97ms
iter 196870: loss 5.7578, time 128.65ms
iter 196880: loss 6.6170, time 125.35ms
iter 196890: loss 6.8313, time 125.26ms
iter 196900: loss 5.4941, time 125.51ms
iter 196910: loss 6.7228, time 125.27ms
iter 196920: loss 7.1604, time 124.94ms
iter 196930: loss 6.0920, time 125.30ms
iter 196940: loss 5.3183, time 128.05ms
iter 196950: loss 6.4077, time 124.92ms
iter 196960: loss 5.7099, time 124.93ms
iter 196970: loss 7.0210, time 124.72ms
iter 196980: loss 6.9325, time 125.75ms
iter 196990: loss 5.8299, time 125.33ms
step 197000: train loss 5.8334, val loss 5.8421
saving checkpoint to out-shakespeare-char
iter 197000: loss 6.7836, time 2869.49ms
iter 197010: loss 6.0679, time 125.39ms
iter 197020: loss 5.5340, time 125.30ms
iter 197030: loss 6.5030, time 127.96ms
iter 197040: loss 6.1193, time 124.35ms
iter 197050: loss 5.9517, time 125.17ms
iter 197060: loss 5.5770, time 125.52ms
iter 197070: loss 6.7442, time 125.35ms
iter 197080: loss 6.0106, time 125.03ms
iter 197090: loss 6.8230, time 125.41ms
iter 197100: loss 6.1336, time 125.44ms
iter 197110: loss 6.8762, time 125.27ms
iter 197120: loss 6.6437, time 125.26ms
iter 197130: loss 6.7001, time 125.13ms
iter 197140: loss 6.6742, time 128.18ms
iter 197150: loss 7.0871, time 125.09ms
iter 197160: loss 7.0264, time 125.24ms
iter 197170: loss 6.3470, time 125.30ms
iter 197180: loss 6.3269, time 125.13ms
iter 197190: loss 6.4947, time 124.50ms
iter 197200: loss 6.1832, time 125.27ms
iter 197210: loss 5.7439, time 125.09ms
iter 197220: loss 6.2510, time 124.96ms
iter 197230: loss 6.5859, time 124.91ms
iter 197240: loss 5.7699, time 125.20ms
step 197250: train loss 5.8544, val loss 5.8453
saving checkpoint to out-shakespeare-char
iter 197250: loss 6.2626, time 2871.26ms
iter 197260: loss 6.5295, time 125.62ms
iter 197270: loss 6.2948, time 128.25ms
iter 197280: loss 6.2011, time 124.80ms
iter 197290: loss 6.5475, time 125.14ms
iter 197300: loss 6.4614, time 125.34ms
iter 197310: loss 5.9851, time 125.68ms
iter 197320: loss 5.9239, time 125.29ms
iter 197330: loss 6.7742, time 125.37ms
iter 197340: loss 6.7805, time 125.05ms
iter 197350: loss 6.9198, time 125.28ms
iter 197360: loss 6.9162, time 125.70ms
iter 197370: loss 6.6661, time 125.57ms
iter 197380: loss 6.1222, time 128.64ms
iter 197390: loss 5.9712, time 125.99ms
iter 197400: loss 6.8794, time 125.77ms
iter 197410: loss 6.6397, time 126.33ms
iter 197420: loss 7.3315, time 125.53ms
iter 197430: loss 6.4722, time 125.54ms
iter 197440: loss 6.3595, time 125.68ms
iter 197450: loss 6.1352, time 125.89ms
iter 197460: loss 7.1202, time 125.86ms
iter 197470: loss 6.0363, time 126.05ms
iter 197480: loss 6.8446, time 126.10ms
iter 197490: loss 5.7979, time 128.84ms
step 197500: train loss 5.7870, val loss 5.8113
saving checkpoint to out-shakespeare-char
iter 197500: loss 5.7668, time 2902.91ms
iter 197510: loss 6.8956, time 129.38ms
iter 197520: loss 7.0778, time 125.64ms
iter 197530: loss 6.3178, time 125.57ms
iter 197540: loss 5.4910, time 125.57ms
iter 197550: loss 6.9553, time 125.58ms
iter 197560: loss 5.8043, time 125.56ms
iter 197570: loss 6.4410, time 125.55ms
iter 197580: loss 7.1399, time 125.65ms
iter 197590: loss 6.1645, time 125.62ms
iter 197600: loss 5.5621, time 125.64ms
iter 197610: loss 6.3665, time 125.52ms
iter 197620: loss 6.4991, time 128.88ms
iter 197630: loss 6.9074, time 125.85ms
iter 197640: loss 6.1353, time 126.31ms
iter 197650: loss 6.5136, time 125.74ms
iter 197660: loss 6.6027, time 125.75ms
iter 197670: loss 6.0061, time 125.78ms
iter 197680: loss 6.4321, time 125.87ms
iter 197690: loss 6.1581, time 126.11ms
iter 197700: loss 6.1951, time 125.61ms
iter 197710: loss 6.3950, time 125.61ms
iter 197720: loss 5.6644, time 125.61ms
iter 197730: loss 6.1829, time 125.69ms
iter 197740: loss 6.5721, time 125.87ms
step 197750: train loss 5.8179, val loss 5.8165
saving checkpoint to out-shakespeare-char
iter 197750: loss 5.6009, time 2884.49ms
iter 197760: loss 6.4874, time 125.94ms
iter 197770: loss 5.9162, time 125.84ms
iter 197780: loss 6.7883, time 125.96ms
iter 197790: loss 6.7295, time 125.62ms
iter 197800: loss 6.2326, time 125.74ms
iter 197810: loss 6.1859, time 125.60ms
iter 197820: loss 5.8981, time 125.93ms
iter 197830: loss 5.2943, time 128.68ms
iter 197840: loss 6.7876, time 125.85ms
iter 197850: loss 6.9682, time 126.13ms
iter 197860: loss 5.7626, time 125.67ms
iter 197870: loss 6.7453, time 126.11ms
iter 197880: loss 6.2213, time 126.81ms
iter 197890: loss 7.3829, time 124.97ms
iter 197900: loss 6.3446, time 126.09ms
iter 197910: loss 6.4975, time 128.83ms
iter 197920: loss 6.6481, time 125.97ms
iter 197930: loss 6.6137, time 126.07ms
iter 197940: loss 7.1579, time 124.87ms
iter 197950: loss 6.6361, time 125.66ms
iter 197960: loss 6.8983, time 125.68ms
iter 197970: loss 6.1608, time 125.67ms
iter 197980: loss 6.0261, time 125.65ms
iter 197990: loss 5.9828, time 125.72ms
step 198000: train loss 5.8160, val loss 5.8621
saving checkpoint to out-shakespeare-char
iter 198000: loss 6.6390, time 2869.44ms
iter 198010: loss 5.7560, time 125.87ms
iter 198020: loss 6.7001, time 126.21ms
iter 198030: loss 6.1754, time 125.98ms
iter 198040: loss 6.7003, time 127.61ms
iter 198050: loss 6.5677, time 124.65ms
iter 198060: loss 6.8891, time 125.97ms
iter 198070: loss 6.3612, time 126.00ms
iter 198080: loss 6.5066, time 125.83ms
iter 198090: loss 5.8235, time 125.73ms
iter 198100: loss 6.5430, time 125.94ms
iter 198110: loss 6.0467, time 124.74ms
iter 198120: loss 6.1580, time 125.87ms
iter 198130: loss 6.3194, time 125.84ms
iter 198140: loss 6.3228, time 126.13ms
iter 198150: loss 5.8728, time 129.01ms
iter 198160: loss 6.5277, time 125.95ms
iter 198170: loss 6.4069, time 126.01ms
iter 198180: loss 5.9347, time 125.98ms
iter 198190: loss 6.9202, time 125.83ms
iter 198200: loss 6.1534, time 126.13ms
iter 198210: loss 6.3016, time 125.96ms
iter 198220: loss 6.7859, time 126.11ms
iter 198230: loss 6.6976, time 125.83ms
iter 198240: loss 6.7843, time 125.52ms
step 198250: train loss 5.8309, val loss 5.8654
saving checkpoint to out-shakespeare-char
iter 198250: loss 6.6794, time 2895.90ms
iter 198260: loss 5.7967, time 125.73ms
iter 198270: loss 6.8135, time 125.96ms
iter 198280: loss 7.1619, time 128.41ms
iter 198290: loss 5.8534, time 125.40ms
iter 198300: loss 6.1141, time 125.72ms
iter 198310: loss 6.0033, time 125.58ms
iter 198320: loss 6.4439, time 128.76ms
iter 198330: loss 6.3852, time 125.65ms
iter 198340: loss 6.9501, time 125.88ms
iter 198350: loss 6.0308, time 125.74ms
iter 198360: loss 6.6250, time 125.02ms
iter 198370: loss 6.0626, time 125.21ms
iter 198380: loss 5.8742, time 126.15ms
iter 198390: loss 6.6624, time 125.22ms
iter 198400: loss 6.6702, time 124.87ms
iter 198410: loss 6.1468, time 125.15ms
iter 198420: loss 6.2717, time 124.56ms
iter 198430: loss 6.2279, time 128.17ms
iter 198440: loss 6.0848, time 125.31ms
iter 198450: loss 6.1245, time 125.83ms
iter 198460: loss 5.9241, time 125.53ms
iter 198470: loss 6.8515, time 125.23ms
iter 198480: loss 6.6498, time 125.28ms
iter 198490: loss 6.3285, time 125.41ms
step 198500: train loss 5.8816, val loss 5.8784
saving checkpoint to out-shakespeare-char
iter 198500: loss 6.8868, time 2896.82ms
iter 198510: loss 6.0767, time 125.49ms
iter 198520: loss 6.9985, time 125.46ms
iter 198530: loss 6.0541, time 128.14ms
iter 198540: loss 5.9533, time 125.42ms
iter 198550: loss 6.6433, time 125.43ms
iter 198560: loss 6.3009, time 125.42ms
iter 198570: loss 6.5558, time 125.17ms
iter 198580: loss 6.1412, time 125.32ms
iter 198590: loss 6.4164, time 125.83ms
iter 198600: loss 7.0699, time 125.06ms
iter 198610: loss 6.6817, time 125.57ms
iter 198620: loss 6.0321, time 125.64ms
iter 198630: loss 6.5332, time 125.88ms
iter 198640: loss 6.5727, time 128.51ms
iter 198650: loss 7.0164, time 125.18ms
iter 198660: loss 6.6560, time 125.67ms
iter 198670: loss 5.9569, time 125.91ms
iter 198680: loss 6.4660, time 125.78ms
iter 198690: loss 6.8175, time 125.65ms
iter 198700: loss 6.3177, time 125.63ms
iter 198710: loss 5.7436, time 125.64ms
iter 198720: loss 6.6537, time 125.73ms
iter 198730: loss 6.9387, time 124.27ms
iter 198740: loss 6.8785, time 125.56ms
step 198750: train loss 5.8049, val loss 5.8613
saving checkpoint to out-shakespeare-char
iter 198750: loss 6.1658, time 2899.22ms
iter 198760: loss 6.1290, time 125.14ms
iter 198770: loss 6.8244, time 127.87ms
iter 198780: loss 6.5867, time 125.05ms
iter 198790: loss 5.6545, time 124.91ms
iter 198800: loss 6.5729, time 125.09ms
iter 198810: loss 6.3545, time 125.02ms
iter 198820: loss 5.9637, time 125.26ms
iter 198830: loss 6.3111, time 125.05ms
iter 198840: loss 5.9879, time 124.76ms
iter 198850: loss 5.9256, time 125.49ms
iter 198860: loss 7.0441, time 124.95ms
iter 198870: loss 5.9169, time 125.29ms
iter 198880: loss 6.7177, time 127.44ms
iter 198890: loss 6.2947, time 123.95ms
iter 198900: loss 7.0411, time 124.93ms
iter 198910: loss 5.8807, time 125.13ms
iter 198920: loss 6.1637, time 123.78ms
iter 198930: loss 6.1988, time 124.94ms
iter 198940: loss 7.1321, time 125.03ms
iter 198950: loss 6.2566, time 124.23ms
iter 198960: loss 6.1888, time 125.06ms
iter 198970: loss 6.5344, time 124.19ms
iter 198980: loss 6.7959, time 124.94ms
iter 198990: loss 6.3081, time 127.89ms
step 199000: train loss 5.9118, val loss 5.8503
saving checkpoint to out-shakespeare-char
iter 199000: loss 6.0103, time 2891.60ms
iter 199010: loss 6.4205, time 125.30ms
iter 199020: loss 7.0616, time 125.29ms
iter 199030: loss 7.2899, time 125.23ms
iter 199040: loss 6.4809, time 125.55ms
iter 199050: loss 6.5850, time 128.30ms
iter 199060: loss 6.5517, time 125.66ms
iter 199070: loss 5.2366, time 125.43ms
iter 199080: loss 6.8462, time 125.41ms
iter 199090: loss 6.5823, time 125.35ms
iter 199100: loss 7.0029, time 125.43ms
iter 199110: loss 6.2684, time 125.50ms
iter 199120: loss 6.8042, time 125.29ms
iter 199130: loss 7.0208, time 125.42ms
iter 199140: loss 6.8302, time 125.25ms
iter 199150: loss 7.1587, time 125.68ms
iter 199160: loss 6.3727, time 127.73ms
iter 199170: loss 6.6473, time 125.06ms
iter 199180: loss 6.6704, time 124.38ms
iter 199190: loss 6.0945, time 125.70ms
iter 199200: loss 6.2405, time 125.06ms
iter 199210: loss 5.6040, time 124.10ms
iter 199220: loss 6.0362, time 125.60ms
iter 199230: loss 6.3699, time 128.22ms
iter 199240: loss 6.8763, time 125.22ms
step 199250: train loss 5.8376, val loss 5.8098
saving checkpoint to out-shakespeare-char
iter 199250: loss 6.6511, time 2889.89ms
iter 199260: loss 6.4185, time 125.23ms
iter 199270: loss 6.6635, time 124.99ms
iter 199280: loss 7.3731, time 123.19ms
iter 199290: loss 6.6501, time 123.96ms
iter 199300: loss 6.3200, time 124.07ms
iter 199310: loss 5.8960, time 124.75ms
iter 199320: loss 5.4203, time 123.49ms
iter 199330: loss 5.9081, time 124.47ms
iter 199340: loss 7.4092, time 125.30ms
iter 199350: loss 6.6378, time 125.81ms
iter 199360: loss 5.7040, time 128.25ms
iter 199370: loss 6.5834, time 125.33ms
iter 199380: loss 6.7889, time 124.28ms
iter 199390: loss 5.5797, time 125.34ms
iter 199400: loss 6.2800, time 124.99ms
iter 199410: loss 5.5507, time 124.84ms
iter 199420: loss 5.7341, time 127.88ms
iter 199430: loss 6.3415, time 124.78ms
iter 199440: loss 5.8219, time 124.84ms
iter 199450: loss 6.3298, time 123.87ms
iter 199460: loss 5.6599, time 124.81ms
iter 199470: loss 6.7681, time 124.96ms
iter 199480: loss 6.1549, time 125.07ms
iter 199490: loss 5.7325, time 122.78ms
step 199500: train loss 5.9072, val loss 5.8560
saving checkpoint to out-shakespeare-char
iter 199500: loss 7.1689, time 2889.61ms
iter 199510: loss 6.4672, time 126.32ms
iter 199520: loss 6.3903, time 125.89ms
iter 199530: loss 5.9882, time 127.87ms
iter 199540: loss 6.0249, time 124.99ms
iter 199550: loss 5.9448, time 124.91ms
iter 199560: loss 5.9760, time 124.87ms
iter 199570: loss 6.9761, time 124.82ms
iter 199580: loss 6.7555, time 124.79ms
iter 199590: loss 6.6702, time 124.63ms
iter 199600: loss 6.2349, time 124.73ms
iter 199610: loss 5.9509, time 124.80ms
iter 199620: loss 6.6400, time 125.39ms
iter 199630: loss 6.7858, time 125.53ms
iter 199640: loss 6.9584, time 128.37ms
iter 199650: loss 6.1292, time 126.54ms
iter 199660: loss 5.8081, time 125.37ms
iter 199670: loss 6.7402, time 125.51ms
iter 199680: loss 5.7965, time 126.16ms
iter 199690: loss 6.6509, time 125.87ms
iter 199700: loss 6.4103, time 126.41ms
iter 199710: loss 6.8176, time 126.25ms
iter 199720: loss 6.7532, time 126.24ms
iter 199730: loss 6.3281, time 126.11ms
iter 199740: loss 6.3524, time 126.01ms
step 199750: train loss 5.8537, val loss 5.8150
saving checkpoint to out-shakespeare-char
iter 199750: loss 6.1535, time 2896.68ms
iter 199760: loss 5.3782, time 122.35ms
iter 199770: loss 6.7851, time 121.44ms
iter 199780: loss 6.6052, time 121.71ms
iter 199790: loss 6.2873, time 120.47ms
iter 199800: loss 6.3510, time 121.62ms
iter 199810: loss 6.6766, time 121.48ms
iter 199820: loss 6.4185, time 122.05ms
iter 199830: loss 6.2489, time 121.86ms
iter 199840: loss 6.1758, time 121.65ms
iter 199850: loss 6.8579, time 121.34ms
iter 199860: loss 6.8270, time 121.65ms
iter 199870: loss 6.5908, time 121.27ms
iter 199880: loss 6.9034, time 121.37ms
iter 199890: loss 6.7136, time 120.91ms
iter 199900: loss 5.6964, time 121.64ms
iter 199910: loss 5.3446, time 121.59ms
iter 199920: loss 6.8389, time 121.68ms
iter 199930: loss 6.4190, time 121.46ms
iter 199940: loss 6.6814, time 121.80ms
iter 199950: loss 6.4158, time 121.52ms
iter 199960: loss 5.3743, time 121.63ms
iter 199970: loss 6.3050, time 121.50ms
iter 199980: loss 6.5412, time 121.59ms
iter 199990: loss 5.9871, time 121.62ms
step 200000: train loss 5.8027, val loss 5.7960
saving checkpoint to out-shakespeare-char
iter 200000: loss 6.4429, time 2907.92ms
iter 200010: loss 6.3227, time 124.81ms
iter 200020: loss 7.0694, time 121.99ms
iter 200030: loss 7.2424, time 124.70ms
iter 200040: loss 6.2664, time 121.73ms
iter 200050: loss 7.1330, time 124.67ms
iter 200060: loss 5.8266, time 121.07ms
iter 200070: loss 6.2958, time 124.79ms
iter 200080: loss 6.1156, time 121.36ms
iter 200090: loss 6.3141, time 124.11ms
iter 200100: loss 7.0421, time 121.67ms
iter 200110: loss 6.3726, time 123.97ms
iter 200120: loss 5.9271, time 121.78ms
iter 200130: loss 6.9786, time 124.72ms
iter 200140: loss 6.2662, time 121.81ms
iter 200150: loss 6.4073, time 124.67ms
iter 200160: loss 6.1726, time 121.53ms
iter 200170: loss 6.4259, time 124.44ms
iter 200180: loss 6.4032, time 121.58ms
iter 200190: loss 6.0664, time 124.76ms
iter 200200: loss 6.7943, time 121.41ms
iter 200210: loss 6.2909, time 124.33ms
iter 200220: loss 5.6703, time 121.57ms
iter 200230: loss 5.9756, time 124.55ms
iter 200240: loss 6.4397, time 121.73ms
step 200250: train loss 5.8806, val loss 5.8264
saving checkpoint to out-shakespeare-char
iter 200250: loss 6.5305, time 2901.95ms
iter 200260: loss 6.3817, time 124.77ms
iter 200270: loss 6.1011, time 125.86ms
iter 200280: loss 6.4112, time 124.68ms
iter 200290: loss 6.5229, time 126.06ms
iter 200300: loss 6.3199, time 125.53ms
iter 200310: loss 5.8994, time 125.37ms
iter 200320: loss 6.6732, time 125.34ms
iter 200330: loss 5.8932, time 125.60ms
iter 200340: loss 6.7983, time 124.44ms
iter 200350: loss 7.0007, time 127.47ms
iter 200360: loss 6.2914, time 125.67ms
iter 200370: loss 6.1359, time 125.73ms
iter 200380: loss 5.9394, time 125.51ms
iter 200390: loss 6.2350, time 125.53ms
iter 200400: loss 5.8643, time 125.44ms
iter 200410: loss 6.3055, time 125.59ms
iter 200420: loss 5.7348, time 128.51ms
iter 200430: loss 6.3613, time 125.78ms
iter 200440: loss 6.1463, time 126.40ms
iter 200450: loss 6.3299, time 125.93ms
iter 200460: loss 7.0258, time 125.73ms
iter 200470: loss 6.9601, time 125.75ms
iter 200480: loss 6.4552, time 125.81ms
iter 200490: loss 5.6578, time 125.41ms
step 200500: train loss 5.8617, val loss 5.8326
saving checkpoint to out-shakespeare-char
iter 200500: loss 6.2608, time 2879.86ms
iter 200510: loss 5.8646, time 124.65ms
iter 200520: loss 6.3169, time 128.27ms
iter 200530: loss 6.3488, time 124.34ms
iter 200540: loss 6.4609, time 125.62ms
iter 200550: loss 5.5053, time 124.32ms
iter 200560: loss 6.2477, time 124.83ms
iter 200570: loss 6.5084, time 124.28ms
iter 200580: loss 6.7634, time 125.41ms
iter 200590: loss 6.1914, time 125.08ms
iter 200600: loss 5.6486, time 125.41ms
iter 200610: loss 5.9888, time 125.84ms
iter 200620: loss 6.0092, time 125.66ms
iter 200630: loss 6.5661, time 125.43ms
iter 200640: loss 6.6948, time 127.81ms
iter 200650: loss 6.2142, time 125.41ms
iter 200660: loss 6.4674, time 125.47ms
iter 200670: loss 6.0041, time 125.51ms
iter 200680: loss 6.5237, time 125.25ms
iter 200690: loss 6.5836, time 125.36ms
iter 200700: loss 6.3725, time 125.83ms
iter 200710: loss 6.4021, time 125.85ms
iter 200720: loss 6.8946, time 125.74ms
iter 200730: loss 5.9925, time 125.39ms
iter 200740: loss 6.0239, time 125.05ms
step 200750: train loss 5.7987, val loss 5.7956
saving checkpoint to out-shakespeare-char
iter 200750: loss 5.9686, time 2901.25ms
iter 200760: loss 6.4364, time 125.41ms
iter 200770: loss 6.1700, time 127.50ms
iter 200780: loss 5.8106, time 124.57ms
iter 200790: loss 6.3379, time 125.17ms
iter 200800: loss 7.0935, time 125.77ms
iter 200810: loss 6.2498, time 128.31ms
iter 200820: loss 5.8708, time 125.58ms
iter 200830: loss 5.9481, time 128.37ms
iter 200840: loss 6.1216, time 125.47ms
iter 200850: loss 6.4799, time 125.59ms
iter 200860: loss 6.1602, time 125.79ms
iter 200870: loss 5.7807, time 125.77ms
iter 200880: loss 6.3321, time 125.67ms
iter 200890: loss 6.2810, time 125.55ms
iter 200900: loss 5.9412, time 125.68ms
iter 200910: loss 6.0204, time 126.31ms
iter 200920: loss 7.5443, time 128.63ms
iter 200930: loss 6.8784, time 125.80ms
iter 200940: loss 6.2084, time 125.53ms
iter 200950: loss 6.9442, time 126.56ms
iter 200960: loss 5.7797, time 125.70ms
iter 200970: loss 6.4926, time 125.65ms
iter 200980: loss 6.4703, time 125.93ms
iter 200990: loss 6.8678, time 125.79ms
step 201000: train loss 5.8520, val loss 5.8490
saving checkpoint to out-shakespeare-char
iter 201000: loss 6.3332, time 2859.57ms
iter 201010: loss 6.3852, time 123.43ms
iter 201020: loss 6.3301, time 122.14ms
iter 201030: loss 6.2670, time 123.71ms
iter 201040: loss 6.1318, time 122.20ms
iter 201050: loss 6.6165, time 123.35ms
iter 201060: loss 7.0294, time 122.14ms
iter 201070: loss 6.6122, time 123.30ms
iter 201080: loss 6.7003, time 122.04ms
iter 201090: loss 6.4824, time 123.26ms
iter 201100: loss 6.9892, time 122.14ms
iter 201110: loss 6.8396, time 123.38ms
iter 201120: loss 6.8036, time 122.18ms
iter 201130: loss 6.0635, time 123.27ms
iter 201140: loss 6.6267, time 122.05ms
iter 201150: loss 6.4163, time 123.19ms
iter 201160: loss 6.0875, time 122.13ms
iter 201170: loss 5.8222, time 123.41ms
iter 201180: loss 6.8703, time 121.96ms
iter 201190: loss 6.7900, time 123.26ms
iter 201200: loss 6.4313, time 122.09ms
iter 201210: loss 5.6982, time 123.53ms
iter 201220: loss 6.9397, time 122.01ms
iter 201230: loss 6.8132, time 123.20ms
iter 201240: loss 6.4081, time 122.16ms
step 201250: train loss 5.8274, val loss 5.8295
saving checkpoint to out-shakespeare-char
iter 201250: loss 6.6256, time 2895.48ms
iter 201260: loss 5.7943, time 122.20ms
iter 201270: loss 6.5358, time 124.22ms
iter 201280: loss 6.3000, time 122.20ms
iter 201290: loss 6.8719, time 124.75ms
iter 201300: loss 6.4877, time 121.95ms
iter 201310: loss 5.8664, time 124.70ms
iter 201320: loss 6.3677, time 122.00ms
iter 201330: loss 6.6899, time 124.75ms
iter 201340: loss 6.7603, time 121.81ms
iter 201350: loss 5.5951, time 124.76ms
iter 201360: loss 6.9119, time 121.85ms
iter 201370: loss 5.8302, time 124.74ms
iter 201380: loss 6.4465, time 122.29ms
iter 201390: loss 6.4187, time 124.85ms
iter 201400: loss 6.2152, time 121.30ms
iter 201410: loss 6.9711, time 124.77ms
iter 201420: loss 6.6857, time 121.78ms
iter 201430: loss 6.1398, time 124.88ms
iter 201440: loss 7.2012, time 121.87ms
iter 201450: loss 6.7221, time 125.06ms
iter 201460: loss 6.2156, time 122.38ms
iter 201470: loss 6.7427, time 124.79ms
iter 201480: loss 6.0363, time 122.05ms
iter 201490: loss 6.6536, time 124.64ms
step 201500: train loss 5.8357, val loss 5.8321
saving checkpoint to out-shakespeare-char
iter 201500: loss 6.2744, time 2891.44ms
iter 201510: loss 6.4115, time 122.11ms
iter 201520: loss 5.7385, time 123.15ms
iter 201530: loss 6.1120, time 121.91ms
iter 201540: loss 6.3740, time 122.95ms
iter 201550: loss 6.7513, time 121.80ms
iter 201560: loss 5.9379, time 122.81ms
iter 201570: loss 6.6702, time 122.01ms
iter 201580: loss 6.5775, time 122.96ms
iter 201590: loss 7.3039, time 121.85ms
iter 201600: loss 6.3391, time 122.68ms
iter 201610: loss 6.5781, time 122.66ms
iter 201620: loss 6.0616, time 123.35ms
iter 201630: loss 6.0438, time 121.46ms
iter 201640: loss 6.6567, time 123.13ms
iter 201650: loss 6.2323, time 121.83ms
iter 201660: loss 6.6537, time 122.75ms
iter 201670: loss 5.5212, time 121.75ms
iter 201680: loss 6.4989, time 122.89ms
iter 201690: loss 6.7229, time 121.75ms
iter 201700: loss 6.6375, time 123.30ms
iter 201710: loss 6.7743, time 121.76ms
iter 201720: loss 6.4615, time 122.96ms
iter 201730: loss 6.1306, time 121.84ms
iter 201740: loss 7.0340, time 123.06ms
step 201750: train loss 5.8243, val loss 5.8500
saving checkpoint to out-shakespeare-char
iter 201750: loss 5.6038, time 2904.00ms
iter 201760: loss 6.4625, time 124.86ms
iter 201770: loss 6.5230, time 121.81ms
iter 201780: loss 6.2367, time 124.59ms
iter 201790: loss 6.7151, time 122.01ms
iter 201800: loss 6.6856, time 124.98ms
iter 201810: loss 6.4819, time 121.80ms
iter 201820: loss 6.7041, time 124.14ms
iter 201830: loss 5.6549, time 121.82ms
iter 201840: loss 6.4370, time 124.54ms
iter 201850: loss 6.6699, time 121.91ms
iter 201860: loss 6.5347, time 124.64ms
iter 201870: loss 6.6193, time 120.84ms
iter 201880: loss 6.7119, time 124.67ms
iter 201890: loss 6.7338, time 121.98ms
iter 201900: loss 6.8431, time 124.72ms
iter 201910: loss 5.5856, time 122.17ms
iter 201920: loss 6.2014, time 124.98ms
iter 201930: loss 6.7293, time 122.06ms
iter 201940: loss 7.4652, time 124.77ms
iter 201950: loss 6.2462, time 122.86ms
iter 201960: loss 6.6432, time 123.84ms
iter 201970: loss 7.0184, time 122.16ms
iter 201980: loss 6.6786, time 124.96ms
iter 201990: loss 5.5636, time 121.93ms
step 202000: train loss 5.8700, val loss 5.8037
saving checkpoint to out-shakespeare-char
iter 202000: loss 6.3560, time 2905.16ms
iter 202010: loss 5.5196, time 126.20ms
iter 202020: loss 6.3228, time 124.91ms
iter 202030: loss 6.1880, time 125.23ms
iter 202040: loss 6.1139, time 124.83ms
iter 202050: loss 6.3923, time 125.53ms
iter 202060: loss 5.7344, time 125.25ms
iter 202070: loss 6.0694, time 125.44ms
iter 202080: loss 6.4470, time 125.42ms
iter 202090: loss 5.7264, time 128.03ms
iter 202100: loss 6.3871, time 125.82ms
iter 202110: loss 6.1343, time 125.90ms
iter 202120: loss 6.0777, time 126.12ms
iter 202130: loss 6.5039, time 128.64ms
iter 202140: loss 7.2619, time 126.11ms
iter 202150: loss 6.0151, time 125.76ms
iter 202160: loss 5.7993, time 125.17ms
iter 202170: loss 6.4138, time 125.16ms
iter 202180: loss 6.2943, time 125.81ms
iter 202190: loss 5.7503, time 125.95ms
iter 202200: loss 6.4504, time 124.78ms
iter 202210: loss 6.3505, time 125.88ms
iter 202220: loss 7.1422, time 124.22ms
iter 202230: loss 6.2872, time 125.18ms
iter 202240: loss 6.7183, time 128.02ms
step 202250: train loss 5.8785, val loss 5.8518
saving checkpoint to out-shakespeare-char
iter 202250: loss 6.4379, time 2897.02ms
iter 202260: loss 6.3567, time 122.13ms
iter 202270: loss 5.7308, time 124.46ms
iter 202280: loss 6.7899, time 122.00ms
iter 202290: loss 6.8801, time 125.03ms
iter 202300: loss 6.3951, time 122.07ms
iter 202310: loss 5.9817, time 125.03ms
iter 202320: loss 6.7121, time 122.07ms
iter 202330: loss 6.3564, time 125.06ms
iter 202340: loss 6.6839, time 122.97ms
iter 202350: loss 5.4086, time 122.28ms
iter 202360: loss 6.1401, time 122.78ms
iter 202370: loss 6.2368, time 124.25ms
iter 202380: loss 6.3739, time 121.73ms
iter 202390: loss 6.7859, time 124.88ms
iter 202400: loss 6.2475, time 122.04ms
iter 202410: loss 6.6241, time 124.50ms
iter 202420: loss 6.1413, time 121.66ms
iter 202430: loss 6.1404, time 124.88ms
iter 202440: loss 6.3754, time 121.58ms
iter 202450: loss 6.0479, time 124.59ms
iter 202460: loss 6.1865, time 121.70ms
iter 202470: loss 7.0371, time 124.11ms
iter 202480: loss 6.4379, time 121.09ms
iter 202490: loss 6.8238, time 124.98ms
step 202500: train loss 5.7928, val loss 5.8293
saving checkpoint to out-shakespeare-char
iter 202500: loss 6.1632, time 2899.36ms
iter 202510: loss 6.3696, time 122.06ms
iter 202520: loss 7.2897, time 121.71ms
iter 202530: loss 5.7438, time 121.76ms
iter 202540: loss 5.7418, time 121.83ms
iter 202550: loss 5.4298, time 121.74ms
iter 202560: loss 6.2051, time 121.48ms
iter 202570: loss 6.4509, time 121.48ms
iter 202580: loss 6.7405, time 121.31ms
iter 202590: loss 6.3892, time 121.89ms
iter 202600: loss 6.3676, time 121.61ms
iter 202610: loss 5.4298, time 121.75ms
iter 202620: loss 6.0143, time 121.69ms
iter 202630: loss 6.0981, time 121.68ms
iter 202640: loss 6.0132, time 121.83ms
iter 202650: loss 6.1563, time 124.57ms
iter 202660: loss 6.7127, time 121.28ms
iter 202670: loss 6.9966, time 123.57ms
iter 202680: loss 6.5293, time 119.97ms
iter 202690: loss 6.2344, time 123.92ms
iter 202700: loss 6.6063, time 120.80ms
iter 202710: loss 6.0037, time 124.85ms
iter 202720: loss 6.5401, time 121.10ms
iter 202730: loss 6.5281, time 123.21ms
iter 202740: loss 6.5515, time 121.70ms
step 202750: train loss 5.8496, val loss 5.7828
saving checkpoint to out-shakespeare-char
iter 202750: loss 6.6817, time 2891.91ms
iter 202760: loss 6.7416, time 121.42ms
iter 202770: loss 6.7759, time 121.57ms
iter 202780: loss 6.8734, time 121.98ms
iter 202790: loss 6.6340, time 123.16ms
iter 202800: loss 6.6959, time 122.06ms
iter 202810: loss 6.0327, time 122.35ms
iter 202820: loss 6.2797, time 121.23ms
iter 202830: loss 6.2271, time 123.30ms
iter 202840: loss 6.7286, time 121.90ms
iter 202850: loss 7.1456, time 122.80ms
iter 202860: loss 6.8276, time 121.32ms
iter 202870: loss 6.8507, time 123.14ms
iter 202880: loss 6.4098, time 121.93ms
iter 202890: loss 6.5951, time 123.54ms
iter 202900: loss 6.7067, time 121.59ms
iter 202910: loss 6.0526, time 121.94ms
iter 202920: loss 5.9706, time 120.18ms
iter 202930: loss 6.3921, time 121.22ms
iter 202940: loss 6.5291, time 121.71ms
iter 202950: loss 6.4550, time 121.94ms
iter 202960: loss 6.2496, time 121.75ms
iter 202970: loss 6.5819, time 121.80ms
iter 202980: loss 6.2644, time 121.81ms
iter 202990: loss 6.2955, time 121.54ms
step 203000: train loss 5.8066, val loss 5.7788
saving checkpoint to out-shakespeare-char
iter 203000: loss 6.6859, time 2896.23ms
iter 203010: loss 6.8568, time 125.91ms
iter 203020: loss 6.2782, time 124.93ms
iter 203030: loss 6.3878, time 125.90ms
iter 203040: loss 6.5994, time 124.99ms
iter 203050: loss 6.6957, time 126.77ms
iter 203060: loss 6.5797, time 125.54ms
iter 203070: loss 6.0699, time 124.03ms
iter 203080: loss 6.0489, time 125.28ms
iter 203090: loss 6.5343, time 125.01ms
iter 203100: loss 6.1272, time 124.93ms
iter 203110: loss 6.9417, time 127.95ms
iter 203120: loss 6.9408, time 124.81ms
iter 203130: loss 6.1016, time 124.85ms
iter 203140: loss 5.4236, time 124.71ms
iter 203150: loss 5.9633, time 125.09ms
iter 203160: loss 5.9601, time 125.16ms
iter 203170: loss 5.8725, time 125.29ms
iter 203180: loss 6.7030, time 124.01ms
iter 203190: loss 6.4447, time 124.90ms
iter 203200: loss 6.8765, time 125.07ms
iter 203210: loss 6.6995, time 125.21ms
iter 203220: loss 6.0359, time 127.84ms
iter 203230: loss 5.6133, time 125.33ms
iter 203240: loss 5.8889, time 125.20ms
step 203250: train loss 5.8477, val loss 5.8109
saving checkpoint to out-shakespeare-char
iter 203250: loss 7.1396, time 2850.07ms
iter 203260: loss 6.2343, time 121.77ms
iter 203270: loss 6.1060, time 121.52ms
iter 203280: loss 6.5424, time 121.73ms
iter 203290: loss 7.1750, time 121.35ms
iter 203300: loss 6.5058, time 121.80ms
iter 203310: loss 6.6865, time 121.63ms
iter 203320: loss 6.3717, time 121.78ms
iter 203330: loss 6.2984, time 121.51ms
iter 203340: loss 6.6971, time 121.68ms
iter 203350: loss 6.6680, time 121.54ms
iter 203360: loss 5.8872, time 121.71ms
iter 203370: loss 6.9116, time 121.43ms
iter 203380: loss 6.7827, time 121.68ms
iter 203390: loss 6.2839, time 121.53ms
iter 203400: loss 6.7982, time 121.72ms
iter 203410: loss 5.8443, time 121.60ms
iter 203420: loss 5.8141, time 121.64ms
iter 203430: loss 6.2513, time 121.48ms
iter 203440: loss 6.3185, time 121.66ms
iter 203450: loss 6.6322, time 121.55ms
iter 203460: loss 6.5614, time 121.44ms
iter 203470: loss 7.0031, time 121.46ms
iter 203480: loss 6.5508, time 121.44ms
iter 203490: loss 5.9681, time 121.78ms
step 203500: train loss 5.8269, val loss 5.8154
saving checkpoint to out-shakespeare-char
iter 203500: loss 6.8066, time 2899.28ms
iter 203510: loss 6.8760, time 125.82ms
iter 203520: loss 6.8613, time 125.75ms
iter 203530: loss 5.9309, time 126.00ms
iter 203540: loss 6.0746, time 125.61ms
iter 203550: loss 6.6203, time 125.66ms
iter 203560: loss 6.4640, time 125.10ms
iter 203570: loss 6.3473, time 125.49ms
iter 203580: loss 5.8254, time 125.96ms
iter 203590: loss 6.6174, time 125.53ms
iter 203600: loss 7.3002, time 125.51ms
iter 203610: loss 6.0167, time 128.39ms
iter 203620: loss 6.4230, time 125.71ms
iter 203630: loss 6.5600, time 125.35ms
iter 203640: loss 6.8039, time 125.67ms
iter 203650: loss 5.9537, time 125.39ms
iter 203660: loss 6.3763, time 125.65ms
iter 203670: loss 6.4217, time 125.60ms
iter 203680: loss 5.8678, time 125.41ms
iter 203690: loss 6.9585, time 125.59ms
iter 203700: loss 6.1801, time 125.61ms
iter 203710: loss 5.9629, time 125.63ms
iter 203720: loss 6.3298, time 125.48ms
iter 203730: loss 5.8465, time 124.74ms
iter 203740: loss 6.1980, time 125.49ms
step 203750: train loss 5.7684, val loss 5.7829
saving checkpoint to out-shakespeare-char
iter 203750: loss 6.3689, time 2897.90ms
iter 203760: loss 6.4386, time 125.60ms
iter 203770: loss 5.8445, time 125.41ms
iter 203780: loss 5.4786, time 128.82ms
iter 203790: loss 6.1219, time 125.75ms
iter 203800: loss 5.7803, time 125.88ms
iter 203810: loss 5.5280, time 125.41ms
iter 203820: loss 6.5591, time 125.15ms
iter 203830: loss 6.1803, time 125.61ms
iter 203840: loss 6.7628, time 125.51ms
iter 203850: loss 6.0604, time 125.89ms
iter 203860: loss 6.7068, time 125.44ms
iter 203870: loss 6.5095, time 125.61ms
iter 203880: loss 6.8889, time 125.48ms
iter 203890: loss 6.2537, time 128.19ms
iter 203900: loss 6.5796, time 125.43ms
iter 203910: loss 6.2210, time 125.46ms
iter 203920: loss 6.6146, time 125.40ms
iter 203930: loss 6.0255, time 125.43ms
iter 203940: loss 5.9556, time 125.36ms
iter 203950: loss 6.0855, time 124.97ms
iter 203960: loss 6.3908, time 124.93ms
iter 203970: loss 6.0483, time 124.74ms
iter 203980: loss 6.2738, time 125.44ms
iter 203990: loss 5.8562, time 125.93ms
step 204000: train loss 5.8475, val loss 5.8137
saving checkpoint to out-shakespeare-char
iter 204000: loss 6.0780, time 2888.45ms
iter 204010: loss 6.5260, time 124.91ms
iter 204020: loss 6.7001, time 128.03ms
iter 204030: loss 6.5335, time 124.75ms
iter 204040: loss 6.9996, time 125.14ms
iter 204050: loss 6.1506, time 124.11ms
iter 204060: loss 6.5847, time 124.57ms
iter 204070: loss 7.0282, time 124.55ms
iter 204080: loss 6.0438, time 125.55ms
iter 204090: loss 6.6011, time 129.03ms
iter 204100: loss 6.9782, time 125.42ms
iter 204110: loss 6.1805, time 125.93ms
iter 204120: loss 6.3918, time 125.60ms
iter 204130: loss 5.9538, time 126.61ms
iter 204140: loss 5.8645, time 125.61ms
iter 204150: loss 6.4631, time 125.70ms
iter 204160: loss 5.8912, time 125.70ms
iter 204170: loss 6.6889, time 126.56ms
iter 204180: loss 5.7685, time 126.02ms
iter 204190: loss 7.8718, time 125.57ms
iter 204200: loss 7.0164, time 128.20ms
iter 204210: loss 6.0998, time 125.25ms
iter 204220: loss 5.8510, time 125.83ms
iter 204230: loss 5.5785, time 125.61ms
iter 204240: loss 6.7010, time 125.37ms
step 204250: train loss 5.8257, val loss 5.8230
saving checkpoint to out-shakespeare-char
iter 204250: loss 6.7563, time 2876.05ms
iter 204260: loss 7.1328, time 121.77ms
iter 204270: loss 6.5717, time 121.88ms
iter 204280: loss 6.6406, time 121.51ms
iter 204290: loss 7.2889, time 121.93ms
iter 204300: loss 6.5911, time 121.64ms
iter 204310: loss 6.6631, time 121.61ms
iter 204320: loss 6.5407, time 121.71ms
iter 204330: loss 6.5649, time 121.77ms
iter 204340: loss 6.9164, time 121.65ms
iter 204350: loss 6.5484, time 121.75ms
iter 204360: loss 6.8070, time 121.79ms
iter 204370: loss 5.5958, time 121.72ms
iter 204380: loss 5.9519, time 120.69ms
iter 204390: loss 6.4706, time 121.81ms
iter 204400: loss 6.7944, time 121.75ms
iter 204410: loss 5.3852, time 121.65ms
iter 204420: loss 6.1688, time 121.71ms
iter 204430: loss 7.2853, time 121.37ms
iter 204440: loss 6.3206, time 121.68ms
iter 204450: loss 5.6804, time 121.80ms
iter 204460: loss 7.0688, time 121.84ms
iter 204470: loss 6.5868, time 121.80ms
iter 204480: loss 6.2588, time 121.67ms
iter 204490: loss 7.0879, time 121.59ms
step 204500: train loss 5.8195, val loss 5.7865
saving checkpoint to out-shakespeare-char
iter 204500: loss 6.7782, time 2893.46ms
iter 204510: loss 6.1782, time 123.24ms
iter 204520: loss 6.1545, time 122.00ms
iter 204530: loss 6.3535, time 123.18ms
iter 204540: loss 6.6145, time 121.83ms
iter 204550: loss 6.9716, time 123.36ms
iter 204560: loss 5.9908, time 122.00ms
iter 204570: loss 6.7954, time 122.99ms
iter 204580: loss 6.4868, time 122.33ms
iter 204590: loss 5.8930, time 123.23ms
iter 204600: loss 5.9858, time 122.15ms
iter 204610: loss 6.0942, time 123.20ms
iter 204620: loss 6.7252, time 121.98ms
iter 204630: loss 6.0886, time 123.32ms
iter 204640: loss 6.0782, time 122.05ms
iter 204650: loss 6.2908, time 122.99ms
iter 204660: loss 6.1526, time 121.97ms
iter 204670: loss 6.9598, time 123.14ms
iter 204680: loss 6.0850, time 121.89ms
iter 204690: loss 6.8929, time 123.15ms
iter 204700: loss 5.9718, time 121.54ms
iter 204710: loss 6.1364, time 123.06ms
iter 204720: loss 5.8830, time 121.87ms
iter 204730: loss 6.1867, time 123.29ms
iter 204740: loss 6.4512, time 121.87ms
step 204750: train loss 5.8291, val loss 5.7947
saving checkpoint to out-shakespeare-char
iter 204750: loss 6.3265, time 2892.45ms
iter 204760: loss 6.8133, time 121.57ms
iter 204770: loss 6.6358, time 121.59ms
iter 204780: loss 6.1533, time 121.60ms
iter 204790: loss 7.0655, time 121.26ms
iter 204800: loss 5.7822, time 121.42ms
iter 204810: loss 5.7507, time 121.64ms
iter 204820: loss 6.2825, time 121.63ms
iter 204830: loss 6.1079, time 121.57ms
iter 204840: loss 6.8890, time 121.43ms
iter 204850: loss 6.4297, time 121.55ms
iter 204860: loss 5.9634, time 121.44ms
iter 204870: loss 5.6496, time 121.40ms
iter 204880: loss 6.4271, time 121.38ms
iter 204890: loss 6.4413, time 127.33ms
iter 204900: loss 5.9707, time 125.18ms
iter 204910: loss 6.6934, time 125.19ms
iter 204920: loss 6.1657, time 124.61ms
iter 204930: loss 6.5158, time 125.13ms
iter 204940: loss 7.0684, time 125.29ms
iter 204950: loss 5.8035, time 125.53ms
iter 204960: loss 5.3379, time 125.89ms
iter 204970: loss 5.7883, time 125.67ms
iter 204980: loss 6.8206, time 124.36ms
iter 204990: loss 6.6289, time 123.74ms
step 205000: train loss 5.8034, val loss 5.8307
saving checkpoint to out-shakespeare-char
iter 205000: loss 6.5634, time 2889.42ms
iter 205010: loss 7.1669, time 125.44ms
iter 205020: loss 5.5617, time 125.12ms
iter 205030: loss 5.9423, time 125.50ms
iter 205040: loss 5.8683, time 125.22ms
iter 205050: loss 6.9009, time 125.61ms
iter 205060: loss 6.4266, time 125.09ms
iter 205070: loss 6.0200, time 124.62ms
iter 205080: loss 6.3341, time 125.37ms
iter 205090: loss 6.5640, time 126.16ms
iter 205100: loss 6.5005, time 128.24ms
iter 205110: loss 6.3670, time 125.46ms
iter 205120: loss 5.3401, time 125.39ms
iter 205130: loss 6.1385, time 124.14ms
iter 205140: loss 7.1452, time 128.44ms
iter 205150: loss 6.1789, time 125.35ms
iter 205160: loss 6.5507, time 125.48ms
iter 205170: loss 5.9975, time 124.79ms
iter 205180: loss 5.7370, time 125.43ms
iter 205190: loss 6.5043, time 125.53ms
iter 205200: loss 6.3453, time 124.00ms
iter 205210: loss 5.2241, time 124.17ms
iter 205220: loss 6.4511, time 124.86ms
iter 205230: loss 6.8822, time 125.78ms
iter 205240: loss 6.1238, time 124.80ms
step 205250: train loss 5.8221, val loss 5.7716
saving checkpoint to out-shakespeare-char
iter 205250: loss 6.7227, time 2895.18ms
iter 205260: loss 6.5680, time 125.32ms
iter 205270: loss 6.3597, time 124.63ms
iter 205280: loss 6.9498, time 125.64ms
iter 205290: loss 6.8447, time 124.59ms
iter 205300: loss 6.9461, time 125.33ms
iter 205310: loss 6.2056, time 128.14ms
iter 205320: loss 6.8038, time 125.28ms
iter 205330: loss 6.3491, time 125.27ms
iter 205340: loss 5.8219, time 125.39ms
iter 205350: loss 6.5656, time 124.24ms
iter 205360: loss 6.4478, time 125.26ms
iter 205370: loss 6.5959, time 125.58ms
iter 205380: loss 6.4734, time 125.41ms
iter 205390: loss 6.8685, time 125.06ms
iter 205400: loss 6.4943, time 125.32ms
iter 205410: loss 6.3234, time 125.30ms
iter 205420: loss 6.3012, time 128.43ms
iter 205430: loss 6.9394, time 125.28ms
iter 205440: loss 6.9149, time 125.19ms
iter 205450: loss 6.4671, time 125.58ms
iter 205460: loss 6.6625, time 125.20ms
iter 205470: loss 6.1816, time 125.69ms
iter 205480: loss 6.6643, time 125.14ms
iter 205490: loss 6.2800, time 125.81ms
step 205500: train loss 5.8277, val loss 5.7690
saving checkpoint to out-shakespeare-char
iter 205500: loss 6.3838, time 2903.27ms
iter 205510: loss 5.6468, time 125.53ms
iter 205520: loss 5.9105, time 124.84ms
iter 205530: loss 6.8744, time 125.55ms
iter 205540: loss 7.1257, time 125.31ms
iter 205550: loss 6.7667, time 125.61ms
iter 205560: loss 6.7817, time 125.25ms
iter 205570: loss 5.9481, time 125.51ms
iter 205580: loss 6.1242, time 125.61ms
iter 205590: loss 6.7028, time 128.24ms
iter 205600: loss 6.3060, time 125.22ms
iter 205610: loss 7.4498, time 124.50ms
iter 205620: loss 6.3323, time 125.45ms
iter 205630: loss 5.3244, time 125.34ms
iter 205640: loss 5.7945, time 125.37ms
iter 205650: loss 6.3986, time 124.53ms
iter 205660: loss 6.7707, time 125.08ms
iter 205670: loss 6.7032, time 125.28ms
iter 205680: loss 6.1555, time 125.35ms
iter 205690: loss 6.1677, time 125.84ms
iter 205700: loss 6.5325, time 121.05ms
iter 205710: loss 6.4366, time 120.04ms
iter 205720: loss 6.1685, time 120.77ms
iter 205730: loss 6.5261, time 119.97ms
iter 205740: loss 6.4995, time 120.06ms
step 205750: train loss 5.8601, val loss 5.8078
saving checkpoint to out-shakespeare-char
iter 205750: loss 6.2350, time 2896.82ms
iter 205760: loss 6.6906, time 120.51ms
iter 205770: loss 6.5866, time 119.72ms
iter 205780: loss 6.6893, time 119.56ms
iter 205790: loss 5.9177, time 119.57ms
iter 205800: loss 6.5807, time 119.65ms
iter 205810: loss 6.6827, time 120.00ms
iter 205820: loss 6.0202, time 120.62ms
iter 205830: loss 6.1215, time 119.61ms
iter 205840: loss 7.1621, time 119.54ms
iter 205850: loss 6.1186, time 119.54ms
iter 205860: loss 7.3265, time 120.27ms
iter 205870: loss 6.6329, time 119.55ms
iter 205880: loss 5.7091, time 119.91ms
iter 205890: loss 5.8442, time 120.17ms
iter 205900: loss 6.7341, time 120.66ms
iter 205910: loss 6.5447, time 120.84ms
iter 205920: loss 5.9462, time 120.57ms
iter 205930: loss 6.5130, time 119.73ms
iter 205940: loss 6.0760, time 119.68ms
iter 205950: loss 5.6929, time 119.07ms
iter 205960: loss 6.4592, time 119.99ms
iter 205970: loss 6.9623, time 119.56ms
iter 205980: loss 6.9100, time 119.46ms
iter 205990: loss 6.7930, time 120.44ms
step 206000: train loss 5.7699, val loss 5.8144
saving checkpoint to out-shakespeare-char
iter 206000: loss 5.3628, time 2881.99ms
iter 206010: loss 6.6127, time 120.48ms
iter 206020: loss 6.8370, time 119.73ms
iter 206030: loss 6.6251, time 119.77ms
iter 206040: loss 6.5815, time 119.77ms
iter 206050: loss 5.9710, time 119.60ms
iter 206060: loss 6.3010, time 119.65ms
iter 206070: loss 6.2826, time 120.32ms
iter 206080: loss 6.4178, time 120.53ms
iter 206090: loss 6.7552, time 119.54ms
iter 206100: loss 5.7997, time 120.36ms
iter 206110: loss 5.9848, time 119.45ms
iter 206120: loss 6.2774, time 119.71ms
iter 206130: loss 5.8824, time 119.77ms
iter 206140: loss 6.3576, time 119.68ms
iter 206150: loss 6.1627, time 119.76ms
iter 206160: loss 6.6886, time 120.29ms
iter 206170: loss 6.3766, time 119.90ms
iter 206180: loss 6.4703, time 119.58ms
iter 206190: loss 6.0641, time 121.16ms
iter 206200: loss 6.8558, time 121.87ms
iter 206210: loss 5.3776, time 121.72ms
iter 206220: loss 6.7578, time 121.62ms
iter 206230: loss 6.3555, time 121.70ms
iter 206240: loss 7.1574, time 120.49ms
step 206250: train loss 5.8154, val loss 5.8068
saving checkpoint to out-shakespeare-char
iter 206250: loss 7.5496, time 2873.71ms
iter 206260: loss 6.7700, time 121.64ms
iter 206270: loss 5.9641, time 124.49ms
iter 206280: loss 6.1883, time 121.87ms
iter 206290: loss 6.6801, time 123.74ms
iter 206300: loss 5.8932, time 121.57ms
iter 206310: loss 6.1963, time 124.75ms
iter 206320: loss 5.8834, time 121.52ms
iter 206330: loss 6.3177, time 124.29ms
iter 206340: loss 6.0507, time 121.66ms
iter 206350: loss 5.9679, time 123.32ms
iter 206360: loss 5.8948, time 121.06ms
iter 206370: loss 5.5238, time 124.24ms
iter 206380: loss 6.2376, time 121.46ms
iter 206390: loss 6.3262, time 122.48ms
iter 206400: loss 6.4783, time 119.70ms
iter 206410: loss 5.7486, time 123.57ms
iter 206420: loss 6.0211, time 118.84ms
iter 206430: loss 5.3558, time 122.49ms
iter 206440: loss 6.6580, time 120.58ms
iter 206450: loss 6.4650, time 122.42ms
iter 206460: loss 6.3509, time 120.30ms
iter 206470: loss 5.7042, time 122.52ms
iter 206480: loss 6.5423, time 119.65ms
iter 206490: loss 6.4197, time 123.61ms
step 206500: train loss 5.8486, val loss 5.8189
saving checkpoint to out-shakespeare-char
iter 206500: loss 6.3727, time 2890.68ms
iter 206510: loss 6.4879, time 119.68ms
iter 206520: loss 5.2194, time 121.28ms
iter 206530: loss 6.4372, time 119.73ms
iter 206540: loss 5.4064, time 120.67ms
iter 206550: loss 5.6078, time 119.91ms
iter 206560: loss 6.3584, time 120.79ms
iter 206570: loss 5.9914, time 120.22ms
iter 206580: loss 6.4938, time 120.66ms
iter 206590: loss 5.9172, time 119.63ms
iter 206600: loss 6.0785, time 120.64ms
iter 206610: loss 5.8672, time 119.28ms
iter 206620: loss 6.1241, time 120.57ms
iter 206630: loss 6.5960, time 119.58ms
iter 206640: loss 6.5233, time 120.70ms
iter 206650: loss 5.9377, time 119.57ms
iter 206660: loss 6.1766, time 120.72ms
iter 206670: loss 5.9894, time 120.58ms
iter 206680: loss 6.7723, time 120.67ms
iter 206690: loss 5.7821, time 118.81ms
iter 206700: loss 5.9545, time 120.76ms
iter 206710: loss 6.9980, time 119.70ms
iter 206720: loss 6.0923, time 120.72ms
iter 206730: loss 6.6848, time 119.42ms
iter 206740: loss 5.7846, time 120.68ms
step 206750: train loss 5.7958, val loss 5.8139
saving checkpoint to out-shakespeare-char
iter 206750: loss 5.7909, time 2888.94ms
iter 206760: loss 6.5292, time 121.29ms
iter 206770: loss 6.8084, time 123.05ms
iter 206780: loss 6.0697, time 120.99ms
iter 206790: loss 6.7892, time 121.49ms
iter 206800: loss 5.7995, time 121.63ms
iter 206810: loss 6.0249, time 121.61ms
iter 206820: loss 6.3042, time 122.05ms
iter 206830: loss 6.2558, time 121.56ms
iter 206840: loss 6.9443, time 121.21ms
iter 206850: loss 6.3150, time 121.43ms
iter 206860: loss 6.4964, time 121.89ms
iter 206870: loss 6.5872, time 121.46ms
iter 206880: loss 6.3706, time 121.63ms
iter 206890: loss 6.8031, time 121.33ms
iter 206900: loss 6.3117, time 121.94ms
iter 206910: loss 6.6306, time 121.58ms
iter 206920: loss 6.6199, time 121.61ms
iter 206930: loss 6.3511, time 121.47ms
iter 206940: loss 5.5357, time 121.66ms
iter 206950: loss 6.0385, time 121.54ms
iter 206960: loss 7.3003, time 121.61ms
iter 206970: loss 6.2416, time 121.53ms
iter 206980: loss 6.1636, time 121.62ms
iter 206990: loss 6.4863, time 121.53ms
step 207000: train loss 5.8017, val loss 5.7924
saving checkpoint to out-shakespeare-char
iter 207000: loss 6.3210, time 2883.52ms
iter 207010: loss 6.2629, time 124.25ms
iter 207020: loss 7.0282, time 122.06ms
iter 207030: loss 6.3867, time 124.36ms
iter 207040: loss 6.5928, time 121.56ms
iter 207050: loss 6.8760, time 124.46ms
iter 207060: loss 6.7751, time 121.47ms
iter 207070: loss 5.5586, time 123.73ms
iter 207080: loss 6.7165, time 121.51ms
iter 207090: loss 6.2908, time 124.43ms
iter 207100: loss 6.8134, time 120.74ms
iter 207110: loss 6.5823, time 124.28ms
iter 207120: loss 6.0730, time 121.52ms
iter 207130: loss 6.1564, time 124.09ms
iter 207140: loss 5.8340, time 122.05ms
iter 207150: loss 6.0601, time 123.46ms
iter 207160: loss 6.9663, time 119.63ms
iter 207170: loss 5.7110, time 123.76ms
iter 207180: loss 6.2683, time 121.48ms
iter 207190: loss 6.2262, time 123.93ms
iter 207200: loss 6.2726, time 121.99ms
iter 207210: loss 6.5733, time 124.38ms
iter 207220: loss 5.8414, time 121.46ms
iter 207230: loss 5.8660, time 124.32ms
iter 207240: loss 6.0825, time 122.17ms
step 207250: train loss 5.8034, val loss 5.8325
saving checkpoint to out-shakespeare-char
iter 207250: loss 6.1272, time 2891.70ms
iter 207260: loss 6.6366, time 121.09ms
iter 207270: loss 6.3367, time 121.86ms
iter 207280: loss 6.7146, time 121.98ms
iter 207290: loss 6.8733, time 121.77ms
iter 207300: loss 6.4383, time 121.86ms
iter 207310: loss 6.7765, time 121.56ms
iter 207320: loss 6.1973, time 121.75ms
iter 207330: loss 6.4521, time 121.56ms
iter 207340: loss 6.6389, time 121.41ms
iter 207350: loss 6.0082, time 121.55ms
iter 207360: loss 6.6810, time 121.53ms
iter 207370: loss 6.8606, time 121.50ms
iter 207380: loss 6.3378, time 121.78ms
iter 207390: loss 6.8835, time 121.90ms
iter 207400: loss 6.6068, time 121.49ms
iter 207410: loss 6.5991, time 121.50ms
iter 207420: loss 6.5965, time 121.70ms
iter 207430: loss 5.4160, time 121.62ms
iter 207440: loss 6.2927, time 121.55ms
iter 207450: loss 7.3750, time 121.58ms
iter 207460: loss 6.4095, time 121.56ms
iter 207470: loss 6.3678, time 121.47ms
iter 207480: loss 5.9508, time 121.69ms
iter 207490: loss 7.1010, time 121.57ms
step 207500: train loss 5.7707, val loss 5.8649
saving checkpoint to out-shakespeare-char
iter 207500: loss 6.1540, time 2886.66ms
iter 207510: loss 6.6776, time 121.47ms
iter 207520: loss 6.6018, time 121.65ms
iter 207530: loss 6.7192, time 121.54ms
iter 207540: loss 6.9230, time 121.48ms
iter 207550: loss 6.0756, time 121.34ms
iter 207560: loss 6.7032, time 121.44ms
iter 207570: loss 6.1990, time 121.69ms
iter 207580: loss 6.1187, time 121.73ms
iter 207590: loss 5.9243, time 121.43ms
iter 207600: loss 6.7398, time 121.46ms
iter 207610: loss 6.2659, time 121.41ms
iter 207620: loss 5.9940, time 121.98ms
iter 207630: loss 6.7267, time 121.51ms
iter 207640: loss 6.9090, time 121.60ms
iter 207650: loss 5.6546, time 121.44ms
iter 207660: loss 6.3322, time 121.10ms
iter 207670: loss 6.2579, time 121.48ms
iter 207680: loss 6.5897, time 122.12ms
iter 207690: loss 5.6659, time 121.51ms
iter 207700: loss 6.2351, time 121.46ms
iter 207710: loss 5.9218, time 121.73ms
iter 207720: loss 6.8193, time 121.55ms
iter 207730: loss 6.5474, time 122.47ms
iter 207740: loss 6.2393, time 121.96ms
step 207750: train loss 5.8160, val loss 5.8301
saving checkpoint to out-shakespeare-char
iter 207750: loss 6.6783, time 2881.73ms
iter 207760: loss 6.1880, time 121.72ms
iter 207770: loss 6.5396, time 121.48ms
iter 207780: loss 6.7375, time 121.42ms
iter 207790: loss 5.9090, time 121.54ms
iter 207800: loss 6.8647, time 121.55ms
iter 207810: loss 6.7753, time 121.71ms
iter 207820: loss 5.9421, time 121.45ms
iter 207830: loss 6.8623, time 121.51ms
iter 207840: loss 6.6209, time 121.42ms
iter 207850: loss 6.2133, time 121.12ms
iter 207860: loss 6.5612, time 121.76ms
iter 207870: loss 6.1681, time 121.67ms
iter 207880: loss 7.0247, time 121.50ms
iter 207890: loss 6.6971, time 121.40ms
iter 207900: loss 5.7355, time 121.75ms
iter 207910: loss 6.4776, time 121.44ms
iter 207920: loss 6.4315, time 121.43ms
iter 207930: loss 6.3149, time 120.84ms
iter 207940: loss 6.7130, time 121.41ms
iter 207950: loss 7.0643, time 121.55ms
iter 207960: loss 5.8375, time 121.34ms
iter 207970: loss 6.4794, time 120.77ms
iter 207980: loss 5.5646, time 121.29ms
iter 207990: loss 6.1474, time 121.38ms
step 208000: train loss 5.8337, val loss 5.7995
saving checkpoint to out-shakespeare-char
iter 208000: loss 6.5372, time 2884.65ms
iter 208010: loss 6.0997, time 121.31ms
iter 208020: loss 7.3106, time 121.62ms
iter 208030: loss 5.7358, time 121.51ms
iter 208040: loss 6.1270, time 121.80ms
iter 208050: loss 6.7647, time 121.43ms
iter 208060: loss 6.1640, time 121.89ms
iter 208070: loss 6.8658, time 121.69ms
iter 208080: loss 6.9028, time 121.57ms
iter 208090: loss 6.1009, time 121.29ms
iter 208100: loss 6.4977, time 121.70ms
iter 208110: loss 6.0796, time 121.69ms
iter 208120: loss 6.6243, time 121.56ms
iter 208130: loss 6.8525, time 121.46ms
iter 208140: loss 6.5032, time 120.15ms
iter 208150: loss 6.6551, time 121.45ms
iter 208160: loss 6.7248, time 121.47ms
iter 208170: loss 6.9367, time 121.32ms
iter 208180: loss 5.9091, time 121.40ms
iter 208190: loss 6.0392, time 121.35ms
iter 208200: loss 6.0163, time 121.52ms
iter 208210: loss 6.6159, time 121.26ms
iter 208220: loss 5.7158, time 121.41ms
iter 208230: loss 6.2250, time 121.58ms
iter 208240: loss 5.7169, time 121.43ms
step 208250: train loss 5.7775, val loss 5.8143
saving checkpoint to out-shakespeare-char
iter 208250: loss 6.6836, time 2875.59ms
iter 208260: loss 6.8870, time 121.74ms
iter 208270: loss 5.7894, time 121.20ms
iter 208280: loss 5.1663, time 121.48ms
iter 208290: loss 6.6003, time 121.39ms
iter 208300: loss 6.2908, time 121.40ms
iter 208310: loss 6.1169, time 121.62ms
iter 208320: loss 7.1701, time 121.62ms
iter 208330: loss 5.7966, time 121.55ms
iter 208340: loss 6.6259, time 121.73ms
iter 208350: loss 5.4921, time 121.70ms
iter 208360: loss 5.5604, time 121.54ms
iter 208370: loss 6.2593, time 121.32ms
iter 208380: loss 6.0694, time 121.12ms
iter 208390: loss 6.1344, time 121.58ms
iter 208400: loss 7.0712, time 121.49ms
iter 208410: loss 5.8139, time 121.70ms
iter 208420: loss 6.1945, time 121.51ms
iter 208430: loss 5.8436, time 121.60ms
iter 208440: loss 6.0711, time 121.58ms
iter 208450: loss 5.9326, time 121.58ms
iter 208460: loss 6.1028, time 121.45ms
iter 208470: loss 6.2855, time 121.67ms
iter 208480: loss 5.7581, time 121.47ms
iter 208490: loss 6.0996, time 121.03ms
step 208500: train loss 5.8593, val loss 5.8537
saving checkpoint to out-shakespeare-char
iter 208500: loss 6.0286, time 2882.14ms
iter 208510: loss 6.7160, time 122.44ms
iter 208520: loss 6.4237, time 121.57ms
iter 208530: loss 6.8016, time 120.97ms
iter 208540: loss 6.8903, time 121.63ms
iter 208550: loss 6.3131, time 121.54ms
iter 208560: loss 6.0384, time 121.49ms
iter 208570: loss 7.0914, time 121.47ms
iter 208580: loss 6.6345, time 121.44ms
iter 208590: loss 6.6065, time 121.29ms
iter 208600: loss 5.6665, time 121.66ms
iter 208610: loss 6.5869, time 121.19ms
iter 208620: loss 7.1399, time 121.47ms
iter 208630: loss 6.2095, time 121.17ms
iter 208640: loss 6.1052, time 121.29ms
iter 208650: loss 5.7920, time 121.56ms
iter 208660: loss 6.6567, time 121.54ms
iter 208670: loss 6.6973, time 121.64ms
iter 208680: loss 6.6656, time 121.49ms
iter 208690: loss 6.1393, time 121.60ms
iter 208700: loss 6.2348, time 121.47ms
iter 208710: loss 6.8471, time 121.57ms
iter 208720: loss 6.2111, time 121.66ms
iter 208730: loss 5.4225, time 121.41ms
iter 208740: loss 5.5997, time 121.70ms
step 208750: train loss 5.7982, val loss 5.8447
saving checkpoint to out-shakespeare-char
iter 208750: loss 6.7283, time 2896.23ms
iter 208760: loss 6.9457, time 121.86ms
iter 208770: loss 6.8861, time 124.95ms
iter 208780: loss 6.2016, time 121.93ms
iter 208790: loss 6.5494, time 124.96ms
iter 208800: loss 6.5802, time 121.68ms
iter 208810: loss 5.9789, time 124.98ms
iter 208820: loss 6.3503, time 121.93ms
iter 208830: loss 7.0158, time 124.64ms
iter 208840: loss 6.1295, time 121.95ms
iter 208850: loss 6.0990, time 124.50ms
iter 208860: loss 7.2732, time 121.79ms
iter 208870: loss 6.2760, time 124.64ms
iter 208880: loss 6.3961, time 121.77ms
iter 208890: loss 5.3875, time 124.29ms
iter 208900: loss 6.2127, time 121.49ms
iter 208910: loss 6.4417, time 124.32ms
iter 208920: loss 6.8179, time 121.40ms
iter 208930: loss 6.2800, time 124.26ms
iter 208940: loss 6.2524, time 121.41ms
iter 208950: loss 6.5779, time 124.52ms
iter 208960: loss 7.2721, time 121.43ms
iter 208970: loss 6.4747, time 123.94ms
iter 208980: loss 5.5801, time 121.43ms
iter 208990: loss 6.0210, time 124.34ms
step 209000: train loss 5.8044, val loss 5.8242
saving checkpoint to out-shakespeare-char
iter 209000: loss 6.7719, time 2882.44ms
iter 209010: loss 6.8542, time 121.45ms
iter 209020: loss 6.4122, time 121.75ms
iter 209030: loss 6.2396, time 121.43ms
iter 209040: loss 6.4061, time 121.55ms
iter 209050: loss 7.3297, time 121.28ms
iter 209060: loss 6.7716, time 121.38ms
iter 209070: loss 6.5983, time 121.78ms
iter 209080: loss 6.6698, time 121.20ms
iter 209090: loss 5.5230, time 121.98ms
iter 209100: loss 5.9982, time 121.37ms
iter 209110: loss 6.7306, time 121.29ms
iter 209120: loss 6.3802, time 121.42ms
iter 209130: loss 6.3675, time 121.35ms
iter 209140: loss 6.4112, time 121.52ms
iter 209150: loss 5.9387, time 121.45ms
iter 209160: loss 5.6078, time 121.34ms
iter 209170: loss 6.0329, time 121.08ms
iter 209180: loss 6.5807, time 121.30ms
iter 209190: loss 5.8814, time 121.25ms
iter 209200: loss 6.3269, time 122.03ms
iter 209210: loss 7.0677, time 121.26ms
iter 209220: loss 5.5888, time 121.32ms
iter 209230: loss 5.9860, time 121.55ms
iter 209240: loss 6.4185, time 121.54ms
step 209250: train loss 5.7731, val loss 5.8492
saving checkpoint to out-shakespeare-char
iter 209250: loss 5.9480, time 2877.89ms
iter 209260: loss 6.7531, time 121.52ms
iter 209270: loss 6.8706, time 120.75ms
iter 209280: loss 5.8030, time 121.76ms
iter 209290: loss 6.7242, time 121.49ms
iter 209300: loss 5.6584, time 121.45ms
iter 209310: loss 6.2579, time 121.34ms
iter 209320: loss 6.4450, time 121.18ms
iter 209330: loss 6.5571, time 121.41ms
iter 209340: loss 5.7172, time 121.61ms
iter 209350: loss 6.4328, time 121.55ms
iter 209360: loss 6.1841, time 121.55ms
iter 209370: loss 6.3152, time 121.53ms
iter 209380: loss 6.0742, time 121.41ms
iter 209390: loss 5.8267, time 121.42ms
iter 209400: loss 5.6746, time 121.48ms
iter 209410: loss 5.7901, time 121.40ms
iter 209420: loss 6.3124, time 121.48ms
iter 209430: loss 5.7047, time 121.53ms
iter 209440: loss 6.2931, time 120.65ms
iter 209450: loss 6.7163, time 121.58ms
iter 209460: loss 6.2150, time 121.34ms
iter 209470: loss 6.0060, time 121.39ms
iter 209480: loss 5.9580, time 121.37ms
iter 209490: loss 6.0812, time 121.60ms
step 209500: train loss 5.8058, val loss 5.7686
saving checkpoint to out-shakespeare-char
iter 209500: loss 6.3085, time 2892.80ms
iter 209510: loss 5.8886, time 120.83ms
iter 209520: loss 6.1688, time 124.36ms
iter 209530: loss 5.5718, time 121.97ms
iter 209540: loss 6.2834, time 124.34ms
iter 209550: loss 6.9301, time 121.54ms
iter 209560: loss 6.5924, time 124.62ms
iter 209570: loss 5.9376, time 121.48ms
iter 209580: loss 5.7657, time 124.25ms
iter 209590: loss 6.0510, time 121.43ms
iter 209600: loss 6.4345, time 124.28ms
iter 209610: loss 6.1625, time 121.40ms
iter 209620: loss 6.0171, time 124.64ms
iter 209630: loss 6.6703, time 121.57ms
iter 209640: loss 6.9987, time 124.24ms
iter 209650: loss 6.5111, time 121.46ms
iter 209660: loss 6.6064, time 124.40ms
iter 209670: loss 6.2571, time 121.37ms
iter 209680: loss 5.8588, time 124.41ms
iter 209690: loss 6.2621, time 121.36ms
iter 209700: loss 5.8682, time 124.38ms
iter 209710: loss 6.7193, time 121.38ms
iter 209720: loss 5.6557, time 124.38ms
iter 209730: loss 6.2647, time 121.50ms
iter 209740: loss 6.3692, time 126.08ms
step 209750: train loss 5.7993, val loss 5.8545
saving checkpoint to out-shakespeare-char
iter 209750: loss 6.0710, time 2896.80ms
iter 209760: loss 5.9727, time 125.71ms
iter 209770: loss 6.3338, time 125.27ms
iter 209780: loss 6.5642, time 125.94ms
iter 209790: loss 6.2106, time 125.63ms
iter 209800: loss 6.0157, time 126.40ms
iter 209810: loss 6.7788, time 126.43ms
iter 209820: loss 7.0880, time 128.63ms
iter 209830: loss 6.1690, time 125.42ms
iter 209840: loss 6.2959, time 125.29ms
iter 209850: loss 6.0531, time 125.26ms
iter 209860: loss 5.8987, time 125.09ms
iter 209870: loss 5.7360, time 125.51ms
iter 209880: loss 6.5592, time 125.65ms
iter 209890: loss 6.1393, time 124.67ms
iter 209900: loss 6.4779, time 125.60ms
iter 209910: loss 6.8388, time 125.81ms
iter 209920: loss 6.6646, time 125.81ms
iter 209930: loss 5.3229, time 128.47ms
iter 209940: loss 6.1438, time 126.80ms
iter 209950: loss 6.7165, time 127.16ms
iter 209960: loss 6.9984, time 125.90ms
iter 209970: loss 6.5015, time 125.94ms
iter 209980: loss 6.6890, time 126.12ms
iter 209990: loss 6.4902, time 125.85ms
step 210000: train loss 5.7857, val loss 5.8680
saving checkpoint to out-shakespeare-char
iter 210000: loss 6.5352, time 2888.62ms
iter 210010: loss 6.7467, time 125.09ms
iter 210020: loss 7.1204, time 125.73ms
iter 210030: loss 6.7462, time 125.49ms
iter 210040: loss 6.6990, time 125.67ms
iter 210050: loss 6.5252, time 125.68ms
iter 210060: loss 7.0199, time 125.87ms
iter 210070: loss 6.5866, time 125.82ms
iter 210080: loss 6.3623, time 125.84ms
iter 210090: loss 6.9754, time 125.76ms
iter 210100: loss 6.4111, time 128.63ms
iter 210110: loss 6.6057, time 125.64ms
iter 210120: loss 6.6040, time 125.66ms
iter 210130: loss 6.4316, time 125.04ms
iter 210140: loss 6.3675, time 126.35ms
iter 210150: loss 6.6162, time 125.77ms
iter 210160: loss 6.5405, time 125.46ms
iter 210170: loss 5.9782, time 125.37ms
iter 210180: loss 6.1981, time 125.75ms
iter 210190: loss 6.4280, time 125.91ms
iter 210200: loss 6.1983, time 125.92ms
iter 210210: loss 6.0052, time 128.66ms
iter 210220: loss 5.9429, time 125.80ms
iter 210230: loss 6.7730, time 125.28ms
iter 210240: loss 6.7853, time 125.49ms
step 210250: train loss 5.8391, val loss 5.8438
saving checkpoint to out-shakespeare-char
iter 210250: loss 6.3374, time 2865.55ms
iter 210260: loss 6.5799, time 125.86ms
iter 210270: loss 6.0642, time 125.25ms
iter 210280: loss 6.6512, time 125.42ms
iter 210290: loss 5.4804, time 124.43ms
iter 210300: loss 5.9722, time 126.15ms
iter 210310: loss 6.4888, time 128.27ms
iter 210320: loss 5.9113, time 125.63ms
iter 210330: loss 6.1472, time 126.02ms
iter 210340: loss 6.4318, time 126.30ms
iter 210350: loss 6.4490, time 125.84ms
iter 210360: loss 6.7933, time 125.90ms
iter 210370: loss 6.6255, time 125.82ms
iter 210380: loss 5.8022, time 128.11ms
iter 210390: loss 6.3289, time 125.64ms
iter 210400: loss 5.7171, time 125.96ms
iter 210410: loss 6.5476, time 125.32ms
iter 210420: loss 6.0530, time 126.38ms
iter 210430: loss 5.8315, time 125.96ms
iter 210440: loss 6.6415, time 125.20ms
iter 210450: loss 6.5837, time 125.64ms
iter 210460: loss 6.7756, time 125.24ms
iter 210470: loss 5.7863, time 125.28ms
iter 210480: loss 6.0701, time 125.58ms
iter 210490: loss 7.0724, time 128.42ms
step 210500: train loss 5.8398, val loss 5.8094
saving checkpoint to out-shakespeare-char
iter 210500: loss 6.1536, time 2898.22ms
iter 210510: loss 6.5664, time 125.29ms
iter 210520: loss 6.2410, time 125.72ms
iter 210530: loss 6.0159, time 125.99ms
iter 210540: loss 6.5599, time 125.95ms
iter 210550: loss 5.8170, time 125.89ms
iter 210560: loss 6.3583, time 125.87ms
iter 210570: loss 6.5621, time 126.03ms
iter 210580: loss 5.8474, time 126.03ms
iter 210590: loss 7.4335, time 128.75ms
iter 210600: loss 6.5949, time 125.76ms
iter 210610: loss 6.3894, time 125.74ms
iter 210620: loss 5.9351, time 125.31ms
iter 210630: loss 6.2552, time 124.97ms
iter 210640: loss 6.0692, time 126.07ms
iter 210650: loss 6.4956, time 125.80ms
iter 210660: loss 6.7801, time 124.99ms
iter 210670: loss 6.2518, time 125.86ms
iter 210680: loss 6.5530, time 125.72ms
iter 210690: loss 7.5565, time 126.22ms
iter 210700: loss 6.2437, time 128.75ms
iter 210710: loss 6.4831, time 125.74ms
iter 210720: loss 6.0796, time 127.61ms
iter 210730: loss 6.7169, time 124.72ms
iter 210740: loss 6.1065, time 125.46ms
step 210750: train loss 5.7950, val loss 5.8157
saving checkpoint to out-shakespeare-char
iter 210750: loss 6.7916, time 2853.23ms
iter 210760: loss 5.9800, time 126.30ms
iter 210770: loss 6.2690, time 128.03ms
iter 210780: loss 5.7823, time 125.81ms
iter 210790: loss 6.2507, time 124.50ms
iter 210800: loss 6.2462, time 125.32ms
iter 210810: loss 6.2343, time 124.99ms
iter 210820: loss 6.0390, time 125.49ms
iter 210830: loss 6.1211, time 125.14ms
iter 210840: loss 6.5204, time 128.30ms
iter 210850: loss 6.3138, time 125.04ms
iter 210860: loss 6.9597, time 125.32ms
iter 210870: loss 5.9534, time 126.23ms
iter 210880: loss 6.0144, time 125.00ms
iter 210890: loss 6.4061, time 125.79ms
iter 210900: loss 5.9180, time 125.00ms
iter 210910: loss 6.1051, time 125.78ms
iter 210920: loss 6.0638, time 125.41ms
iter 210930: loss 6.8227, time 125.69ms
iter 210940: loss 6.3050, time 125.94ms
iter 210950: loss 6.0652, time 124.90ms
iter 210960: loss 6.6843, time 126.06ms
iter 210970: loss 6.6518, time 125.12ms
iter 210980: loss 6.5518, time 125.24ms
iter 210990: loss 6.9510, time 127.60ms
step 211000: train loss 5.8145, val loss 5.8394
saving checkpoint to out-shakespeare-char
iter 211000: loss 6.0998, time 2892.25ms
iter 211010: loss 6.8688, time 125.17ms
iter 211020: loss 6.7243, time 125.66ms
iter 211030: loss 6.3810, time 125.08ms
iter 211040: loss 6.0900, time 126.06ms
iter 211050: loss 6.2677, time 125.12ms
iter 211060: loss 6.0169, time 125.46ms
iter 211070: loss 6.4525, time 125.81ms
iter 211080: loss 6.0754, time 124.61ms
iter 211090: loss 6.3490, time 125.77ms
iter 211100: loss 6.1264, time 125.51ms
iter 211110: loss 6.4694, time 125.46ms
iter 211120: loss 6.1725, time 125.21ms
iter 211130: loss 5.2991, time 126.05ms
iter 211140: loss 6.7990, time 125.00ms
iter 211150: loss 6.7285, time 125.66ms
iter 211160: loss 6.9607, time 125.60ms
iter 211170: loss 6.8381, time 128.53ms
iter 211180: loss 6.1373, time 125.34ms
iter 211190: loss 6.0131, time 125.53ms
iter 211200: loss 5.8208, time 125.95ms
iter 211210: loss 7.0540, time 125.96ms
iter 211220: loss 6.3138, time 125.61ms
iter 211230: loss 6.6697, time 125.91ms
iter 211240: loss 6.5976, time 126.19ms
step 211250: train loss 5.8727, val loss 5.7589
saving checkpoint to out-shakespeare-char
iter 211250: loss 6.4931, time 2898.43ms
iter 211260: loss 6.9582, time 125.62ms
iter 211270: loss 7.0824, time 125.19ms
iter 211280: loss 6.9061, time 128.96ms
iter 211290: loss 6.7993, time 125.77ms
iter 211300: loss 6.2248, time 126.05ms
iter 211310: loss 6.1221, time 125.48ms
iter 211320: loss 5.5138, time 125.89ms
iter 211330: loss 6.5074, time 125.52ms
iter 211340: loss 6.6269, time 125.40ms
iter 211350: loss 6.9454, time 125.58ms
iter 211360: loss 6.6046, time 125.51ms
iter 211370: loss 6.8048, time 125.59ms
iter 211380: loss 6.0216, time 125.52ms
iter 211390: loss 6.0880, time 128.29ms
iter 211400: loss 6.5574, time 125.69ms
iter 211410: loss 6.6876, time 126.41ms
iter 211420: loss 6.1734, time 126.13ms
iter 211430: loss 6.1296, time 128.42ms
iter 211440: loss 5.9656, time 126.19ms
iter 211450: loss 6.7734, time 125.39ms
iter 211460: loss 6.3201, time 127.60ms
iter 211470: loss 6.4088, time 123.48ms
iter 211480: loss 6.5918, time 125.63ms
iter 211490: loss 5.7396, time 125.78ms
step 211500: train loss 5.8310, val loss 5.8024
saving checkpoint to out-shakespeare-char
iter 211500: loss 6.7458, time 2913.76ms
iter 211510: loss 6.6294, time 124.34ms
iter 211520: loss 6.6399, time 124.50ms
iter 211530: loss 6.5468, time 125.89ms
iter 211540: loss 6.4549, time 125.01ms
iter 211550: loss 6.5789, time 128.66ms
iter 211560: loss 5.9972, time 125.63ms
iter 211570: loss 6.4969, time 125.30ms
iter 211580: loss 6.6077, time 125.75ms
iter 211590: loss 6.6670, time 125.55ms
iter 211600: loss 5.7581, time 125.69ms
iter 211610: loss 6.2880, time 125.82ms
iter 211620: loss 6.0548, time 125.71ms
iter 211630: loss 5.9994, time 125.61ms
iter 211640: loss 6.4495, time 125.60ms
iter 211650: loss 6.8847, time 125.86ms
iter 211660: loss 5.7751, time 128.84ms
iter 211670: loss 6.1492, time 125.69ms
iter 211680: loss 6.0910, time 125.53ms
iter 211690: loss 6.3044, time 125.55ms
iter 211700: loss 6.6665, time 125.29ms
iter 211710: loss 6.2043, time 125.22ms
iter 211720: loss 6.3804, time 124.90ms
iter 211730: loss 6.6311, time 125.04ms
iter 211740: loss 5.5352, time 125.10ms
step 211750: train loss 5.7858, val loss 5.8320
saving checkpoint to out-shakespeare-char
iter 211750: loss 6.3050, time 2894.09ms
iter 211760: loss 5.9345, time 126.15ms
iter 211770: loss 6.4929, time 126.15ms
iter 211780: loss 6.2144, time 125.52ms
iter 211790: loss 6.1899, time 125.98ms
iter 211800: loss 6.1213, time 129.23ms
iter 211810: loss 6.3859, time 125.89ms
iter 211820: loss 6.6844, time 126.03ms
iter 211830: loss 6.4365, time 126.42ms
iter 211840: loss 6.1627, time 127.78ms
iter 211850: loss 6.2746, time 125.68ms
iter 211860: loss 5.9000, time 125.62ms
iter 211870: loss 6.3330, time 123.86ms
iter 211880: loss 6.7646, time 125.49ms
iter 211890: loss 6.3770, time 125.40ms
iter 211900: loss 6.4504, time 125.51ms
iter 211910: loss 5.5751, time 126.25ms
iter 211920: loss 6.0359, time 126.12ms
iter 211930: loss 6.2771, time 127.20ms
iter 211940: loss 6.7953, time 124.19ms
iter 211950: loss 6.2492, time 126.51ms
iter 211960: loss 5.7948, time 125.69ms
iter 211970: loss 5.3295, time 125.69ms
iter 211980: loss 6.8129, time 125.23ms
iter 211990: loss 6.2305, time 125.63ms
step 212000: train loss 5.8270, val loss 5.7925
saving checkpoint to out-shakespeare-char
iter 212000: loss 6.3735, time 2909.34ms
iter 212010: loss 6.9826, time 124.82ms
iter 212020: loss 6.8102, time 125.46ms
iter 212030: loss 6.1061, time 125.20ms
iter 212040: loss 5.4855, time 125.31ms
iter 212050: loss 5.8095, time 127.62ms
iter 212060: loss 6.3892, time 125.10ms
iter 212070: loss 5.6453, time 125.83ms
iter 212080: loss 6.6094, time 124.36ms
iter 212090: loss 6.0546, time 125.53ms
iter 212100: loss 6.2576, time 125.37ms
iter 212110: loss 6.5645, time 125.79ms
iter 212120: loss 7.3934, time 124.50ms
iter 212130: loss 6.2283, time 125.03ms
iter 212140: loss 5.7969, time 124.93ms
iter 212150: loss 6.0232, time 125.61ms
iter 212160: loss 6.8387, time 128.16ms
iter 212170: loss 5.5957, time 125.95ms
iter 212180: loss 5.6928, time 125.98ms
iter 212190: loss 6.5575, time 125.51ms
iter 212200: loss 5.6342, time 125.17ms
iter 212210: loss 5.6726, time 125.09ms
iter 212220: loss 6.8126, time 125.57ms
iter 212230: loss 6.8757, time 124.15ms
iter 212240: loss 6.5775, time 125.56ms
step 212250: train loss 5.8498, val loss 5.8324
saving checkpoint to out-shakespeare-char
iter 212250: loss 6.7829, time 2895.16ms
iter 212260: loss 7.1716, time 125.84ms
iter 212270: loss 6.1424, time 124.62ms
iter 212280: loss 6.8797, time 123.49ms
iter 212290: loss 6.3717, time 125.10ms
iter 212300: loss 6.8386, time 125.61ms
iter 212310: loss 7.1278, time 124.33ms
iter 212320: loss 6.4936, time 124.66ms
iter 212330: loss 6.0247, time 125.10ms
iter 212340: loss 6.6879, time 125.71ms
iter 212350: loss 6.0745, time 124.99ms
iter 212360: loss 6.2783, time 128.86ms
iter 212370: loss 6.3761, time 126.01ms
iter 212380: loss 7.3039, time 125.30ms
iter 212390: loss 5.8685, time 123.23ms
iter 212400: loss 6.4329, time 125.14ms
iter 212410: loss 6.1439, time 125.68ms
iter 212420: loss 5.7168, time 125.18ms
iter 212430: loss 5.8195, time 123.92ms
iter 212440: loss 6.3360, time 125.60ms
iter 212450: loss 5.8879, time 125.61ms
iter 212460: loss 5.7907, time 125.69ms
iter 212470: loss 6.3423, time 128.69ms
iter 212480: loss 6.1467, time 125.80ms
iter 212490: loss 5.7327, time 125.46ms
step 212500: train loss 5.7677, val loss 5.8074
saving checkpoint to out-shakespeare-char
iter 212500: loss 6.1988, time 2899.23ms
iter 212510: loss 6.4707, time 121.61ms
iter 212520: loss 6.3316, time 121.68ms
iter 212530: loss 6.5025, time 121.56ms
iter 212540: loss 6.4210, time 121.55ms
iter 212550: loss 6.2054, time 121.39ms
iter 212560: loss 5.9787, time 120.79ms
iter 212570: loss 6.3068, time 121.51ms
iter 212580: loss 5.6901, time 121.78ms
iter 212590: loss 6.2285, time 121.46ms
iter 212600: loss 6.5971, time 121.45ms
iter 212610: loss 6.8333, time 122.23ms
iter 212620: loss 7.4533, time 121.64ms
iter 212630: loss 6.7799, time 121.32ms
iter 212640: loss 5.8905, time 122.00ms
iter 212650: loss 6.2197, time 121.33ms
iter 212660: loss 6.3070, time 122.02ms
iter 212670: loss 6.1118, time 121.46ms
iter 212680: loss 5.9577, time 121.49ms
iter 212690: loss 6.3631, time 121.60ms
iter 212700: loss 6.5136, time 122.09ms
iter 212710: loss 6.9277, time 120.53ms
iter 212720: loss 6.7217, time 121.89ms
iter 212730: loss 6.5349, time 121.62ms
iter 212740: loss 6.5032, time 121.68ms
step 212750: train loss 5.7342, val loss 5.7929
saving checkpoint to out-shakespeare-char
iter 212750: loss 7.4255, time 2873.79ms
iter 212760: loss 6.6559, time 121.30ms
iter 212770: loss 5.9682, time 121.38ms
iter 212780: loss 5.9206, time 121.36ms
iter 212790: loss 5.9852, time 121.31ms
iter 212800: loss 6.4015, time 121.34ms
iter 212810: loss 6.2842, time 121.49ms
iter 212820: loss 6.7776, time 121.97ms
iter 212830: loss 6.4983, time 121.50ms
iter 212840: loss 6.1394, time 121.68ms
iter 212850: loss 5.5447, time 121.49ms
iter 212860: loss 5.8425, time 121.27ms
iter 212870: loss 6.9162, time 121.83ms
iter 212880: loss 6.6950, time 121.89ms
iter 212890: loss 6.0581, time 121.30ms
iter 212900: loss 6.4294, time 121.87ms
iter 212910: loss 6.4338, time 121.18ms
iter 212920: loss 6.6817, time 120.98ms
iter 212930: loss 6.9291, time 121.36ms
iter 212940: loss 6.3186, time 121.27ms
iter 212950: loss 6.6191, time 121.47ms
iter 212960: loss 6.5064, time 121.69ms
iter 212970: loss 6.5047, time 121.44ms
iter 212980: loss 6.7589, time 121.89ms
iter 212990: loss 6.1832, time 121.30ms
step 213000: train loss 5.7603, val loss 5.8072
saving checkpoint to out-shakespeare-char
iter 213000: loss 6.4731, time 2893.61ms
iter 213010: loss 6.8108, time 121.29ms
iter 213020: loss 6.4356, time 122.59ms
iter 213030: loss 5.6124, time 121.43ms
iter 213040: loss 6.5201, time 122.52ms
iter 213050: loss 6.5525, time 122.07ms
iter 213060: loss 6.5190, time 122.63ms
iter 213070: loss 6.0334, time 121.38ms
iter 213080: loss 6.0945, time 123.14ms
iter 213090: loss 6.8303, time 121.50ms
iter 213100: loss 6.6592, time 122.60ms
iter 213110: loss 6.8941, time 121.59ms
iter 213120: loss 6.0272, time 122.66ms
iter 213130: loss 6.2731, time 121.65ms
iter 213140: loss 5.7752, time 123.44ms
iter 213150: loss 6.7567, time 122.03ms
iter 213160: loss 5.7349, time 122.62ms
iter 213170: loss 5.9380, time 121.58ms
iter 213180: loss 5.5918, time 123.62ms
iter 213190: loss 5.3271, time 121.59ms
iter 213200: loss 6.6067, time 122.95ms
iter 213210: loss 6.7839, time 121.60ms
iter 213220: loss 5.7669, time 121.57ms
iter 213230: loss 6.2125, time 121.62ms
iter 213240: loss 5.9453, time 122.62ms
step 213250: train loss 5.8271, val loss 5.8007
saving checkpoint to out-shakespeare-char
iter 213250: loss 6.1396, time 2901.46ms
iter 213260: loss 6.5080, time 124.98ms
iter 213270: loss 6.7594, time 125.83ms
iter 213280: loss 7.2455, time 126.09ms
iter 213290: loss 6.8221, time 125.63ms
iter 213300: loss 6.5949, time 128.35ms
iter 213310: loss 6.5147, time 125.18ms
iter 213320: loss 6.6948, time 125.77ms
iter 213330: loss 7.1069, time 125.80ms
iter 213340: loss 6.5440, time 125.96ms
iter 213350: loss 5.7983, time 125.87ms
iter 213360: loss 6.6859, time 125.75ms
iter 213370: loss 6.1397, time 124.67ms
iter 213380: loss 5.4897, time 125.63ms
iter 213390: loss 6.7061, time 125.96ms
iter 213400: loss 6.5319, time 125.64ms
iter 213410: loss 6.4019, time 128.80ms
iter 213420: loss 6.6045, time 125.72ms
iter 213430: loss 6.0943, time 124.80ms
iter 213440: loss 6.4482, time 125.63ms
iter 213450: loss 5.8128, time 125.74ms
iter 213460: loss 6.5826, time 125.51ms
iter 213470: loss 6.6252, time 125.55ms
iter 213480: loss 6.1158, time 125.51ms
iter 213490: loss 6.5358, time 126.10ms
step 213500: train loss 5.8040, val loss 5.7860
saving checkpoint to out-shakespeare-char
iter 213500: loss 5.3003, time 2901.02ms
iter 213510: loss 5.7237, time 125.34ms
iter 213520: loss 6.7195, time 125.47ms
iter 213530: loss 6.1782, time 124.99ms
iter 213540: loss 6.3904, time 125.95ms
iter 213550: loss 6.9326, time 125.77ms
iter 213560: loss 6.2033, time 125.88ms
iter 213570: loss 6.6099, time 125.90ms
iter 213580: loss 6.4115, time 127.94ms
iter 213590: loss 6.1382, time 125.70ms
iter 213600: loss 6.2208, time 125.82ms
iter 213610: loss 6.1990, time 125.79ms
iter 213620: loss 6.0853, time 125.23ms
iter 213630: loss 6.4105, time 125.77ms
iter 213640: loss 6.0081, time 125.92ms
iter 213650: loss 5.5568, time 125.51ms
iter 213660: loss 5.8180, time 124.82ms
iter 213670: loss 6.3566, time 124.87ms
iter 213680: loss 6.2393, time 125.59ms
iter 213690: loss 6.0515, time 128.19ms
iter 213700: loss 6.2788, time 124.91ms
iter 213710: loss 6.5922, time 124.34ms
iter 213720: loss 6.8472, time 125.13ms
iter 213730: loss 6.9500, time 124.97ms
iter 213740: loss 6.1043, time 125.21ms
step 213750: train loss 5.7333, val loss 5.8233
saving checkpoint to out-shakespeare-char
iter 213750: loss 6.5256, time 2896.93ms
iter 213760: loss 6.5043, time 122.07ms
iter 213770: loss 6.2565, time 122.87ms
iter 213780: loss 6.3350, time 130.52ms
iter 213790: loss 5.7888, time 122.11ms
iter 213800: loss 5.8471, time 122.16ms
iter 213810: loss 6.3022, time 122.69ms
iter 213820: loss 6.4171, time 121.75ms
iter 213830: loss 5.8930, time 122.35ms
iter 213840: loss 5.9786, time 121.35ms
iter 213850: loss 6.2084, time 122.76ms
iter 213860: loss 6.6856, time 121.73ms
iter 213870: loss 6.2864, time 122.79ms
iter 213880: loss 6.5478, time 121.60ms
iter 213890: loss 5.8015, time 122.75ms
iter 213900: loss 6.1588, time 121.83ms
iter 213910: loss 5.8538, time 122.77ms
iter 213920: loss 6.2374, time 121.74ms
iter 213930: loss 6.6119, time 122.75ms
iter 213940: loss 5.4838, time 122.09ms
iter 213950: loss 6.6778, time 123.18ms
iter 213960: loss 6.2439, time 122.36ms
iter 213970: loss 5.8332, time 122.38ms
iter 213980: loss 6.8950, time 121.98ms
iter 213990: loss 6.6205, time 122.93ms
step 214000: train loss 5.7819, val loss 5.7978
saving checkpoint to out-shakespeare-char
iter 214000: loss 6.2562, time 2896.77ms
iter 214010: loss 6.5952, time 121.73ms
iter 214020: loss 7.0008, time 122.30ms
iter 214030: loss 6.6538, time 122.06ms
iter 214040: loss 6.2492, time 121.48ms
iter 214050: loss 6.3771, time 122.16ms
iter 214060: loss 6.4332, time 121.01ms
iter 214070: loss 6.1074, time 122.14ms
iter 214080: loss 6.3978, time 121.95ms
iter 214090: loss 6.5487, time 121.77ms
iter 214100: loss 6.0719, time 121.62ms
iter 214110: loss 6.0922, time 121.99ms
iter 214120: loss 6.4175, time 121.93ms
iter 214130: loss 6.0252, time 121.96ms
iter 214140: loss 6.6440, time 121.77ms
iter 214150: loss 5.7348, time 121.90ms
iter 214160: loss 6.7707, time 121.73ms
iter 214170: loss 5.9781, time 121.92ms
iter 214180: loss 5.5985, time 121.83ms
iter 214190: loss 5.9039, time 122.82ms
iter 214200: loss 6.4021, time 121.73ms
iter 214210: loss 5.7582, time 121.87ms
iter 214220: loss 6.7427, time 121.68ms
iter 214230: loss 6.1377, time 121.97ms
iter 214240: loss 6.6114, time 121.33ms
step 214250: train loss 5.7411, val loss 5.8262
saving checkpoint to out-shakespeare-char
iter 214250: loss 6.4962, time 2907.42ms
iter 214260: loss 6.6303, time 125.06ms
iter 214270: loss 6.4964, time 121.73ms
iter 214280: loss 6.1585, time 121.56ms
iter 214290: loss 5.8363, time 121.77ms
iter 214300: loss 6.4662, time 121.75ms
iter 214310: loss 5.5639, time 121.43ms
iter 214320: loss 6.1078, time 121.43ms
iter 214330: loss 7.4668, time 121.66ms
iter 214340: loss 6.6601, time 121.74ms
iter 214350: loss 5.9017, time 121.76ms
iter 214360: loss 7.2559, time 121.63ms
iter 214370: loss 5.9599, time 121.80ms
iter 214380: loss 6.5360, time 122.72ms
iter 214390: loss 6.7706, time 121.66ms
iter 214400: loss 6.1634, time 121.75ms
iter 214410: loss 5.8409, time 121.87ms
iter 214420: loss 7.0241, time 122.37ms
iter 214430: loss 7.0847, time 121.82ms
iter 214440: loss 5.9482, time 121.88ms
iter 214450: loss 6.2194, time 121.73ms
iter 214460: loss 5.9673, time 121.41ms
iter 214470: loss 6.6649, time 121.79ms
iter 214480: loss 6.1413, time 121.79ms
iter 214490: loss 5.4339, time 121.64ms
step 214500: train loss 5.8180, val loss 5.7745
saving checkpoint to out-shakespeare-char
iter 214500: loss 6.9147, time 2897.85ms
iter 214510: loss 5.9973, time 121.81ms
iter 214520: loss 6.2012, time 121.83ms
iter 214530: loss 6.7305, time 121.43ms
iter 214540: loss 6.3994, time 121.67ms
iter 214550: loss 5.8406, time 121.65ms
iter 214560: loss 7.0302, time 121.68ms
iter 214570: loss 5.8657, time 121.71ms
iter 214580: loss 6.2469, time 121.47ms
iter 214590: loss 6.4077, time 121.54ms
iter 214600: loss 5.6408, time 121.43ms
iter 214610: loss 6.6108, time 121.45ms
iter 214620: loss 5.9274, time 122.02ms
iter 214630: loss 6.2657, time 121.61ms
iter 214640: loss 6.7745, time 121.94ms
iter 214650: loss 6.6093, time 121.56ms
iter 214660: loss 6.7860, time 122.07ms
iter 214670: loss 6.6782, time 121.51ms
iter 214680: loss 6.5187, time 121.52ms
iter 214690: loss 6.2861, time 121.26ms
iter 214700: loss 6.1331, time 121.72ms
iter 214710: loss 6.7363, time 121.84ms
iter 214720: loss 6.8791, time 121.57ms
iter 214730: loss 6.8913, time 121.53ms
iter 214740: loss 5.9644, time 121.79ms
step 214750: train loss 5.8029, val loss 5.8334
saving checkpoint to out-shakespeare-char
iter 214750: loss 6.4538, time 2883.99ms
iter 214760: loss 6.5707, time 121.83ms
iter 214770: loss 7.9912, time 125.56ms
iter 214780: loss 6.0788, time 125.66ms
iter 214790: loss 6.6921, time 125.51ms
iter 214800: loss 6.6456, time 125.67ms
iter 214810: loss 6.6531, time 126.06ms
iter 214820: loss 5.9045, time 125.79ms
iter 214830: loss 7.0724, time 125.85ms
iter 214840: loss 7.1236, time 128.29ms
iter 214850: loss 6.0662, time 126.24ms
iter 214860: loss 6.2410, time 124.96ms
iter 214870: loss 7.3026, time 126.09ms
iter 214880: loss 5.2462, time 125.90ms
iter 214890: loss 5.6330, time 125.69ms
iter 214900: loss 6.1989, time 125.82ms
iter 214910: loss 6.3714, time 125.51ms
iter 214920: loss 7.1947, time 125.80ms
iter 214930: loss 6.1583, time 125.84ms
iter 214940: loss 6.4312, time 125.99ms
iter 214950: loss 6.4016, time 128.86ms
iter 214960: loss 6.0611, time 125.40ms
iter 214970: loss 6.0065, time 125.89ms
iter 214980: loss 6.8478, time 125.60ms
iter 214990: loss 6.0342, time 127.55ms
step 215000: train loss 5.7856, val loss 5.7908
saving checkpoint to out-shakespeare-char
iter 215000: loss 6.5571, time 2871.70ms
iter 215010: loss 6.6415, time 126.11ms
iter 215020: loss 6.4940, time 125.20ms
iter 215030: loss 5.3369, time 125.25ms
iter 215040: loss 6.0292, time 126.73ms
iter 215050: loss 6.1302, time 128.31ms
iter 215060: loss 6.4875, time 125.31ms
iter 215070: loss 5.9919, time 125.12ms
iter 215080: loss 6.6563, time 125.61ms
iter 215090: loss 5.3941, time 125.81ms
iter 215100: loss 5.9854, time 125.44ms
iter 215110: loss 6.1768, time 125.30ms
iter 215120: loss 6.5446, time 124.78ms
iter 215130: loss 5.9001, time 125.21ms
iter 215140: loss 5.4561, time 124.24ms
iter 215150: loss 6.4183, time 125.30ms
iter 215160: loss 6.2570, time 128.30ms
iter 215170: loss 6.3518, time 125.43ms
iter 215180: loss 6.0509, time 125.31ms
iter 215190: loss 6.4503, time 125.47ms
iter 215200: loss 6.8642, time 125.59ms
iter 215210: loss 5.7527, time 125.25ms
iter 215220: loss 5.7695, time 125.85ms
iter 215230: loss 6.9312, time 125.47ms
iter 215240: loss 5.8767, time 125.59ms
step 215250: train loss 5.8252, val loss 5.8096
saving checkpoint to out-shakespeare-char
iter 215250: loss 6.5360, time 2899.99ms
iter 215260: loss 6.3127, time 128.94ms
iter 215270: loss 6.4437, time 126.02ms
iter 215280: loss 6.6410, time 125.63ms
iter 215290: loss 6.4790, time 125.56ms
iter 215300: loss 5.9749, time 125.38ms
iter 215310: loss 5.6022, time 125.54ms
iter 215320: loss 5.9768, time 125.82ms
iter 215330: loss 5.9383, time 125.58ms
iter 215340: loss 7.0893, time 125.29ms
iter 215350: loss 6.8102, time 125.03ms
iter 215360: loss 5.2871, time 125.52ms
iter 215370: loss 5.6620, time 128.11ms
iter 215380: loss 6.1947, time 125.33ms
iter 215390: loss 6.1450, time 125.20ms
iter 215400: loss 6.5809, time 125.62ms
iter 215410: loss 6.3010, time 125.19ms
iter 215420: loss 5.7612, time 125.28ms
iter 215430: loss 6.6138, time 125.18ms
iter 215440: loss 6.8241, time 125.19ms
iter 215450: loss 6.4456, time 125.52ms
iter 215460: loss 5.8188, time 124.63ms
iter 215470: loss 6.2515, time 125.30ms
iter 215480: loss 5.8123, time 128.13ms
iter 215490: loss 5.5907, time 125.15ms
step 215500: train loss 5.8342, val loss 5.8265
saving checkpoint to out-shakespeare-char
iter 215500: loss 6.3497, time 2905.53ms
iter 215510: loss 6.3247, time 125.42ms
iter 215520: loss 6.3956, time 125.34ms
iter 215530: loss 6.5576, time 125.18ms
iter 215540: loss 6.0926, time 125.71ms
iter 215550: loss 6.3578, time 126.00ms
iter 215560: loss 6.7136, time 124.32ms
iter 215570: loss 5.9383, time 124.96ms
iter 215580: loss 6.5680, time 125.40ms
iter 215590: loss 6.4253, time 125.11ms
iter 215600: loss 5.8808, time 125.26ms
iter 215610: loss 6.7900, time 128.12ms
iter 215620: loss 6.6321, time 125.13ms
iter 215630: loss 6.2350, time 125.35ms
iter 215640: loss 6.8542, time 124.76ms
iter 215650: loss 6.5335, time 125.06ms
iter 215660: loss 6.8107, time 125.27ms
iter 215670: loss 6.4854, time 125.70ms
iter 215680: loss 6.0324, time 124.65ms
iter 215690: loss 6.2382, time 125.80ms
iter 215700: loss 6.6625, time 125.61ms
iter 215710: loss 6.2916, time 126.08ms
iter 215720: loss 6.2397, time 125.53ms
iter 215730: loss 6.4041, time 125.73ms
iter 215740: loss 5.7810, time 125.79ms
step 215750: train loss 5.7705, val loss 5.7091
saving checkpoint to out-shakespeare-char
iter 215750: loss 6.0662, time 2861.31ms
iter 215760: loss 7.1476, time 126.22ms
iter 215770: loss 7.1347, time 125.81ms
iter 215780: loss 6.6663, time 125.02ms
iter 215790: loss 6.4013, time 125.75ms
iter 215800: loss 5.8488, time 127.57ms
iter 215810: loss 5.7167, time 126.00ms
iter 215820: loss 6.2255, time 125.73ms
iter 215830: loss 7.1372, time 126.01ms
iter 215840: loss 5.8785, time 126.14ms
iter 215850: loss 6.2827, time 126.13ms
iter 215860: loss 6.5375, time 128.82ms
iter 215870: loss 6.8103, time 125.80ms
iter 215880: loss 5.8047, time 125.79ms
iter 215890: loss 6.2181, time 126.84ms
iter 215900: loss 5.9605, time 125.39ms
iter 215910: loss 6.0259, time 125.47ms
iter 215920: loss 5.9101, time 126.28ms
iter 215930: loss 6.3709, time 125.57ms
iter 215940: loss 6.9320, time 126.05ms
iter 215950: loss 6.0907, time 126.66ms
iter 215960: loss 6.7594, time 125.43ms
iter 215970: loss 6.5184, time 125.59ms
iter 215980: loss 6.5763, time 125.65ms
iter 215990: loss 5.9731, time 125.67ms
step 216000: train loss 5.8365, val loss 5.8916
saving checkpoint to out-shakespeare-char
iter 216000: loss 6.0858, time 2886.09ms
iter 216010: loss 6.5824, time 126.39ms
iter 216020: loss 6.3286, time 124.79ms
iter 216030: loss 5.9861, time 125.99ms
iter 216040: loss 5.6719, time 125.60ms
iter 216050: loss 5.8273, time 125.26ms
iter 216060: loss 5.9936, time 125.55ms
iter 216070: loss 7.0536, time 128.22ms
iter 216080: loss 5.9724, time 125.74ms
iter 216090: loss 6.3886, time 121.83ms
iter 216100: loss 6.0587, time 121.66ms
iter 216110: loss 6.3210, time 121.74ms
iter 216120: loss 6.2053, time 121.72ms
iter 216130: loss 6.6344, time 121.86ms
iter 216140: loss 7.0999, time 121.92ms
iter 216150: loss 6.7187, time 121.83ms
iter 216160: loss 6.6860, time 121.44ms
iter 216170: loss 5.8980, time 121.91ms
iter 216180: loss 6.7231, time 121.63ms
iter 216190: loss 6.0943, time 121.82ms
iter 216200: loss 6.2123, time 121.94ms
iter 216210: loss 6.4662, time 122.96ms
iter 216220: loss 6.2282, time 121.97ms
iter 216230: loss 6.5838, time 121.87ms
iter 216240: loss 6.5569, time 121.69ms
step 216250: train loss 5.8172, val loss 5.7932
saving checkpoint to out-shakespeare-char
iter 216250: loss 6.1697, time 2893.79ms
iter 216260: loss 6.3683, time 120.54ms
iter 216270: loss 6.7900, time 121.25ms
iter 216280: loss 6.2344, time 120.93ms
iter 216290: loss 5.5328, time 120.65ms
iter 216300: loss 6.2827, time 121.46ms
iter 216310: loss 6.7284, time 121.33ms
iter 216320: loss 6.4966, time 121.48ms
iter 216330: loss 5.7558, time 121.02ms
iter 216340: loss 6.5361, time 121.05ms
iter 216350: loss 6.2767, time 121.48ms
iter 216360: loss 5.7099, time 121.62ms
iter 216370: loss 5.3748, time 121.81ms
iter 216380: loss 6.7109, time 121.47ms
iter 216390: loss 5.9915, time 121.53ms
iter 216400: loss 6.9172, time 121.59ms
iter 216410: loss 6.3811, time 121.58ms
iter 216420: loss 5.6280, time 121.53ms
iter 216430: loss 6.7181, time 121.57ms
iter 216440: loss 6.1205, time 121.21ms
iter 216450: loss 5.8683, time 121.45ms
iter 216460: loss 6.4512, time 121.94ms
iter 216470: loss 5.7633, time 121.57ms
iter 216480: loss 6.0204, time 121.50ms
iter 216490: loss 6.1232, time 121.33ms
step 216500: train loss 5.7978, val loss 5.8082
saving checkpoint to out-shakespeare-char
iter 216500: loss 6.1600, time 2882.49ms
iter 216510: loss 5.1367, time 121.61ms
iter 216520: loss 5.9183, time 122.82ms
iter 216530: loss 5.8983, time 120.87ms
iter 216540: loss 7.3348, time 121.29ms
iter 216550: loss 6.1907, time 121.31ms
iter 216560: loss 5.9431, time 121.19ms
iter 216570: loss 6.6296, time 121.04ms
iter 216580: loss 6.1896, time 121.64ms
iter 216590: loss 5.3947, time 121.56ms
iter 216600: loss 6.4740, time 122.34ms
iter 216610: loss 5.6768, time 121.64ms
iter 216620: loss 6.5901, time 121.72ms
iter 216630: loss 6.1240, time 121.56ms
iter 216640: loss 6.6411, time 121.56ms
iter 216650: loss 6.2677, time 120.82ms
iter 216660: loss 5.6619, time 120.97ms
iter 216670: loss 6.7067, time 121.65ms
iter 216680: loss 6.7055, time 121.46ms
iter 216690: loss 6.7007, time 121.65ms
iter 216700: loss 6.4062, time 121.62ms
iter 216710: loss 6.2071, time 121.82ms
iter 216720: loss 6.3025, time 121.74ms
iter 216730: loss 5.9089, time 121.73ms
iter 216740: loss 5.4572, time 120.88ms
step 216750: train loss 5.7332, val loss 5.8030
saving checkpoint to out-shakespeare-char
iter 216750: loss 5.9998, time 2897.39ms
iter 216760: loss 6.5214, time 125.99ms
iter 216770: loss 6.7778, time 125.69ms
iter 216780: loss 6.6332, time 125.29ms
iter 216790: loss 5.8652, time 125.60ms
iter 216800: loss 6.4323, time 125.85ms
iter 216810: loss 6.4175, time 126.20ms
iter 216820: loss 6.2337, time 125.79ms
iter 216830: loss 6.6052, time 125.67ms
iter 216840: loss 6.2877, time 124.59ms
iter 216850: loss 6.4023, time 125.57ms
iter 216860: loss 6.3872, time 125.38ms
iter 216870: loss 6.0341, time 125.85ms
iter 216880: loss 6.8691, time 125.76ms
iter 216890: loss 5.9322, time 124.29ms
iter 216900: loss 6.2182, time 125.02ms
iter 216910: loss 5.2859, time 125.62ms
iter 216920: loss 6.8862, time 125.16ms
iter 216930: loss 6.1580, time 124.33ms
iter 216940: loss 4.8450, time 128.31ms
iter 216950: loss 5.9495, time 125.45ms
iter 216960: loss 6.7023, time 124.21ms
iter 216970: loss 6.1141, time 125.63ms
iter 216980: loss 6.0472, time 125.63ms
iter 216990: loss 6.1646, time 125.81ms
step 217000: train loss 5.7845, val loss 5.7885
saving checkpoint to out-shakespeare-char
iter 217000: loss 6.4986, time 2881.11ms
iter 217010: loss 6.7198, time 125.04ms
iter 217020: loss 5.9648, time 125.58ms
iter 217030: loss 6.4711, time 124.56ms
iter 217040: loss 5.9936, time 125.64ms
iter 217050: loss 6.4909, time 125.34ms
iter 217060: loss 5.8431, time 125.49ms
iter 217070: loss 6.1304, time 128.41ms
iter 217080: loss 6.6419, time 125.44ms
iter 217090: loss 6.3738, time 125.73ms
iter 217100: loss 7.2007, time 124.80ms
iter 217110: loss 6.5178, time 124.92ms
iter 217120: loss 6.6579, time 125.30ms
iter 217130: loss 6.3724, time 125.53ms
iter 217140: loss 6.5577, time 124.43ms
iter 217150: loss 5.8850, time 125.77ms
iter 217160: loss 6.9062, time 125.42ms
iter 217170: loss 6.1115, time 125.71ms
iter 217180: loss 7.0952, time 128.38ms
iter 217190: loss 6.2887, time 125.18ms
iter 217200: loss 5.8851, time 125.58ms
iter 217210: loss 6.6892, time 124.94ms
iter 217220: loss 6.9798, time 125.85ms
iter 217230: loss 5.7754, time 125.43ms
iter 217240: loss 5.6946, time 125.55ms
step 217250: train loss 5.7903, val loss 5.7903
saving checkpoint to out-shakespeare-char
iter 217250: loss 7.1185, time 2887.80ms
iter 217260: loss 7.1539, time 126.00ms
iter 217270: loss 6.4858, time 124.99ms
iter 217280: loss 6.0917, time 125.26ms
iter 217290: loss 6.7449, time 125.40ms
iter 217300: loss 5.7017, time 125.04ms
iter 217310: loss 5.8674, time 124.57ms
iter 217320: loss 6.2760, time 125.44ms
iter 217330: loss 6.7150, time 125.39ms
iter 217340: loss 6.5949, time 125.27ms
iter 217350: loss 6.0934, time 127.57ms
iter 217360: loss 6.6752, time 125.51ms
iter 217370: loss 5.7095, time 125.39ms
iter 217380: loss 5.7650, time 125.97ms
iter 217390: loss 5.4974, time 128.38ms
iter 217400: loss 5.5538, time 125.14ms
iter 217410: loss 6.3820, time 125.47ms
iter 217420: loss 6.3491, time 125.80ms
iter 217430: loss 6.0244, time 125.33ms
iter 217440: loss 6.7346, time 124.77ms
iter 217450: loss 6.2984, time 126.09ms
iter 217460: loss 6.9157, time 128.29ms
iter 217470: loss 6.3495, time 127.91ms
iter 217480: loss 5.1538, time 125.67ms
iter 217490: loss 7.0395, time 125.84ms
step 217500: train loss 5.7496, val loss 5.7650
saving checkpoint to out-shakespeare-char
iter 217500: loss 5.5400, time 2888.12ms
iter 217510: loss 6.2837, time 126.23ms
iter 217520: loss 6.0314, time 125.66ms
iter 217530: loss 6.9573, time 128.80ms
iter 217540: loss 6.9214, time 126.15ms
iter 217550: loss 7.0656, time 125.58ms
iter 217560: loss 6.5989, time 124.81ms
iter 217570: loss 6.0576, time 125.64ms
iter 217580: loss 6.5312, time 125.78ms
iter 217590: loss 6.0887, time 125.20ms
iter 217600: loss 5.7256, time 128.62ms
iter 217610: loss 5.7309, time 125.88ms
iter 217620: loss 6.5872, time 125.47ms
iter 217630: loss 6.3356, time 124.75ms
iter 217640: loss 5.8406, time 128.21ms
iter 217650: loss 5.7904, time 125.05ms
iter 217660: loss 5.5391, time 125.29ms
iter 217670: loss 6.3226, time 124.76ms
iter 217680: loss 5.7855, time 125.42ms
iter 217690: loss 6.3550, time 125.82ms
iter 217700: loss 6.2888, time 125.52ms
iter 217710: loss 6.0069, time 124.39ms
iter 217720: loss 7.1700, time 125.42ms
iter 217730: loss 6.2485, time 125.33ms
iter 217740: loss 6.5429, time 126.07ms
step 217750: train loss 5.7897, val loss 5.7742
saving checkpoint to out-shakespeare-char
iter 217750: loss 5.7779, time 2879.30ms
iter 217760: loss 5.7543, time 126.03ms
iter 217770: loss 6.2729, time 125.76ms
iter 217780: loss 5.6332, time 128.53ms
iter 217790: loss 7.1167, time 125.60ms
iter 217800: loss 6.4639, time 125.81ms
iter 217810: loss 6.6179, time 125.74ms
iter 217820: loss 5.9879, time 125.67ms
iter 217830: loss 6.3598, time 125.53ms
iter 217840: loss 6.2287, time 125.49ms
iter 217850: loss 6.0131, time 125.25ms
iter 217860: loss 5.4039, time 125.48ms
iter 217870: loss 5.8251, time 125.42ms
iter 217880: loss 6.5985, time 125.32ms
iter 217890: loss 6.4583, time 127.87ms
iter 217900: loss 6.3904, time 125.06ms
iter 217910: loss 6.6738, time 125.67ms
iter 217920: loss 6.0066, time 125.23ms
iter 217930: loss 6.7444, time 125.69ms
iter 217940: loss 5.7071, time 125.62ms
iter 217950: loss 6.3948, time 125.64ms
iter 217960: loss 5.5964, time 125.95ms
iter 217970: loss 6.2126, time 125.82ms
iter 217980: loss 5.7822, time 125.71ms
iter 217990: loss 6.0423, time 125.54ms
step 218000: train loss 5.8208, val loss 5.7662
saving checkpoint to out-shakespeare-char
iter 218000: loss 5.7391, time 2871.40ms
iter 218010: loss 6.8626, time 122.04ms
iter 218020: loss 6.3995, time 121.89ms
iter 218030: loss 6.3950, time 121.81ms
iter 218040: loss 5.8719, time 121.55ms
iter 218050: loss 6.4236, time 121.84ms
iter 218060: loss 6.2162, time 122.26ms
iter 218070: loss 6.8140, time 121.51ms
iter 218080: loss 6.1564, time 122.00ms
iter 218090: loss 6.6375, time 121.60ms
iter 218100: loss 6.4326, time 121.95ms
iter 218110: loss 5.6613, time 122.02ms
iter 218120: loss 5.8042, time 121.91ms
iter 218130: loss 6.5751, time 121.96ms
iter 218140: loss 6.5199, time 121.90ms
iter 218150: loss 5.9850, time 122.01ms
iter 218160: loss 6.2189, time 121.96ms
iter 218170: loss 7.1000, time 121.92ms
iter 218180: loss 5.4667, time 121.97ms
iter 218190: loss 6.5402, time 121.78ms
iter 218200: loss 5.8561, time 122.29ms
iter 218210: loss 5.9287, time 121.84ms
iter 218220: loss 5.7176, time 121.34ms
iter 218230: loss 7.2160, time 121.88ms
iter 218240: loss 5.6692, time 121.98ms
step 218250: train loss 5.8041, val loss 5.8083
saving checkpoint to out-shakespeare-char
iter 218250: loss 6.6173, time 2908.03ms
iter 218260: loss 6.6505, time 125.58ms
iter 218270: loss 7.1519, time 124.89ms
iter 218280: loss 6.2726, time 124.95ms
iter 218290: loss 6.6867, time 128.20ms
iter 218300: loss 6.7208, time 125.73ms
iter 218310: loss 6.7519, time 125.38ms
iter 218320: loss 6.7732, time 125.43ms
iter 218330: loss 7.3657, time 125.30ms
iter 218340: loss 6.2387, time 125.55ms
iter 218350: loss 5.9028, time 125.24ms
iter 218360: loss 6.6973, time 125.45ms
iter 218370: loss 7.1031, time 128.36ms
iter 218380: loss 6.3336, time 125.40ms
iter 218390: loss 6.4863, time 126.04ms
iter 218400: loss 6.2414, time 125.72ms
iter 218410: loss 5.9212, time 128.35ms
iter 218420: loss 5.4665, time 125.34ms
iter 218430: loss 6.8486, time 125.45ms
iter 218440: loss 6.4086, time 125.38ms
iter 218450: loss 6.2589, time 125.46ms
iter 218460: loss 6.1221, time 125.56ms
iter 218470: loss 6.8445, time 126.50ms
iter 218480: loss 6.1843, time 125.38ms
iter 218490: loss 6.1258, time 125.20ms
step 218500: train loss 5.7873, val loss 5.8056
saving checkpoint to out-shakespeare-char
iter 218500: loss 6.9411, time 2893.19ms
iter 218510: loss 6.3282, time 125.12ms
iter 218520: loss 5.9347, time 125.17ms
iter 218530: loss 6.5202, time 124.90ms
iter 218540: loss 5.6366, time 128.32ms
iter 218550: loss 6.5186, time 125.07ms
iter 218560: loss 6.2323, time 124.59ms
iter 218570: loss 6.2762, time 125.83ms
iter 218580: loss 6.7902, time 125.86ms
iter 218590: loss 6.3453, time 125.98ms
iter 218600: loss 6.3300, time 126.06ms
iter 218610: loss 5.7935, time 126.03ms
iter 218620: loss 6.3198, time 125.04ms
iter 218630: loss 6.8282, time 125.64ms
iter 218640: loss 6.5943, time 126.47ms
iter 218650: loss 6.1605, time 129.93ms
iter 218660: loss 6.4230, time 128.08ms
iter 218670: loss 6.5146, time 127.84ms
iter 218680: loss 5.6785, time 125.30ms
iter 218690: loss 6.3591, time 126.98ms
iter 218700: loss 6.0803, time 125.10ms
iter 218710: loss 6.6857, time 121.46ms
iter 218720: loss 6.2933, time 121.74ms
iter 218730: loss 6.1321, time 121.66ms
iter 218740: loss 6.3064, time 122.18ms
step 218750: train loss 5.7520, val loss 5.7690
saving checkpoint to out-shakespeare-char
iter 218750: loss 6.4803, time 2883.99ms
iter 218760: loss 6.4059, time 121.64ms
iter 218770: loss 6.2448, time 121.23ms
iter 218780: loss 5.7874, time 121.34ms
iter 218790: loss 5.8740, time 121.62ms
iter 218800: loss 6.9575, time 121.34ms
iter 218810: loss 6.1661, time 121.68ms
iter 218820: loss 6.4691, time 121.32ms
iter 218830: loss 6.5215, time 122.75ms
iter 218840: loss 6.8705, time 121.41ms
iter 218850: loss 5.6161, time 121.49ms
iter 218860: loss 6.6252, time 121.28ms
iter 218870: loss 6.2925, time 123.12ms
iter 218880: loss 6.8993, time 121.37ms
iter 218890: loss 6.2642, time 119.98ms
iter 218900: loss 6.5271, time 121.88ms
iter 218910: loss 6.6314, time 122.00ms
iter 218920: loss 6.3417, time 121.47ms
iter 218930: loss 6.4909, time 121.52ms
iter 218940: loss 6.2746, time 122.29ms
iter 218950: loss 6.4572, time 121.91ms
iter 218960: loss 5.6881, time 121.80ms
iter 218970: loss 5.4457, time 121.61ms
iter 218980: loss 6.2775, time 121.86ms
iter 218990: loss 6.5318, time 122.11ms
step 219000: train loss 5.8131, val loss 5.7887
saving checkpoint to out-shakespeare-char
iter 219000: loss 6.9652, time 2897.77ms
iter 219010: loss 6.2427, time 121.83ms
iter 219020: loss 6.5730, time 121.89ms
iter 219030: loss 6.7587, time 121.94ms
iter 219040: loss 5.7573, time 122.30ms
iter 219050: loss 5.8563, time 122.35ms
iter 219060: loss 6.5272, time 121.79ms
iter 219070: loss 5.9510, time 122.20ms
iter 219080: loss 5.7249, time 121.88ms
iter 219090: loss 6.2114, time 122.95ms
iter 219100: loss 5.7484, time 121.77ms
iter 219110: loss 6.4510, time 121.33ms
iter 219120: loss 6.0465, time 122.16ms
iter 219130: loss 7.0700, time 122.77ms
iter 219140: loss 6.8652, time 122.32ms
iter 219150: loss 5.9540, time 121.75ms
iter 219160: loss 5.1357, time 121.74ms
iter 219170: loss 6.9425, time 121.71ms
iter 219180: loss 5.9108, time 121.84ms
iter 219190: loss 6.3075, time 121.80ms
iter 219200: loss 6.5444, time 122.99ms
iter 219210: loss 5.9559, time 121.95ms
iter 219220: loss 6.4209, time 121.77ms
iter 219230: loss 5.9066, time 122.34ms
iter 219240: loss 5.5371, time 122.12ms
step 219250: train loss 5.8058, val loss 5.7562
saving checkpoint to out-shakespeare-char
iter 219250: loss 6.3058, time 2888.16ms
iter 219260: loss 6.0244, time 122.07ms
iter 219270: loss 6.4845, time 123.06ms
iter 219280: loss 6.4741, time 121.35ms
iter 219290: loss 5.8517, time 122.49ms
iter 219300: loss 7.1721, time 121.73ms
iter 219310: loss 5.7158, time 122.74ms
iter 219320: loss 6.2495, time 121.46ms
iter 219330: loss 6.8689, time 122.87ms
iter 219340: loss 6.1912, time 121.61ms
iter 219350: loss 6.1780, time 121.64ms
iter 219360: loss 6.0070, time 121.52ms
iter 219370: loss 6.4472, time 122.70ms
iter 219380: loss 5.6272, time 121.51ms
iter 219390: loss 5.8636, time 122.75ms
iter 219400: loss 6.3045, time 119.90ms
iter 219410: loss 6.5106, time 120.77ms
iter 219420: loss 6.1683, time 119.74ms
iter 219430: loss 6.5336, time 120.80ms
iter 219440: loss 6.2366, time 121.06ms
iter 219450: loss 6.0822, time 121.79ms
iter 219460: loss 6.3521, time 121.31ms
iter 219470: loss 6.1884, time 123.00ms
iter 219480: loss 6.6230, time 121.37ms
iter 219490: loss 5.6407, time 122.72ms
step 219500: train loss 5.7920, val loss 5.7886
saving checkpoint to out-shakespeare-char
iter 219500: loss 6.1876, time 2891.32ms
iter 219510: loss 5.7557, time 121.91ms
iter 219520: loss 5.6118, time 121.65ms
iter 219530: loss 6.1017, time 121.79ms
iter 219540: loss 6.5753, time 121.88ms
iter 219550: loss 5.6232, time 121.56ms
iter 219560: loss 6.2936, time 122.15ms
iter 219570: loss 5.5424, time 121.65ms
iter 219580: loss 6.3248, time 124.53ms
iter 219590: loss 5.4569, time 122.11ms
iter 219600: loss 6.3903, time 124.02ms
iter 219610: loss 6.1452, time 121.04ms
iter 219620: loss 5.5306, time 124.10ms
iter 219630: loss 5.6975, time 121.08ms
iter 219640: loss 6.1686, time 124.15ms
iter 219650: loss 7.0506, time 121.06ms
iter 219660: loss 6.5497, time 124.41ms
iter 219670: loss 6.7667, time 122.61ms
iter 219680: loss 6.1258, time 124.72ms
iter 219690: loss 6.4650, time 121.07ms
iter 219700: loss 6.3595, time 121.78ms
iter 219710: loss 6.6851, time 121.87ms
iter 219720: loss 6.2737, time 120.96ms
iter 219730: loss 5.8541, time 125.73ms
iter 219740: loss 6.0933, time 124.67ms
step 219750: train loss 5.7772, val loss 5.7314
saving checkpoint to out-shakespeare-char
iter 219750: loss 6.7345, time 2894.28ms
iter 219760: loss 5.4944, time 125.18ms
iter 219770: loss 6.3737, time 126.20ms
iter 219780: loss 6.5375, time 126.09ms
iter 219790: loss 6.8841, time 125.97ms
iter 219800: loss 6.1447, time 126.09ms
iter 219810: loss 6.7575, time 128.60ms
iter 219820: loss 6.4277, time 125.79ms
iter 219830: loss 6.1994, time 126.01ms
iter 219840: loss 6.4609, time 125.09ms
iter 219850: loss 6.4225, time 125.25ms
iter 219860: loss 5.8021, time 125.19ms
iter 219870: loss 5.9376, time 125.45ms
iter 219880: loss 6.1793, time 128.48ms
iter 219890: loss 5.8440, time 124.94ms
iter 219900: loss 6.7741, time 124.88ms
iter 219910: loss 6.2352, time 124.95ms
iter 219920: loss 6.4953, time 124.74ms
iter 219930: loss 6.5919, time 124.86ms
iter 219940: loss 6.8033, time 125.26ms
iter 219950: loss 6.6631, time 125.08ms
iter 219960: loss 5.8929, time 124.67ms
iter 219970: loss 6.8181, time 125.43ms
iter 219980: loss 6.3635, time 125.41ms
iter 219990: loss 6.7561, time 128.48ms
step 220000: train loss 5.7655, val loss 5.7452
saving checkpoint to out-shakespeare-char
iter 220000: loss 6.3836, time 2885.93ms
iter 220010: loss 6.1781, time 128.43ms
iter 220020: loss 5.8891, time 125.87ms
iter 220030: loss 6.1988, time 125.13ms
iter 220040: loss 6.4351, time 125.22ms
iter 220050: loss 5.8797, time 125.21ms
iter 220060: loss 6.3327, time 125.05ms
iter 220070: loss 6.6615, time 125.43ms
iter 220080: loss 7.1662, time 125.01ms
iter 220090: loss 6.4523, time 125.17ms
iter 220100: loss 6.3760, time 125.02ms
iter 220110: loss 6.2485, time 125.18ms
iter 220120: loss 6.3912, time 128.20ms
iter 220130: loss 6.9027, time 126.49ms
iter 220140: loss 6.1260, time 125.58ms
iter 220150: loss 6.4188, time 124.99ms
iter 220160: loss 6.1444, time 125.28ms
iter 220170: loss 5.5841, time 124.82ms
iter 220180: loss 5.3334, time 125.94ms
iter 220190: loss 6.4951, time 127.81ms
iter 220200: loss 6.0714, time 125.93ms
iter 220210: loss 6.7710, time 126.39ms
iter 220220: loss 6.3304, time 125.60ms
iter 220230: loss 5.9796, time 125.55ms
iter 220240: loss 6.0566, time 125.57ms
step 220250: train loss 5.7534, val loss 5.7930
saving checkpoint to out-shakespeare-char
iter 220250: loss 5.9991, time 2886.24ms
iter 220260: loss 6.9805, time 121.74ms
iter 220270: loss 6.0238, time 124.69ms
iter 220280: loss 6.3107, time 122.34ms
iter 220290: loss 5.8892, time 124.43ms
iter 220300: loss 6.5530, time 121.99ms
iter 220310: loss 6.8577, time 124.64ms
iter 220320: loss 6.1043, time 121.56ms
iter 220330: loss 6.1937, time 124.58ms
iter 220340: loss 6.2396, time 121.66ms
iter 220350: loss 6.3958, time 124.49ms
iter 220360: loss 6.3909, time 120.87ms
iter 220370: loss 6.0843, time 124.12ms
iter 220380: loss 5.4750, time 121.92ms
iter 220390: loss 6.7426, time 124.47ms
iter 220400: loss 6.4133, time 121.14ms
iter 220410: loss 6.6356, time 124.27ms
iter 220420: loss 6.3125, time 121.53ms
iter 220430: loss 6.6271, time 125.04ms
iter 220440: loss 6.8039, time 122.26ms
iter 220450: loss 6.4150, time 124.40ms
iter 220460: loss 6.5204, time 121.44ms
iter 220470: loss 6.0932, time 124.56ms
iter 220480: loss 6.4946, time 121.93ms
iter 220490: loss 5.7926, time 124.35ms
step 220500: train loss 5.7557, val loss 5.7507
saving checkpoint to out-shakespeare-char
iter 220500: loss 6.6315, time 2883.81ms
iter 220510: loss 7.0193, time 121.94ms
iter 220520: loss 6.3208, time 122.76ms
iter 220530: loss 5.9024, time 121.89ms
iter 220540: loss 6.3574, time 122.67ms
iter 220550: loss 6.9458, time 121.56ms
iter 220560: loss 6.1729, time 122.76ms
iter 220570: loss 6.2379, time 121.85ms
iter 220580: loss 5.6556, time 122.66ms
iter 220590: loss 6.3879, time 121.35ms
iter 220600: loss 6.2663, time 122.57ms
iter 220610: loss 6.9092, time 121.93ms
iter 220620: loss 7.0256, time 123.31ms
iter 220630: loss 6.3821, time 122.12ms
iter 220640: loss 7.0098, time 123.09ms
iter 220650: loss 6.3521, time 121.76ms
iter 220660: loss 6.2198, time 123.14ms
iter 220670: loss 5.8698, time 121.61ms
iter 220680: loss 5.6796, time 122.67ms
iter 220690: loss 6.4091, time 122.09ms
iter 220700: loss 6.0957, time 122.49ms
iter 220710: loss 6.0839, time 121.66ms
iter 220720: loss 6.7068, time 122.85ms
iter 220730: loss 6.0948, time 121.94ms
iter 220740: loss 6.0366, time 122.67ms
step 220750: train loss 5.7897, val loss 5.8006
saving checkpoint to out-shakespeare-char
iter 220750: loss 5.8631, time 2882.49ms
iter 220760: loss 6.9319, time 121.92ms
iter 220770: loss 5.9036, time 121.56ms
iter 220780: loss 6.5012, time 121.69ms
iter 220790: loss 6.3348, time 122.54ms
iter 220800: loss 6.3645, time 121.37ms
iter 220810: loss 6.3544, time 121.38ms
iter 220820: loss 6.0049, time 121.82ms
iter 220830: loss 5.4582, time 122.85ms
iter 220840: loss 6.0033, time 121.75ms
iter 220850: loss 6.7588, time 121.39ms
iter 220860: loss 6.0533, time 121.83ms
iter 220870: loss 6.0502, time 121.93ms
iter 220880: loss 6.2604, time 121.86ms
iter 220890: loss 6.3972, time 121.87ms
iter 220900: loss 6.2599, time 121.58ms
iter 220910: loss 6.6871, time 123.18ms
iter 220920: loss 6.7218, time 121.70ms
iter 220930: loss 6.8474, time 121.72ms
iter 220940: loss 5.4719, time 121.78ms
iter 220950: loss 6.4977, time 121.82ms
iter 220960: loss 6.0150, time 121.01ms
iter 220970: loss 6.3660, time 121.80ms
iter 220980: loss 6.7316, time 121.85ms
iter 220990: loss 6.5668, time 121.69ms
step 221000: train loss 5.8188, val loss 5.7314
saving checkpoint to out-shakespeare-char
iter 221000: loss 5.9273, time 2904.17ms
iter 221010: loss 5.7119, time 121.81ms
iter 221020: loss 6.2951, time 125.61ms
iter 221030: loss 6.4683, time 121.86ms
iter 221040: loss 7.0748, time 123.83ms
iter 221050: loss 6.0458, time 121.95ms
iter 221060: loss 6.1049, time 124.98ms
iter 221070: loss 5.7651, time 121.66ms
iter 221080: loss 5.7126, time 123.73ms
iter 221090: loss 6.2631, time 121.93ms
iter 221100: loss 5.6670, time 124.77ms
iter 221110: loss 7.0677, time 121.82ms
iter 221120: loss 6.2907, time 124.81ms
iter 221130: loss 5.3741, time 122.01ms
iter 221140: loss 6.9128, time 124.48ms
iter 221150: loss 6.8799, time 121.76ms
iter 221160: loss 6.0705, time 125.14ms
iter 221170: loss 7.7295, time 122.32ms
iter 221180: loss 5.6206, time 124.80ms
iter 221190: loss 6.9568, time 122.06ms
iter 221200: loss 6.3253, time 125.08ms
iter 221210: loss 6.2581, time 122.00ms
iter 221220: loss 6.1091, time 124.82ms
iter 221230: loss 6.2980, time 121.99ms
iter 221240: loss 6.1393, time 124.85ms
step 221250: train loss 5.7995, val loss 5.8024
saving checkpoint to out-shakespeare-char
iter 221250: loss 6.5839, time 2897.32ms
iter 221260: loss 6.6853, time 121.99ms
iter 221270: loss 5.4050, time 121.86ms
iter 221280: loss 5.8534, time 121.64ms
iter 221290: loss 5.7619, time 123.32ms
iter 221300: loss 7.1917, time 121.67ms
iter 221310: loss 6.0764, time 122.90ms
iter 221320: loss 6.6875, time 121.57ms
iter 221330: loss 6.4755, time 121.69ms
iter 221340: loss 6.1114, time 121.74ms
iter 221350: loss 6.3183, time 122.20ms
iter 221360: loss 6.1932, time 121.72ms
iter 221370: loss 6.6343, time 121.97ms
iter 221380: loss 6.3108, time 121.54ms
iter 221390: loss 6.1926, time 121.71ms
iter 221400: loss 6.2048, time 122.08ms
iter 221410: loss 6.3798, time 121.79ms
iter 221420: loss 6.8681, time 121.67ms
iter 221430: loss 6.1289, time 121.86ms
iter 221440: loss 6.1893, time 122.02ms
iter 221450: loss 6.6334, time 122.00ms
iter 221460: loss 7.3281, time 121.86ms
iter 221470: loss 5.8342, time 121.61ms
iter 221480: loss 6.6308, time 121.38ms
iter 221490: loss 6.2508, time 120.92ms
step 221500: train loss 5.7946, val loss 5.8249
saving checkpoint to out-shakespeare-char
iter 221500: loss 6.7183, time 2896.71ms
iter 221510: loss 6.3203, time 122.08ms
iter 221520: loss 6.8357, time 121.85ms
iter 221530: loss 5.9463, time 123.16ms
iter 221540: loss 6.6005, time 121.91ms
iter 221550: loss 6.3172, time 123.04ms
iter 221560: loss 6.4058, time 121.89ms
iter 221570: loss 5.9182, time 123.02ms
iter 221580: loss 6.5210, time 121.94ms
iter 221590: loss 6.6743, time 122.75ms
iter 221600: loss 6.0685, time 120.99ms
iter 221610: loss 6.7656, time 122.28ms
iter 221620: loss 6.1258, time 121.76ms
iter 221630: loss 5.3008, time 122.85ms
iter 221640: loss 6.4091, time 121.72ms
iter 221650: loss 5.7042, time 122.83ms
iter 221660: loss 6.0161, time 121.81ms
iter 221670: loss 6.0784, time 122.89ms
iter 221680: loss 5.7283, time 121.81ms
iter 221690: loss 6.2067, time 122.47ms
iter 221700: loss 6.1324, time 121.74ms
iter 221710: loss 6.1175, time 122.71ms
iter 221720: loss 6.2098, time 121.61ms
iter 221730: loss 6.4677, time 122.78ms
iter 221740: loss 6.4873, time 121.83ms
step 221750: train loss 5.8176, val loss 5.8128
saving checkpoint to out-shakespeare-char
iter 221750: loss 5.9087, time 2894.44ms
iter 221760: loss 6.9022, time 121.88ms
iter 221770: loss 5.9647, time 122.00ms
iter 221780: loss 6.1413, time 121.86ms
iter 221790: loss 5.9832, time 122.09ms
iter 221800: loss 6.4408, time 121.84ms
iter 221810: loss 6.7630, time 121.99ms
iter 221820: loss 6.9732, time 121.73ms
iter 221830: loss 5.8136, time 122.05ms
iter 221840: loss 6.1228, time 122.11ms
iter 221850: loss 6.4080, time 122.06ms
iter 221860: loss 6.4168, time 121.79ms
iter 221870: loss 5.5492, time 121.98ms
iter 221880: loss 6.5807, time 121.71ms
iter 221890: loss 6.6134, time 122.04ms
iter 221900: loss 6.9028, time 121.76ms
iter 221910: loss 7.0453, time 122.12ms
iter 221920: loss 6.1015, time 120.81ms
iter 221930: loss 7.1143, time 121.75ms
iter 221940: loss 6.3977, time 121.78ms
iter 221950: loss 5.7384, time 122.45ms
iter 221960: loss 6.3690, time 121.69ms
iter 221970: loss 6.2649, time 122.80ms
iter 221980: loss 6.1277, time 122.84ms
iter 221990: loss 6.5109, time 121.00ms
step 222000: train loss 5.8012, val loss 5.8265
saving checkpoint to out-shakespeare-char
iter 222000: loss 5.9457, time 2897.02ms
iter 222010: loss 5.5761, time 122.62ms
iter 222020: loss 6.1638, time 124.95ms
iter 222030: loss 6.8852, time 121.66ms
iter 222040: loss 6.1603, time 124.73ms
iter 222050: loss 5.9308, time 122.03ms
iter 222060: loss 6.4700, time 124.17ms
iter 222070: loss 6.4456, time 121.62ms
iter 222080: loss 5.9441, time 124.53ms
iter 222090: loss 6.0228, time 121.00ms
iter 222100: loss 6.6117, time 124.14ms
iter 222110: loss 6.3513, time 121.71ms
iter 222120: loss 6.0805, time 124.61ms
iter 222130: loss 6.0661, time 121.87ms
iter 222140: loss 6.2389, time 124.85ms
iter 222150: loss 5.8687, time 121.69ms
iter 222160: loss 6.3001, time 124.46ms
iter 222170: loss 6.1728, time 121.59ms
iter 222180: loss 5.9087, time 124.84ms
iter 222190: loss 6.3153, time 121.92ms
iter 222200: loss 6.5616, time 124.81ms
iter 222210: loss 5.7669, time 121.14ms
iter 222220: loss 5.9867, time 123.74ms
iter 222230: loss 6.2062, time 121.92ms
iter 222240: loss 6.4792, time 124.48ms
step 222250: train loss 5.7813, val loss 5.8010
saving checkpoint to out-shakespeare-char
iter 222250: loss 6.0610, time 2905.45ms
iter 222260: loss 5.8588, time 121.48ms
iter 222270: loss 6.3643, time 121.44ms
iter 222280: loss 5.7565, time 121.34ms
iter 222290: loss 6.2420, time 123.57ms
iter 222300: loss 6.7968, time 121.68ms
iter 222310: loss 6.2422, time 121.72ms
iter 222320: loss 6.7962, time 121.55ms
iter 222330: loss 6.0697, time 121.54ms
iter 222340: loss 5.9693, time 122.21ms
iter 222350: loss 6.1821, time 121.59ms
iter 222360: loss 6.5335, time 121.01ms
iter 222370: loss 5.7149, time 121.26ms
iter 222380: loss 5.8828, time 124.48ms
iter 222390: loss 6.0784, time 121.72ms
iter 222400: loss 5.5003, time 122.47ms
iter 222410: loss 5.6860, time 121.64ms
iter 222420: loss 6.1240, time 122.29ms
iter 222430: loss 6.2366, time 121.37ms
iter 222440: loss 6.2977, time 122.89ms
iter 222450: loss 5.7614, time 122.06ms
iter 222460: loss 6.0626, time 122.88ms
iter 222470: loss 6.8336, time 121.38ms
iter 222480: loss 5.7103, time 123.05ms
iter 222490: loss 6.2069, time 121.37ms
step 222500: train loss 5.8164, val loss 5.8083
saving checkpoint to out-shakespeare-char
iter 222500: loss 5.9883, time 2900.54ms
iter 222510: loss 6.2158, time 121.31ms
iter 222520: loss 6.0081, time 124.36ms
iter 222530: loss 5.9569, time 121.23ms
iter 222540: loss 6.7814, time 124.61ms
iter 222550: loss 7.4950, time 120.95ms
iter 222560: loss 6.2461, time 122.10ms
iter 222570: loss 6.5918, time 121.33ms
iter 222580: loss 5.7857, time 123.18ms
iter 222590: loss 5.4664, time 121.74ms
iter 222600: loss 5.5993, time 122.13ms
iter 222610: loss 6.1950, time 121.09ms
iter 222620: loss 6.9854, time 121.97ms
iter 222630: loss 6.5232, time 121.47ms
iter 222640: loss 6.2564, time 122.66ms
iter 222650: loss 6.4026, time 121.02ms
iter 222660: loss 6.2927, time 122.60ms
iter 222670: loss 6.4267, time 120.85ms
iter 222680: loss 6.5944, time 122.26ms
iter 222690: loss 5.9721, time 121.44ms
iter 222700: loss 6.4625, time 122.41ms
iter 222710: loss 6.3962, time 121.20ms
iter 222720: loss 6.1931, time 123.28ms
iter 222730: loss 6.1547, time 122.11ms
iter 222740: loss 5.2984, time 123.80ms
step 222750: train loss 5.7642, val loss 5.7555
saving checkpoint to out-shakespeare-char
iter 222750: loss 6.2853, time 2888.03ms
iter 222760: loss 6.3616, time 122.20ms
iter 222770: loss 6.0093, time 121.75ms
iter 222780: loss 6.2647, time 121.86ms
iter 222790: loss 6.4909, time 121.70ms
iter 222800: loss 6.3677, time 121.58ms
iter 222810: loss 6.0778, time 121.73ms
iter 222820: loss 6.3960, time 121.77ms
iter 222830: loss 5.9052, time 121.89ms
iter 222840: loss 5.9600, time 121.64ms
iter 222850: loss 6.3424, time 122.23ms
iter 222860: loss 6.2133, time 121.85ms
iter 222870: loss 6.0573, time 122.06ms
iter 222880: loss 5.5976, time 122.31ms
iter 222890: loss 6.0806, time 121.68ms
iter 222900: loss 6.6721, time 121.87ms
iter 222910: loss 6.0454, time 121.66ms
iter 222920: loss 6.3263, time 121.70ms
iter 222930: loss 6.5953, time 121.77ms
iter 222940: loss 6.4474, time 122.12ms
iter 222950: loss 5.9455, time 121.70ms
iter 222960: loss 6.1105, time 121.83ms
iter 222970: loss 6.1947, time 122.31ms
iter 222980: loss 6.3585, time 121.73ms
iter 222990: loss 6.0942, time 120.75ms
step 223000: train loss 5.7684, val loss 5.7553
saving checkpoint to out-shakespeare-char
iter 223000: loss 6.0952, time 2888.09ms
iter 223010: loss 6.5906, time 122.00ms
iter 223020: loss 6.9486, time 121.80ms
iter 223030: loss 5.7669, time 121.62ms
iter 223040: loss 5.8060, time 121.75ms
iter 223050: loss 6.5415, time 120.35ms
iter 223060: loss 6.1133, time 121.69ms
iter 223070: loss 5.7829, time 120.19ms
iter 223080: loss 6.7150, time 121.19ms
iter 223090: loss 6.2316, time 121.12ms
iter 223100: loss 6.8791, time 121.85ms
iter 223110: loss 5.6189, time 121.74ms
iter 223120: loss 6.3035, time 121.79ms
iter 223130: loss 6.0983, time 121.66ms
iter 223140: loss 5.8234, time 121.71ms
iter 223150: loss 6.4311, time 120.50ms
iter 223160: loss 5.7981, time 120.39ms
iter 223170: loss 6.0012, time 121.17ms
iter 223180: loss 6.1544, time 120.99ms
iter 223190: loss 6.4334, time 121.97ms
iter 223200: loss 6.3537, time 121.97ms
iter 223210: loss 6.0770, time 121.79ms
iter 223220: loss 5.7665, time 121.76ms
iter 223230: loss 5.9666, time 121.70ms
iter 223240: loss 6.2431, time 121.77ms
step 223250: train loss 5.6932, val loss 5.7653
saving checkpoint to out-shakespeare-char
iter 223250: loss 6.1913, time 2898.47ms
iter 223260: loss 6.4115, time 121.78ms
iter 223270: loss 5.8491, time 125.03ms
iter 223280: loss 5.9072, time 121.93ms
iter 223290: loss 6.1402, time 125.14ms
iter 223300: loss 6.2105, time 120.96ms
iter 223310: loss 6.5699, time 124.90ms
iter 223320: loss 6.4010, time 122.08ms
iter 223330: loss 5.7872, time 124.56ms
iter 223340: loss 5.4447, time 121.94ms
iter 223350: loss 6.1101, time 123.66ms
iter 223360: loss 5.9569, time 121.78ms
iter 223370: loss 5.8512, time 124.86ms
iter 223380: loss 6.5527, time 121.68ms
iter 223390: loss 6.0914, time 124.59ms
iter 223400: loss 5.8898, time 121.87ms
iter 223410: loss 6.3386, time 124.52ms
iter 223420: loss 6.5840, time 121.61ms
iter 223430: loss 6.2032, time 124.44ms
iter 223440: loss 6.2300, time 121.64ms
iter 223450: loss 6.0308, time 124.19ms
iter 223460: loss 5.4935, time 121.66ms
iter 223470: loss 6.0447, time 124.65ms
iter 223480: loss 6.3561, time 121.61ms
iter 223490: loss 6.0270, time 124.44ms
step 223500: train loss 5.7822, val loss 5.7983
saving checkpoint to out-shakespeare-char
iter 223500: loss 6.2173, time 2887.46ms
iter 223510: loss 6.2126, time 121.89ms
iter 223520: loss 6.2544, time 122.85ms
iter 223530: loss 6.0711, time 121.70ms
iter 223540: loss 6.2844, time 122.88ms
iter 223550: loss 6.4447, time 121.51ms
iter 223560: loss 6.3073, time 122.59ms
iter 223570: loss 6.3849, time 121.91ms
iter 223580: loss 5.3758, time 122.87ms
iter 223590: loss 6.6301, time 121.60ms
iter 223600: loss 6.1199, time 122.76ms
iter 223610: loss 6.3600, time 121.74ms
iter 223620: loss 6.7076, time 122.89ms
iter 223630: loss 6.2018, time 121.70ms
iter 223640: loss 6.5794, time 122.32ms
iter 223650: loss 6.1092, time 121.46ms
iter 223660: loss 6.4186, time 122.15ms
iter 223670: loss 6.2526, time 121.05ms
iter 223680: loss 6.1910, time 123.20ms
iter 223690: loss 6.2371, time 121.68ms
iter 223700: loss 6.3108, time 122.76ms
iter 223710: loss 6.2964, time 121.70ms
iter 223720: loss 6.6025, time 122.79ms
iter 223730: loss 6.1171, time 121.70ms
iter 223740: loss 6.0672, time 122.61ms
step 223750: train loss 5.7596, val loss 5.7359
saving checkpoint to out-shakespeare-char
iter 223750: loss 6.2243, time 2903.41ms
iter 223760: loss 6.4582, time 125.65ms
iter 223770: loss 7.1672, time 124.57ms
iter 223780: loss 6.6951, time 124.88ms
iter 223790: loss 6.8496, time 124.90ms
iter 223800: loss 5.9470, time 123.32ms
iter 223810: loss 6.4201, time 127.60ms
iter 223820: loss 6.3731, time 124.66ms
iter 223830: loss 6.3306, time 124.93ms
iter 223840: loss 5.6763, time 124.41ms
iter 223850: loss 7.1187, time 120.32ms
iter 223860: loss 6.0715, time 121.71ms
iter 223870: loss 5.8446, time 121.89ms
iter 223880: loss 6.6262, time 121.83ms
iter 223890: loss 6.1082, time 121.71ms
iter 223900: loss 5.8267, time 121.79ms
iter 223910: loss 6.5266, time 121.99ms
iter 223920: loss 5.9499, time 121.29ms
iter 223930: loss 6.5301, time 120.74ms
iter 223940: loss 6.4921, time 121.62ms
iter 223950: loss 6.6279, time 121.73ms
iter 223960: loss 7.2579, time 121.45ms
iter 223970: loss 5.7525, time 121.81ms
iter 223980: loss 7.2451, time 121.10ms
iter 223990: loss 6.6113, time 121.33ms
step 224000: train loss 5.7888, val loss 5.7941
saving checkpoint to out-shakespeare-char
iter 224000: loss 6.3081, time 2897.58ms
iter 224010: loss 5.6936, time 121.93ms
iter 224020: loss 6.0697, time 124.46ms
iter 224030: loss 6.7272, time 122.19ms
iter 224040: loss 5.4836, time 124.78ms
iter 224050: loss 5.6620, time 121.90ms
iter 224060: loss 5.6514, time 124.87ms
iter 224070: loss 5.6498, time 122.01ms
iter 224080: loss 6.4538, time 124.80ms
iter 224090: loss 5.8190, time 122.02ms
iter 224100: loss 5.9649, time 124.75ms
iter 224110: loss 6.0904, time 122.03ms
iter 224120: loss 6.3969, time 124.88ms
iter 224130: loss 6.1444, time 120.72ms
iter 224140: loss 6.7211, time 124.58ms
iter 224150: loss 6.6224, time 121.99ms
iter 224160: loss 6.2067, time 124.93ms
iter 224170: loss 5.6801, time 121.93ms
iter 224180: loss 5.5693, time 125.07ms
iter 224190: loss 6.7121, time 121.98ms
iter 224200: loss 6.8516, time 124.37ms
iter 224210: loss 5.9139, time 122.05ms
iter 224220: loss 5.5242, time 125.28ms
iter 224230: loss 6.0996, time 121.23ms
iter 224240: loss 6.4976, time 122.03ms
step 224250: train loss 5.7570, val loss 5.7961
saving checkpoint to out-shakespeare-char
iter 224250: loss 5.7667, time 2918.45ms
iter 224260: loss 6.0174, time 125.09ms
iter 224270: loss 5.8835, time 121.73ms
iter 224280: loss 5.0468, time 124.87ms
iter 224290: loss 6.0871, time 121.94ms
iter 224300: loss 5.5436, time 124.94ms
iter 224310: loss 5.6877, time 122.20ms
iter 224320: loss 6.1061, time 125.04ms
iter 224330: loss 6.5199, time 121.87ms
iter 224340: loss 6.6276, time 124.42ms
iter 224350: loss 6.0168, time 122.00ms
iter 224360: loss 5.7988, time 124.84ms
iter 224370: loss 6.5343, time 121.94ms
iter 224380: loss 7.0827, time 124.65ms
iter 224390: loss 6.6844, time 122.03ms
iter 224400: loss 6.0618, time 124.79ms
iter 224410: loss 6.5213, time 121.89ms
iter 224420: loss 7.0979, time 124.84ms
iter 224430: loss 6.2171, time 122.02ms
iter 224440: loss 5.9527, time 125.56ms
iter 224450: loss 6.3750, time 120.87ms
iter 224460: loss 6.4420, time 124.84ms
iter 224470: loss 5.3427, time 122.10ms
iter 224480: loss 5.7250, time 124.95ms
iter 224490: loss 5.3766, time 122.00ms
step 224500: train loss 5.7547, val loss 5.7257
saving checkpoint to out-shakespeare-char
iter 224500: loss 6.2151, time 2903.16ms
iter 224510: loss 6.3960, time 122.01ms
iter 224520: loss 6.4665, time 123.36ms
iter 224530: loss 6.9164, time 121.36ms
iter 224540: loss 6.3330, time 123.06ms
iter 224550: loss 6.6144, time 121.97ms
iter 224560: loss 5.8111, time 123.46ms
iter 224570: loss 6.7979, time 121.96ms
iter 224580: loss 6.7754, time 122.13ms
iter 224590: loss 6.1354, time 121.99ms
iter 224600: loss 6.3118, time 123.36ms
iter 224610: loss 6.4405, time 121.86ms
iter 224620: loss 5.9490, time 123.00ms
iter 224630: loss 6.4450, time 122.00ms
iter 224640: loss 6.5517, time 123.27ms
iter 224650: loss 5.7994, time 121.90ms
iter 224660: loss 5.6990, time 123.09ms
iter 224670: loss 6.7098, time 122.01ms
iter 224680: loss 6.0727, time 123.05ms
iter 224690: loss 5.8837, time 122.27ms
iter 224700: loss 7.1960, time 123.07ms
iter 224710: loss 6.0688, time 122.36ms
iter 224720: loss 6.8892, time 123.09ms
iter 224730: loss 6.3289, time 121.98ms
iter 224740: loss 6.1015, time 123.18ms
step 224750: train loss 5.7529, val loss 5.7226
saving checkpoint to out-shakespeare-char
iter 224750: loss 6.5255, time 2893.66ms
iter 224760: loss 6.2798, time 121.79ms
iter 224770: loss 6.2434, time 122.69ms
iter 224780: loss 6.4906, time 121.81ms
iter 224790: loss 6.4076, time 122.26ms
iter 224800: loss 5.8755, time 121.18ms
iter 224810: loss 6.1181, time 122.19ms
iter 224820: loss 6.5830, time 121.60ms
iter 224830: loss 6.2510, time 122.85ms
iter 224840: loss 5.9276, time 121.75ms
iter 224850: loss 6.3582, time 122.90ms
iter 224860: loss 6.2047, time 121.53ms
iter 224870: loss 6.5487, time 122.81ms
iter 224880: loss 5.9038, time 121.89ms
iter 224890: loss 6.6034, time 121.42ms
iter 224900: loss 6.5131, time 121.66ms
iter 224910: loss 6.5512, time 121.61ms
iter 224920: loss 7.7955, time 120.93ms
iter 224930: loss 6.5633, time 121.63ms
iter 224940: loss 6.4572, time 121.87ms
iter 224950: loss 6.1234, time 121.63ms
iter 224960: loss 5.5304, time 121.79ms
iter 224970: loss 6.1946, time 121.69ms
iter 224980: loss 6.3708, time 121.81ms
iter 224990: loss 6.3887, time 121.73ms
step 225000: train loss 5.7333, val loss 5.7851
saving checkpoint to out-shakespeare-char
iter 225000: loss 5.8910, time 2891.04ms
iter 225010: loss 6.0003, time 122.77ms
iter 225020: loss 5.1383, time 121.92ms
iter 225030: loss 6.2602, time 122.04ms
iter 225040: loss 6.6439, time 121.77ms
iter 225050: loss 5.7231, time 122.82ms
iter 225060: loss 6.0209, time 121.89ms
iter 225070: loss 6.9240, time 124.03ms
iter 225080: loss 6.6533, time 122.03ms
iter 225090: loss 5.6366, time 122.45ms
iter 225100: loss 6.0690, time 121.75ms
iter 225110: loss 6.2869, time 120.74ms
iter 225120: loss 6.0685, time 122.81ms
iter 225130: loss 6.0259, time 122.19ms
iter 225140: loss 6.0485, time 119.13ms
iter 225150: loss 6.1857, time 123.78ms
iter 225160: loss 6.0953, time 120.31ms
iter 225170: loss 7.1569, time 124.60ms
iter 225180: loss 6.6551, time 121.15ms
iter 225190: loss 5.5105, time 124.33ms
iter 225200: loss 6.1081, time 121.73ms
iter 225210: loss 6.1205, time 124.73ms
iter 225220: loss 6.4198, time 121.77ms
iter 225230: loss 5.9532, time 124.57ms
iter 225240: loss 6.6032, time 121.76ms
step 225250: train loss 5.7774, val loss 5.7672
saving checkpoint to out-shakespeare-char
iter 225250: loss 6.1965, time 2897.92ms
iter 225260: loss 6.4163, time 121.04ms
iter 225270: loss 6.4193, time 121.61ms
iter 225280: loss 6.3354, time 121.69ms
iter 225290: loss 6.5598, time 122.87ms
iter 225300: loss 6.5687, time 121.12ms
iter 225310: loss 6.4925, time 122.78ms
iter 225320: loss 6.3719, time 121.77ms
iter 225330: loss 6.4558, time 121.53ms
iter 225340: loss 6.5845, time 122.01ms
iter 225350: loss 7.2359, time 122.82ms
iter 225360: loss 6.1847, time 121.77ms
iter 225370: loss 6.3115, time 122.98ms
iter 225380: loss 6.5327, time 121.90ms
iter 225390: loss 5.9575, time 123.07ms
iter 225400: loss 6.6252, time 120.62ms
iter 225410: loss 6.8184, time 122.97ms
iter 225420: loss 6.2816, time 121.55ms
iter 225430: loss 6.9214, time 122.56ms
iter 225440: loss 6.5086, time 121.37ms
iter 225450: loss 6.3140, time 122.85ms
iter 225460: loss 5.9878, time 121.82ms
iter 225470: loss 6.2336, time 123.01ms
iter 225480: loss 6.1168, time 121.79ms
iter 225490: loss 6.1732, time 122.74ms
step 225500: train loss 5.7589, val loss 5.7762
saving checkpoint to out-shakespeare-char
iter 225500: loss 6.6268, time 2904.01ms
iter 225510: loss 6.3092, time 126.22ms
iter 225520: loss 6.3405, time 126.25ms
iter 225530: loss 6.2922, time 125.86ms
iter 225540: loss 6.2814, time 125.96ms
iter 225550: loss 6.5185, time 126.61ms
iter 225560: loss 6.9318, time 125.15ms
iter 225570: loss 5.5316, time 126.18ms
iter 225580: loss 5.5802, time 123.99ms
iter 225590: loss 6.8033, time 120.17ms
iter 225600: loss 6.0422, time 123.79ms
iter 225610: loss 6.4850, time 120.48ms
iter 225620: loss 6.2376, time 123.21ms
iter 225630: loss 6.1005, time 120.20ms
iter 225640: loss 7.1353, time 122.85ms
iter 225650: loss 5.6187, time 121.18ms
iter 225660: loss 6.9426, time 122.96ms
iter 225670: loss 6.0067, time 120.98ms
iter 225680: loss 6.2278, time 123.33ms
iter 225690: loss 5.6935, time 120.26ms
iter 225700: loss 6.3942, time 123.20ms
iter 225710: loss 6.3529, time 120.04ms
iter 225720: loss 5.6391, time 122.94ms
iter 225730: loss 5.6745, time 119.77ms
iter 225740: loss 6.6390, time 122.84ms
step 225750: train loss 5.7645, val loss 5.7599
saving checkpoint to out-shakespeare-char
iter 225750: loss 6.7935, time 2896.28ms
iter 225760: loss 6.2860, time 119.82ms
iter 225770: loss 5.8044, time 120.94ms
iter 225780: loss 6.2864, time 119.77ms
iter 225790: loss 5.7340, time 120.79ms
iter 225800: loss 6.1906, time 119.87ms
iter 225810: loss 6.1358, time 121.34ms
iter 225820: loss 6.9603, time 119.60ms
iter 225830: loss 6.1414, time 120.96ms
iter 225840: loss 6.5067, time 119.77ms
iter 225850: loss 6.9169, time 120.79ms
iter 225860: loss 6.2956, time 120.77ms
iter 225870: loss 6.0061, time 120.96ms
iter 225880: loss 6.1653, time 119.75ms
iter 225890: loss 6.6970, time 121.11ms
iter 225900: loss 6.1423, time 120.22ms
iter 225910: loss 5.4071, time 121.21ms
iter 225920: loss 6.5064, time 119.86ms
iter 225930: loss 6.2763, time 121.13ms
iter 225940: loss 6.4808, time 119.98ms
iter 225950: loss 7.3405, time 120.98ms
iter 225960: loss 6.5719, time 119.82ms
iter 225970: loss 6.2438, time 121.02ms
iter 225980: loss 6.0324, time 119.86ms
iter 225990: loss 6.1935, time 120.92ms
step 226000: train loss 5.7729, val loss 5.8189
saving checkpoint to out-shakespeare-char
iter 226000: loss 6.3838, time 2897.42ms
iter 226010: loss 6.3217, time 119.92ms
iter 226020: loss 5.7629, time 120.78ms
iter 226030: loss 5.9551, time 119.92ms
iter 226040: loss 6.1460, time 119.74ms
iter 226050: loss 6.4773, time 119.76ms
iter 226060: loss 6.1924, time 119.63ms
iter 226070: loss 6.4279, time 119.65ms
iter 226080: loss 6.3617, time 119.65ms
iter 226090: loss 5.6521, time 119.66ms
iter 226100: loss 5.9841, time 120.18ms
iter 226110: loss 6.6605, time 119.66ms
iter 226120: loss 6.2787, time 119.54ms
iter 226130: loss 6.7247, time 119.68ms
iter 226140: loss 6.6755, time 119.67ms
iter 226150: loss 6.4512, time 119.91ms
iter 226160: loss 6.4733, time 119.45ms
iter 226170: loss 6.3069, time 119.71ms
iter 226180: loss 6.1541, time 119.47ms
iter 226190: loss 6.2440, time 119.83ms
iter 226200: loss 5.8519, time 119.43ms
iter 226210: loss 6.3379, time 119.71ms
iter 226220: loss 6.2006, time 120.87ms
iter 226230: loss 7.0297, time 119.90ms
iter 226240: loss 5.9836, time 119.52ms
step 226250: train loss 5.8100, val loss 5.7624
saving checkpoint to out-shakespeare-char
iter 226250: loss 5.7759, time 2876.85ms
iter 226260: loss 6.0834, time 125.22ms
iter 226270: loss 6.3298, time 125.56ms
iter 226280: loss 6.2834, time 124.68ms
iter 226290: loss 6.2199, time 125.78ms
iter 226300: loss 6.1726, time 125.61ms
iter 226310: loss 6.3777, time 125.74ms
iter 226320: loss 5.8331, time 125.43ms
iter 226330: loss 6.4419, time 125.17ms
iter 226340: loss 6.0923, time 125.15ms
iter 226350: loss 5.8744, time 124.69ms
iter 226360: loss 6.0266, time 125.54ms
iter 226370: loss 6.7386, time 124.33ms
iter 226380: loss 5.9379, time 128.75ms
iter 226390: loss 6.0508, time 124.10ms
iter 226400: loss 6.6583, time 126.27ms
iter 226410: loss 5.6804, time 126.14ms
iter 226420: loss 6.9685, time 126.38ms
iter 226430: loss 6.7015, time 122.23ms
iter 226440: loss 6.6366, time 124.64ms
iter 226450: loss 5.8578, time 122.05ms
iter 226460: loss 6.0895, time 124.70ms
iter 226470: loss 6.9207, time 121.10ms
iter 226480: loss 6.4935, time 124.70ms
iter 226490: loss 6.0192, time 121.91ms
step 226500: train loss 5.7482, val loss 5.7499
saving checkpoint to out-shakespeare-char
iter 226500: loss 6.2839, time 2900.70ms
iter 226510: loss 6.6733, time 121.82ms
iter 226520: loss 6.3356, time 122.02ms
iter 226530: loss 6.3332, time 121.86ms
iter 226540: loss 6.0249, time 121.77ms
iter 226550: loss 5.7994, time 121.83ms
iter 226560: loss 6.4509, time 121.92ms
iter 226570: loss 5.9526, time 121.69ms
iter 226580: loss 6.0821, time 121.54ms
iter 226590: loss 6.2988, time 121.94ms
iter 226600: loss 6.1653, time 121.63ms
iter 226610: loss 6.7394, time 121.20ms
iter 226620: loss 5.8291, time 121.77ms
iter 226630: loss 5.9450, time 121.67ms
iter 226640: loss 6.2260, time 121.70ms
iter 226650: loss 6.3378, time 121.78ms
iter 226660: loss 7.2213, time 121.08ms
iter 226670: loss 5.7396, time 121.77ms
iter 226680: loss 6.8673, time 123.05ms
iter 226690: loss 6.3165, time 121.45ms
iter 226700: loss 6.0777, time 123.09ms
iter 226710: loss 6.2611, time 121.81ms
iter 226720: loss 6.1866, time 121.22ms
iter 226730: loss 6.7204, time 121.70ms
iter 226740: loss 6.0512, time 121.34ms
step 226750: train loss 5.7798, val loss 5.7690
saving checkpoint to out-shakespeare-char
iter 226750: loss 5.8954, time 2900.38ms
iter 226760: loss 6.5193, time 123.16ms
iter 226770: loss 6.1927, time 122.07ms
iter 226780: loss 6.3125, time 123.16ms
iter 226790: loss 6.0693, time 121.91ms
iter 226800: loss 6.6705, time 122.20ms
iter 226810: loss 6.6810, time 121.93ms
iter 226820: loss 6.5276, time 122.94ms
iter 226830: loss 6.2929, time 121.91ms
iter 226840: loss 7.1406, time 123.15ms
iter 226850: loss 6.4887, time 121.32ms
iter 226860: loss 5.7794, time 123.05ms
iter 226870: loss 6.2578, time 121.66ms
iter 226880: loss 5.8226, time 122.84ms
iter 226890: loss 6.6600, time 121.89ms
iter 226900: loss 6.7455, time 122.76ms
iter 226910: loss 5.7842, time 121.85ms
iter 226920: loss 6.5379, time 123.31ms
iter 226930: loss 5.6107, time 121.79ms
iter 226940: loss 6.7020, time 122.79ms
iter 226950: loss 6.6773, time 121.86ms
iter 226960: loss 5.7836, time 121.80ms
iter 226970: loss 6.5787, time 121.77ms
iter 226980: loss 6.1298, time 120.88ms
iter 226990: loss 5.7044, time 121.57ms
step 227000: train loss 5.7631, val loss 5.7432
saving checkpoint to out-shakespeare-char
iter 227000: loss 5.8970, time 2901.98ms
iter 227010: loss 6.2330, time 121.60ms
iter 227020: loss 5.7657, time 123.48ms
iter 227030: loss 6.0734, time 122.83ms
iter 227040: loss 5.8470, time 121.01ms
iter 227050: loss 5.8863, time 122.70ms
iter 227060: loss 5.5230, time 121.72ms
iter 227070: loss 6.3216, time 121.58ms
iter 227080: loss 5.6664, time 121.65ms
iter 227090: loss 5.8600, time 121.82ms
iter 227100: loss 6.1878, time 121.77ms
iter 227110: loss 6.3654, time 121.59ms
iter 227120: loss 6.1615, time 121.69ms
iter 227130: loss 6.4426, time 121.72ms
iter 227140: loss 6.2127, time 122.03ms
iter 227150: loss 5.9209, time 121.57ms
iter 227160: loss 6.3017, time 121.91ms
iter 227170: loss 5.2916, time 120.71ms
iter 227180: loss 6.0229, time 121.77ms
iter 227190: loss 6.9565, time 120.88ms
iter 227200: loss 6.5429, time 121.77ms
iter 227210: loss 5.9187, time 121.73ms
iter 227220: loss 5.9975, time 121.76ms
iter 227230: loss 6.6281, time 121.77ms
iter 227240: loss 6.2425, time 121.63ms
step 227250: train loss 5.7425, val loss 5.7298
saving checkpoint to out-shakespeare-char
iter 227250: loss 6.0493, time 2906.62ms
iter 227260: loss 6.5036, time 121.93ms
iter 227270: loss 6.2691, time 122.05ms
iter 227280: loss 7.1196, time 121.66ms
iter 227290: loss 6.0549, time 121.84ms
iter 227300: loss 5.9625, time 121.98ms
iter 227310: loss 6.0099, time 121.69ms
iter 227320: loss 5.8105, time 121.89ms
iter 227330: loss 6.6362, time 121.50ms
iter 227340: loss 6.2149, time 121.75ms
iter 227350: loss 6.9594, time 121.64ms
iter 227360: loss 7.1012, time 121.49ms
iter 227370: loss 6.0875, time 121.43ms
iter 227380: loss 6.4819, time 120.10ms
iter 227390: loss 5.8341, time 121.25ms
iter 227400: loss 6.4481, time 121.61ms
iter 227410: loss 6.7066, time 121.71ms
iter 227420: loss 6.2849, time 121.72ms
iter 227430: loss 6.0903, time 121.61ms
iter 227440: loss 6.5637, time 121.69ms
iter 227450: loss 5.8056, time 121.66ms
iter 227460: loss 7.0830, time 121.86ms
iter 227470: loss 6.1591, time 121.51ms
iter 227480: loss 5.8392, time 121.67ms
iter 227490: loss 6.0850, time 121.75ms
step 227500: train loss 5.7660, val loss 5.7411
saving checkpoint to out-shakespeare-char
iter 227500: loss 5.2813, time 2892.95ms
iter 227510: loss 6.5488, time 121.99ms
iter 227520: loss 6.8841, time 120.75ms
iter 227530: loss 6.0510, time 122.78ms
iter 227540: loss 6.2514, time 121.56ms
iter 227550: loss 6.7790, time 122.79ms
iter 227560: loss 7.2913, time 121.70ms
iter 227570: loss 5.8131, time 122.86ms
iter 227580: loss 6.1578, time 121.55ms
iter 227590: loss 6.8079, time 122.61ms
iter 227600: loss 6.2199, time 121.77ms
iter 227610: loss 6.6973, time 122.64ms
iter 227620: loss 5.9573, time 121.74ms
iter 227630: loss 6.5357, time 122.70ms
iter 227640: loss 6.8803, time 121.63ms
iter 227650: loss 6.0216, time 122.74ms
iter 227660: loss 6.2478, time 120.72ms
iter 227670: loss 5.9632, time 122.33ms
iter 227680: loss 6.7922, time 121.61ms
iter 227690: loss 5.8863, time 122.69ms
iter 227700: loss 6.3082, time 121.77ms
iter 227710: loss 5.7681, time 122.74ms
iter 227720: loss 5.8812, time 121.75ms
iter 227730: loss 5.9038, time 122.82ms
iter 227740: loss 6.5180, time 121.77ms
step 227750: train loss 5.7552, val loss 5.7681
saving checkpoint to out-shakespeare-char
iter 227750: loss 6.8042, time 2886.70ms
iter 227760: loss 6.1631, time 122.09ms
iter 227770: loss 5.4695, time 122.78ms
iter 227780: loss 6.0838, time 121.72ms
iter 227790: loss 6.5073, time 121.96ms
iter 227800: loss 6.3307, time 120.91ms
iter 227810: loss 6.7457, time 122.31ms
iter 227820: loss 5.6719, time 122.58ms
iter 227830: loss 6.2739, time 122.88ms
iter 227840: loss 6.8571, time 121.82ms
iter 227850: loss 6.6521, time 122.79ms
iter 227860: loss 6.5152, time 121.70ms
iter 227870: loss 5.7218, time 122.68ms
iter 227880: loss 6.4890, time 121.79ms
iter 227890: loss 5.9182, time 122.71ms
iter 227900: loss 6.3421, time 121.59ms
iter 227910: loss 6.8063, time 122.76ms
iter 227920: loss 6.7135, time 121.73ms
iter 227930: loss 6.4463, time 122.74ms
iter 227940: loss 6.3457, time 121.68ms
iter 227950: loss 6.5109, time 122.05ms
iter 227960: loss 5.5823, time 121.13ms
iter 227970: loss 6.1631, time 122.77ms
iter 227980: loss 6.3157, time 121.95ms
iter 227990: loss 6.3876, time 123.28ms
step 228000: train loss 5.7879, val loss 5.7487
saving checkpoint to out-shakespeare-char
iter 228000: loss 6.0691, time 2896.27ms
iter 228010: loss 5.4712, time 122.29ms
iter 228020: loss 6.2218, time 122.48ms
iter 228030: loss 5.8124, time 122.03ms
iter 228040: loss 5.3472, time 121.99ms
iter 228050: loss 6.5675, time 121.71ms
iter 228060: loss 6.2759, time 121.76ms
iter 228070: loss 5.5856, time 122.80ms
iter 228080: loss 6.6879, time 121.33ms
iter 228090: loss 7.4830, time 122.14ms
iter 228100: loss 5.8063, time 121.97ms
iter 228110: loss 6.6084, time 122.58ms
iter 228120: loss 6.4979, time 122.20ms
iter 228130: loss 7.1894, time 122.68ms
iter 228140: loss 6.4234, time 121.64ms
iter 228150: loss 6.4965, time 122.37ms
iter 228160: loss 6.2053, time 121.45ms
iter 228170: loss 5.6210, time 122.56ms
iter 228180: loss 6.4826, time 120.95ms
iter 228190: loss 6.3922, time 122.29ms
iter 228200: loss 6.0741, time 121.70ms
iter 228210: loss 6.4265, time 122.57ms
iter 228220: loss 6.3452, time 122.16ms
iter 228230: loss 6.1317, time 123.10ms
iter 228240: loss 6.4696, time 120.93ms
step 228250: train loss 5.7500, val loss 5.7574
saving checkpoint to out-shakespeare-char
iter 228250: loss 6.6030, time 2887.03ms
iter 228260: loss 6.4087, time 126.07ms
iter 228270: loss 6.1383, time 125.38ms
iter 228280: loss 6.7133, time 128.14ms
iter 228290: loss 6.9556, time 124.37ms
iter 228300: loss 6.2103, time 125.14ms
iter 228310: loss 5.9711, time 125.17ms
iter 228320: loss 6.4647, time 124.70ms
iter 228330: loss 6.0397, time 125.37ms
iter 228340: loss 6.3528, time 125.28ms
iter 228350: loss 5.2750, time 124.81ms
iter 228360: loss 6.5733, time 125.40ms
iter 228370: loss 5.8812, time 126.39ms
iter 228380: loss 6.7694, time 125.45ms
iter 228390: loss 5.1856, time 123.96ms
iter 228400: loss 5.3705, time 124.74ms
iter 228410: loss 6.1795, time 125.54ms
iter 228420: loss 6.5701, time 125.55ms
iter 228430: loss 6.5809, time 123.92ms
iter 228440: loss 6.4409, time 125.59ms
iter 228450: loss 5.7002, time 125.43ms
iter 228460: loss 6.5834, time 125.52ms
iter 228470: loss 5.6349, time 128.16ms
iter 228480: loss 5.8917, time 125.25ms
iter 228490: loss 5.5571, time 125.32ms
step 228500: train loss 5.7893, val loss 5.7502
saving checkpoint to out-shakespeare-char
iter 228500: loss 5.9384, time 2883.13ms
iter 228510: loss 5.8396, time 125.68ms
iter 228520: loss 6.1629, time 125.79ms
iter 228530: loss 6.6210, time 128.05ms
iter 228540: loss 5.6565, time 125.49ms
iter 228550: loss 5.7219, time 125.04ms
iter 228560: loss 5.9825, time 123.43ms
iter 228570: loss 6.0537, time 124.92ms
iter 228580: loss 6.5512, time 125.49ms
iter 228590: loss 5.8629, time 125.40ms
iter 228600: loss 5.4710, time 127.79ms
iter 228610: loss 6.4308, time 124.89ms
iter 228620: loss 6.2642, time 125.16ms
iter 228630: loss 6.6062, time 124.53ms
iter 228640: loss 6.3190, time 125.05ms
iter 228650: loss 6.0557, time 124.88ms
iter 228660: loss 6.2935, time 125.36ms
iter 228670: loss 6.7393, time 124.41ms
iter 228680: loss 6.4222, time 125.19ms
iter 228690: loss 5.9738, time 125.14ms
iter 228700: loss 5.5284, time 124.80ms
iter 228710: loss 6.1557, time 124.83ms
iter 228720: loss 6.4804, time 124.81ms
iter 228730: loss 6.4913, time 124.36ms
iter 228740: loss 6.0602, time 126.90ms
step 228750: train loss 5.8215, val loss 5.7870
saving checkpoint to out-shakespeare-char
iter 228750: loss 6.6696, time 2902.00ms
iter 228760: loss 6.4485, time 125.13ms
iter 228770: loss 6.5811, time 124.59ms
iter 228780: loss 6.1588, time 125.22ms
iter 228790: loss 6.7678, time 124.94ms
iter 228800: loss 6.0358, time 125.06ms
iter 228810: loss 6.7628, time 124.89ms
iter 228820: loss 6.8087, time 125.11ms
iter 228830: loss 7.0215, time 125.17ms
iter 228840: loss 6.5582, time 125.30ms
iter 228850: loss 6.7802, time 127.74ms
iter 228860: loss 6.3595, time 125.02ms
iter 228870: loss 5.8396, time 125.10ms
iter 228880: loss 6.1825, time 125.10ms
iter 228890: loss 6.1102, time 124.15ms
iter 228900: loss 6.4544, time 125.06ms
iter 228910: loss 5.7499, time 125.04ms
iter 228920: loss 5.9189, time 125.12ms
iter 228930: loss 5.9440, time 123.76ms
iter 228940: loss 7.6967, time 125.05ms
iter 228950: loss 5.8621, time 125.46ms
iter 228960: loss 6.1251, time 124.88ms
iter 228970: loss 6.1176, time 124.23ms
iter 228980: loss 6.8925, time 125.09ms
iter 228990: loss 6.6993, time 125.16ms
step 229000: train loss 5.7461, val loss 5.7064
saving checkpoint to out-shakespeare-char
iter 229000: loss 6.1797, time 2876.04ms
iter 229010: loss 6.4481, time 125.18ms
iter 229020: loss 5.7783, time 125.39ms
iter 229030: loss 6.3213, time 127.95ms
iter 229040: loss 6.4429, time 125.34ms
iter 229050: loss 6.3372, time 124.98ms
iter 229060: loss 6.3538, time 125.17ms
iter 229070: loss 5.5473, time 125.06ms
iter 229080: loss 7.0123, time 125.11ms
iter 229090: loss 5.3546, time 125.21ms
iter 229100: loss 6.4539, time 127.94ms
iter 229110: loss 5.8177, time 125.12ms
iter 229120: loss 5.9051, time 124.89ms
iter 229130: loss 6.3982, time 125.23ms
iter 229140: loss 5.9653, time 125.08ms
iter 229150: loss 6.4694, time 125.10ms
iter 229160: loss 5.7364, time 125.82ms
iter 229170: loss 6.6032, time 125.92ms
iter 229180: loss 5.7742, time 126.07ms
iter 229190: loss 6.4952, time 125.85ms
iter 229200: loss 6.2612, time 125.97ms
iter 229210: loss 7.0594, time 125.75ms
iter 229220: loss 5.9944, time 125.58ms
iter 229230: loss 5.4574, time 125.70ms
iter 229240: loss 6.1037, time 125.35ms
step 229250: train loss 5.7482, val loss 5.7498
saving checkpoint to out-shakespeare-char
iter 229250: loss 5.8298, time 2886.27ms
iter 229260: loss 6.2866, time 124.75ms
iter 229270: loss 6.5903, time 125.49ms
iter 229280: loss 6.0404, time 125.13ms
iter 229290: loss 6.3280, time 125.29ms
iter 229300: loss 5.5554, time 128.04ms
iter 229310: loss 6.6041, time 125.50ms
iter 229320: loss 6.4582, time 125.51ms
iter 229330: loss 6.4481, time 125.58ms
iter 229340: loss 6.8340, time 124.13ms
iter 229350: loss 6.8737, time 124.97ms
iter 229360: loss 5.8018, time 125.04ms
iter 229370: loss 6.4463, time 124.93ms
iter 229380: loss 6.3193, time 125.27ms
iter 229390: loss 5.4877, time 125.07ms
iter 229400: loss 6.0834, time 125.96ms
iter 229410: loss 6.2552, time 128.36ms
iter 229420: loss 6.5193, time 125.53ms
iter 229430: loss 5.4492, time 126.37ms
iter 229440: loss 6.3297, time 126.15ms
iter 229450: loss 5.8667, time 126.16ms
iter 229460: loss 6.0276, time 126.24ms
iter 229470: loss 5.1662, time 125.90ms
iter 229480: loss 6.4319, time 125.70ms
iter 229490: loss 6.2685, time 125.98ms
step 229500: train loss 5.7964, val loss 5.7685
saving checkpoint to out-shakespeare-char
iter 229500: loss 5.8239, time 2898.74ms
iter 229510: loss 6.0854, time 125.58ms
iter 229520: loss 6.1801, time 125.86ms
iter 229530: loss 6.0073, time 125.96ms
iter 229540: loss 5.3588, time 125.95ms
iter 229550: loss 6.6579, time 128.44ms
iter 229560: loss 5.6792, time 125.72ms
iter 229570: loss 5.5882, time 125.70ms
iter 229580: loss 6.3448, time 125.90ms
iter 229590: loss 6.6286, time 125.44ms
iter 229600: loss 6.1402, time 124.66ms
iter 229610: loss 6.4775, time 125.11ms
iter 229620: loss 6.4364, time 125.73ms
iter 229630: loss 6.1579, time 127.98ms
iter 229640: loss 5.3651, time 124.11ms
iter 229650: loss 6.5382, time 125.36ms
iter 229660: loss 6.1607, time 125.34ms
iter 229670: loss 6.0000, time 127.98ms
iter 229680: loss 5.9232, time 125.25ms
iter 229690: loss 5.6264, time 125.20ms
iter 229700: loss 6.3355, time 126.50ms
iter 229710: loss 6.3404, time 127.82ms
iter 229720: loss 6.2479, time 125.06ms
iter 229730: loss 6.5453, time 125.34ms
iter 229740: loss 5.9282, time 125.69ms
step 229750: train loss 5.7272, val loss 5.7394
saving checkpoint to out-shakespeare-char
iter 229750: loss 6.2435, time 2900.06ms
iter 229760: loss 6.0721, time 124.57ms
iter 229770: loss 5.9237, time 125.44ms
iter 229780: loss 6.8273, time 125.52ms
iter 229790: loss 6.2537, time 125.03ms
iter 229800: loss 5.7005, time 124.50ms
iter 229810: loss 6.6298, time 125.11ms
iter 229820: loss 5.9841, time 125.12ms
iter 229830: loss 6.2034, time 125.59ms
iter 229840: loss 6.3567, time 125.34ms
iter 229850: loss 5.8000, time 127.21ms
iter 229860: loss 5.1721, time 125.07ms
iter 229870: loss 6.6760, time 125.20ms
iter 229880: loss 5.5668, time 125.20ms
iter 229890: loss 6.6262, time 125.39ms
iter 229900: loss 6.3838, time 125.41ms
iter 229910: loss 5.8211, time 125.31ms
iter 229920: loss 6.3857, time 125.74ms
iter 229930: loss 6.3697, time 125.20ms
iter 229940: loss 6.5473, time 125.13ms
iter 229950: loss 5.9900, time 125.02ms
iter 229960: loss 6.6517, time 128.06ms
iter 229970: loss 5.9245, time 125.33ms
iter 229980: loss 5.9565, time 128.00ms
iter 229990: loss 6.3708, time 125.99ms
step 230000: train loss 5.7646, val loss 5.8197
saving checkpoint to out-shakespeare-char
iter 230000: loss 5.9002, time 2899.87ms
iter 230010: loss 6.3191, time 123.39ms
iter 230020: loss 6.5791, time 124.59ms
iter 230030: loss 6.2786, time 124.92ms
iter 230040: loss 6.6530, time 125.67ms
iter 230050: loss 6.2360, time 125.46ms
iter 230060: loss 6.6699, time 124.34ms
iter 230070: loss 6.5224, time 125.04ms
iter 230080: loss 6.2352, time 125.20ms
iter 230090: loss 6.2133, time 125.30ms
iter 230100: loss 5.5219, time 125.22ms
iter 230110: loss 5.8815, time 125.44ms
iter 230120: loss 6.1179, time 125.17ms
iter 230130: loss 6.2708, time 128.05ms
iter 230140: loss 6.7586, time 125.79ms
iter 230150: loss 5.0171, time 125.35ms
iter 230160: loss 6.0213, time 124.90ms
iter 230170: loss 6.3396, time 129.06ms
iter 230180: loss 6.2659, time 127.55ms
iter 230190: loss 6.1425, time 125.46ms
iter 230200: loss 6.0413, time 125.10ms
iter 230210: loss 6.1252, time 124.87ms
iter 230220: loss 6.7191, time 124.43ms
iter 230230: loss 6.3534, time 125.30ms
iter 230240: loss 6.4529, time 127.67ms
step 230250: train loss 5.7314, val loss 5.7776
saving checkpoint to out-shakespeare-char
iter 230250: loss 5.9921, time 2903.35ms
iter 230260: loss 6.1678, time 125.01ms
iter 230270: loss 5.5321, time 126.58ms
iter 230280: loss 6.4475, time 125.00ms
iter 230290: loss 7.0086, time 125.58ms
iter 230300: loss 6.0210, time 125.54ms
iter 230310: loss 6.3458, time 125.20ms
iter 230320: loss 5.9248, time 125.56ms
iter 230330: loss 6.0897, time 125.89ms
iter 230340: loss 6.1303, time 125.87ms
iter 230350: loss 5.9635, time 125.77ms
iter 230360: loss 6.4069, time 125.55ms
iter 230370: loss 6.5438, time 126.11ms
iter 230380: loss 6.2313, time 126.14ms
iter 230390: loss 5.9522, time 128.59ms
iter 230400: loss 5.7203, time 125.37ms
iter 230410: loss 5.7832, time 125.39ms
iter 230420: loss 5.9808, time 125.57ms
iter 230430: loss 6.6830, time 127.75ms
iter 230440: loss 6.4141, time 126.03ms
iter 230450: loss 6.4525, time 125.48ms
iter 230460: loss 6.8496, time 126.09ms
iter 230470: loss 6.0951, time 126.03ms
iter 230480: loss 6.7847, time 126.29ms
iter 230490: loss 6.1943, time 125.64ms
step 230500: train loss 5.7600, val loss 5.7366
saving checkpoint to out-shakespeare-char
iter 230500: loss 6.2768, time 2869.08ms
iter 230510: loss 6.1205, time 125.61ms
iter 230520: loss 6.1597, time 124.98ms
iter 230530: loss 6.0133, time 128.71ms
iter 230540: loss 6.7063, time 125.74ms
iter 230550: loss 6.3368, time 125.04ms
iter 230560: loss 5.5200, time 125.82ms
iter 230570: loss 6.2292, time 125.95ms
iter 230580: loss 6.1720, time 125.41ms
iter 230590: loss 6.0343, time 125.79ms
iter 230600: loss 6.6689, time 125.09ms
iter 230610: loss 7.1197, time 125.53ms
iter 230620: loss 6.2360, time 125.94ms
iter 230630: loss 5.9172, time 125.80ms
iter 230640: loss 6.1045, time 128.10ms
iter 230650: loss 6.6753, time 125.56ms
iter 230660: loss 6.3448, time 125.61ms
iter 230670: loss 5.9745, time 125.56ms
iter 230680: loss 5.9551, time 124.97ms
iter 230690: loss 6.3932, time 124.90ms
iter 230700: loss 6.3930, time 124.92ms
iter 230710: loss 6.7516, time 125.10ms
iter 230720: loss 6.0251, time 125.18ms
iter 230730: loss 5.9756, time 128.07ms
iter 230740: loss 5.7374, time 126.14ms
step 230750: train loss 5.7475, val loss 5.7630
saving checkpoint to out-shakespeare-char
iter 230750: loss 5.9930, time 2880.83ms
iter 230760: loss 6.7953, time 123.38ms
iter 230770: loss 5.8997, time 121.83ms
iter 230780: loss 6.4052, time 123.17ms
iter 230790: loss 6.9040, time 122.06ms
iter 230800: loss 5.8239, time 123.43ms
iter 230810: loss 5.9431, time 121.77ms
iter 230820: loss 6.0521, time 122.54ms
iter 230830: loss 6.5129, time 121.69ms
iter 230840: loss 6.9699, time 122.97ms
iter 230850: loss 6.5651, time 121.54ms
iter 230860: loss 7.2778, time 123.49ms
iter 230870: loss 6.5670, time 122.29ms
iter 230880: loss 5.9725, time 123.14ms
iter 230890: loss 6.6452, time 120.96ms
iter 230900: loss 6.0424, time 123.42ms
iter 230910: loss 6.2435, time 121.63ms
iter 230920: loss 5.7501, time 123.44ms
iter 230930: loss 6.3651, time 121.74ms
iter 230940: loss 6.0232, time 123.00ms
iter 230950: loss 6.1175, time 122.66ms
iter 230960: loss 5.4370, time 123.09ms
iter 230970: loss 6.6574, time 121.94ms
iter 230980: loss 5.9007, time 123.02ms
iter 230990: loss 5.9316, time 121.47ms
step 231000: train loss 5.7816, val loss 5.7974
saving checkpoint to out-shakespeare-char
iter 231000: loss 7.0193, time 2901.52ms
iter 231010: loss 6.5826, time 125.59ms
iter 231020: loss 6.0692, time 125.29ms
iter 231030: loss 6.2338, time 125.05ms
iter 231040: loss 6.6055, time 125.35ms
iter 231050: loss 6.4292, time 121.60ms
iter 231060: loss 6.3014, time 121.56ms
iter 231070: loss 6.7853, time 120.64ms
iter 231080: loss 6.9470, time 120.89ms
iter 231090: loss 6.3504, time 121.57ms
iter 231100: loss 6.4309, time 121.62ms
iter 231110: loss 6.0756, time 121.47ms
iter 231120: loss 5.8025, time 121.85ms
iter 231130: loss 6.8354, time 121.39ms
iter 231140: loss 5.6837, time 121.10ms
iter 231150: loss 5.8040, time 121.57ms
iter 231160: loss 6.1362, time 121.16ms
iter 231170: loss 7.1021, time 121.59ms
iter 231180: loss 6.1408, time 121.92ms
iter 231190: loss 6.0682, time 121.60ms
iter 231200: loss 6.1851, time 121.78ms
iter 231210: loss 6.4199, time 121.61ms
iter 231220: loss 5.6487, time 121.49ms
iter 231230: loss 5.6213, time 121.46ms
iter 231240: loss 5.3322, time 121.17ms
step 231250: train loss 5.7629, val loss 5.7498
saving checkpoint to out-shakespeare-char
iter 231250: loss 5.5815, time 2900.19ms
iter 231260: loss 5.8850, time 121.50ms
iter 231270: loss 5.7104, time 122.30ms
iter 231280: loss 6.0024, time 121.74ms
iter 231290: loss 6.1518, time 122.59ms
iter 231300: loss 5.9736, time 121.63ms
iter 231310: loss 6.3031, time 122.93ms
iter 231320: loss 6.0720, time 121.67ms
iter 231330: loss 6.4153, time 122.74ms
iter 231340: loss 5.9007, time 120.58ms
iter 231350: loss 6.1900, time 122.55ms
iter 231360: loss 6.1768, time 121.84ms
iter 231370: loss 6.9203, time 122.91ms
iter 231380: loss 6.8108, time 121.58ms
iter 231390: loss 5.8218, time 122.67ms
iter 231400: loss 6.3078, time 120.90ms
iter 231410: loss 5.8056, time 122.61ms
iter 231420: loss 6.1655, time 121.93ms
iter 231430: loss 6.6167, time 123.10ms
iter 231440: loss 6.9911, time 122.07ms
iter 231450: loss 6.2308, time 121.23ms
iter 231460: loss 6.2951, time 120.27ms
iter 231470: loss 6.6936, time 121.90ms
iter 231480: loss 6.1590, time 119.96ms
iter 231490: loss 6.2309, time 121.26ms
step 231500: train loss 5.7657, val loss 5.7450
saving checkpoint to out-shakespeare-char
iter 231500: loss 5.8246, time 2900.97ms
iter 231510: loss 6.4783, time 123.19ms
iter 231520: loss 6.4595, time 120.01ms
iter 231530: loss 6.2832, time 123.00ms
iter 231540: loss 6.0869, time 120.97ms
iter 231550: loss 6.1807, time 123.67ms
iter 231560: loss 6.1714, time 120.71ms
iter 231570: loss 6.6622, time 122.69ms
iter 231580: loss 5.9303, time 120.11ms
iter 231590: loss 6.3170, time 123.54ms
iter 231600: loss 6.1198, time 120.45ms
iter 231610: loss 6.3572, time 123.82ms
iter 231620: loss 6.0439, time 119.94ms
iter 231630: loss 6.5867, time 122.86ms
iter 231640: loss 6.1834, time 119.88ms
iter 231650: loss 6.1500, time 122.79ms
iter 231660: loss 6.4462, time 120.25ms
iter 231670: loss 5.9203, time 122.84ms
iter 231680: loss 5.7744, time 120.03ms
iter 231690: loss 6.2524, time 122.83ms
iter 231700: loss 5.4058, time 119.95ms
iter 231710: loss 6.4105, time 122.94ms
iter 231720: loss 6.2904, time 120.98ms
iter 231730: loss 5.4607, time 122.77ms
iter 231740: loss 6.8931, time 119.67ms
step 231750: train loss 5.7650, val loss 5.7783
saving checkpoint to out-shakespeare-char
iter 231750: loss 6.0083, time 2900.69ms
iter 231760: loss 6.5655, time 121.16ms
iter 231770: loss 5.5816, time 121.13ms
iter 231780: loss 5.6667, time 121.56ms
iter 231790: loss 5.6178, time 119.67ms
iter 231800: loss 5.9510, time 121.34ms
iter 231810: loss 5.4605, time 121.72ms
iter 231820: loss 6.1698, time 121.38ms
iter 231830: loss 6.0035, time 120.81ms
iter 231840: loss 5.7696, time 121.98ms
iter 231850: loss 6.2420, time 121.70ms
iter 231860: loss 5.9403, time 121.79ms
iter 231870: loss 5.9771, time 121.86ms
iter 231880: loss 6.8229, time 121.86ms
iter 231890: loss 7.0538, time 121.83ms
iter 231900: loss 6.5968, time 121.85ms
iter 231910: loss 6.3180, time 121.67ms
iter 231920: loss 6.1899, time 121.91ms
iter 231930: loss 5.9744, time 121.89ms
iter 231940: loss 6.5255, time 121.64ms
iter 231950: loss 6.1552, time 121.56ms
iter 231960: loss 6.5325, time 121.59ms
iter 231970: loss 6.5255, time 121.77ms
iter 231980: loss 6.1764, time 121.65ms
iter 231990: loss 6.3499, time 121.91ms
step 232000: train loss 5.7761, val loss 5.7747
saving checkpoint to out-shakespeare-char
iter 232000: loss 6.9091, time 2887.92ms
iter 232010: loss 6.1623, time 120.15ms
iter 232020: loss 5.6463, time 119.71ms
iter 232030: loss 5.9157, time 119.70ms
iter 232040: loss 6.6010, time 120.94ms
iter 232050: loss 6.8295, time 119.78ms
iter 232060: loss 6.1118, time 120.17ms
iter 232070: loss 6.6117, time 119.83ms
iter 232080: loss 6.2762, time 120.64ms
iter 232090: loss 5.5055, time 120.99ms
iter 232100: loss 6.1553, time 119.94ms
iter 232110: loss 6.0651, time 119.88ms
iter 232120: loss 6.9939, time 119.35ms
iter 232130: loss 5.9097, time 121.55ms
iter 232140: loss 6.1588, time 121.54ms
iter 232150: loss 5.8579, time 121.79ms
iter 232160: loss 6.3490, time 122.06ms
iter 232170: loss 6.1103, time 121.50ms
iter 232180: loss 6.2970, time 121.57ms
iter 232190: loss 6.3057, time 121.46ms
iter 232200: loss 5.9124, time 121.69ms
iter 232210: loss 5.4731, time 121.32ms
iter 232220: loss 6.0456, time 121.60ms
iter 232230: loss 6.1622, time 121.48ms
iter 232240: loss 6.5013, time 121.50ms
step 232250: train loss 5.7977, val loss 5.7593
saving checkpoint to out-shakespeare-char
iter 232250: loss 6.3584, time 2904.67ms
iter 232260: loss 6.2169, time 121.77ms
iter 232270: loss 6.1842, time 121.61ms
iter 232280: loss 6.6393, time 121.44ms
iter 232290: loss 5.9175, time 121.72ms
iter 232300: loss 5.8462, time 121.58ms
iter 232310: loss 5.8756, time 120.99ms
iter 232320: loss 6.6691, time 121.46ms
iter 232330: loss 7.1147, time 120.79ms
iter 232340: loss 5.7511, time 121.43ms
iter 232350: loss 6.0068, time 121.64ms
iter 232360: loss 6.4765, time 122.01ms
iter 232370: loss 5.7166, time 121.68ms
iter 232380: loss 6.1115, time 121.32ms
iter 232390: loss 6.2903, time 121.68ms
iter 232400: loss 6.4994, time 121.51ms
iter 232410: loss 5.1216, time 121.62ms
iter 232420: loss 6.0883, time 121.50ms
iter 232430: loss 6.1090, time 121.48ms
iter 232440: loss 5.7959, time 121.52ms
iter 232450: loss 6.2243, time 121.73ms
iter 232460: loss 6.2563, time 121.41ms
iter 232470: loss 6.6124, time 121.59ms
iter 232480: loss 6.7032, time 120.54ms
iter 232490: loss 6.0772, time 121.50ms
step 232500: train loss 5.7596, val loss 5.7421
saving checkpoint to out-shakespeare-char
iter 232500: loss 6.0267, time 2899.31ms
iter 232510: loss 5.5651, time 121.70ms
iter 232520: loss 6.0113, time 121.58ms
iter 232530: loss 7.0938, time 121.69ms
iter 232540: loss 5.5335, time 122.00ms
iter 232550: loss 6.6366, time 121.29ms
iter 232560: loss 6.6770, time 121.97ms
iter 232570: loss 6.6218, time 121.50ms
iter 232580: loss 5.5191, time 121.69ms
iter 232590: loss 6.3887, time 121.34ms
iter 232600: loss 6.3977, time 121.65ms
iter 232610: loss 5.6473, time 121.45ms
iter 232620: loss 6.4006, time 121.67ms
iter 232630: loss 5.6297, time 121.61ms
iter 232640: loss 5.6866, time 121.62ms
iter 232650: loss 6.7472, time 120.64ms
iter 232660: loss 5.1286, time 121.51ms
iter 232670: loss 6.5585, time 121.54ms
iter 232680: loss 6.6453, time 122.07ms
iter 232690: loss 6.3361, time 121.43ms
iter 232700: loss 6.0698, time 121.54ms
iter 232710: loss 6.3400, time 121.63ms
iter 232720: loss 6.1561, time 122.14ms
iter 232730: loss 6.1818, time 122.19ms
iter 232740: loss 6.1276, time 121.73ms
step 232750: train loss 5.7702, val loss 5.7589
saving checkpoint to out-shakespeare-char
iter 232750: loss 6.5409, time 2900.19ms
iter 232760: loss 6.4616, time 121.80ms
iter 232770: loss 6.6592, time 123.08ms
iter 232780: loss 6.5173, time 120.18ms
iter 232790: loss 5.6498, time 122.92ms
iter 232800: loss 6.8394, time 121.69ms
iter 232810: loss 5.8223, time 123.58ms
iter 232820: loss 6.4483, time 121.64ms
iter 232830: loss 6.1308, time 122.98ms
iter 232840: loss 6.2252, time 121.61ms
iter 232850: loss 6.3095, time 122.96ms
iter 232860: loss 6.4864, time 121.79ms
iter 232870: loss 6.2548, time 123.24ms
iter 232880: loss 5.9687, time 121.47ms
iter 232890: loss 5.7252, time 122.99ms
iter 232900: loss 6.2495, time 121.56ms
iter 232910: loss 6.3482, time 122.95ms
iter 232920: loss 6.6637, time 121.51ms
iter 232930: loss 5.9489, time 123.20ms
iter 232940: loss 6.0115, time 121.96ms
iter 232950: loss 6.6277, time 122.85ms
iter 232960: loss 6.1134, time 121.62ms
iter 232970: loss 5.7194, time 122.34ms
iter 232980: loss 5.6229, time 120.92ms
iter 232990: loss 6.5768, time 122.34ms
step 233000: train loss 5.6961, val loss 5.7889
saving checkpoint to out-shakespeare-char
iter 233000: loss 6.7528, time 2904.68ms
iter 233010: loss 6.4626, time 121.53ms
iter 233020: loss 6.8298, time 121.60ms
iter 233030: loss 5.8207, time 121.77ms
iter 233040: loss 6.6430, time 121.71ms
iter 233050: loss 6.2874, time 121.77ms
iter 233060: loss 5.9026, time 121.48ms
iter 233070: loss 6.4955, time 123.25ms
iter 233080: loss 6.3804, time 121.30ms
iter 233090: loss 6.9314, time 121.58ms
iter 233100: loss 5.1709, time 121.61ms
iter 233110: loss 6.5383, time 121.52ms
iter 233120: loss 6.3443, time 121.28ms
iter 233130: loss 6.3684, time 120.79ms
iter 233140: loss 6.1548, time 121.43ms
iter 233150: loss 6.5347, time 121.47ms
iter 233160: loss 6.3402, time 121.38ms
iter 233170: loss 5.4787, time 121.36ms
iter 233180: loss 5.7526, time 121.65ms
iter 233190: loss 5.9259, time 121.80ms
iter 233200: loss 5.9753, time 121.70ms
iter 233210: loss 6.1818, time 121.89ms
iter 233220: loss 6.5830, time 121.68ms
iter 233230: loss 6.5950, time 121.51ms
iter 233240: loss 7.1284, time 121.40ms
step 233250: train loss 5.7464, val loss 5.7537
saving checkpoint to out-shakespeare-char
iter 233250: loss 6.4370, time 2894.78ms
iter 233260: loss 6.4744, time 124.70ms
iter 233270: loss 6.4441, time 121.52ms
iter 233280: loss 6.5471, time 124.49ms
iter 233290: loss 6.7158, time 121.82ms
iter 233300: loss 5.9751, time 124.56ms
iter 233310: loss 5.6501, time 121.45ms
iter 233320: loss 6.8549, time 124.65ms
iter 233330: loss 5.9134, time 121.43ms
iter 233340: loss 6.7079, time 124.53ms
iter 233350: loss 5.7551, time 121.50ms
iter 233360: loss 6.1030, time 124.72ms
iter 233370: loss 6.1141, time 121.54ms
iter 233380: loss 6.6234, time 125.00ms
iter 233390: loss 5.1347, time 121.53ms
iter 233400: loss 5.9904, time 124.66ms
iter 233410: loss 5.6827, time 121.37ms
iter 233420: loss 5.9189, time 124.58ms
iter 233430: loss 6.3037, time 121.46ms
iter 233440: loss 5.6613, time 124.59ms
iter 233450: loss 5.9676, time 121.42ms
iter 233460: loss 6.3811, time 124.81ms
iter 233470: loss 5.3519, time 121.54ms
iter 233480: loss 6.4840, time 124.54ms
iter 233490: loss 6.4956, time 122.02ms
step 233500: train loss 5.7254, val loss 5.7546
saving checkpoint to out-shakespeare-char
iter 233500: loss 5.8939, time 2906.41ms
iter 233510: loss 6.2254, time 123.55ms
iter 233520: loss 6.0188, time 123.06ms
iter 233530: loss 6.4817, time 122.94ms
iter 233540: loss 5.8998, time 121.55ms
iter 233550: loss 6.4550, time 123.19ms
iter 233560: loss 5.8611, time 122.07ms
iter 233570: loss 6.1772, time 122.75ms
iter 233580: loss 6.5872, time 121.47ms
iter 233590: loss 6.3472, time 123.31ms
iter 233600: loss 6.2779, time 121.30ms
iter 233610: loss 5.9366, time 123.06ms
iter 233620: loss 6.4815, time 122.84ms
iter 233630: loss 6.0070, time 122.90ms
iter 233640: loss 5.9885, time 121.61ms
iter 233650: loss 5.8265, time 122.93ms
iter 233660: loss 5.6964, time 121.81ms
iter 233670: loss 5.9432, time 123.02ms
iter 233680: loss 6.0207, time 121.49ms
iter 233690: loss 6.5485, time 122.53ms
iter 233700: loss 6.7189, time 121.11ms
iter 233710: loss 6.3010, time 123.03ms
iter 233720: loss 6.3580, time 121.59ms
iter 233730: loss 6.2250, time 122.77ms
iter 233740: loss 6.3375, time 121.87ms
step 233750: train loss 5.7434, val loss 5.7169
saving checkpoint to out-shakespeare-char
iter 233750: loss 6.6281, time 2898.44ms
iter 233760: loss 6.0800, time 121.72ms
iter 233770: loss 6.4113, time 121.94ms
iter 233780: loss 6.8442, time 121.78ms
iter 233790: loss 6.4911, time 121.71ms
iter 233800: loss 6.2306, time 122.08ms
iter 233810: loss 6.1279, time 121.57ms
iter 233820: loss 6.7062, time 121.91ms
iter 233830: loss 5.8312, time 121.65ms
iter 233840: loss 6.1140, time 121.68ms
iter 233850: loss 6.1250, time 121.58ms
iter 233860: loss 6.3366, time 121.73ms
iter 233870: loss 6.3510, time 121.42ms
iter 233880: loss 5.6550, time 121.37ms
iter 233890: loss 6.1385, time 121.54ms
iter 233900: loss 5.6209, time 121.71ms
iter 233910: loss 5.8332, time 121.72ms
iter 233920: loss 5.9309, time 121.88ms
iter 233930: loss 6.1144, time 120.98ms
iter 233940: loss 6.0020, time 121.93ms
iter 233950: loss 6.3732, time 121.46ms
iter 233960: loss 5.9375, time 121.29ms
iter 233970: loss 5.5033, time 121.56ms
iter 233980: loss 6.1683, time 121.85ms
iter 233990: loss 7.5520, time 121.69ms
step 234000: train loss 5.7708, val loss 5.7060
saving checkpoint to out-shakespeare-char
iter 234000: loss 6.3920, time 2906.31ms
iter 234010: loss 5.9799, time 121.66ms
iter 234020: loss 6.3765, time 121.57ms
iter 234030: loss 6.6935, time 122.36ms
iter 234040: loss 6.1619, time 121.49ms
iter 234050: loss 6.1600, time 121.53ms
iter 234060: loss 5.6758, time 122.24ms
iter 234070: loss 6.8616, time 124.29ms
iter 234080: loss 6.6315, time 121.98ms
iter 234090: loss 6.6103, time 124.81ms
iter 234100: loss 5.8552, time 121.63ms
iter 234110: loss 6.0655, time 124.70ms
iter 234120: loss 6.4056, time 121.55ms
iter 234130: loss 6.5541, time 124.59ms
iter 234140: loss 6.7261, time 121.60ms
iter 234150: loss 6.3016, time 125.38ms
iter 234160: loss 6.3839, time 121.53ms
iter 234170: loss 6.3789, time 124.68ms
iter 234180: loss 5.8142, time 121.55ms
iter 234190: loss 5.9754, time 124.56ms
iter 234200: loss 6.3307, time 121.49ms
iter 234210: loss 5.9837, time 124.63ms
iter 234220: loss 6.3718, time 121.79ms
iter 234230: loss 6.3004, time 124.54ms
iter 234240: loss 5.5725, time 121.51ms
step 234250: train loss 5.7504, val loss 5.6937
saving checkpoint to out-shakespeare-char
iter 234250: loss 6.0348, time 2902.73ms
iter 234260: loss 6.0168, time 121.31ms
iter 234270: loss 6.2370, time 121.25ms
iter 234280: loss 6.0823, time 121.91ms
iter 234290: loss 6.4314, time 121.59ms
iter 234300: loss 7.0587, time 121.40ms
iter 234310: loss 5.7953, time 121.55ms
iter 234320: loss 5.9984, time 121.58ms
iter 234330: loss 6.1570, time 121.43ms
iter 234340: loss 6.5747, time 121.87ms
iter 234350: loss 5.9798, time 121.49ms
iter 234360: loss 6.1691, time 121.03ms
iter 234370: loss 6.3912, time 121.13ms
iter 234380: loss 6.5082, time 121.75ms
iter 234390: loss 6.0737, time 121.72ms
iter 234400: loss 5.8599, time 121.70ms
iter 234410: loss 5.8823, time 121.05ms
iter 234420: loss 5.2584, time 121.63ms
iter 234430: loss 5.7938, time 121.46ms
iter 234440: loss 6.0323, time 121.72ms
iter 234450: loss 6.3405, time 121.78ms
iter 234460: loss 5.6467, time 121.81ms
iter 234470: loss 6.9389, time 121.76ms
iter 234480: loss 6.3388, time 121.76ms
iter 234490: loss 6.9690, time 121.58ms
step 234500: train loss 5.7118, val loss 5.7759
saving checkpoint to out-shakespeare-char
iter 234500: loss 6.5796, time 2902.24ms
iter 234510: loss 6.6295, time 125.91ms
iter 234520: loss 6.0617, time 129.01ms
iter 234530: loss 5.8792, time 125.45ms
iter 234540: loss 5.6726, time 125.67ms
iter 234550: loss 5.3476, time 125.88ms
iter 234560: loss 6.5208, time 125.71ms
iter 234570: loss 5.8950, time 125.66ms
iter 234580: loss 6.4404, time 126.32ms
iter 234590: loss 5.3138, time 125.78ms
iter 234600: loss 5.9701, time 125.87ms
iter 234610: loss 5.8908, time 125.59ms
iter 234620: loss 6.2087, time 125.82ms
iter 234630: loss 6.0108, time 128.52ms
iter 234640: loss 6.5471, time 125.67ms
iter 234650: loss 6.5136, time 125.54ms
iter 234660: loss 6.3828, time 126.37ms
iter 234670: loss 6.6188, time 124.93ms
iter 234680: loss 6.0263, time 125.59ms
iter 234690: loss 5.5485, time 125.66ms
iter 234700: loss 6.1509, time 124.77ms
iter 234710: loss 6.7727, time 127.15ms
iter 234720: loss 6.5979, time 126.60ms
iter 234730: loss 6.2169, time 125.64ms
iter 234740: loss 5.5945, time 128.67ms
step 234750: train loss 5.6368, val loss 5.7704
saving checkpoint to out-shakespeare-char
iter 234750: loss 6.0740, time 2882.00ms
iter 234760: loss 5.6591, time 125.17ms
iter 234770: loss 6.3905, time 121.72ms
iter 234780: loss 6.1200, time 124.16ms
iter 234790: loss 7.3199, time 121.60ms
iter 234800: loss 7.1679, time 124.50ms
iter 234810: loss 5.7985, time 121.55ms
iter 234820: loss 6.3135, time 124.77ms
iter 234830: loss 5.8876, time 121.54ms
iter 234840: loss 6.0932, time 124.45ms
iter 234850: loss 5.4327, time 122.04ms
iter 234860: loss 6.1040, time 125.08ms
iter 234870: loss 5.4159, time 121.60ms
iter 234880: loss 6.1692, time 124.69ms
iter 234890: loss 5.5081, time 121.64ms
iter 234900: loss 5.8602, time 124.63ms
iter 234910: loss 5.4969, time 121.75ms
iter 234920: loss 6.0100, time 124.71ms
iter 234930: loss 6.1339, time 121.16ms
iter 234940: loss 6.0704, time 124.68ms
iter 234950: loss 6.5502, time 121.60ms
iter 234960: loss 5.5940, time 124.73ms
iter 234970: loss 6.3490, time 121.64ms
iter 234980: loss 6.6023, time 124.59ms
iter 234990: loss 6.1427, time 121.58ms
step 235000: train loss 5.7440, val loss 5.7461
saving checkpoint to out-shakespeare-char
iter 235000: loss 5.9390, time 2893.79ms
iter 235010: loss 5.8103, time 121.59ms
iter 235020: loss 5.9384, time 120.93ms
iter 235030: loss 5.9605, time 121.84ms
iter 235040: loss 6.2551, time 121.64ms
iter 235050: loss 6.1556, time 122.83ms
iter 235060: loss 6.9422, time 121.77ms
iter 235070: loss 7.4972, time 121.83ms
iter 235080: loss 5.6264, time 122.28ms
iter 235090: loss 5.7478, time 121.70ms
iter 235100: loss 5.7235, time 121.63ms
iter 235110: loss 5.8458, time 121.90ms
iter 235120: loss 5.9514, time 121.73ms
iter 235130: loss 5.4571, time 121.59ms
iter 235140: loss 6.0692, time 120.65ms
iter 235150: loss 5.3550, time 121.80ms
iter 235160: loss 6.4971, time 121.73ms
iter 235170: loss 6.5228, time 122.10ms
iter 235180: loss 5.8575, time 120.78ms
iter 235190: loss 6.6958, time 120.91ms
iter 235200: loss 6.2914, time 121.64ms
iter 235210: loss 6.1542, time 121.78ms
iter 235220: loss 6.3096, time 121.65ms
iter 235230: loss 6.2911, time 121.82ms
iter 235240: loss 5.9105, time 121.70ms
step 235250: train loss 5.7094, val loss 5.7248
saving checkpoint to out-shakespeare-char
iter 235250: loss 6.5862, time 2911.91ms
iter 235260: loss 6.2502, time 121.50ms
iter 235270: loss 5.8036, time 121.72ms
iter 235280: loss 7.4190, time 121.48ms
iter 235290: loss 5.7533, time 121.63ms
iter 235300: loss 5.9421, time 120.85ms
iter 235310: loss 5.9687, time 120.41ms
iter 235320: loss 5.9434, time 121.65ms
iter 235330: loss 6.0233, time 120.90ms
iter 235340: loss 6.4608, time 121.50ms
iter 235350: loss 6.9848, time 121.76ms
iter 235360: loss 6.4114, time 121.54ms
iter 235370: loss 6.0849, time 121.70ms
iter 235380: loss 5.5542, time 121.42ms
iter 235390: loss 5.4603, time 121.30ms
iter 235400: loss 5.3098, time 121.43ms
iter 235410: loss 5.4681, time 121.64ms
iter 235420: loss 6.4793, time 121.48ms
iter 235430: loss 6.1102, time 121.68ms
iter 235440: loss 6.0025, time 121.82ms
iter 235450: loss 4.9707, time 121.46ms
iter 235460: loss 5.6455, time 120.79ms
iter 235470: loss 6.2363, time 121.42ms
iter 235480: loss 5.7340, time 121.42ms
iter 235490: loss 6.5689, time 121.76ms
step 235500: train loss 5.7718, val loss 5.7324
saving checkpoint to out-shakespeare-char
iter 235500: loss 6.1815, time 2909.90ms
iter 235510: loss 5.5055, time 125.78ms
iter 235520: loss 5.8585, time 127.02ms
iter 235530: loss 5.7466, time 126.25ms
iter 235540: loss 6.6535, time 128.60ms
iter 235550: loss 6.1778, time 125.84ms
iter 235560: loss 6.4035, time 126.32ms
iter 235570: loss 5.9413, time 125.34ms
iter 235580: loss 6.5961, time 126.02ms
iter 235590: loss 6.9657, time 125.67ms
iter 235600: loss 6.0887, time 125.78ms
iter 235610: loss 6.4498, time 126.33ms
iter 235620: loss 5.8359, time 125.58ms
iter 235630: loss 6.3885, time 125.56ms
iter 235640: loss 7.1713, time 125.80ms
iter 235650: loss 6.7169, time 125.66ms
iter 235660: loss 5.8211, time 126.47ms
iter 235670: loss 6.7587, time 125.55ms
iter 235680: loss 6.2833, time 126.04ms
iter 235690: loss 6.2432, time 126.03ms
iter 235700: loss 5.9618, time 125.24ms
iter 235710: loss 6.5022, time 125.81ms
iter 235720: loss 6.2351, time 125.07ms
iter 235730: loss 6.9825, time 125.71ms
iter 235740: loss 5.5498, time 125.90ms
step 235750: train loss 5.7477, val loss 5.7318
saving checkpoint to out-shakespeare-char
iter 235750: loss 6.7558, time 2870.51ms
iter 235760: loss 6.0750, time 124.88ms
iter 235770: loss 5.6774, time 121.76ms
iter 235780: loss 6.9744, time 124.75ms
iter 235790: loss 7.2372, time 121.84ms
iter 235800: loss 6.2953, time 124.74ms
iter 235810: loss 6.1982, time 122.82ms
iter 235820: loss 6.6787, time 124.91ms
iter 235830: loss 6.3563, time 121.80ms
iter 235840: loss 5.8244, time 124.88ms
iter 235850: loss 6.5171, time 121.58ms
iter 235860: loss 5.9630, time 124.36ms
iter 235870: loss 5.4620, time 122.18ms
iter 235880: loss 5.4617, time 124.51ms
iter 235890: loss 6.0835, time 121.86ms
iter 235900: loss 5.7871, time 124.73ms
iter 235910: loss 6.6318, time 121.72ms
iter 235920: loss 6.6834, time 124.77ms
iter 235930: loss 6.0996, time 121.61ms
iter 235940: loss 6.0134, time 124.53ms
iter 235950: loss 6.1926, time 121.80ms
iter 235960: loss 5.8373, time 124.98ms
iter 235970: loss 6.6322, time 121.64ms
iter 235980: loss 6.7954, time 124.62ms
iter 235990: loss 6.3741, time 121.70ms
step 236000: train loss 5.6849, val loss 5.7550
saving checkpoint to out-shakespeare-char
iter 236000: loss 6.0268, time 2912.31ms
iter 236010: loss 6.4017, time 127.50ms
iter 236020: loss 6.1573, time 126.07ms
iter 236030: loss 6.1533, time 125.89ms
iter 236040: loss 6.3700, time 125.61ms
iter 236050: loss 6.3443, time 125.80ms
iter 236060: loss 6.3445, time 125.53ms
iter 236070: loss 5.7576, time 125.89ms
iter 236080: loss 6.0261, time 125.75ms
iter 236090: loss 6.9763, time 124.88ms
iter 236100: loss 6.1141, time 125.16ms
iter 236110: loss 5.5733, time 125.54ms
iter 236120: loss 6.0970, time 125.58ms
iter 236130: loss 6.4133, time 125.58ms
iter 236140: loss 5.8689, time 128.71ms
iter 236150: loss 5.7545, time 125.70ms
iter 236160: loss 6.8291, time 126.31ms
iter 236170: loss 5.7727, time 125.83ms
iter 236180: loss 5.8916, time 130.09ms
iter 236190: loss 6.4638, time 125.82ms
iter 236200: loss 6.1352, time 125.78ms
iter 236210: loss 6.0713, time 126.17ms
iter 236220: loss 5.9495, time 125.84ms
iter 236230: loss 6.6550, time 125.45ms
iter 236240: loss 6.6646, time 125.49ms
step 236250: train loss 5.7682, val loss 5.7227
saving checkpoint to out-shakespeare-char
iter 236250: loss 6.0973, time 2910.50ms
iter 236260: loss 6.2336, time 121.95ms
iter 236270: loss 6.1474, time 121.70ms
iter 236280: loss 6.0050, time 121.60ms
iter 236290: loss 5.5825, time 122.88ms
iter 236300: loss 6.2147, time 121.86ms
iter 236310: loss 6.7616, time 122.43ms
iter 236320: loss 7.0096, time 120.83ms
iter 236330: loss 6.6076, time 120.80ms
iter 236340: loss 6.2940, time 121.72ms
iter 236350: loss 6.7719, time 121.83ms
iter 236360: loss 6.3211, time 120.70ms
iter 236370: loss 6.1465, time 121.65ms
iter 236380: loss 6.0351, time 121.83ms
iter 236390: loss 5.4553, time 121.43ms
iter 236400: loss 6.0033, time 120.89ms
iter 236410: loss 6.2827, time 121.68ms
iter 236420: loss 5.8586, time 122.54ms
iter 236430: loss 6.5025, time 121.99ms
iter 236440: loss 6.4089, time 122.07ms
iter 236450: loss 6.3446, time 121.97ms
iter 236460: loss 5.9208, time 121.60ms
iter 236470: loss 6.9690, time 121.59ms
iter 236480: loss 5.7002, time 121.59ms
iter 236490: loss 6.0145, time 121.91ms
step 236500: train loss 5.7435, val loss 5.7286
saving checkpoint to out-shakespeare-char
iter 236500: loss 5.6702, time 2890.02ms
iter 236510: loss 6.2077, time 122.92ms
iter 236520: loss 5.7121, time 122.05ms
iter 236530: loss 6.3140, time 122.90ms
iter 236540: loss 6.3926, time 122.23ms
iter 236550: loss 6.9896, time 123.25ms
iter 236560: loss 6.6425, time 122.16ms
iter 236570: loss 6.2413, time 124.16ms
iter 236580: loss 6.2148, time 121.94ms
iter 236590: loss 6.1152, time 123.71ms
iter 236600: loss 5.5774, time 121.09ms
iter 236610: loss 6.3702, time 123.29ms
iter 236620: loss 5.2505, time 121.19ms
iter 236630: loss 6.0527, time 122.73ms
iter 236640: loss 5.9074, time 119.85ms
iter 236650: loss 5.7522, time 121.45ms
iter 236660: loss 6.2376, time 120.92ms
iter 236670: loss 5.9892, time 121.22ms
iter 236680: loss 5.5009, time 120.08ms
iter 236690: loss 5.6477, time 121.17ms
iter 236700: loss 6.1570, time 120.08ms
iter 236710: loss 5.7965, time 123.32ms
iter 236720: loss 6.5789, time 121.97ms
iter 236730: loss 5.5047, time 122.86ms
iter 236740: loss 6.2664, time 122.08ms
step 236750: train loss 5.6822, val loss 5.7625
saving checkpoint to out-shakespeare-char
iter 236750: loss 6.4119, time 2890.27ms
iter 236760: loss 5.3581, time 122.98ms
iter 236770: loss 5.5341, time 122.04ms
iter 236780: loss 6.1094, time 121.75ms
iter 236790: loss 5.7469, time 122.07ms
iter 236800: loss 6.6067, time 123.48ms
iter 236810: loss 5.8386, time 121.77ms
iter 236820: loss 6.7228, time 123.43ms
iter 236830: loss 5.9064, time 122.08ms
iter 236840: loss 6.8171, time 122.74ms
iter 236850: loss 6.3433, time 121.88ms
iter 236860: loss 6.2334, time 123.02ms
iter 236870: loss 5.9382, time 122.33ms
iter 236880: loss 6.3200, time 123.04ms
iter 236890: loss 5.9839, time 122.87ms
iter 236900: loss 5.6006, time 122.97ms
iter 236910: loss 6.1764, time 122.03ms
iter 236920: loss 5.8444, time 122.02ms
iter 236930: loss 4.8653, time 121.88ms
iter 236940: loss 5.8240, time 123.33ms
iter 236950: loss 6.9085, time 119.53ms
iter 236960: loss 6.2251, time 122.94ms
iter 236970: loss 6.6970, time 121.99ms
iter 236980: loss 6.4029, time 122.79ms
iter 236990: loss 6.8068, time 122.08ms
step 237000: train loss 5.7590, val loss 5.7238
saving checkpoint to out-shakespeare-char
iter 237000: loss 6.5593, time 2882.82ms
iter 237010: loss 5.7492, time 120.84ms
iter 237020: loss 6.4541, time 122.33ms
iter 237030: loss 5.9572, time 121.79ms
iter 237040: loss 5.4548, time 121.63ms
iter 237050: loss 5.4811, time 122.53ms
iter 237060: loss 5.9727, time 121.93ms
iter 237070: loss 6.0708, time 121.90ms
iter 237080: loss 6.2755, time 121.94ms
iter 237090: loss 6.2580, time 121.51ms
iter 237100: loss 6.0421, time 121.26ms
iter 237110: loss 6.5763, time 121.98ms
iter 237120: loss 6.0934, time 121.96ms
iter 237130: loss 6.2088, time 122.62ms
iter 237140: loss 5.9625, time 121.95ms
iter 237150: loss 6.4009, time 122.04ms
iter 237160: loss 5.9088, time 121.86ms
iter 237170: loss 5.9603, time 121.77ms
iter 237180: loss 5.4059, time 121.73ms
iter 237190: loss 5.8687, time 121.85ms
iter 237200: loss 6.3922, time 121.72ms
iter 237210: loss 6.5417, time 121.88ms
iter 237220: loss 6.3467, time 121.91ms
iter 237230: loss 6.5488, time 121.78ms
iter 237240: loss 6.4593, time 121.07ms
step 237250: train loss 5.7457, val loss 5.7623
saving checkpoint to out-shakespeare-char
iter 237250: loss 5.8424, time 2885.56ms
iter 237260: loss 5.5831, time 121.79ms
iter 237270: loss 5.5511, time 124.32ms
iter 237280: loss 5.9505, time 121.81ms
iter 237290: loss 6.2122, time 121.84ms
iter 237300: loss 6.6770, time 121.69ms
iter 237310: loss 6.5575, time 123.30ms
iter 237320: loss 5.6451, time 121.57ms
iter 237330: loss 5.8833, time 122.31ms
iter 237340: loss 5.7524, time 121.86ms
iter 237350: loss 6.8532, time 123.33ms
iter 237360: loss 6.2614, time 121.98ms
iter 237370: loss 6.2028, time 121.77ms
iter 237380: loss 5.7874, time 121.97ms
iter 237390: loss 6.5608, time 122.12ms
iter 237400: loss 5.7519, time 121.01ms
iter 237410: loss 6.6689, time 123.32ms
iter 237420: loss 5.8514, time 121.28ms
iter 237430: loss 6.1847, time 123.56ms
iter 237440: loss 5.9104, time 122.27ms
iter 237450: loss 6.7096, time 123.37ms
iter 237460: loss 6.0296, time 122.08ms
iter 237470: loss 5.9760, time 123.19ms
iter 237480: loss 6.4902, time 120.49ms
iter 237490: loss 6.3006, time 121.55ms
step 237500: train loss 5.7284, val loss 5.7237
saving checkpoint to out-shakespeare-char
iter 237500: loss 6.2128, time 2890.13ms
iter 237510: loss 5.8030, time 120.92ms
iter 237520: loss 6.6828, time 121.75ms
iter 237530: loss 5.9433, time 121.55ms
iter 237540: loss 5.7690, time 121.64ms
iter 237550: loss 6.0349, time 121.63ms
iter 237560: loss 5.7476, time 121.58ms
iter 237570: loss 6.4590, time 121.59ms
iter 237580: loss 5.8871, time 121.44ms
iter 237590: loss 6.6674, time 121.49ms
iter 237600: loss 6.2949, time 121.20ms
iter 237610: loss 6.7480, time 120.33ms
iter 237620: loss 6.0941, time 120.47ms
iter 237630: loss 6.3666, time 122.32ms
iter 237640: loss 5.7551, time 121.38ms
iter 237650: loss 6.6371, time 121.03ms
iter 237660: loss 5.7123, time 121.01ms
iter 237670: loss 6.5136, time 121.59ms
iter 237680: loss 6.0851, time 120.93ms
iter 237690: loss 6.7413, time 121.59ms
iter 237700: loss 6.0142, time 120.85ms
iter 237710: loss 6.6128, time 121.66ms
iter 237720: loss 6.3744, time 121.60ms
iter 237730: loss 6.2986, time 119.93ms
iter 237740: loss 5.7220, time 120.34ms
step 237750: train loss 5.7547, val loss 5.8108
saving checkpoint to out-shakespeare-char
iter 237750: loss 6.1951, time 2886.15ms
iter 237760: loss 5.8795, time 121.40ms
iter 237770: loss 6.0509, time 121.20ms
iter 237780: loss 6.3997, time 121.63ms
iter 237790: loss 6.4490, time 121.61ms
iter 237800: loss 6.7582, time 121.65ms
iter 237810: loss 7.1602, time 121.58ms
iter 237820: loss 6.3714, time 121.61ms
iter 237830: loss 6.3889, time 121.60ms
iter 237840: loss 5.9507, time 122.03ms
iter 237850: loss 6.5475, time 121.63ms
iter 237860: loss 5.7827, time 122.01ms
iter 237870: loss 6.3386, time 121.77ms
iter 237880: loss 6.2795, time 122.74ms
iter 237890: loss 6.1815, time 122.56ms
iter 237900: loss 6.5983, time 122.36ms
iter 237910: loss 6.5763, time 122.04ms
iter 237920: loss 6.5325, time 122.92ms
iter 237930: loss 5.1814, time 122.53ms
iter 237940: loss 6.3297, time 122.61ms
iter 237950: loss 6.1920, time 121.72ms
iter 237960: loss 6.0663, time 122.94ms
iter 237970: loss 6.7108, time 122.44ms
iter 237980: loss 6.6433, time 123.23ms
iter 237990: loss 5.7109, time 121.72ms
step 238000: train loss 5.7362, val loss 5.7485
saving checkpoint to out-shakespeare-char
iter 238000: loss 6.8648, time 2908.63ms
iter 238010: loss 6.0904, time 125.79ms
iter 238020: loss 6.0119, time 126.17ms
iter 238030: loss 6.1823, time 125.98ms
iter 238040: loss 6.0348, time 128.57ms
iter 238050: loss 5.8578, time 126.17ms
iter 238060: loss 5.5237, time 124.54ms
iter 238070: loss 6.4902, time 126.24ms
iter 238080: loss 6.0590, time 126.33ms
iter 238090: loss 6.5213, time 122.11ms
iter 238100: loss 5.9473, time 124.11ms
iter 238110: loss 6.2900, time 121.25ms
iter 238120: loss 6.7877, time 125.09ms
iter 238130: loss 5.9807, time 122.03ms
iter 238140: loss 6.3821, time 124.84ms
iter 238150: loss 6.2297, time 122.01ms
iter 238160: loss 6.9476, time 125.53ms
iter 238170: loss 6.3299, time 121.36ms
iter 238180: loss 6.3016, time 123.92ms
iter 238190: loss 5.6579, time 122.04ms
iter 238200: loss 6.1783, time 125.16ms
iter 238210: loss 6.1578, time 122.91ms
iter 238220: loss 6.0201, time 124.74ms
iter 238230: loss 6.0710, time 122.47ms
iter 238240: loss 6.3768, time 124.63ms
step 238250: train loss 5.7019, val loss 5.7411
saving checkpoint to out-shakespeare-char
iter 238250: loss 6.7582, time 2892.05ms
iter 238260: loss 5.6307, time 120.69ms
iter 238270: loss 6.1683, time 120.52ms
iter 238280: loss 6.3147, time 120.98ms
iter 238290: loss 6.3835, time 121.29ms
iter 238300: loss 6.4277, time 121.44ms
iter 238310: loss 6.5571, time 121.65ms
iter 238320: loss 6.9776, time 122.06ms
iter 238330: loss 6.0639, time 121.68ms
iter 238340: loss 6.0901, time 121.05ms
iter 238350: loss 6.7455, time 122.30ms
iter 238360: loss 5.6345, time 121.72ms
iter 238370: loss 5.3785, time 121.76ms
iter 238380: loss 6.1019, time 122.03ms
iter 238390: loss 6.1702, time 122.36ms
iter 238400: loss 6.6023, time 121.81ms
iter 238410: loss 5.5254, time 122.73ms
iter 238420: loss 5.6496, time 121.90ms
iter 238430: loss 5.5129, time 122.13ms
iter 238440: loss 6.2432, time 121.42ms
iter 238450: loss 5.2501, time 122.69ms
iter 238460: loss 5.8382, time 119.67ms
iter 238470: loss 6.3721, time 120.73ms
iter 238480: loss 6.5836, time 121.21ms
iter 238490: loss 6.0317, time 122.86ms
step 238500: train loss 5.7160, val loss 5.7539
saving checkpoint to out-shakespeare-char
iter 238500: loss 5.8320, time 2891.25ms
iter 238510: loss 5.8560, time 119.34ms
iter 238520: loss 6.1212, time 121.58ms
iter 238530: loss 5.5729, time 121.37ms
iter 238540: loss 5.7876, time 122.08ms
iter 238550: loss 6.3834, time 121.11ms
iter 238560: loss 6.2159, time 122.23ms
iter 238570: loss 6.2056, time 122.02ms
iter 238580: loss 6.3542, time 120.35ms
iter 238590: loss 6.3392, time 120.55ms
iter 238600: loss 5.5455, time 121.31ms
iter 238610: loss 6.6334, time 121.27ms
iter 238620: loss 6.0757, time 121.26ms
iter 238630: loss 6.0822, time 120.44ms
iter 238640: loss 6.2688, time 121.30ms
iter 238650: loss 6.6681, time 119.79ms
iter 238660: loss 6.8373, time 121.62ms
iter 238670: loss 6.4938, time 121.39ms
iter 238680: loss 6.4579, time 118.70ms
iter 238690: loss 6.2644, time 121.23ms
iter 238700: loss 5.9643, time 120.90ms
iter 238710: loss 7.0238, time 119.30ms
iter 238720: loss 7.1150, time 120.88ms
iter 238730: loss 6.2804, time 120.25ms
iter 238740: loss 6.4178, time 120.00ms
step 238750: train loss 5.7376, val loss 5.6746
saving checkpoint to out-shakespeare-char
iter 238750: loss 6.0086, time 2883.56ms
iter 238760: loss 6.4932, time 123.19ms
iter 238770: loss 6.8801, time 122.01ms
iter 238780: loss 6.2917, time 122.92ms
iter 238790: loss 6.3100, time 122.01ms
iter 238800: loss 5.6339, time 123.12ms
iter 238810: loss 6.1086, time 121.99ms
iter 238820: loss 6.7227, time 123.09ms
iter 238830: loss 6.1399, time 121.92ms
iter 238840: loss 6.9124, time 123.40ms
iter 238850: loss 6.6447, time 121.90ms
iter 238860: loss 6.3109, time 123.20ms
iter 238870: loss 6.6516, time 121.87ms
iter 238880: loss 4.9847, time 123.01ms
iter 238890: loss 5.7313, time 120.79ms
iter 238900: loss 5.5378, time 123.45ms
iter 238910: loss 5.6754, time 121.99ms
iter 238920: loss 6.5609, time 123.19ms
iter 238930: loss 6.5203, time 121.90ms
iter 238940: loss 5.8780, time 123.69ms
iter 238950: loss 5.7857, time 122.19ms
iter 238960: loss 5.5701, time 123.12ms
iter 238970: loss 6.0062, time 122.38ms
iter 238980: loss 5.9517, time 123.12ms
iter 238990: loss 5.8773, time 122.05ms
step 239000: train loss 5.7142, val loss 5.7318
saving checkpoint to out-shakespeare-char
iter 239000: loss 5.8930, time 2893.93ms
iter 239010: loss 6.0512, time 121.65ms
iter 239020: loss 6.1758, time 124.59ms
iter 239030: loss 6.0442, time 120.82ms
iter 239040: loss 6.1762, time 124.42ms
iter 239050: loss 6.6440, time 121.59ms
iter 239060: loss 5.8634, time 124.40ms
iter 239070: loss 6.3207, time 121.64ms
iter 239080: loss 6.1909, time 123.89ms
iter 239090: loss 5.8369, time 121.19ms
iter 239100: loss 5.8741, time 124.53ms
iter 239110: loss 6.0045, time 121.54ms
iter 239120: loss 6.7280, time 123.54ms
iter 239130: loss 6.1354, time 120.73ms
iter 239140: loss 6.4503, time 124.44ms
iter 239150: loss 6.2687, time 121.23ms
iter 239160: loss 6.2279, time 124.38ms
iter 239170: loss 5.8514, time 121.52ms
iter 239180: loss 5.8552, time 124.61ms
iter 239190: loss 6.1926, time 122.06ms
iter 239200: loss 6.3255, time 124.22ms
iter 239210: loss 6.9429, time 121.11ms
iter 239220: loss 6.2540, time 123.27ms
iter 239230: loss 6.3064, time 121.34ms
iter 239240: loss 6.0366, time 124.19ms
step 239250: train loss 5.7661, val loss 5.7541
saving checkpoint to out-shakespeare-char
iter 239250: loss 6.5024, time 2893.05ms
iter 239260: loss 6.3572, time 121.69ms
iter 239270: loss 5.9383, time 124.21ms
iter 239280: loss 5.7222, time 122.09ms
iter 239290: loss 6.0984, time 124.82ms
iter 239300: loss 5.7152, time 121.52ms
iter 239310: loss 6.4832, time 124.26ms
iter 239320: loss 6.5883, time 120.38ms
iter 239330: loss 6.1045, time 123.41ms
iter 239340: loss 6.4620, time 121.77ms
iter 239350: loss 6.5447, time 124.57ms
iter 239360: loss 5.7889, time 121.49ms
iter 239370: loss 7.0195, time 124.91ms
iter 239380: loss 6.7885, time 121.67ms
iter 239390: loss 6.3564, time 123.74ms
iter 239400: loss 6.5099, time 121.15ms
iter 239410: loss 6.1518, time 123.90ms
iter 239420: loss 6.1765, time 121.06ms
iter 239430: loss 6.4405, time 121.71ms
iter 239440: loss 5.9056, time 120.91ms
iter 239450: loss 6.0001, time 121.67ms
iter 239460: loss 6.4698, time 121.38ms
iter 239470: loss 6.0283, time 121.10ms
iter 239480: loss 6.5361, time 121.50ms
iter 239490: loss 5.7246, time 121.57ms
step 239500: train loss 5.6785, val loss 5.7309
saving checkpoint to out-shakespeare-char
iter 239500: loss 5.9662, time 2903.66ms
iter 239510: loss 5.3237, time 121.61ms
iter 239520: loss 5.7767, time 121.84ms
iter 239530: loss 6.4238, time 121.74ms
iter 239540: loss 6.1174, time 122.55ms
iter 239550: loss 6.0595, time 121.51ms
iter 239560: loss 6.2981, time 121.61ms
iter 239570: loss 6.4271, time 121.90ms
iter 239580: loss 5.7775, time 122.65ms
iter 239590: loss 5.6779, time 121.56ms
iter 239600: loss 5.8533, time 122.67ms
iter 239610: loss 6.4234, time 121.55ms
iter 239620: loss 6.2248, time 123.15ms
iter 239630: loss 5.8833, time 121.60ms
iter 239640: loss 6.6528, time 122.65ms
iter 239650: loss 6.0001, time 122.13ms
iter 239660: loss 6.0172, time 122.61ms
iter 239670: loss 6.1585, time 121.42ms
iter 239680: loss 5.8555, time 122.48ms
iter 239690: loss 6.0160, time 121.87ms
iter 239700: loss 6.1255, time 123.09ms
iter 239710: loss 6.2263, time 121.89ms
iter 239720: loss 6.6263, time 122.52ms
iter 239730: loss 6.8007, time 121.55ms
iter 239740: loss 6.5659, time 121.31ms
step 239750: train loss 5.7455, val loss 5.7404
saving checkpoint to out-shakespeare-char
iter 239750: loss 6.6215, time 2872.71ms
iter 239760: loss 6.7911, time 120.63ms
iter 239770: loss 6.3506, time 120.13ms
iter 239780: loss 6.4792, time 120.91ms
iter 239790: loss 6.7203, time 121.44ms
iter 239800: loss 6.4565, time 120.89ms
iter 239810: loss 6.5396, time 121.38ms
iter 239820: loss 6.8832, time 121.42ms
iter 239830: loss 6.0044, time 121.20ms
iter 239840: loss 5.5695, time 121.31ms
iter 239850: loss 5.7267, time 121.20ms
iter 239860: loss 5.7498, time 121.38ms
iter 239870: loss 5.8902, time 121.00ms
iter 239880: loss 5.9656, time 121.39ms
iter 239890: loss 6.1808, time 121.02ms
iter 239900: loss 6.0095, time 120.69ms
iter 239910: loss 5.9270, time 119.87ms
iter 239920: loss 6.0728, time 120.74ms
iter 239930: loss 6.8340, time 120.98ms
iter 239940: loss 5.9629, time 121.55ms
iter 239950: loss 5.7259, time 121.31ms
iter 239960: loss 5.7314, time 121.28ms
iter 239970: loss 6.2399, time 121.22ms
iter 239980: loss 5.9689, time 121.87ms
iter 239990: loss 6.5116, time 121.96ms
step 240000: train loss 5.7276, val loss 5.7658
saving checkpoint to out-shakespeare-char
iter 240000: loss 6.1424, time 2889.71ms
iter 240010: loss 6.1811, time 125.73ms
iter 240020: loss 5.7298, time 124.10ms
iter 240030: loss 5.8655, time 124.34ms
iter 240040: loss 6.1349, time 125.66ms
iter 240050: loss 6.7952, time 125.23ms
iter 240060: loss 6.7318, time 124.76ms
iter 240070: loss 5.6955, time 125.69ms
iter 240080: loss 6.5079, time 125.20ms
iter 240090: loss 6.7254, time 123.99ms
iter 240100: loss 6.6249, time 127.91ms
iter 240110: loss 6.2199, time 124.83ms
iter 240120: loss 5.8054, time 125.31ms
iter 240130: loss 6.6860, time 124.51ms
iter 240140: loss 6.0059, time 124.38ms
iter 240150: loss 5.6576, time 125.40ms
iter 240160: loss 6.0156, time 124.78ms
iter 240170: loss 5.7174, time 123.78ms
iter 240180: loss 5.9874, time 125.03ms
iter 240190: loss 6.3353, time 125.14ms
iter 240200: loss 5.4340, time 124.22ms
iter 240210: loss 6.2081, time 127.13ms
iter 240220: loss 6.1273, time 125.05ms
iter 240230: loss 5.2021, time 125.32ms
iter 240240: loss 6.7805, time 124.82ms
step 240250: train loss 5.7360, val loss 5.7260
saving checkpoint to out-shakespeare-char
iter 240250: loss 6.4603, time 2890.99ms
iter 240260: loss 5.8670, time 125.35ms
iter 240270: loss 5.5395, time 125.36ms
iter 240280: loss 5.3151, time 125.78ms
iter 240290: loss 6.3329, time 125.65ms
iter 240300: loss 6.1548, time 125.71ms
iter 240310: loss 6.2022, time 125.83ms
iter 240320: loss 5.6742, time 125.75ms
iter 240330: loss 5.7398, time 125.88ms
iter 240340: loss 5.5989, time 125.45ms
iter 240350: loss 5.3739, time 125.90ms
iter 240360: loss 6.2133, time 125.58ms
iter 240370: loss 5.5200, time 125.93ms
iter 240380: loss 6.2130, time 125.83ms
iter 240390: loss 7.2461, time 128.40ms
iter 240400: loss 5.6892, time 125.48ms
iter 240410: loss 6.6867, time 125.49ms
iter 240420: loss 5.6183, time 125.69ms
iter 240430: loss 6.3743, time 125.75ms
iter 240440: loss 5.9952, time 125.88ms
iter 240450: loss 5.8350, time 126.22ms
iter 240460: loss 6.3111, time 125.93ms
iter 240470: loss 6.1041, time 125.77ms
iter 240480: loss 6.3157, time 125.54ms
iter 240490: loss 6.6642, time 125.66ms
step 240500: train loss 5.7369, val loss 5.7382
saving checkpoint to out-shakespeare-char
iter 240500: loss 6.5991, time 2894.48ms
iter 240510: loss 6.4645, time 126.27ms
iter 240520: loss 6.6886, time 125.93ms
iter 240530: loss 7.0046, time 126.00ms
iter 240540: loss 5.9409, time 125.37ms
iter 240550: loss 6.2763, time 125.97ms
iter 240560: loss 6.5413, time 128.45ms
iter 240570: loss 5.9245, time 125.14ms
iter 240580: loss 5.4679, time 125.67ms
iter 240590: loss 6.8104, time 125.73ms
iter 240600: loss 5.7495, time 125.73ms
iter 240610: loss 5.8772, time 125.78ms
iter 240620: loss 6.3719, time 125.52ms
iter 240630: loss 6.0675, time 125.37ms
iter 240640: loss 6.4805, time 126.09ms
iter 240650: loss 6.8938, time 125.76ms
iter 240660: loss 6.6696, time 125.62ms
iter 240670: loss 5.7977, time 128.61ms
iter 240680: loss 6.1836, time 125.79ms
iter 240690: loss 6.6174, time 125.78ms
iter 240700: loss 6.5287, time 126.14ms
iter 240710: loss 5.8861, time 125.96ms
iter 240720: loss 6.8468, time 125.80ms
iter 240730: loss 6.3624, time 125.49ms
iter 240740: loss 6.4361, time 123.02ms
step 240750: train loss 5.7346, val loss 5.7087
saving checkpoint to out-shakespeare-char
iter 240750: loss 5.6547, time 2847.85ms
iter 240760: loss 5.5954, time 122.68ms
iter 240770: loss 5.8113, time 119.84ms
iter 240780: loss 6.1834, time 123.14ms
iter 240790: loss 6.6332, time 120.07ms
iter 240800: loss 6.4852, time 123.30ms
iter 240810: loss 6.9226, time 121.78ms
iter 240820: loss 6.2354, time 124.56ms
iter 240830: loss 6.6406, time 121.70ms
iter 240840: loss 5.7770, time 123.99ms
iter 240850: loss 6.0529, time 120.18ms
iter 240860: loss 5.8305, time 123.93ms
iter 240870: loss 6.1916, time 121.60ms
iter 240880: loss 6.3175, time 124.55ms
iter 240890: loss 5.9560, time 121.84ms
iter 240900: loss 5.8292, time 124.79ms
iter 240910: loss 7.0373, time 121.58ms
iter 240920: loss 6.1008, time 124.52ms
iter 240930: loss 5.8239, time 121.29ms
iter 240940: loss 6.5631, time 123.60ms
iter 240950: loss 6.2490, time 121.53ms
iter 240960: loss 6.0010, time 124.94ms
iter 240970: loss 5.9690, time 121.60ms
iter 240980: loss 5.7738, time 124.58ms
iter 240990: loss 6.2983, time 122.12ms
step 241000: train loss 5.7651, val loss 5.7343
saving checkpoint to out-shakespeare-char
iter 241000: loss 5.8269, time 2895.72ms
iter 241010: loss 5.2877, time 121.99ms
iter 241020: loss 5.9711, time 121.87ms
iter 241030: loss 6.5814, time 121.00ms
iter 241040: loss 5.9850, time 120.40ms
iter 241050: loss 5.8562, time 121.79ms
iter 241060: loss 5.7440, time 121.93ms
iter 241070: loss 5.9259, time 121.72ms
iter 241080: loss 5.4093, time 121.81ms
iter 241090: loss 5.7500, time 121.71ms
iter 241100: loss 5.9073, time 121.64ms
iter 241110: loss 5.4920, time 121.67ms
iter 241120: loss 6.1519, time 120.68ms
iter 241130: loss 5.7592, time 120.40ms
iter 241140: loss 7.2089, time 121.69ms
iter 241150: loss 6.4704, time 121.63ms
iter 241160: loss 5.7870, time 121.64ms
iter 241170: loss 6.2483, time 121.45ms
iter 241180: loss 6.6279, time 121.63ms
iter 241190: loss 6.0144, time 121.75ms
iter 241200: loss 6.0904, time 122.29ms
iter 241210: loss 6.1357, time 120.93ms
iter 241220: loss 6.0591, time 120.75ms
iter 241230: loss 6.4906, time 121.61ms
iter 241240: loss 6.4809, time 121.84ms
step 241250: train loss 5.7639, val loss 5.7307
saving checkpoint to out-shakespeare-char
iter 241250: loss 6.4992, time 2887.20ms
iter 241260: loss 6.7085, time 121.64ms
iter 241270: loss 6.5251, time 121.28ms
iter 241280: loss 5.6969, time 122.02ms
iter 241290: loss 6.2455, time 121.47ms
iter 241300: loss 6.6510, time 121.63ms
iter 241310: loss 6.7396, time 121.34ms
iter 241320: loss 6.2471, time 121.37ms
iter 241330: loss 6.2035, time 121.58ms
iter 241340: loss 6.0214, time 121.43ms
iter 241350: loss 6.0732, time 121.40ms
iter 241360: loss 7.2495, time 121.25ms
iter 241370: loss 6.2551, time 121.76ms
iter 241380: loss 6.8749, time 121.47ms
iter 241390: loss 6.6055, time 121.72ms
iter 241400: loss 6.0940, time 121.32ms
iter 241410: loss 6.1329, time 121.32ms
iter 241420: loss 6.4006, time 126.84ms
iter 241430: loss 6.0713, time 125.65ms
iter 241440: loss 5.7990, time 125.44ms
iter 241450: loss 5.8082, time 125.41ms
iter 241460: loss 6.6152, time 125.59ms
iter 241470: loss 6.5855, time 126.16ms
iter 241480: loss 5.8929, time 126.90ms
iter 241490: loss 4.9697, time 125.63ms
step 241500: train loss 5.7322, val loss 5.7075
saving checkpoint to out-shakespeare-char
iter 241500: loss 6.9956, time 2881.27ms
iter 241510: loss 5.7597, time 121.94ms
iter 241520: loss 5.6529, time 122.05ms
iter 241530: loss 6.1323, time 121.53ms
iter 241540: loss 6.2622, time 121.53ms
iter 241550: loss 6.3096, time 121.49ms
iter 241560: loss 6.5467, time 121.50ms
iter 241570: loss 5.4770, time 121.46ms
iter 241580: loss 6.7916, time 121.48ms
iter 241590: loss 5.4541, time 121.44ms
iter 241600: loss 6.4047, time 121.48ms
iter 241610: loss 6.8302, time 121.53ms
iter 241620: loss 6.2452, time 121.59ms
iter 241630: loss 6.1853, time 121.46ms
iter 241640: loss 5.8973, time 121.19ms
iter 241650: loss 5.8392, time 121.57ms
iter 241660: loss 6.7041, time 121.53ms
iter 241670: loss 6.3682, time 121.24ms
iter 241680: loss 6.3963, time 122.06ms
iter 241690: loss 5.8441, time 121.36ms
iter 241700: loss 6.5526, time 121.97ms
iter 241710: loss 5.9226, time 121.52ms
iter 241720: loss 6.3596, time 121.32ms
iter 241730: loss 5.8138, time 122.19ms
iter 241740: loss 5.8311, time 121.74ms
step 241750: train loss 5.7549, val loss 5.7286
saving checkpoint to out-shakespeare-char
iter 241750: loss 6.7364, time 2885.84ms
iter 241760: loss 6.4719, time 124.68ms
iter 241770: loss 6.2592, time 125.80ms
iter 241780: loss 5.9874, time 127.33ms
iter 241790: loss 6.6760, time 125.93ms
iter 241800: loss 6.2454, time 124.69ms
iter 241810: loss 5.9883, time 125.63ms
iter 241820: loss 6.6557, time 124.56ms
iter 241830: loss 5.9715, time 124.96ms
iter 241840: loss 5.9072, time 124.51ms
iter 241850: loss 6.0215, time 125.35ms
iter 241860: loss 6.2737, time 124.80ms
iter 241870: loss 6.8006, time 125.64ms
iter 241880: loss 6.2441, time 124.62ms
iter 241890: loss 5.7845, time 128.16ms
iter 241900: loss 4.9446, time 124.60ms
iter 241910: loss 5.8893, time 125.30ms
iter 241920: loss 6.0009, time 124.61ms
iter 241930: loss 6.1237, time 127.61ms
iter 241940: loss 6.6576, time 124.31ms
iter 241950: loss 6.5280, time 125.21ms
iter 241960: loss 5.7975, time 123.49ms
iter 241970: loss 5.9012, time 124.60ms
iter 241980: loss 6.4174, time 123.94ms
iter 241990: loss 5.9690, time 125.45ms
step 242000: train loss 5.6778, val loss 5.7259
saving checkpoint to out-shakespeare-char
iter 242000: loss 5.9738, time 2894.84ms
iter 242010: loss 6.1607, time 122.93ms
iter 242020: loss 6.8204, time 121.17ms
iter 242030: loss 6.5875, time 122.45ms
iter 242040: loss 5.8132, time 121.74ms
iter 242050: loss 6.3228, time 122.87ms
iter 242060: loss 6.7891, time 121.59ms
iter 242070: loss 5.6503, time 122.84ms
iter 242080: loss 6.9930, time 120.63ms
iter 242090: loss 6.4503, time 121.51ms
iter 242100: loss 7.1473, time 120.86ms
iter 242110: loss 6.0197, time 122.93ms
iter 242120: loss 6.5002, time 122.18ms
iter 242130: loss 6.0020, time 122.16ms
iter 242140: loss 6.6219, time 121.34ms
iter 242150: loss 6.6477, time 122.58ms
iter 242160: loss 5.3934, time 121.83ms
iter 242170: loss 6.4195, time 123.39ms
iter 242180: loss 6.2487, time 121.74ms
iter 242190: loss 5.8994, time 122.48ms
iter 242200: loss 6.2805, time 120.80ms
iter 242210: loss 5.9984, time 121.94ms
iter 242220: loss 5.9280, time 121.86ms
iter 242230: loss 6.3403, time 122.88ms
iter 242240: loss 6.1388, time 121.65ms
step 242250: train loss 5.7552, val loss 5.7872
saving checkpoint to out-shakespeare-char
iter 242250: loss 6.5267, time 2891.97ms
iter 242260: loss 5.9860, time 121.58ms
iter 242270: loss 5.7349, time 121.71ms
iter 242280: loss 6.3340, time 121.50ms
iter 242290: loss 6.1750, time 121.67ms
iter 242300: loss 6.0321, time 121.57ms
iter 242310: loss 5.9691, time 121.57ms
iter 242320: loss 5.9201, time 121.58ms
iter 242330: loss 6.8715, time 121.57ms
iter 242340: loss 6.1560, time 121.55ms
iter 242350: loss 7.0955, time 121.54ms
iter 242360: loss 5.4573, time 121.70ms
iter 242370: loss 6.6082, time 121.37ms
iter 242380: loss 6.9743, time 121.93ms
iter 242390: loss 6.0938, time 121.76ms
iter 242400: loss 7.0935, time 121.65ms
iter 242410: loss 6.7813, time 121.75ms
iter 242420: loss 6.6544, time 121.69ms
iter 242430: loss 6.9167, time 121.47ms
iter 242440: loss 5.8716, time 122.05ms
iter 242450: loss 5.4017, time 120.77ms
iter 242460: loss 5.7532, time 121.56ms
iter 242470: loss 5.3680, time 121.52ms
iter 242480: loss 6.5410, time 121.53ms
iter 242490: loss 6.0853, time 121.50ms
step 242500: train loss 5.8195, val loss 5.7145
saving checkpoint to out-shakespeare-char
iter 242500: loss 6.5119, time 2894.77ms
iter 242510: loss 6.2678, time 121.99ms
iter 242520: loss 6.3057, time 122.13ms
iter 242530: loss 5.8810, time 121.80ms
iter 242540: loss 6.7820, time 121.65ms
iter 242550: loss 6.5400, time 120.65ms
iter 242560: loss 5.2967, time 121.88ms
iter 242570: loss 5.8313, time 121.78ms
iter 242580: loss 6.2832, time 121.68ms
iter 242590: loss 5.9802, time 121.72ms
iter 242600: loss 6.1752, time 121.85ms
iter 242610: loss 5.9951, time 121.83ms
iter 242620: loss 6.4531, time 120.81ms
iter 242630: loss 5.8385, time 121.66ms
iter 242640: loss 6.1836, time 121.76ms
iter 242650: loss 6.0592, time 120.76ms
iter 242660: loss 6.1102, time 121.83ms
iter 242670: loss 6.6367, time 121.78ms
iter 242680: loss 6.1472, time 121.68ms
iter 242690: loss 6.1847, time 120.82ms
iter 242700: loss 6.0929, time 121.61ms
iter 242710: loss 5.4680, time 120.67ms
iter 242720: loss 6.5185, time 121.40ms
iter 242730: loss 6.1007, time 121.66ms
iter 242740: loss 5.8598, time 121.42ms
step 242750: train loss 5.7759, val loss 5.7642
saving checkpoint to out-shakespeare-char
iter 242750: loss 6.7020, time 2909.64ms
iter 242760: loss 5.4638, time 126.07ms
iter 242770: loss 7.0213, time 128.44ms
iter 242780: loss 5.8462, time 125.96ms
iter 242790: loss 6.2678, time 126.20ms
iter 242800: loss 6.6681, time 127.01ms
iter 242810: loss 5.5749, time 125.93ms
iter 242820: loss 6.7371, time 125.21ms
iter 242830: loss 6.6632, time 125.71ms
iter 242840: loss 7.0154, time 124.93ms
iter 242850: loss 5.6420, time 125.48ms
iter 242860: loss 5.9012, time 125.85ms
iter 242870: loss 6.3743, time 126.01ms
iter 242880: loss 5.8745, time 129.00ms
iter 242890: loss 6.0929, time 125.81ms
iter 242900: loss 6.4351, time 125.54ms
iter 242910: loss 5.7870, time 125.77ms
iter 242920: loss 6.3150, time 121.37ms
iter 242930: loss 5.7459, time 121.12ms
iter 242940: loss 6.4194, time 121.29ms
iter 242950: loss 5.3800, time 121.78ms
iter 242960: loss 6.2741, time 121.60ms
iter 242970: loss 6.0707, time 121.31ms
iter 242980: loss 6.0306, time 121.45ms
iter 242990: loss 6.6120, time 121.30ms
step 243000: train loss 5.7188, val loss 5.7670
saving checkpoint to out-shakespeare-char
iter 243000: loss 6.0925, time 2877.33ms
iter 243010: loss 6.3342, time 125.49ms
iter 243020: loss 5.7211, time 124.66ms
iter 243030: loss 5.9337, time 125.32ms
iter 243040: loss 5.7153, time 125.70ms
iter 243050: loss 5.4110, time 124.99ms
iter 243060: loss 6.5839, time 125.86ms
iter 243070: loss 7.0800, time 122.01ms
iter 243080: loss 6.2829, time 124.66ms
iter 243090: loss 5.7911, time 122.12ms
iter 243100: loss 6.0906, time 124.51ms
iter 243110: loss 6.0983, time 121.93ms
iter 243120: loss 5.9100, time 124.28ms
iter 243130: loss 6.0951, time 121.92ms
iter 243140: loss 6.6609, time 124.68ms
iter 243150: loss 5.7778, time 122.23ms
iter 243160: loss 6.1364, time 124.76ms
iter 243170: loss 6.3936, time 121.81ms
iter 243180: loss 5.8787, time 125.51ms
iter 243190: loss 5.6110, time 122.26ms
iter 243200: loss 6.2899, time 124.77ms
iter 243210: loss 6.6435, time 121.90ms
iter 243220: loss 6.4218, time 124.57ms
iter 243230: loss 6.0169, time 121.93ms
iter 243240: loss 5.4229, time 124.70ms
step 243250: train loss 5.7162, val loss 5.7124
saving checkpoint to out-shakespeare-char
iter 243250: loss 6.0852, time 2904.74ms
iter 243260: loss 6.9926, time 124.84ms
iter 243270: loss 6.0149, time 125.23ms
iter 243280: loss 5.6603, time 124.69ms
iter 243290: loss 5.8688, time 125.86ms
iter 243300: loss 6.1795, time 125.19ms
iter 243310: loss 5.8917, time 126.02ms
iter 243320: loss 6.0353, time 125.12ms
iter 243330: loss 6.8337, time 126.01ms
iter 243340: loss 6.3210, time 126.00ms
iter 243350: loss 6.3565, time 124.85ms
iter 243360: loss 6.0581, time 128.75ms
iter 243370: loss 6.4375, time 126.06ms
iter 243380: loss 6.8444, time 124.89ms
iter 243390: loss 7.6287, time 125.79ms
iter 243400: loss 6.2259, time 125.21ms
iter 243410: loss 5.8642, time 125.84ms
iter 243420: loss 6.0909, time 124.64ms
iter 243430: loss 6.8426, time 125.86ms
iter 243440: loss 6.0419, time 124.62ms
iter 243450: loss 6.3261, time 125.47ms
iter 243460: loss 6.2863, time 125.11ms
iter 243470: loss 6.1699, time 124.65ms
iter 243480: loss 6.0402, time 125.69ms
iter 243490: loss 6.4186, time 123.97ms
step 243500: train loss 5.7202, val loss 5.7530
saving checkpoint to out-shakespeare-char
iter 243500: loss 6.5581, time 2892.03ms
iter 243510: loss 5.5615, time 126.21ms
iter 243520: loss 6.3223, time 125.98ms
iter 243530: loss 5.8571, time 129.37ms
iter 243540: loss 5.5439, time 126.10ms
iter 243550: loss 6.2521, time 126.18ms
iter 243560: loss 5.5560, time 125.38ms
iter 243570: loss 6.6046, time 129.41ms
iter 243580: loss 6.1404, time 126.06ms
iter 243590: loss 6.0542, time 126.30ms
iter 243600: loss 6.3709, time 125.53ms
iter 243610: loss 6.1880, time 125.76ms
iter 243620: loss 5.6231, time 125.68ms
iter 243630: loss 6.6109, time 125.72ms
iter 243640: loss 6.3016, time 125.67ms
iter 243650: loss 5.6700, time 125.60ms
iter 243660: loss 6.5527, time 125.82ms
iter 243670: loss 6.1509, time 125.98ms
iter 243680: loss 6.6154, time 128.64ms
iter 243690: loss 5.9254, time 125.72ms
iter 243700: loss 5.9654, time 125.77ms
iter 243710: loss 6.0329, time 126.23ms
iter 243720: loss 6.0358, time 125.64ms
iter 243730: loss 6.6313, time 125.86ms
iter 243740: loss 6.3941, time 125.92ms
step 243750: train loss 5.7192, val loss 5.7350
saving checkpoint to out-shakespeare-char
iter 243750: loss 6.0952, time 2867.80ms
iter 243760: loss 6.1515, time 125.64ms
iter 243770: loss 6.3494, time 124.90ms
iter 243780: loss 5.5191, time 125.64ms
iter 243790: loss 5.8896, time 125.70ms
iter 243800: loss 6.9500, time 125.71ms
iter 243810: loss 6.3235, time 125.31ms
iter 243820: loss 6.9444, time 126.62ms
iter 243830: loss 6.2111, time 126.19ms
iter 243840: loss 6.5635, time 125.80ms
iter 243850: loss 6.4654, time 128.44ms
iter 243860: loss 6.6763, time 125.47ms
iter 243870: loss 6.6760, time 125.69ms
iter 243880: loss 6.0185, time 126.00ms
iter 243890: loss 5.8662, time 125.68ms
iter 243900: loss 7.0346, time 125.82ms
iter 243910: loss 7.0765, time 126.48ms
iter 243920: loss 6.2331, time 128.56ms
iter 243930: loss 6.0430, time 125.49ms
iter 243940: loss 6.3460, time 125.90ms
iter 243950: loss 6.1545, time 124.95ms
iter 243960: loss 5.9784, time 125.89ms
iter 243970: loss 5.8921, time 125.48ms
iter 243980: loss 5.5444, time 125.82ms
iter 243990: loss 6.2118, time 124.86ms
step 244000: train loss 5.7341, val loss 5.7366
saving checkpoint to out-shakespeare-char
iter 244000: loss 6.0071, time 2876.44ms
iter 244010: loss 5.7293, time 125.39ms
iter 244020: loss 6.2968, time 124.75ms
iter 244030: loss 5.8430, time 125.24ms
iter 244040: loss 5.9653, time 125.24ms
iter 244050: loss 6.0829, time 125.56ms
iter 244060: loss 6.0690, time 128.51ms
iter 244070: loss 7.1240, time 124.81ms
iter 244080: loss 5.4910, time 125.16ms
iter 244090: loss 5.9305, time 125.64ms
iter 244100: loss 6.5994, time 125.16ms
iter 244110: loss 5.6702, time 125.51ms
iter 244120: loss 6.6523, time 125.43ms
iter 244130: loss 6.2667, time 124.98ms
iter 244140: loss 6.4329, time 125.32ms
iter 244150: loss 6.4795, time 125.71ms
iter 244160: loss 6.4181, time 125.54ms
iter 244170: loss 6.5104, time 125.10ms
iter 244180: loss 5.9425, time 125.07ms
iter 244190: loss 6.9199, time 125.13ms
iter 244200: loss 5.1086, time 125.65ms
iter 244210: loss 5.6962, time 125.20ms
iter 244220: loss 5.5537, time 125.42ms
iter 244230: loss 6.1089, time 125.27ms
iter 244240: loss 5.8670, time 125.18ms
step 244250: train loss 5.7068, val loss 5.7138
saving checkpoint to out-shakespeare-char
iter 244250: loss 6.6757, time 2888.99ms
iter 244260: loss 7.1655, time 124.97ms
iter 244270: loss 6.7029, time 124.56ms
iter 244280: loss 5.6132, time 124.50ms
iter 244290: loss 5.5835, time 124.94ms
iter 244300: loss 6.8266, time 123.76ms
iter 244310: loss 6.1970, time 125.26ms
iter 244320: loss 5.9587, time 125.68ms
iter 244330: loss 6.7326, time 125.29ms
iter 244340: loss 6.5633, time 123.10ms
iter 244350: loss 5.9653, time 125.29ms
iter 244360: loss 7.2609, time 125.04ms
iter 244370: loss 6.2016, time 125.41ms
iter 244380: loss 6.0592, time 128.09ms
iter 244390: loss 6.7553, time 125.44ms
iter 244400: loss 6.2979, time 125.27ms
iter 244410: loss 6.3105, time 123.75ms
iter 244420: loss 6.8604, time 125.36ms
iter 244430: loss 6.0343, time 125.75ms
iter 244440: loss 5.6622, time 125.32ms
iter 244450: loss 5.4078, time 123.89ms
iter 244460: loss 6.7329, time 125.16ms
iter 244470: loss 5.9706, time 125.48ms
iter 244480: loss 5.5561, time 125.58ms
iter 244490: loss 6.3687, time 128.25ms
step 244500: train loss 5.7530, val loss 5.7479
saving checkpoint to out-shakespeare-char
iter 244500: loss 6.1025, time 2883.16ms
iter 244510: loss 6.4389, time 125.81ms
iter 244520: loss 6.0336, time 125.58ms
iter 244530: loss 6.2665, time 125.69ms
iter 244540: loss 6.1245, time 125.40ms
iter 244550: loss 6.1124, time 128.31ms
iter 244560: loss 6.1214, time 124.93ms
iter 244570: loss 5.9730, time 125.79ms
iter 244580: loss 5.7037, time 125.73ms
iter 244590: loss 6.2093, time 126.08ms
iter 244600: loss 6.6306, time 126.03ms
iter 244610: loss 6.8305, time 125.73ms
iter 244620: loss 6.6957, time 126.42ms
iter 244630: loss 6.8926, time 125.69ms
iter 244640: loss 6.4564, time 125.86ms
iter 244650: loss 6.3839, time 125.97ms
iter 244660: loss 6.2711, time 128.60ms
iter 244670: loss 5.6613, time 125.58ms
iter 244680: loss 5.0238, time 125.73ms
iter 244690: loss 6.1987, time 125.66ms
iter 244700: loss 5.5228, time 128.66ms
iter 244710: loss 5.8255, time 125.63ms
iter 244720: loss 6.0737, time 125.80ms
iter 244730: loss 6.0411, time 125.91ms
iter 244740: loss 5.3450, time 124.46ms
step 244750: train loss 5.7742, val loss 5.7104
saving checkpoint to out-shakespeare-char
iter 244750: loss 6.0357, time 2896.21ms
iter 244760: loss 5.9396, time 128.21ms
iter 244770: loss 6.4633, time 125.36ms
iter 244780: loss 5.6488, time 125.33ms
iter 244790: loss 6.2961, time 125.45ms
iter 244800: loss 6.4368, time 125.26ms
iter 244810: loss 6.5101, time 125.15ms
iter 244820: loss 6.6249, time 125.64ms
iter 244830: loss 5.9228, time 125.50ms
iter 244840: loss 5.9694, time 125.38ms
iter 244850: loss 6.1402, time 125.79ms
iter 244860: loss 5.8152, time 125.86ms
iter 244870: loss 5.8136, time 128.79ms
iter 244880: loss 6.2217, time 125.51ms
iter 244890: loss 5.8642, time 125.42ms
iter 244900: loss 6.4312, time 125.86ms
iter 244910: loss 6.5400, time 128.48ms
iter 244920: loss 6.8676, time 125.64ms
iter 244930: loss 6.3035, time 125.50ms
iter 244940: loss 5.6077, time 125.33ms
iter 244950: loss 5.8777, time 125.23ms
iter 244960: loss 6.2467, time 125.59ms
iter 244970: loss 5.8518, time 125.96ms
iter 244980: loss 5.8479, time 125.63ms
iter 244990: loss 6.3352, time 125.26ms
step 245000: train loss 5.6971, val loss 5.7127
saving checkpoint to out-shakespeare-char
iter 245000: loss 5.0875, time 2903.54ms
iter 245010: loss 5.7454, time 125.94ms
iter 245020: loss 6.0000, time 125.75ms
iter 245030: loss 6.2130, time 125.95ms
iter 245040: loss 6.0854, time 128.64ms
iter 245050: loss 5.5265, time 125.64ms
iter 245060: loss 6.0608, time 125.72ms
iter 245070: loss 6.8000, time 125.10ms
iter 245080: loss 6.9409, time 125.94ms
iter 245090: loss 6.5828, time 125.95ms
iter 245100: loss 6.8110, time 125.66ms
iter 245110: loss 6.0205, time 126.29ms
iter 245120: loss 6.4712, time 125.26ms
iter 245130: loss 5.7230, time 125.46ms
iter 245140: loss 6.1580, time 125.41ms
iter 245150: loss 5.7989, time 125.78ms
iter 245160: loss 6.6445, time 128.30ms
iter 245170: loss 6.4359, time 125.28ms
iter 245180: loss 6.1590, time 125.23ms
iter 245190: loss 6.2170, time 125.64ms
iter 245200: loss 5.1854, time 125.91ms
iter 245210: loss 5.8058, time 125.36ms
iter 245220: loss 6.0944, time 121.57ms
iter 245230: loss 5.7223, time 121.93ms
iter 245240: loss 6.9461, time 122.35ms
step 245250: train loss 5.7274, val loss 5.7515
saving checkpoint to out-shakespeare-char
iter 245250: loss 5.9423, time 2892.34ms
iter 245260: loss 5.9285, time 125.13ms
iter 245270: loss 6.0442, time 125.62ms
iter 245280: loss 5.3363, time 125.77ms
iter 245290: loss 6.3632, time 125.21ms
iter 245300: loss 6.1422, time 128.55ms
iter 245310: loss 6.5298, time 125.89ms
iter 245320: loss 6.2487, time 125.83ms
iter 245330: loss 5.7342, time 128.93ms
iter 245340: loss 5.7975, time 125.67ms
iter 245350: loss 6.5399, time 125.61ms
iter 245360: loss 7.0797, time 125.58ms
iter 245370: loss 5.6312, time 125.63ms
iter 245380: loss 5.5382, time 125.85ms
iter 245390: loss 6.0867, time 125.77ms
iter 245400: loss 5.9250, time 125.50ms
iter 245410: loss 6.6368, time 126.21ms
iter 245420: loss 5.9412, time 125.79ms
iter 245430: loss 6.0593, time 125.40ms
iter 245440: loss 5.6895, time 125.65ms
iter 245450: loss 6.1155, time 125.69ms
iter 245460: loss 5.5559, time 125.53ms
iter 245470: loss 6.0434, time 126.18ms
iter 245480: loss 5.7734, time 125.97ms
iter 245490: loss 5.5904, time 125.99ms
step 245500: train loss 5.7075, val loss 5.7368
saving checkpoint to out-shakespeare-char
iter 245500: loss 6.4290, time 2859.36ms
iter 245510: loss 5.6323, time 126.08ms
iter 245520: loss 5.6663, time 125.58ms
iter 245530: loss 6.6721, time 125.70ms
iter 245540: loss 5.5094, time 125.67ms
iter 245550: loss 6.0194, time 125.78ms
iter 245560: loss 5.6952, time 129.31ms
iter 245570: loss 6.4409, time 125.88ms
iter 245580: loss 6.3635, time 125.21ms
iter 245590: loss 6.6984, time 125.25ms
iter 245600: loss 5.9839, time 124.95ms
iter 245610: loss 6.2401, time 125.00ms
iter 245620: loss 6.2853, time 119.92ms
iter 245630: loss 6.5406, time 120.06ms
iter 245640: loss 6.0330, time 120.02ms
iter 245650: loss 6.0051, time 120.28ms
iter 245660: loss 5.3231, time 119.80ms
iter 245670: loss 6.6075, time 120.04ms
iter 245680: loss 5.7374, time 121.06ms
iter 245690: loss 5.9174, time 119.94ms
iter 245700: loss 5.7928, time 120.78ms
iter 245710: loss 6.2201, time 120.58ms
iter 245720: loss 6.0295, time 119.80ms
iter 245730: loss 5.9477, time 120.91ms
iter 245740: loss 6.0371, time 120.69ms
step 245750: train loss 5.6404, val loss 5.7360
saving checkpoint to out-shakespeare-char
iter 245750: loss 6.1165, time 2887.70ms
iter 245760: loss 5.8423, time 120.14ms
iter 245770: loss 6.3382, time 119.71ms
iter 245780: loss 6.1975, time 126.04ms
iter 245790: loss 6.1764, time 129.11ms
iter 245800: loss 6.3146, time 125.91ms
iter 245810: loss 5.6748, time 125.45ms
iter 245820: loss 6.0164, time 125.42ms
iter 245830: loss 5.5859, time 125.69ms
iter 245840: loss 6.2898, time 125.65ms
iter 245850: loss 6.6261, time 125.70ms
iter 245860: loss 6.8119, time 128.23ms
iter 245870: loss 6.2443, time 125.07ms
iter 245880: loss 6.6275, time 125.21ms
iter 245890: loss 6.5914, time 125.41ms
iter 245900: loss 6.2625, time 128.63ms
iter 245910: loss 6.4743, time 125.09ms
iter 245920: loss 6.5178, time 125.55ms
iter 245930: loss 7.0723, time 125.13ms
iter 245940: loss 6.0425, time 125.43ms
iter 245950: loss 5.6412, time 125.75ms
iter 245960: loss 5.8299, time 126.14ms
iter 245970: loss 5.9903, time 128.93ms
iter 245980: loss 6.3937, time 125.42ms
iter 245990: loss 6.4544, time 125.30ms
step 246000: train loss 5.7036, val loss 5.7101
saving checkpoint to out-shakespeare-char
iter 246000: loss 5.6610, time 2897.36ms
iter 246010: loss 5.8559, time 122.04ms
iter 246020: loss 6.0674, time 124.59ms
iter 246030: loss 6.1931, time 121.84ms
iter 246040: loss 6.1303, time 124.55ms
iter 246050: loss 6.4359, time 121.67ms
iter 246060: loss 5.3760, time 124.55ms
iter 246070: loss 6.5202, time 121.72ms
iter 246080: loss 6.3348, time 124.16ms
iter 246090: loss 6.3597, time 121.67ms
iter 246100: loss 5.9352, time 124.76ms
iter 246110: loss 6.0018, time 122.01ms
iter 246120: loss 5.4612, time 124.59ms
iter 246130: loss 5.9168, time 121.65ms
iter 246140: loss 5.2043, time 124.53ms
iter 246150: loss 6.1639, time 122.81ms
iter 246160: loss 6.2048, time 124.84ms
iter 246170: loss 6.3847, time 121.62ms
iter 246180: loss 6.3594, time 124.35ms
iter 246190: loss 6.1325, time 122.03ms
iter 246200: loss 6.3401, time 124.69ms
iter 246210: loss 6.6525, time 121.90ms
iter 246220: loss 5.4907, time 124.49ms
iter 246230: loss 6.7269, time 121.75ms
iter 246240: loss 6.7642, time 124.46ms
step 246250: train loss 5.7408, val loss 5.7118
saving checkpoint to out-shakespeare-char
iter 246250: loss 6.0728, time 2890.29ms
iter 246260: loss 5.9687, time 121.06ms
iter 246270: loss 6.4499, time 122.81ms
iter 246280: loss 6.0599, time 122.38ms
iter 246290: loss 6.5343, time 122.88ms
iter 246300: loss 6.3727, time 121.47ms
iter 246310: loss 5.7795, time 122.78ms
iter 246320: loss 6.0870, time 121.81ms
iter 246330: loss 5.8383, time 123.17ms
iter 246340: loss 6.7838, time 121.84ms
iter 246350: loss 6.1102, time 124.01ms
iter 246360: loss 6.4013, time 121.79ms
iter 246370: loss 6.4084, time 123.06ms
iter 246380: loss 6.0417, time 121.88ms
iter 246390: loss 5.7322, time 123.30ms
iter 246400: loss 6.0623, time 121.92ms
iter 246410: loss 6.3463, time 123.04ms
iter 246420: loss 5.6082, time 121.46ms
iter 246430: loss 6.0981, time 122.36ms
iter 246440: loss 5.4322, time 121.76ms
iter 246450: loss 6.6616, time 122.16ms
iter 246460: loss 6.4085, time 121.88ms
iter 246470: loss 5.6962, time 123.15ms
iter 246480: loss 7.2121, time 121.82ms
iter 246490: loss 5.9462, time 123.07ms
step 246500: train loss 5.7017, val loss 5.7570
saving checkpoint to out-shakespeare-char
iter 246500: loss 6.1932, time 2913.21ms
iter 246510: loss 5.9884, time 122.06ms
iter 246520: loss 6.9024, time 121.71ms
iter 246530: loss 5.9020, time 123.42ms
iter 246540: loss 5.5986, time 121.85ms
iter 246550: loss 6.3760, time 123.38ms
iter 246560: loss 6.6659, time 121.98ms
iter 246570: loss 6.2234, time 123.41ms
iter 246580: loss 5.7918, time 122.44ms
iter 246590: loss 5.8597, time 123.05ms
iter 246600: loss 6.3902, time 121.88ms
iter 246610: loss 5.8548, time 122.25ms
iter 246620: loss 6.1573, time 121.93ms
iter 246630: loss 5.8300, time 123.43ms
iter 246640: loss 5.8193, time 122.09ms
iter 246650: loss 5.9915, time 123.32ms
iter 246660: loss 6.3139, time 121.93ms
iter 246670: loss 6.4252, time 123.41ms
iter 246680: loss 6.7914, time 121.83ms
iter 246690: loss 6.1567, time 123.13ms
iter 246700: loss 6.0145, time 122.00ms
iter 246710: loss 5.7856, time 123.21ms
iter 246720: loss 6.5383, time 121.83ms
iter 246730: loss 6.1324, time 123.32ms
iter 246740: loss 6.2276, time 121.78ms
step 246750: train loss 5.7353, val loss 5.7114
saving checkpoint to out-shakespeare-char
iter 246750: loss 6.4776, time 2897.52ms
iter 246760: loss 5.9500, time 121.67ms
iter 246770: loss 5.1190, time 120.58ms
iter 246780: loss 6.6128, time 121.50ms
iter 246790: loss 5.6877, time 121.56ms
iter 246800: loss 6.6965, time 121.89ms
iter 246810: loss 6.9242, time 121.66ms
iter 246820: loss 5.6822, time 121.49ms
iter 246830: loss 6.4699, time 120.76ms
iter 246840: loss 6.3658, time 121.62ms
iter 246850: loss 6.7551, time 121.71ms
iter 246860: loss 6.8386, time 121.30ms
iter 246870: loss 5.7556, time 121.52ms
iter 246880: loss 6.1207, time 121.38ms
iter 246890: loss 6.2617, time 121.78ms
iter 246900: loss 6.3769, time 120.77ms
iter 246910: loss 6.0506, time 121.59ms
iter 246920: loss 6.7661, time 122.69ms
iter 246930: loss 6.0566, time 121.58ms
iter 246940: loss 6.5020, time 121.59ms
iter 246950: loss 5.6168, time 121.50ms
iter 246960: loss 6.7894, time 121.51ms
iter 246970: loss 5.5869, time 122.84ms
iter 246980: loss 6.7754, time 121.66ms
iter 246990: loss 6.4964, time 121.46ms
step 247000: train loss 5.7216, val loss 5.7214
saving checkpoint to out-shakespeare-char
iter 247000: loss 6.0895, time 2891.45ms
iter 247010: loss 5.9378, time 121.76ms
iter 247020: loss 6.6047, time 124.51ms
iter 247030: loss 6.4590, time 121.80ms
iter 247040: loss 6.7788, time 125.01ms
iter 247050: loss 6.2289, time 121.73ms
iter 247060: loss 6.3024, time 124.51ms
iter 247070: loss 6.1682, time 122.01ms
iter 247080: loss 5.6972, time 124.71ms
iter 247090: loss 5.6301, time 122.12ms
iter 247100: loss 6.3270, time 124.41ms
iter 247110: loss 5.9361, time 121.80ms
iter 247120: loss 6.8180, time 124.57ms
iter 247130: loss 6.3933, time 121.87ms
iter 247140: loss 6.5148, time 124.96ms
iter 247150: loss 6.5921, time 121.47ms
iter 247160: loss 6.0393, time 124.57ms
iter 247170: loss 6.0757, time 121.87ms
iter 247180: loss 6.1637, time 124.62ms
iter 247190: loss 6.1022, time 121.89ms
iter 247200: loss 6.4172, time 124.24ms
iter 247210: loss 5.9831, time 121.87ms
iter 247220: loss 6.1984, time 124.85ms
iter 247230: loss 5.9286, time 121.83ms
iter 247240: loss 5.6771, time 124.54ms
step 247250: train loss 5.7429, val loss 5.7212
saving checkpoint to out-shakespeare-char
iter 247250: loss 6.2034, time 2893.37ms
iter 247260: loss 5.8418, time 121.69ms
iter 247270: loss 6.1532, time 124.31ms
iter 247280: loss 5.9839, time 121.59ms
iter 247290: loss 6.7227, time 124.56ms
iter 247300: loss 6.4335, time 121.67ms
iter 247310: loss 6.0990, time 124.54ms
iter 247320: loss 6.2246, time 121.64ms
iter 247330: loss 6.2478, time 124.54ms
iter 247340: loss 5.4502, time 121.73ms
iter 247350: loss 6.4953, time 124.52ms
iter 247360: loss 5.9783, time 122.60ms
iter 247370: loss 6.4318, time 124.40ms
iter 247380: loss 5.4027, time 121.65ms
iter 247390: loss 6.7171, time 124.27ms
iter 247400: loss 5.9559, time 121.71ms
iter 247410: loss 6.6016, time 124.37ms
iter 247420: loss 5.8625, time 121.60ms
iter 247430: loss 6.2320, time 124.39ms
iter 247440: loss 6.2056, time 121.64ms
iter 247450: loss 5.7989, time 124.44ms
iter 247460: loss 5.9523, time 121.22ms
iter 247470: loss 6.0680, time 122.14ms
iter 247480: loss 5.4389, time 121.61ms
iter 247490: loss 5.8267, time 122.93ms
step 247500: train loss 5.7155, val loss 5.7258
saving checkpoint to out-shakespeare-char
iter 247500: loss 6.0298, time 2893.96ms
iter 247510: loss 6.1879, time 121.65ms
iter 247520: loss 5.9227, time 121.74ms
iter 247530: loss 5.8704, time 121.64ms
iter 247540: loss 5.4807, time 121.78ms
iter 247550: loss 6.1334, time 121.50ms
iter 247560: loss 6.0085, time 121.72ms
iter 247570: loss 6.5095, time 121.79ms
iter 247580: loss 5.4245, time 121.09ms
iter 247590: loss 6.4442, time 121.63ms
iter 247600: loss 5.6740, time 125.74ms
iter 247610: loss 6.2031, time 125.36ms
iter 247620: loss 6.5690, time 125.59ms
iter 247630: loss 6.1369, time 122.09ms
iter 247640: loss 6.5254, time 123.27ms
iter 247650: loss 6.6160, time 121.89ms
iter 247660: loss 6.3692, time 122.99ms
iter 247670: loss 6.4829, time 121.64ms
iter 247680: loss 5.4011, time 123.00ms
iter 247690: loss 6.9974, time 121.93ms
iter 247700: loss 6.4809, time 123.17ms
iter 247710: loss 5.6088, time 121.73ms
iter 247720: loss 6.3967, time 123.33ms
iter 247730: loss 5.6967, time 122.20ms
iter 247740: loss 5.9812, time 122.81ms
step 247750: train loss 5.6806, val loss 5.7114
saving checkpoint to out-shakespeare-char
iter 247750: loss 5.7457, time 2895.28ms
iter 247760: loss 5.4724, time 121.44ms
iter 247770: loss 5.8399, time 121.59ms
iter 247780: loss 6.8176, time 121.71ms
iter 247790: loss 5.9667, time 121.49ms
iter 247800: loss 6.7998, time 121.64ms
iter 247810: loss 6.5167, time 121.44ms
iter 247820: loss 5.2769, time 121.75ms
iter 247830: loss 6.4718, time 122.77ms
iter 247840: loss 5.9444, time 121.58ms
iter 247850: loss 5.6695, time 121.66ms
iter 247860: loss 5.7334, time 121.65ms
iter 247870: loss 6.3180, time 121.55ms
iter 247880: loss 5.9853, time 122.87ms
iter 247890: loss 5.8441, time 121.61ms
iter 247900: loss 6.7830, time 121.91ms
iter 247910: loss 5.2708, time 121.68ms
iter 247920: loss 6.4928, time 121.88ms
iter 247930: loss 6.2263, time 121.53ms
iter 247940: loss 5.7513, time 121.94ms
iter 247950: loss 5.8638, time 121.27ms
iter 247960: loss 5.7853, time 121.05ms
iter 247970: loss 6.5260, time 121.77ms
iter 247980: loss 5.7328, time 121.12ms
iter 247990: loss 6.1616, time 121.58ms
step 248000: train loss 5.7264, val loss 5.7521
saving checkpoint to out-shakespeare-char
iter 248000: loss 6.1476, time 2905.03ms
iter 248010: loss 6.8195, time 125.85ms
iter 248020: loss 6.3488, time 125.77ms
iter 248030: loss 6.1046, time 125.51ms
iter 248040: loss 5.6915, time 125.50ms
iter 248050: loss 6.4786, time 125.31ms
iter 248060: loss 5.8746, time 125.39ms
iter 248070: loss 5.5096, time 125.83ms
iter 248080: loss 5.6134, time 125.83ms
iter 248090: loss 5.6518, time 126.13ms
iter 248100: loss 6.1580, time 128.65ms
iter 248110: loss 5.9304, time 125.88ms
iter 248120: loss 5.7917, time 125.14ms
iter 248130: loss 5.9918, time 125.89ms
iter 248140: loss 5.4838, time 125.64ms
iter 248150: loss 5.8001, time 125.82ms
iter 248160: loss 5.4724, time 125.75ms
iter 248170: loss 6.1669, time 125.59ms
iter 248180: loss 6.3170, time 125.70ms
iter 248190: loss 6.1139, time 125.83ms
iter 248200: loss 6.1871, time 125.52ms
iter 248210: loss 5.8946, time 128.53ms
iter 248220: loss 5.8265, time 125.55ms
iter 248230: loss 5.4986, time 125.52ms
iter 248240: loss 6.8876, time 125.77ms
step 248250: train loss 5.7559, val loss 5.7558
saving checkpoint to out-shakespeare-char
iter 248250: loss 5.9438, time 2896.16ms
iter 248260: loss 6.4657, time 125.70ms
iter 248270: loss 6.3415, time 128.23ms
iter 248280: loss 6.4022, time 125.74ms
iter 248290: loss 5.8902, time 125.46ms
iter 248300: loss 6.1827, time 125.64ms
iter 248310: loss 5.9072, time 125.95ms
iter 248320: loss 5.3912, time 125.48ms
iter 248330: loss 5.8071, time 125.84ms
iter 248340: loss 6.3706, time 126.02ms
iter 248350: loss 5.8550, time 125.51ms
iter 248360: loss 6.0392, time 125.92ms
iter 248370: loss 6.1233, time 125.47ms
iter 248380: loss 6.3994, time 128.31ms
iter 248390: loss 6.6468, time 125.57ms
iter 248400: loss 6.1959, time 125.46ms
iter 248410: loss 5.7823, time 125.52ms
iter 248420: loss 6.4909, time 125.45ms
iter 248430: loss 6.3253, time 125.63ms
iter 248440: loss 6.3750, time 125.64ms
iter 248450: loss 6.2207, time 124.75ms
iter 248460: loss 5.9204, time 125.66ms
iter 248470: loss 6.3893, time 125.60ms
iter 248480: loss 6.2633, time 125.46ms
iter 248490: loss 6.0303, time 128.39ms
step 248500: train loss 5.7157, val loss 5.7357
saving checkpoint to out-shakespeare-char
iter 248500: loss 6.3245, time 2894.11ms
iter 248510: loss 6.4419, time 128.47ms
iter 248520: loss 5.6851, time 125.01ms
iter 248530: loss 6.1107, time 125.92ms
iter 248540: loss 6.2912, time 124.44ms
iter 248550: loss 6.2711, time 125.59ms
iter 248560: loss 6.2002, time 126.03ms
iter 248570: loss 5.9994, time 125.64ms
iter 248580: loss 6.4333, time 125.53ms
iter 248590: loss 6.0740, time 125.85ms
iter 248600: loss 5.6223, time 125.83ms
iter 248610: loss 6.7383, time 125.67ms
iter 248620: loss 6.4069, time 128.03ms
iter 248630: loss 6.4133, time 124.97ms
iter 248640: loss 6.3834, time 125.59ms
iter 248650: loss 5.6498, time 125.69ms
iter 248660: loss 5.8716, time 125.40ms
iter 248670: loss 6.1414, time 125.51ms
iter 248680: loss 5.5728, time 125.63ms
iter 248690: loss 5.6254, time 125.61ms
iter 248700: loss 6.3006, time 126.01ms
iter 248710: loss 5.9278, time 125.77ms
iter 248720: loss 6.3035, time 126.04ms
iter 248730: loss 6.7765, time 128.78ms
iter 248740: loss 6.0085, time 125.81ms
step 248750: train loss 5.7356, val loss 5.7666
saving checkpoint to out-shakespeare-char
iter 248750: loss 5.8209, time 2903.72ms
iter 248760: loss 6.0567, time 124.70ms
iter 248770: loss 6.3577, time 125.58ms
iter 248780: loss 6.4870, time 125.88ms
iter 248790: loss 6.0396, time 125.85ms
iter 248800: loss 7.1326, time 124.68ms
iter 248810: loss 6.7695, time 125.67ms
iter 248820: loss 6.0007, time 125.20ms
iter 248830: loss 6.1339, time 125.78ms
iter 248840: loss 5.7274, time 128.54ms
iter 248850: loss 6.7912, time 124.41ms
iter 248860: loss 6.7547, time 125.73ms
iter 248870: loss 6.1833, time 125.50ms
iter 248880: loss 5.9462, time 126.95ms
iter 248890: loss 5.6181, time 126.17ms
iter 248900: loss 5.4967, time 125.65ms
iter 248910: loss 6.7779, time 125.50ms
iter 248920: loss 6.1355, time 125.85ms
iter 248930: loss 6.2328, time 125.66ms
iter 248940: loss 5.7714, time 125.93ms
iter 248950: loss 6.6640, time 128.49ms
iter 248960: loss 6.0578, time 125.11ms
iter 248970: loss 6.2012, time 126.16ms
iter 248980: loss 6.6635, time 125.88ms
iter 248990: loss 6.5198, time 128.66ms
step 249000: train loss 5.7345, val loss 5.6996
saving checkpoint to out-shakespeare-char
iter 249000: loss 6.9019, time 2898.23ms
iter 249010: loss 6.2243, time 126.15ms
iter 249020: loss 6.5706, time 129.01ms
iter 249030: loss 5.9106, time 126.07ms
iter 249040: loss 5.2433, time 126.34ms
iter 249050: loss 6.8114, time 126.20ms
iter 249060: loss 6.0534, time 125.98ms
iter 249070: loss 5.6482, time 126.64ms
iter 249080: loss 6.3437, time 126.42ms
iter 249090: loss 5.8229, time 126.14ms
iter 249100: loss 5.6704, time 126.51ms
iter 249110: loss 6.6571, time 126.57ms
iter 249120: loss 6.7409, time 126.09ms
iter 249130: loss 5.7927, time 126.87ms
iter 249140: loss 5.8464, time 125.83ms
iter 249150: loss 6.2642, time 126.12ms
iter 249160: loss 6.3919, time 125.54ms
iter 249170: loss 6.1143, time 127.82ms
iter 249180: loss 6.7180, time 125.57ms
iter 249190: loss 6.5754, time 125.95ms
iter 249200: loss 5.8464, time 123.70ms
iter 249210: loss 5.7312, time 121.85ms
iter 249220: loss 5.9894, time 123.66ms
iter 249230: loss 5.8571, time 121.43ms
iter 249240: loss 5.9080, time 122.18ms
step 249250: train loss 5.7385, val loss 5.7429
saving checkpoint to out-shakespeare-char
iter 249250: loss 5.7981, time 2904.66ms
iter 249260: loss 6.5195, time 125.68ms
iter 249270: loss 6.0090, time 126.12ms
iter 249280: loss 5.9310, time 125.76ms
iter 249290: loss 6.6123, time 125.90ms
iter 249300: loss 6.3842, time 129.06ms
iter 249310: loss 6.6558, time 125.58ms
iter 249320: loss 6.2119, time 126.33ms
iter 249330: loss 5.7457, time 126.65ms
iter 249340: loss 5.8677, time 128.62ms
iter 249350: loss 6.0063, time 125.87ms
iter 249360: loss 5.9259, time 125.95ms
iter 249370: loss 6.5946, time 125.43ms
iter 249380: loss 6.5234, time 125.87ms
iter 249390: loss 6.4328, time 125.77ms
iter 249400: loss 6.3313, time 124.74ms
iter 249410: loss 5.7654, time 124.77ms
iter 249420: loss 6.1027, time 124.93ms
iter 249430: loss 6.4251, time 125.25ms
iter 249440: loss 6.2320, time 125.02ms
iter 249450: loss 6.1223, time 126.91ms
iter 249460: loss 6.5348, time 125.08ms
iter 249470: loss 5.9560, time 125.24ms
iter 249480: loss 5.2134, time 125.05ms
iter 249490: loss 5.7991, time 125.02ms
step 249500: train loss 5.6762, val loss 5.6806
saving checkpoint to out-shakespeare-char
iter 249500: loss 6.1628, time 2873.46ms
iter 249510: loss 6.3201, time 125.27ms
iter 249520: loss 6.2194, time 125.38ms
iter 249530: loss 5.3004, time 125.53ms
iter 249540: loss 6.4032, time 125.42ms
iter 249550: loss 6.4779, time 125.56ms
iter 249560: loss 5.6658, time 125.63ms
iter 249570: loss 5.9151, time 125.63ms
iter 249580: loss 6.0823, time 125.60ms
iter 249590: loss 6.3823, time 125.54ms
iter 249600: loss 6.2816, time 125.77ms
iter 249610: loss 6.0231, time 126.14ms
iter 249620: loss 6.2190, time 126.77ms
iter 249630: loss 6.1986, time 125.12ms
iter 249640: loss 6.7769, time 125.14ms
iter 249650: loss 6.0266, time 124.92ms
iter 249660: loss 5.8141, time 136.06ms
iter 249670: loss 6.3027, time 125.92ms
iter 249680: loss 6.4261, time 126.04ms
iter 249690: loss 6.2529, time 126.97ms
iter 249700: loss 6.0017, time 126.38ms
iter 249710: loss 5.7311, time 125.22ms
iter 249720: loss 6.8299, time 125.80ms
iter 249730: loss 6.2169, time 126.06ms
iter 249740: loss 5.8963, time 125.81ms
step 249750: train loss 5.7076, val loss 5.7026
saving checkpoint to out-shakespeare-char
iter 249750: loss 5.6141, time 2867.44ms
iter 249760: loss 5.6848, time 125.41ms
iter 249770: loss 5.9549, time 125.08ms
iter 249780: loss 6.6592, time 124.89ms
iter 249790: loss 6.3331, time 125.03ms
iter 249800: loss 6.1983, time 124.95ms
iter 249810: loss 6.2799, time 125.18ms
iter 249820: loss 5.7723, time 128.49ms
iter 249830: loss 6.4135, time 125.21ms
iter 249840: loss 5.8292, time 125.18ms
iter 249850: loss 6.8862, time 125.09ms
iter 249860: loss 6.0791, time 125.22ms
iter 249870: loss 6.3236, time 124.96ms
iter 249880: loss 6.7008, time 124.98ms
iter 249890: loss 6.6477, time 125.22ms
iter 249900: loss 6.9697, time 125.18ms
iter 249910: loss 6.0577, time 125.38ms
iter 249920: loss 6.2530, time 125.61ms
iter 249930: loss 6.5611, time 128.17ms
iter 249940: loss 5.2248, time 125.30ms
iter 249950: loss 6.4017, time 125.05ms
iter 249960: loss 5.8895, time 125.46ms
iter 249970: loss 6.1289, time 124.60ms
iter 249980: loss 5.9884, time 125.00ms
iter 249990: loss 5.8069, time 124.93ms
step 250000: train loss 5.7160, val loss 5.7643
saving checkpoint to out-shakespeare-char
iter 250000: loss 5.6139, time 2900.14ms
iter 250010: loss 5.4835, time 125.42ms
iter 250020: loss 6.2053, time 125.57ms
iter 250030: loss 6.0570, time 126.11ms
iter 250040: loss 6.3946, time 125.28ms
iter 250050: loss 6.1415, time 124.97ms
iter 250060: loss 6.2533, time 125.29ms
iter 250070: loss 5.7409, time 128.04ms
iter 250080: loss 6.1841, time 125.42ms
iter 250090: loss 6.2237, time 125.12ms
iter 250100: loss 6.4053, time 125.22ms
iter 250110: loss 6.2524, time 125.24ms
iter 250120: loss 5.8423, time 125.04ms
iter 250130: loss 6.2252, time 125.21ms
iter 250140: loss 6.2409, time 125.20ms
iter 250150: loss 5.4646, time 124.55ms
iter 250160: loss 6.7952, time 125.67ms
iter 250170: loss 5.8319, time 125.67ms
iter 250180: loss 5.9611, time 128.28ms
iter 250190: loss 6.8150, time 125.57ms
iter 250200: loss 6.3284, time 125.58ms
iter 250210: loss 6.4717, time 125.93ms
iter 250220: loss 6.3108, time 125.48ms
iter 250230: loss 6.2254, time 125.72ms
iter 250240: loss 5.9427, time 125.65ms
step 250250: train loss 5.7209, val loss 5.7275
saving checkpoint to out-shakespeare-char
iter 250250: loss 6.0162, time 2885.48ms
iter 250260: loss 6.3299, time 125.59ms
iter 250270: loss 6.5042, time 125.24ms
iter 250280: loss 6.6177, time 125.15ms
iter 250290: loss 5.8832, time 125.36ms
iter 250300: loss 6.4994, time 125.91ms
iter 250310: loss 5.2306, time 121.37ms
iter 250320: loss 6.4755, time 121.38ms
iter 250330: loss 5.8679, time 121.71ms
iter 250340: loss 6.1165, time 121.43ms
iter 250350: loss 6.6406, time 120.75ms
iter 250360: loss 5.2633, time 121.25ms
iter 250370: loss 5.8661, time 121.49ms
iter 250380: loss 5.9381, time 121.29ms
iter 250390: loss 5.9844, time 121.59ms
iter 250400: loss 5.7514, time 121.53ms
iter 250410: loss 6.0823, time 122.10ms
iter 250420: loss 5.9551, time 121.56ms
iter 250430: loss 6.1961, time 121.53ms
iter 250440: loss 6.0231, time 121.53ms
iter 250450: loss 6.1836, time 121.67ms
iter 250460: loss 6.1257, time 121.25ms
iter 250470: loss 6.2496, time 121.48ms
iter 250480: loss 6.2460, time 120.87ms
iter 250490: loss 5.9177, time 121.50ms
step 250500: train loss 5.7329, val loss 5.7345
saving checkpoint to out-shakespeare-char
iter 250500: loss 6.1446, time 2882.02ms
iter 250510: loss 6.3775, time 121.94ms
iter 250520: loss 5.2964, time 124.38ms
iter 250530: loss 6.6584, time 121.27ms
iter 250540: loss 5.6824, time 124.67ms
iter 250550: loss 5.7659, time 121.64ms
iter 250560: loss 5.8937, time 124.53ms
iter 250570: loss 6.3864, time 122.04ms
iter 250580: loss 6.4211, time 123.45ms
iter 250590: loss 5.6660, time 120.72ms
iter 250600: loss 5.4797, time 124.58ms
iter 250610: loss 6.6279, time 121.50ms
iter 250620: loss 6.1458, time 124.59ms
iter 250630: loss 6.0524, time 121.49ms
iter 250640: loss 5.9415, time 124.67ms
iter 250650: loss 6.6516, time 121.72ms
iter 250660: loss 5.8271, time 124.56ms
iter 250670: loss 6.3960, time 121.66ms
iter 250680: loss 6.2850, time 123.86ms
iter 250690: loss 6.5992, time 122.62ms
iter 250700: loss 6.2129, time 124.45ms
iter 250710: loss 6.2203, time 121.36ms
iter 250720: loss 6.1847, time 124.51ms
iter 250730: loss 5.8678, time 121.69ms
iter 250740: loss 6.3137, time 124.78ms
step 250750: train loss 5.6565, val loss 5.7447
saving checkpoint to out-shakespeare-char
iter 250750: loss 6.2790, time 2896.92ms
iter 250760: loss 6.7948, time 121.57ms
iter 250770: loss 6.2202, time 124.56ms
iter 250780: loss 6.0313, time 121.58ms
iter 250790: loss 6.0231, time 124.36ms
iter 250800: loss 5.6086, time 121.64ms
iter 250810: loss 7.3100, time 124.39ms
iter 250820: loss 5.6967, time 121.85ms
iter 250830: loss 6.2053, time 124.52ms
iter 250840: loss 5.5783, time 121.57ms
iter 250850: loss 6.6712, time 124.38ms
iter 250860: loss 6.5740, time 122.00ms
iter 250870: loss 5.8412, time 124.47ms
iter 250880: loss 6.2558, time 121.76ms
iter 250890: loss 6.0840, time 124.11ms
iter 250900: loss 6.0923, time 121.64ms
iter 250910: loss 6.4767, time 124.39ms
iter 250920: loss 6.5477, time 121.69ms
iter 250930: loss 6.3747, time 124.48ms
iter 250940: loss 6.1028, time 121.65ms
iter 250950: loss 6.5942, time 123.98ms
iter 250960: loss 6.4430, time 120.48ms
iter 250970: loss 6.6936, time 124.35ms
iter 250980: loss 6.0469, time 121.49ms
iter 250990: loss 6.5892, time 124.31ms
step 251000: train loss 5.7437, val loss 5.7407
saving checkpoint to out-shakespeare-char
iter 251000: loss 5.9300, time 2886.97ms
iter 251010: loss 6.0853, time 125.50ms
iter 251020: loss 5.5211, time 128.04ms
iter 251030: loss 6.3018, time 125.57ms
iter 251040: loss 5.9687, time 128.49ms
iter 251050: loss 6.2380, time 125.19ms
iter 251060: loss 5.9605, time 125.72ms
iter 251070: loss 6.9779, time 125.49ms
iter 251080: loss 6.4562, time 128.80ms
iter 251090: loss 6.6736, time 125.76ms
iter 251100: loss 5.8812, time 125.55ms
iter 251110: loss 6.0512, time 125.28ms
iter 251120: loss 6.5977, time 124.22ms
iter 251130: loss 6.4268, time 125.15ms
iter 251140: loss 6.5737, time 125.68ms
iter 251150: loss 6.4799, time 125.32ms
iter 251160: loss 5.5825, time 128.58ms
iter 251170: loss 5.7959, time 125.31ms
iter 251180: loss 5.8042, time 125.21ms
iter 251190: loss 6.4137, time 125.49ms
iter 251200: loss 5.9870, time 125.29ms
iter 251210: loss 6.3083, time 125.59ms
iter 251220: loss 5.7393, time 125.65ms
iter 251230: loss 6.1223, time 125.55ms
iter 251240: loss 6.6707, time 125.67ms
step 251250: train loss 5.7383, val loss 5.7425
saving checkpoint to out-shakespeare-char
iter 251250: loss 5.9824, time 2901.10ms
iter 251260: loss 5.9356, time 125.39ms
iter 251270: loss 6.0383, time 125.07ms
iter 251280: loss 6.6360, time 125.55ms
iter 251290: loss 5.6662, time 128.21ms
iter 251300: loss 5.2692, time 125.26ms
iter 251310: loss 5.6706, time 125.28ms
iter 251320: loss 6.2561, time 125.33ms
iter 251330: loss 6.2922, time 125.84ms
iter 251340: loss 5.7474, time 125.59ms
iter 251350: loss 6.1410, time 125.59ms
iter 251360: loss 6.7186, time 125.47ms
iter 251370: loss 6.8326, time 125.66ms
iter 251380: loss 6.1650, time 125.73ms
iter 251390: loss 6.3490, time 125.80ms
iter 251400: loss 6.1274, time 124.37ms
iter 251410: loss 6.1696, time 125.30ms
iter 251420: loss 6.0407, time 125.39ms
iter 251430: loss 6.1582, time 125.38ms
iter 251440: loss 6.2218, time 128.04ms
iter 251450: loss 6.0654, time 125.38ms
iter 251460: loss 6.1168, time 125.81ms
iter 251470: loss 6.3291, time 125.84ms
iter 251480: loss 6.1591, time 124.51ms
iter 251490: loss 5.8129, time 126.94ms
step 251500: train loss 5.7215, val loss 5.7620
saving checkpoint to out-shakespeare-char
iter 251500: loss 6.2755, time 2897.93ms
iter 251510: loss 6.8221, time 125.91ms
iter 251520: loss 5.9347, time 126.93ms
iter 251530: loss 6.1158, time 126.10ms
iter 251540: loss 5.7155, time 125.51ms
iter 251550: loss 5.6347, time 125.32ms
iter 251560: loss 6.8626, time 125.44ms
iter 251570: loss 6.0289, time 125.22ms
iter 251580: loss 6.0221, time 124.55ms
iter 251590: loss 6.1792, time 124.70ms
iter 251600: loss 6.2085, time 125.68ms
iter 251610: loss 5.8825, time 125.39ms
iter 251620: loss 6.4351, time 125.64ms
iter 251630: loss 6.0306, time 125.44ms
iter 251640: loss 6.5845, time 125.44ms
iter 251650: loss 6.1787, time 128.38ms
iter 251660: loss 6.5115, time 125.23ms
iter 251670: loss 5.8814, time 125.85ms
iter 251680: loss 5.8039, time 125.85ms
iter 251690: loss 6.1905, time 125.42ms
iter 251700: loss 5.8589, time 125.19ms
iter 251710: loss 6.3316, time 126.08ms
iter 251720: loss 6.3379, time 125.60ms
iter 251730: loss 5.4496, time 125.82ms
iter 251740: loss 6.0752, time 125.46ms
step 251750: train loss 5.7990, val loss 5.7345
saving checkpoint to out-shakespeare-char
iter 251750: loss 5.9475, time 2905.20ms
iter 251760: loss 5.7529, time 125.33ms
iter 251770: loss 5.8722, time 124.64ms
iter 251780: loss 5.7059, time 125.24ms
iter 251790: loss 6.5661, time 125.32ms
iter 251800: loss 6.0356, time 125.29ms
iter 251810: loss 6.3703, time 128.22ms
iter 251820: loss 5.9474, time 125.03ms
iter 251830: loss 5.4690, time 125.44ms
iter 251840: loss 6.1229, time 124.35ms
iter 251850: loss 6.5443, time 128.23ms
iter 251860: loss 5.9497, time 125.24ms
iter 251870: loss 6.1389, time 125.54ms
iter 251880: loss 5.7984, time 123.72ms
iter 251890: loss 6.0493, time 124.72ms
iter 251900: loss 5.9480, time 123.37ms
iter 251910: loss 5.9087, time 125.49ms
iter 251920: loss 5.8912, time 125.17ms
iter 251930: loss 6.1061, time 125.65ms
iter 251940: loss 5.5559, time 125.22ms
iter 251950: loss 5.9763, time 128.14ms
iter 251960: loss 6.7605, time 125.22ms
iter 251970: loss 6.1355, time 125.26ms
iter 251980: loss 6.6883, time 125.22ms
iter 251990: loss 6.2998, time 125.29ms
step 252000: train loss 5.7057, val loss 5.7277
saving checkpoint to out-shakespeare-char
iter 252000: loss 6.4560, time 2887.87ms
iter 252010: loss 6.6136, time 125.23ms
iter 252020: loss 5.6134, time 124.91ms
iter 252030: loss 5.2269, time 128.18ms
iter 252040: loss 6.3833, time 125.24ms
iter 252050: loss 5.6073, time 125.24ms
iter 252060: loss 6.2449, time 125.86ms
iter 252070: loss 6.6095, time 125.26ms
iter 252080: loss 6.2677, time 126.81ms
iter 252090: loss 6.7754, time 125.97ms
iter 252100: loss 6.3234, time 125.25ms
iter 252110: loss 6.5459, time 125.24ms
iter 252120: loss 5.7024, time 125.42ms
iter 252130: loss 5.5617, time 124.75ms
iter 252140: loss 5.7491, time 128.25ms
iter 252150: loss 6.4793, time 125.68ms
iter 252160: loss 6.2905, time 125.84ms
iter 252170: loss 6.2281, time 125.78ms
iter 252180: loss 6.6534, time 125.57ms
iter 252190: loss 5.3816, time 125.57ms
iter 252200: loss 5.8557, time 125.72ms
iter 252210: loss 6.5097, time 125.31ms
iter 252220: loss 6.0077, time 125.78ms
iter 252230: loss 6.5120, time 125.58ms
iter 252240: loss 6.2405, time 125.59ms
step 252250: train loss 5.7082, val loss 5.7038
saving checkpoint to out-shakespeare-char
iter 252250: loss 5.7772, time 2905.20ms
iter 252260: loss 5.7689, time 125.41ms
iter 252270: loss 6.1416, time 125.24ms
iter 252280: loss 6.0058, time 125.33ms
iter 252290: loss 6.1674, time 125.10ms
iter 252300: loss 6.4623, time 125.44ms
iter 252310: loss 6.4761, time 125.27ms
iter 252320: loss 5.9337, time 125.24ms
iter 252330: loss 5.9550, time 125.33ms
iter 252340: loss 5.8449, time 127.99ms
iter 252350: loss 6.4421, time 125.17ms
iter 252360: loss 6.3408, time 125.55ms
iter 252370: loss 5.6855, time 125.86ms
iter 252380: loss 6.7459, time 124.57ms
iter 252390: loss 5.8870, time 125.61ms
iter 252400: loss 5.5991, time 125.40ms
iter 252410: loss 6.1444, time 125.52ms
iter 252420: loss 6.8766, time 125.47ms
iter 252430: loss 6.5966, time 125.23ms
iter 252440: loss 5.3728, time 125.86ms
iter 252450: loss 5.0393, time 124.32ms
iter 252460: loss 6.8990, time 125.46ms
iter 252470: loss 6.0951, time 125.44ms
iter 252480: loss 5.8256, time 125.44ms
iter 252490: loss 6.8849, time 128.55ms
step 252500: train loss 5.7412, val loss 5.6947
saving checkpoint to out-shakespeare-char
iter 252500: loss 6.3558, time 2888.48ms
iter 252510: loss 5.2190, time 125.62ms
iter 252520: loss 6.4864, time 122.81ms
iter 252530: loss 6.2222, time 120.89ms
iter 252540: loss 6.0147, time 123.94ms
iter 252550: loss 5.8648, time 120.49ms
iter 252560: loss 6.3925, time 121.45ms
iter 252570: loss 5.9778, time 121.65ms
iter 252580: loss 6.6387, time 121.66ms
iter 252590: loss 6.2902, time 121.92ms
iter 252600: loss 5.9721, time 121.74ms
iter 252610: loss 6.3457, time 121.21ms
iter 252620: loss 6.1552, time 121.90ms
iter 252630: loss 6.2259, time 121.64ms
iter 252640: loss 6.1917, time 122.36ms
iter 252650: loss 6.9312, time 121.53ms
iter 252660: loss 6.2395, time 121.45ms
iter 252670: loss 6.0838, time 121.68ms
iter 252680: loss 5.8776, time 120.85ms
iter 252690: loss 6.2514, time 122.81ms
iter 252700: loss 6.2744, time 120.94ms
iter 252710: loss 5.7415, time 121.21ms
iter 252720: loss 5.3493, time 120.77ms
iter 252730: loss 6.1264, time 120.91ms
iter 252740: loss 5.9727, time 121.71ms
step 252750: train loss 5.6811, val loss 5.6821
saving checkpoint to out-shakespeare-char
iter 252750: loss 6.0111, time 2897.89ms
iter 252760: loss 6.1040, time 124.76ms
iter 252770: loss 6.0984, time 121.75ms
iter 252780: loss 5.8740, time 124.50ms
iter 252790: loss 6.6481, time 122.06ms
iter 252800: loss 6.0516, time 123.86ms
iter 252810: loss 6.3050, time 121.66ms
iter 252820: loss 5.6861, time 124.08ms
iter 252830: loss 5.9317, time 121.72ms
iter 252840: loss 6.2989, time 124.44ms
iter 252850: loss 5.3468, time 121.70ms
iter 252860: loss 5.7063, time 124.50ms
iter 252870: loss 6.3784, time 122.10ms
iter 252880: loss 5.2063, time 124.39ms
iter 252890: loss 5.7926, time 122.05ms
iter 252900: loss 6.0079, time 124.02ms
iter 252910: loss 5.7339, time 121.62ms
iter 252920: loss 6.1780, time 124.42ms
iter 252930: loss 6.1601, time 121.65ms
iter 252940: loss 5.5823, time 124.54ms
iter 252950: loss 6.9293, time 121.68ms
iter 252960: loss 5.7465, time 124.40ms
iter 252970: loss 5.7539, time 121.17ms
iter 252980: loss 6.2982, time 124.92ms
iter 252990: loss 6.4475, time 122.50ms
step 253000: train loss 5.6973, val loss 5.7237
saving checkpoint to out-shakespeare-char
iter 253000: loss 6.9169, time 2896.82ms
iter 253010: loss 6.4239, time 122.92ms
iter 253020: loss 6.9454, time 121.40ms
iter 253030: loss 5.2875, time 122.57ms
iter 253040: loss 6.2750, time 121.46ms
iter 253050: loss 5.9282, time 122.68ms
iter 253060: loss 5.8259, time 121.67ms
iter 253070: loss 5.8030, time 122.63ms
iter 253080: loss 6.6496, time 121.40ms
iter 253090: loss 6.2976, time 122.42ms
iter 253100: loss 6.2695, time 121.57ms
iter 253110: loss 5.9532, time 122.33ms
iter 253120: loss 5.7477, time 121.51ms
iter 253130: loss 5.9991, time 123.29ms
iter 253140: loss 5.8362, time 121.67ms
iter 253150: loss 6.3853, time 123.12ms
iter 253160: loss 6.6975, time 121.61ms
iter 253170: loss 6.6982, time 122.72ms
iter 253180: loss 6.1238, time 121.72ms
iter 253190: loss 6.3420, time 122.42ms
iter 253200: loss 6.1750, time 120.89ms
iter 253210: loss 5.9368, time 122.73ms
iter 253220: loss 6.4194, time 122.05ms
iter 253230: loss 6.6699, time 122.73ms
iter 253240: loss 6.1420, time 121.92ms
step 253250: train loss 5.7046, val loss 5.7675
saving checkpoint to out-shakespeare-char
iter 253250: loss 5.7300, time 2899.30ms
iter 253260: loss 6.3550, time 121.77ms
iter 253270: loss 5.9952, time 121.93ms
iter 253280: loss 5.8493, time 122.04ms
iter 253290: loss 6.6802, time 121.63ms
iter 253300: loss 6.5107, time 121.06ms
iter 253310: loss 6.9214, time 121.75ms
iter 253320: loss 6.7440, time 121.42ms
iter 253330: loss 6.3815, time 121.52ms
iter 253340: loss 5.8320, time 121.93ms
iter 253350: loss 6.0670, time 121.81ms
iter 253360: loss 6.0153, time 121.61ms
iter 253370: loss 6.9727, time 121.45ms
iter 253380: loss 6.5539, time 122.02ms
iter 253390: loss 6.4783, time 121.93ms
iter 253400: loss 6.1699, time 121.58ms
iter 253410: loss 6.2864, time 121.77ms
iter 253420: loss 6.1525, time 121.51ms
iter 253430: loss 6.3475, time 121.53ms
iter 253440: loss 5.6646, time 121.44ms
iter 253450: loss 6.3389, time 121.48ms
iter 253460: loss 5.4858, time 121.60ms
iter 253470: loss 5.9087, time 121.60ms
iter 253480: loss 5.9514, time 121.24ms
iter 253490: loss 5.6025, time 121.54ms
step 253500: train loss 5.7595, val loss 5.7244
saving checkpoint to out-shakespeare-char
iter 253500: loss 5.4394, time 2886.39ms
iter 253510: loss 6.8162, time 121.67ms
iter 253520: loss 5.8015, time 122.83ms
iter 253530: loss 6.3503, time 121.62ms
iter 253540: loss 6.1634, time 122.57ms
iter 253550: loss 5.8844, time 121.89ms
iter 253560: loss 7.2550, time 122.87ms
iter 253570: loss 5.9308, time 121.72ms
iter 253580: loss 6.3152, time 122.72ms
iter 253590: loss 6.7010, time 121.51ms
iter 253600: loss 5.9957, time 122.74ms
iter 253610: loss 5.5503, time 121.78ms
iter 253620: loss 5.9715, time 122.73ms
iter 253630: loss 5.9510, time 122.18ms
iter 253640: loss 6.0769, time 122.95ms
iter 253650: loss 6.6061, time 121.76ms
iter 253660: loss 6.0268, time 122.95ms
iter 253670: loss 6.0846, time 121.70ms
iter 253680: loss 6.0923, time 122.76ms
iter 253690: loss 6.2064, time 121.55ms
iter 253700: loss 5.6506, time 122.71ms
iter 253710: loss 7.1762, time 121.54ms
iter 253720: loss 5.8525, time 122.60ms
iter 253730: loss 5.7998, time 121.43ms
iter 253740: loss 6.3533, time 123.76ms
step 253750: train loss 5.7350, val loss 5.7037
saving checkpoint to out-shakespeare-char
iter 253750: loss 6.5227, time 2890.03ms
iter 253760: loss 6.1559, time 124.74ms
iter 253770: loss 6.5295, time 122.22ms
iter 253780: loss 6.6666, time 124.64ms
iter 253790: loss 5.4869, time 121.46ms
iter 253800: loss 6.0499, time 124.42ms
iter 253810: loss 6.2819, time 121.61ms
iter 253820: loss 5.9531, time 124.17ms
iter 253830: loss 5.7643, time 121.78ms
iter 253840: loss 7.2134, time 124.32ms
iter 253850: loss 6.1617, time 121.43ms
iter 253860: loss 6.0225, time 124.33ms
iter 253870: loss 5.4441, time 121.43ms
iter 253880: loss 6.0579, time 124.14ms
iter 253890: loss 6.5650, time 121.52ms
iter 253900: loss 6.0279, time 124.19ms
iter 253910: loss 5.8254, time 121.28ms
iter 253920: loss 6.3688, time 124.24ms
iter 253930: loss 6.5026, time 121.49ms
iter 253940: loss 6.5292, time 123.55ms
iter 253950: loss 6.8314, time 121.40ms
iter 253960: loss 6.6874, time 124.20ms
iter 253970: loss 6.1825, time 121.63ms
iter 253980: loss 5.9660, time 124.23ms
iter 253990: loss 5.7989, time 121.54ms
step 254000: train loss 5.7323, val loss 5.7365
saving checkpoint to out-shakespeare-char
iter 254000: loss 6.3860, time 2890.14ms
iter 254010: loss 7.0517, time 121.46ms
iter 254020: loss 6.2791, time 120.52ms
iter 254030: loss 5.4818, time 122.64ms
iter 254040: loss 6.5147, time 121.43ms
iter 254050: loss 6.9644, time 122.46ms
iter 254060: loss 5.9019, time 121.52ms
iter 254070: loss 5.6488, time 122.46ms
iter 254080: loss 6.6785, time 121.92ms
iter 254090: loss 6.1295, time 121.36ms
iter 254100: loss 5.5586, time 120.85ms
iter 254110: loss 6.1849, time 121.46ms
iter 254120: loss 6.1340, time 121.51ms
iter 254130: loss 6.2629, time 121.53ms
iter 254140: loss 5.8445, time 122.07ms
iter 254150: loss 7.0521, time 120.51ms
iter 254160: loss 5.4887, time 121.47ms
iter 254170: loss 6.0355, time 121.31ms
iter 254180: loss 5.6246, time 121.62ms
iter 254190: loss 5.9849, time 121.79ms
iter 254200: loss 6.2298, time 121.77ms
iter 254210: loss 6.0538, time 121.23ms
iter 254220: loss 5.5550, time 121.48ms
iter 254230: loss 6.2427, time 121.44ms
iter 254240: loss 6.7999, time 121.52ms
step 254250: train loss 5.7073, val loss 5.6849
saving checkpoint to out-shakespeare-char
iter 254250: loss 7.1507, time 2889.07ms
iter 254260: loss 6.1702, time 121.30ms
iter 254270: loss 5.8547, time 124.37ms
iter 254280: loss 6.3015, time 121.63ms
iter 254290: loss 5.1753, time 124.23ms
iter 254300: loss 6.3454, time 121.63ms
iter 254310: loss 5.8344, time 124.43ms
iter 254320: loss 6.0101, time 121.46ms
iter 254330: loss 5.5334, time 121.33ms
iter 254340: loss 5.9053, time 121.23ms
iter 254350: loss 5.8972, time 121.51ms
iter 254360: loss 6.5164, time 122.15ms
iter 254370: loss 5.7104, time 121.53ms
iter 254380: loss 5.7825, time 122.56ms
iter 254390: loss 5.4767, time 121.30ms
iter 254400: loss 5.4454, time 121.30ms
iter 254410: loss 5.5697, time 121.43ms
iter 254420: loss 6.2655, time 121.45ms
iter 254430: loss 6.4744, time 121.65ms
iter 254440: loss 6.1814, time 121.86ms
iter 254450: loss 6.3282, time 121.39ms
iter 254460: loss 5.8466, time 121.91ms
iter 254470: loss 6.3097, time 121.37ms
iter 254480: loss 6.1849, time 121.55ms
iter 254490: loss 6.0182, time 121.72ms
step 254500: train loss 5.7573, val loss 5.7153
saving checkpoint to out-shakespeare-char
iter 254500: loss 6.2465, time 2876.26ms
iter 254510: loss 5.9351, time 121.61ms
iter 254520: loss 5.9358, time 121.89ms
iter 254530: loss 6.4586, time 121.66ms
iter 254540: loss 6.5885, time 122.95ms
iter 254550: loss 6.5922, time 121.66ms
iter 254560: loss 6.2249, time 121.84ms
iter 254570: loss 6.3654, time 121.96ms
iter 254580: loss 5.9875, time 122.01ms
iter 254590: loss 5.4739, time 121.80ms
iter 254600: loss 6.7050, time 121.85ms
iter 254610: loss 7.0318, time 121.99ms
iter 254620: loss 6.5397, time 122.16ms
iter 254630: loss 6.2355, time 121.52ms
iter 254640: loss 6.6467, time 121.77ms
iter 254650: loss 5.8008, time 121.71ms
iter 254660: loss 6.3484, time 122.16ms
iter 254670: loss 6.3910, time 121.64ms
iter 254680: loss 6.8059, time 121.90ms
iter 254690: loss 6.7130, time 122.34ms
iter 254700: loss 6.2119, time 121.80ms
iter 254710: loss 6.3525, time 121.79ms
iter 254720: loss 5.5557, time 121.12ms
iter 254730: loss 7.2149, time 121.66ms
iter 254740: loss 5.9466, time 122.00ms
step 254750: train loss 5.7436, val loss 5.6572
saving checkpoint to out-shakespeare-char
iter 254750: loss 6.9197, time 2895.94ms
iter 254760: loss 6.2354, time 121.55ms
iter 254770: loss 6.7498, time 120.44ms
iter 254780: loss 5.8439, time 119.71ms
iter 254790: loss 5.6608, time 121.39ms
iter 254800: loss 6.3875, time 121.88ms
iter 254810: loss 6.5273, time 121.19ms
iter 254820: loss 6.5144, time 120.42ms
iter 254830: loss 6.0609, time 121.88ms
iter 254840: loss 6.0351, time 121.32ms
iter 254850: loss 5.8626, time 121.19ms
iter 254860: loss 7.1317, time 121.37ms
iter 254870: loss 6.0115, time 121.36ms
iter 254880: loss 6.4349, time 121.38ms
iter 254890: loss 6.4828, time 121.15ms
iter 254900: loss 6.7839, time 121.39ms
iter 254910: loss 5.7436, time 121.35ms
iter 254920: loss 6.5999, time 121.97ms
iter 254930: loss 6.3772, time 122.22ms
iter 254940: loss 6.5255, time 122.26ms
iter 254950: loss 6.4789, time 121.49ms
iter 254960: loss 6.2803, time 121.60ms
iter 254970: loss 6.4491, time 121.35ms
iter 254980: loss 6.1794, time 121.46ms
iter 254990: loss 6.1570, time 121.52ms
step 255000: train loss 5.6637, val loss 5.7227
saving checkpoint to out-shakespeare-char
iter 255000: loss 5.8447, time 2885.35ms
iter 255010: loss 6.3753, time 120.94ms
iter 255020: loss 5.8155, time 121.90ms
iter 255030: loss 5.7331, time 120.74ms
iter 255040: loss 5.6229, time 121.52ms
iter 255050: loss 6.0963, time 121.90ms
iter 255060: loss 6.0956, time 121.56ms
iter 255070: loss 6.5794, time 121.59ms
iter 255080: loss 5.9226, time 121.45ms
iter 255090: loss 5.9630, time 121.51ms
iter 255100: loss 6.0956, time 121.66ms
iter 255110: loss 6.5013, time 121.58ms
iter 255120: loss 6.1825, time 121.28ms
iter 255130: loss 6.6259, time 121.61ms
iter 255140: loss 6.6398, time 121.50ms
iter 255150: loss 5.6315, time 121.92ms
iter 255160: loss 6.4209, time 121.59ms
iter 255170: loss 5.5182, time 121.58ms
iter 255180: loss 6.5875, time 121.68ms
iter 255190: loss 5.6798, time 121.60ms
iter 255200: loss 6.5116, time 121.89ms
iter 255210: loss 5.7850, time 121.73ms
iter 255220: loss 5.9528, time 121.54ms
iter 255230: loss 5.9102, time 122.50ms
iter 255240: loss 5.6042, time 121.77ms
step 255250: train loss 5.6974, val loss 5.7277
saving checkpoint to out-shakespeare-char
iter 255250: loss 5.8799, time 2906.14ms
iter 255260: loss 6.2091, time 122.87ms
iter 255270: loss 5.7137, time 121.78ms
iter 255280: loss 5.4243, time 122.59ms
iter 255290: loss 5.4394, time 121.68ms
iter 255300: loss 5.9820, time 122.69ms
iter 255310: loss 5.9673, time 121.74ms
iter 255320: loss 6.8548, time 122.24ms
iter 255330: loss 6.4322, time 121.78ms
iter 255340: loss 6.7883, time 122.73ms
iter 255350: loss 6.7437, time 121.58ms
iter 255360: loss 6.1993, time 122.68ms
iter 255370: loss 6.5605, time 121.50ms
iter 255380: loss 5.2594, time 122.54ms
iter 255390: loss 5.9610, time 121.94ms
iter 255400: loss 5.8901, time 122.73ms
iter 255410: loss 6.0028, time 121.56ms
iter 255420: loss 6.5411, time 122.71ms
iter 255430: loss 6.1428, time 121.55ms
iter 255440: loss 5.9881, time 122.68ms
iter 255450: loss 6.5179, time 121.45ms
iter 255460: loss 5.8459, time 122.73ms
iter 255470: loss 6.7208, time 121.51ms
iter 255480: loss 5.9853, time 122.53ms
iter 255490: loss 6.4553, time 121.49ms
step 255500: train loss 5.7193, val loss 5.6823
saving checkpoint to out-shakespeare-char
iter 255500: loss 5.9930, time 2890.34ms
iter 255510: loss 6.3566, time 121.27ms
iter 255520: loss 6.1363, time 121.48ms
iter 255530: loss 5.8341, time 122.57ms
iter 255540: loss 6.0263, time 121.28ms
iter 255550: loss 6.7224, time 121.22ms
iter 255560: loss 6.6652, time 121.55ms
iter 255570: loss 5.9583, time 122.51ms
iter 255580: loss 6.8840, time 121.56ms
iter 255590: loss 7.0261, time 121.16ms
iter 255600: loss 6.7074, time 121.57ms
iter 255610: loss 5.5230, time 121.46ms
iter 255620: loss 6.1222, time 121.47ms
iter 255630: loss 5.7240, time 121.47ms
iter 255640: loss 6.0735, time 121.47ms
iter 255650: loss 5.2249, time 121.43ms
iter 255660: loss 6.3605, time 121.51ms
iter 255670: loss 6.2292, time 121.45ms
iter 255680: loss 6.3483, time 121.48ms
iter 255690: loss 5.9279, time 121.49ms
iter 255700: loss 6.6200, time 121.71ms
iter 255710: loss 5.5688, time 122.00ms
iter 255720: loss 6.1184, time 121.54ms
iter 255730: loss 6.0802, time 121.74ms
iter 255740: loss 6.1511, time 121.71ms
step 255750: train loss 5.7191, val loss 5.7536
saving checkpoint to out-shakespeare-char
iter 255750: loss 6.1464, time 2890.04ms
iter 255760: loss 6.2364, time 120.71ms
iter 255770: loss 6.2530, time 121.36ms
iter 255780: loss 6.3712, time 121.29ms
iter 255790: loss 6.0614, time 121.47ms
iter 255800: loss 6.0376, time 121.47ms
iter 255810: loss 6.2844, time 120.94ms
iter 255820: loss 5.7957, time 121.42ms
iter 255830: loss 5.9827, time 121.38ms
iter 255840: loss 5.9800, time 121.38ms
iter 255850: loss 6.4810, time 121.18ms
iter 255860: loss 6.3296, time 121.40ms
iter 255870: loss 6.4993, time 120.79ms
iter 255880: loss 5.7251, time 121.31ms
iter 255890: loss 5.5440, time 121.30ms
iter 255900: loss 5.2225, time 121.39ms
iter 255910: loss 6.3118, time 121.58ms
iter 255920: loss 6.4566, time 121.30ms
iter 255930: loss 6.1542, time 121.46ms
iter 255940: loss 6.7008, time 121.20ms
iter 255950: loss 6.9495, time 121.97ms
iter 255960: loss 5.5111, time 121.56ms
iter 255970: loss 6.2030, time 121.29ms
iter 255980: loss 5.8910, time 121.30ms
iter 255990: loss 5.9126, time 121.30ms
step 256000: train loss 5.6796, val loss 5.6606
saving checkpoint to out-shakespeare-char
iter 256000: loss 5.6914, time 2885.03ms
iter 256010: loss 5.9831, time 122.23ms
iter 256020: loss 5.9459, time 121.86ms
iter 256030: loss 5.5192, time 123.27ms
iter 256040: loss 6.1377, time 121.71ms
iter 256050: loss 6.1311, time 122.13ms
iter 256060: loss 6.0895, time 120.92ms
iter 256070: loss 5.9815, time 122.93ms
iter 256080: loss 5.8241, time 121.91ms
iter 256090: loss 6.3774, time 122.89ms
iter 256100: loss 5.8810, time 121.32ms
iter 256110: loss 6.4903, time 122.56ms
iter 256120: loss 5.7511, time 121.46ms
iter 256130: loss 6.0742, time 123.12ms
iter 256140: loss 6.1193, time 121.94ms
iter 256150: loss 5.8709, time 122.99ms
iter 256160: loss 5.7046, time 121.74ms
iter 256170: loss 6.0033, time 123.31ms
iter 256180: loss 5.7922, time 121.64ms
iter 256190: loss 7.0149, time 123.44ms
iter 256200: loss 6.3255, time 122.12ms
iter 256210: loss 6.9209, time 123.35ms
iter 256220: loss 6.1071, time 122.24ms
iter 256230: loss 5.9715, time 122.98ms
iter 256240: loss 5.9428, time 121.76ms
step 256250: train loss 5.7130, val loss 5.7243
saving checkpoint to out-shakespeare-char
iter 256250: loss 6.1339, time 2901.19ms
iter 256260: loss 6.2596, time 121.85ms
iter 256270: loss 6.0830, time 121.72ms
iter 256280: loss 6.0013, time 121.69ms
iter 256290: loss 5.8235, time 121.67ms
iter 256300: loss 6.1935, time 121.99ms
iter 256310: loss 5.7878, time 122.14ms
iter 256320: loss 6.2211, time 121.72ms
iter 256330: loss 5.8729, time 121.83ms
iter 256340: loss 5.9981, time 121.62ms
iter 256350: loss 6.4819, time 121.92ms
iter 256360: loss 6.4364, time 121.95ms
iter 256370: loss 6.2053, time 121.98ms
iter 256380: loss 6.1799, time 125.54ms
iter 256390: loss 6.6072, time 125.98ms
iter 256400: loss 5.9340, time 125.99ms
iter 256410: loss 5.9154, time 125.61ms
iter 256420: loss 5.4685, time 120.64ms
iter 256430: loss 6.1655, time 120.82ms
iter 256440: loss 6.2392, time 121.22ms
iter 256450: loss 6.3022, time 121.49ms
iter 256460: loss 6.3563, time 122.42ms
iter 256470: loss 5.3455, time 121.28ms
iter 256480: loss 6.3796, time 121.31ms
iter 256490: loss 6.1074, time 121.43ms
step 256500: train loss 5.7225, val loss 5.6799
saving checkpoint to out-shakespeare-char
iter 256500: loss 5.8383, time 2888.48ms
iter 256510: loss 5.7406, time 121.68ms
iter 256520: loss 5.8388, time 124.79ms
iter 256530: loss 6.4232, time 122.02ms
iter 256540: loss 5.7140, time 124.11ms
iter 256550: loss 6.3860, time 121.88ms
iter 256560: loss 6.5853, time 123.91ms
iter 256570: loss 6.0265, time 121.74ms
iter 256580: loss 6.7134, time 124.89ms
iter 256590: loss 6.1641, time 121.04ms
iter 256600: loss 5.8880, time 125.02ms
iter 256610: loss 5.8596, time 121.42ms
iter 256620: loss 6.6141, time 124.24ms
iter 256630: loss 6.0058, time 122.36ms
iter 256640: loss 6.7534, time 124.73ms
iter 256650: loss 6.3654, time 121.75ms
iter 256660: loss 5.9321, time 124.48ms
iter 256670: loss 5.4182, time 122.18ms
iter 256680: loss 6.5919, time 124.77ms
iter 256690: loss 6.2197, time 122.06ms
iter 256700: loss 6.6455, time 124.96ms
iter 256710: loss 5.8422, time 122.05ms
iter 256720: loss 6.4543, time 124.94ms
iter 256730: loss 5.5539, time 121.85ms
iter 256740: loss 6.4329, time 124.90ms
step 256750: train loss 5.6531, val loss 5.7228
saving checkpoint to out-shakespeare-char
iter 256750: loss 5.2270, time 2894.68ms
iter 256760: loss 6.1473, time 124.07ms
iter 256770: loss 6.2230, time 124.21ms
iter 256780: loss 7.0116, time 125.11ms
iter 256790: loss 6.3361, time 125.25ms
iter 256800: loss 6.1836, time 126.93ms
iter 256810: loss 6.9658, time 126.22ms
iter 256820: loss 6.4360, time 125.27ms
iter 256830: loss 6.1876, time 122.53ms
iter 256840: loss 6.5706, time 120.35ms
iter 256850: loss 6.1859, time 121.79ms
iter 256860: loss 6.2365, time 121.64ms
iter 256870: loss 5.9016, time 122.75ms
iter 256880: loss 6.3737, time 121.22ms
iter 256890: loss 6.0750, time 122.69ms
iter 256900: loss 5.8081, time 121.42ms
iter 256910: loss 5.9391, time 122.60ms
iter 256920: loss 5.6282, time 122.54ms
iter 256930: loss 6.3441, time 122.65ms
iter 256940: loss 6.3499, time 121.55ms
iter 256950: loss 6.1902, time 121.62ms
iter 256960: loss 5.8117, time 121.43ms
iter 256970: loss 6.7009, time 121.38ms
iter 256980: loss 5.7392, time 121.95ms
iter 256990: loss 5.9135, time 121.43ms
step 257000: train loss 5.7197, val loss 5.6971
saving checkpoint to out-shakespeare-char
iter 257000: loss 6.1628, time 2889.01ms
iter 257010: loss 6.4782, time 121.01ms
iter 257020: loss 5.3535, time 121.14ms
iter 257030: loss 6.6991, time 122.15ms
iter 257040: loss 6.3026, time 122.31ms
iter 257050: loss 5.4175, time 121.49ms
iter 257060: loss 5.7768, time 121.12ms
iter 257070: loss 5.9518, time 121.30ms
iter 257080: loss 5.6936, time 121.25ms
iter 257090: loss 5.6739, time 121.40ms
iter 257100: loss 6.4735, time 122.21ms
iter 257110: loss 6.1910, time 121.48ms
iter 257120: loss 6.0739, time 120.43ms
iter 257130: loss 6.0374, time 121.49ms
iter 257140: loss 6.1154, time 121.80ms
iter 257150: loss 6.2608, time 121.34ms
iter 257160: loss 6.0520, time 121.12ms
iter 257170: loss 5.9852, time 121.46ms
iter 257180: loss 5.7652, time 121.55ms
iter 257190: loss 6.2293, time 121.48ms
iter 257200: loss 5.2513, time 121.26ms
iter 257210: loss 6.5363, time 121.40ms
iter 257220: loss 5.7181, time 121.34ms
iter 257230: loss 6.8237, time 121.47ms
iter 257240: loss 6.3594, time 121.45ms
step 257250: train loss 5.6575, val loss 5.6949
saving checkpoint to out-shakespeare-char
iter 257250: loss 6.1457, time 2885.34ms
iter 257260: loss 6.1102, time 121.61ms
iter 257270: loss 6.4584, time 121.42ms
iter 257280: loss 5.7515, time 122.46ms
iter 257290: loss 6.5226, time 121.51ms
iter 257300: loss 5.7806, time 123.66ms
iter 257310: loss 6.1245, time 121.44ms
iter 257320: loss 6.0600, time 122.45ms
iter 257330: loss 6.1305, time 121.43ms
iter 257340: loss 6.5380, time 122.59ms
iter 257350: loss 6.6895, time 121.48ms
iter 257360: loss 5.7262, time 121.57ms
iter 257370: loss 6.4072, time 121.42ms
iter 257380: loss 5.9446, time 123.08ms
iter 257390: loss 5.8661, time 121.57ms
iter 257400: loss 6.3020, time 122.77ms
iter 257410: loss 6.0479, time 121.38ms
iter 257420: loss 6.5890, time 122.57ms
iter 257430: loss 5.8869, time 121.49ms
iter 257440: loss 6.0218, time 122.69ms
iter 257450: loss 5.9582, time 121.50ms
iter 257460: loss 5.8860, time 122.52ms
iter 257470: loss 6.0054, time 121.55ms
iter 257480: loss 6.4267, time 122.50ms
iter 257490: loss 5.9367, time 121.36ms
step 257500: train loss 5.7039, val loss 5.7475
saving checkpoint to out-shakespeare-char
iter 257500: loss 5.5109, time 2896.51ms
iter 257510: loss 7.0660, time 121.36ms
iter 257520: loss 6.4232, time 124.01ms
iter 257530: loss 6.0724, time 121.44ms
iter 257540: loss 5.8523, time 124.24ms
iter 257550: loss 5.6160, time 121.44ms
iter 257560: loss 6.1786, time 124.19ms
iter 257570: loss 5.7712, time 122.34ms
iter 257580: loss 6.0250, time 124.35ms
iter 257590: loss 6.1614, time 121.26ms
iter 257600: loss 6.1128, time 124.17ms
iter 257610: loss 5.9365, time 121.46ms
iter 257620: loss 5.8459, time 124.49ms
iter 257630: loss 5.8015, time 121.46ms
iter 257640: loss 6.7096, time 124.62ms
iter 257650: loss 6.2715, time 121.21ms
iter 257660: loss 5.3914, time 125.04ms
iter 257670: loss 6.5678, time 121.34ms
iter 257680: loss 6.5619, time 124.34ms
iter 257690: loss 5.9427, time 121.79ms
iter 257700: loss 6.1597, time 124.21ms
iter 257710: loss 5.9843, time 121.57ms
iter 257720: loss 6.5585, time 124.30ms
iter 257730: loss 6.1052, time 121.39ms
iter 257740: loss 6.0288, time 124.52ms
step 257750: train loss 5.6774, val loss 5.6616
saving checkpoint to out-shakespeare-char
iter 257750: loss 6.0432, time 2890.26ms
iter 257760: loss 6.1480, time 121.40ms
iter 257770: loss 5.6608, time 121.74ms
iter 257780: loss 6.0318, time 121.91ms
iter 257790: loss 6.4258, time 121.69ms
iter 257800: loss 6.1828, time 121.88ms
iter 257810: loss 6.2673, time 121.64ms
iter 257820: loss 5.7536, time 121.87ms
iter 257830: loss 6.2526, time 121.24ms
iter 257840: loss 5.8515, time 121.91ms
iter 257850: loss 5.3773, time 121.52ms
iter 257860: loss 6.0035, time 121.31ms
iter 257870: loss 5.1889, time 121.78ms
iter 257880: loss 6.3210, time 121.88ms
iter 257890: loss 6.3005, time 121.93ms
iter 257900: loss 5.8324, time 121.80ms
iter 257910: loss 7.2444, time 121.62ms
iter 257920: loss 6.3152, time 122.12ms
iter 257930: loss 5.4149, time 121.78ms
iter 257940: loss 6.3140, time 122.01ms
iter 257950: loss 6.1834, time 121.72ms
iter 257960: loss 6.6012, time 122.43ms
iter 257970: loss 5.3749, time 121.57ms
iter 257980: loss 6.2712, time 121.97ms
iter 257990: loss 5.7707, time 122.16ms
step 258000: train loss 5.6892, val loss 5.7153
saving checkpoint to out-shakespeare-char
iter 258000: loss 6.7792, time 2900.35ms
iter 258010: loss 6.1060, time 122.96ms
iter 258020: loss 5.5725, time 121.64ms
iter 258030: loss 6.3822, time 121.69ms
iter 258040: loss 5.6617, time 121.54ms
iter 258050: loss 6.0698, time 121.62ms
iter 258060: loss 6.2041, time 121.47ms
iter 258070: loss 6.3762, time 121.51ms
iter 258080: loss 5.6037, time 121.62ms
iter 258090: loss 5.4454, time 121.85ms
iter 258100: loss 5.9315, time 121.78ms
iter 258110: loss 6.2917, time 121.63ms
iter 258120: loss 5.8358, time 121.00ms
iter 258130: loss 5.9683, time 121.72ms
iter 258140: loss 5.8743, time 121.85ms
iter 258150: loss 5.7081, time 121.90ms
iter 258160: loss 5.8903, time 121.73ms
iter 258170: loss 5.5729, time 121.71ms
iter 258180: loss 6.0424, time 121.97ms
iter 258190: loss 6.6172, time 121.22ms
iter 258200: loss 6.4409, time 121.68ms
iter 258210: loss 6.2916, time 121.99ms
iter 258220: loss 5.9059, time 121.68ms
iter 258230: loss 5.9381, time 121.60ms
iter 258240: loss 5.4895, time 121.53ms
step 258250: train loss 5.7367, val loss 5.7009
saving checkpoint to out-shakespeare-char
iter 258250: loss 6.4541, time 2891.18ms
iter 258260: loss 5.7102, time 121.48ms
iter 258270: loss 5.5508, time 121.41ms
iter 258280: loss 6.2686, time 120.39ms
iter 258290: loss 6.0957, time 119.95ms
iter 258300: loss 5.9407, time 121.08ms
iter 258310: loss 5.7458, time 122.63ms
iter 258320: loss 6.0863, time 121.65ms
iter 258330: loss 5.7259, time 121.48ms
iter 258340: loss 6.2835, time 121.54ms
iter 258350: loss 6.5552, time 121.30ms
iter 258360: loss 5.3580, time 121.68ms
iter 258370: loss 5.3786, time 121.47ms
iter 258380: loss 6.5055, time 121.58ms
iter 258390: loss 5.6416, time 121.58ms
iter 258400: loss 5.8295, time 121.61ms
iter 258410: loss 5.4194, time 121.46ms
iter 258420: loss 5.6961, time 122.63ms
iter 258430: loss 6.3283, time 121.52ms
iter 258440: loss 6.0935, time 121.65ms
iter 258450: loss 6.1406, time 122.75ms
iter 258460: loss 6.1773, time 121.76ms
iter 258470: loss 6.6350, time 121.73ms
iter 258480: loss 6.1003, time 121.99ms
iter 258490: loss 6.1570, time 121.81ms
step 258500: train loss 5.6939, val loss 5.7033
saving checkpoint to out-shakespeare-char
iter 258500: loss 5.6375, time 2925.95ms
iter 258510: loss 6.7538, time 120.01ms
iter 258520: loss 5.8918, time 124.02ms
iter 258530: loss 6.5326, time 119.76ms
iter 258540: loss 6.3707, time 122.81ms
iter 258550: loss 5.7161, time 119.99ms
iter 258560: loss 6.0799, time 122.91ms
iter 258570: loss 6.2738, time 120.14ms
iter 258580: loss 5.7653, time 122.96ms
iter 258590: loss 5.3372, time 120.02ms
iter 258600: loss 6.2995, time 122.89ms
iter 258610: loss 5.9920, time 119.65ms
iter 258620: loss 5.5400, time 123.01ms
iter 258630: loss 6.6002, time 121.63ms
iter 258640: loss 6.3797, time 123.86ms
iter 258650: loss 5.4294, time 119.65ms
iter 258660: loss 5.3028, time 122.10ms
iter 258670: loss 6.4499, time 119.40ms
iter 258680: loss 6.4007, time 122.22ms
iter 258690: loss 5.3891, time 119.51ms
iter 258700: loss 6.6241, time 122.43ms
iter 258710: loss 5.7028, time 119.58ms
iter 258720: loss 6.2689, time 122.43ms
iter 258730: loss 6.3708, time 119.42ms
iter 258740: loss 6.6743, time 122.36ms
step 258750: train loss 5.6805, val loss 5.7078
saving checkpoint to out-shakespeare-char
iter 258750: loss 5.5831, time 2885.51ms
iter 258760: loss 6.4362, time 127.21ms
iter 258770: loss 5.7249, time 125.85ms
iter 258780: loss 6.2724, time 126.21ms
iter 258790: loss 6.5001, time 125.70ms
iter 258800: loss 6.3758, time 128.58ms
iter 258810: loss 6.4613, time 125.91ms
iter 258820: loss 5.8991, time 125.66ms
iter 258830: loss 5.9743, time 125.80ms
iter 258840: loss 5.5390, time 126.04ms
iter 258850: loss 5.8533, time 125.41ms
iter 258860: loss 5.9655, time 125.27ms
iter 258870: loss 6.5495, time 125.43ms
iter 258880: loss 5.5575, time 125.19ms
iter 258890: loss 6.2930, time 125.19ms
iter 258900: loss 6.1666, time 125.23ms
iter 258910: loss 5.9739, time 124.59ms
iter 258920: loss 5.7654, time 128.56ms
iter 258930: loss 5.9906, time 125.35ms
iter 258940: loss 5.2273, time 125.31ms
iter 258950: loss 5.9491, time 125.74ms
iter 258960: loss 6.2490, time 125.32ms
iter 258970: loss 5.8539, time 125.07ms
iter 258980: loss 6.3785, time 125.33ms
iter 258990: loss 5.6747, time 126.08ms
step 259000: train loss 5.7036, val loss 5.7478
saving checkpoint to out-shakespeare-char
iter 259000: loss 5.9970, time 2892.36ms
iter 259010: loss 6.5903, time 120.34ms
iter 259020: loss 6.1050, time 120.80ms
iter 259030: loss 6.1868, time 121.38ms
iter 259040: loss 5.6627, time 121.60ms
iter 259050: loss 6.8923, time 122.27ms
iter 259060: loss 5.7972, time 120.76ms
iter 259070: loss 6.2115, time 121.55ms
iter 259080: loss 6.6198, time 121.38ms
iter 259090: loss 6.8477, time 119.12ms
iter 259100: loss 6.2438, time 121.89ms
iter 259110: loss 5.4047, time 121.43ms
iter 259120: loss 6.7398, time 121.52ms
iter 259130: loss 5.8532, time 121.52ms
iter 259140: loss 6.4417, time 121.39ms
iter 259150: loss 6.0309, time 122.38ms
iter 259160: loss 6.0179, time 121.51ms
iter 259170: loss 5.9629, time 120.74ms
iter 259180: loss 6.4551, time 121.72ms
iter 259190: loss 5.7634, time 121.61ms
iter 259200: loss 6.1129, time 122.25ms
iter 259210: loss 6.0514, time 121.51ms
iter 259220: loss 6.0813, time 121.74ms
iter 259230: loss 6.4970, time 120.91ms
iter 259240: loss 5.8998, time 120.15ms
step 259250: train loss 5.6694, val loss 5.7732
saving checkpoint to out-shakespeare-char
iter 259250: loss 6.7091, time 2894.63ms
iter 259260: loss 6.4580, time 124.42ms
iter 259270: loss 6.0424, time 121.59ms
iter 259280: loss 6.7735, time 124.28ms
iter 259290: loss 5.6690, time 121.60ms
iter 259300: loss 6.2932, time 124.29ms
iter 259310: loss 5.8199, time 120.83ms
iter 259320: loss 6.1897, time 124.37ms
iter 259330: loss 5.7661, time 121.48ms
iter 259340: loss 6.1243, time 124.49ms
iter 259350: loss 6.2503, time 121.57ms
iter 259360: loss 6.3872, time 124.34ms
iter 259370: loss 6.7459, time 120.39ms
iter 259380: loss 6.4237, time 124.83ms
iter 259390: loss 5.9879, time 121.54ms
iter 259400: loss 6.5064, time 124.39ms
iter 259410: loss 5.9083, time 121.32ms
iter 259420: loss 5.8537, time 124.54ms
iter 259430: loss 6.2002, time 121.61ms
iter 259440: loss 5.6016, time 124.70ms
iter 259450: loss 5.8199, time 121.58ms
iter 259460: loss 6.2269, time 124.44ms
iter 259470: loss 6.1223, time 121.54ms
iter 259480: loss 5.9844, time 124.42ms
iter 259490: loss 6.0208, time 121.37ms
step 259500: train loss 5.6887, val loss 5.6705
saving checkpoint to out-shakespeare-char
iter 259500: loss 6.2352, time 2893.22ms
iter 259510: loss 6.2106, time 125.90ms
iter 259520: loss 6.1150, time 125.02ms
iter 259530: loss 6.3129, time 125.14ms
iter 259540: loss 5.7255, time 125.61ms
iter 259550: loss 6.2226, time 125.59ms
iter 259560: loss 5.2734, time 125.58ms
iter 259570: loss 6.5340, time 124.80ms
iter 259580: loss 6.1877, time 125.52ms
iter 259590: loss 5.6145, time 125.80ms
iter 259600: loss 6.2505, time 128.40ms
iter 259610: loss 5.9683, time 125.75ms
iter 259620: loss 6.3340, time 124.98ms
iter 259630: loss 6.4714, time 126.41ms
iter 259640: loss 6.3014, time 125.86ms
iter 259650: loss 5.8472, time 125.91ms
iter 259660: loss 6.5330, time 125.64ms
iter 259670: loss 5.6419, time 125.42ms
iter 259680: loss 6.2721, time 125.71ms
iter 259690: loss 6.0914, time 125.62ms
iter 259700: loss 5.7578, time 125.44ms
iter 259710: loss 6.1223, time 125.66ms
iter 259720: loss 5.8743, time 125.62ms
iter 259730: loss 5.9579, time 125.72ms
iter 259740: loss 5.9475, time 125.32ms
step 259750: train loss 5.6776, val loss 5.6729
saving checkpoint to out-shakespeare-char
iter 259750: loss 5.3237, time 2898.51ms
iter 259760: loss 6.0906, time 124.83ms
iter 259770: loss 5.8202, time 125.20ms
iter 259780: loss 6.0617, time 124.01ms
iter 259790: loss 6.6032, time 124.66ms
iter 259800: loss 5.9763, time 125.01ms
iter 259810: loss 5.8255, time 125.22ms
iter 259820: loss 5.8798, time 125.90ms
iter 259830: loss 6.8275, time 124.79ms
iter 259840: loss 5.3920, time 124.81ms
iter 259850: loss 5.8225, time 124.37ms
iter 259860: loss 6.0133, time 125.46ms
iter 259870: loss 6.2912, time 125.01ms
iter 259880: loss 6.3197, time 125.62ms
iter 259890: loss 5.7437, time 124.91ms
iter 259900: loss 5.3123, time 128.23ms
iter 259910: loss 6.4868, time 124.86ms
iter 259920: loss 6.1551, time 125.02ms
iter 259930: loss 6.8174, time 124.78ms
iter 259940: loss 6.8936, time 124.06ms
iter 259950: loss 6.5258, time 124.82ms
iter 259960: loss 6.2568, time 124.84ms
iter 259970: loss 6.7417, time 124.62ms
iter 259980: loss 6.6178, time 124.54ms
iter 259990: loss 5.6324, time 124.90ms
step 260000: train loss 5.6826, val loss 5.7042
saving checkpoint to out-shakespeare-char
iter 260000: loss 6.7842, time 2899.44ms
iter 260010: loss 6.9228, time 125.18ms
iter 260020: loss 5.5872, time 125.34ms
iter 260030: loss 6.4956, time 124.64ms
iter 260040: loss 6.3449, time 124.56ms
iter 260050: loss 5.9217, time 125.07ms
iter 260060: loss 6.2027, time 125.42ms
iter 260070: loss 6.3159, time 125.58ms
iter 260080: loss 6.1339, time 125.27ms
iter 260090: loss 5.4567, time 125.33ms
iter 260100: loss 6.1676, time 125.36ms
iter 260110: loss 6.1425, time 125.33ms
iter 260120: loss 6.1467, time 125.77ms
iter 260130: loss 6.0031, time 128.23ms
iter 260140: loss 5.9231, time 125.29ms
iter 260150: loss 6.0264, time 124.45ms
iter 260160: loss 6.1400, time 125.44ms
iter 260170: loss 6.2448, time 125.57ms
iter 260180: loss 5.5848, time 125.39ms
iter 260190: loss 6.2609, time 126.75ms
iter 260200: loss 6.2889, time 125.19ms
iter 260210: loss 5.8594, time 125.02ms
iter 260220: loss 6.3552, time 125.00ms
iter 260230: loss 5.6606, time 126.30ms
iter 260240: loss 6.9285, time 125.32ms
step 260250: train loss 5.7219, val loss 5.6808
saving checkpoint to out-shakespeare-char
iter 260250: loss 5.7598, time 2914.93ms
iter 260260: loss 7.0081, time 125.42ms
iter 260270: loss 6.5349, time 124.93ms
iter 260280: loss 6.0292, time 125.93ms
iter 260290: loss 5.9506, time 125.39ms
iter 260300: loss 6.2286, time 124.93ms
iter 260310: loss 6.3603, time 124.97ms
iter 260320: loss 6.7063, time 125.50ms
iter 260330: loss 6.7641, time 127.96ms
iter 260340: loss 6.1376, time 125.02ms
iter 260350: loss 6.1652, time 125.72ms
iter 260360: loss 5.6854, time 124.91ms
iter 260370: loss 5.8905, time 124.92ms
iter 260380: loss 6.2442, time 124.79ms
iter 260390: loss 5.9802, time 124.48ms
iter 260400: loss 5.5855, time 125.26ms
iter 260410: loss 6.3502, time 124.52ms
iter 260420: loss 6.3576, time 128.63ms
iter 260430: loss 5.7837, time 124.88ms
iter 260440: loss 6.5532, time 124.64ms
iter 260450: loss 5.9260, time 125.68ms
iter 260460: loss 6.2982, time 127.99ms
iter 260470: loss 6.8627, time 125.17ms
iter 260480: loss 5.7964, time 125.49ms
iter 260490: loss 5.7209, time 126.31ms
step 260500: train loss 5.6883, val loss 5.6859
saving checkpoint to out-shakespeare-char
iter 260500: loss 6.7786, time 2914.44ms
iter 260510: loss 6.3926, time 125.00ms
iter 260520: loss 6.4989, time 126.30ms
iter 260530: loss 5.7223, time 124.86ms
iter 260540: loss 6.7499, time 125.04ms
iter 260550: loss 6.2340, time 124.94ms
iter 260560: loss 6.5977, time 124.89ms
iter 260570: loss 7.0415, time 125.13ms
iter 260580: loss 5.5905, time 124.69ms
iter 260590: loss 6.4119, time 126.16ms
iter 260600: loss 5.6691, time 124.85ms
iter 260610: loss 6.8085, time 124.73ms
iter 260620: loss 6.6615, time 125.37ms
iter 260630: loss 5.8916, time 124.53ms
iter 260640: loss 6.4491, time 124.75ms
iter 260650: loss 6.4086, time 126.52ms
iter 260660: loss 5.9751, time 124.07ms
iter 260670: loss 6.5614, time 125.04ms
iter 260680: loss 5.9999, time 125.03ms
iter 260690: loss 6.0228, time 125.11ms
iter 260700: loss 5.7530, time 126.94ms
iter 260710: loss 5.7762, time 125.00ms
iter 260720: loss 5.1595, time 124.97ms
iter 260730: loss 6.1308, time 125.46ms
iter 260740: loss 6.4299, time 124.20ms
step 260750: train loss 5.6882, val loss 5.7052
saving checkpoint to out-shakespeare-char
iter 260750: loss 5.7054, time 2894.22ms
iter 260760: loss 6.4318, time 127.06ms
iter 260770: loss 6.6946, time 125.34ms
iter 260780: loss 6.0461, time 125.30ms
iter 260790: loss 6.3941, time 125.86ms
iter 260800: loss 5.5306, time 126.75ms
iter 260810: loss 6.7848, time 125.15ms
iter 260820: loss 6.1655, time 124.90ms
iter 260830: loss 6.7716, time 125.37ms
iter 260840: loss 6.5282, time 128.81ms
iter 260850: loss 5.4426, time 125.75ms
iter 260860: loss 5.6513, time 125.79ms
iter 260870: loss 5.6397, time 126.82ms
iter 260880: loss 5.9964, time 125.53ms
iter 260890: loss 6.1631, time 125.60ms
iter 260900: loss 5.7333, time 125.48ms
iter 260910: loss 6.2666, time 128.39ms
iter 260920: loss 6.0899, time 124.77ms
iter 260930: loss 5.6799, time 124.75ms
iter 260940: loss 5.1853, time 126.18ms
iter 260950: loss 6.4825, time 125.32ms
iter 260960: loss 6.0784, time 124.69ms
iter 260970: loss 7.1116, time 125.41ms
iter 260980: loss 6.2688, time 125.60ms
iter 260990: loss 6.0026, time 125.34ms
step 261000: train loss 5.7350, val loss 5.7514
saving checkpoint to out-shakespeare-char
iter 261000: loss 5.9998, time 2885.62ms
iter 261010: loss 5.4541, time 125.37ms
iter 261020: loss 5.7661, time 125.64ms
iter 261030: loss 6.5766, time 125.10ms
iter 261040: loss 5.7762, time 125.42ms
iter 261050: loss 6.2445, time 125.74ms
iter 261060: loss 6.5938, time 125.72ms
iter 261070: loss 5.8196, time 124.72ms
iter 261080: loss 5.7739, time 128.53ms
iter 261090: loss 6.7116, time 125.56ms
iter 261100: loss 5.2291, time 124.91ms
iter 261110: loss 6.4144, time 124.47ms
iter 261120: loss 5.8278, time 124.96ms
iter 261130: loss 5.9956, time 125.36ms
iter 261140: loss 6.3235, time 125.36ms
iter 261150: loss 7.0446, time 125.18ms
iter 261160: loss 7.3517, time 125.55ms
iter 261170: loss 6.3033, time 125.47ms
iter 261180: loss 6.1936, time 125.55ms
iter 261190: loss 6.1520, time 124.41ms
iter 261200: loss 5.8407, time 125.03ms
iter 261210: loss 6.5730, time 125.36ms
iter 261220: loss 6.1247, time 125.46ms
iter 261230: loss 5.9924, time 125.40ms
iter 261240: loss 5.8140, time 125.79ms
step 261250: train loss 5.6778, val loss 5.7124
saving checkpoint to out-shakespeare-char
iter 261250: loss 6.1718, time 2857.19ms
iter 261260: loss 6.2656, time 125.82ms
iter 261270: loss 5.1674, time 125.91ms
iter 261280: loss 6.3891, time 125.54ms
iter 261290: loss 5.8012, time 126.04ms
iter 261300: loss 5.4647, time 124.73ms
iter 261310: loss 5.4916, time 125.44ms
iter 261320: loss 6.5010, time 125.46ms
iter 261330: loss 5.2397, time 125.29ms
iter 261340: loss 6.1360, time 126.14ms
iter 261350: loss 6.4303, time 125.47ms
iter 261360: loss 5.9797, time 125.38ms
iter 261370: loss 5.7470, time 125.56ms
iter 261380: loss 6.4025, time 125.88ms
iter 261390: loss 6.1468, time 125.83ms
iter 261400: loss 5.7178, time 125.71ms
iter 261410: loss 6.8414, time 125.84ms
iter 261420: loss 5.7714, time 127.87ms
iter 261430: loss 6.4794, time 125.49ms
iter 261440: loss 6.2543, time 125.42ms
iter 261450: loss 6.8331, time 126.06ms
iter 261460: loss 5.7814, time 125.38ms
iter 261470: loss 5.7321, time 125.67ms
iter 261480: loss 6.0336, time 125.73ms
iter 261490: loss 5.9049, time 125.84ms
step 261500: train loss 5.7374, val loss 5.7011
saving checkpoint to out-shakespeare-char
iter 261500: loss 5.3197, time 2847.53ms
iter 261510: loss 5.8163, time 124.48ms
iter 261520: loss 6.0801, time 124.08ms
iter 261530: loss 5.8429, time 125.49ms
iter 261540: loss 6.1152, time 125.81ms
iter 261550: loss 5.8596, time 125.44ms
iter 261560: loss 6.5830, time 128.60ms
iter 261570: loss 6.0177, time 125.38ms
iter 261580: loss 6.3634, time 125.43ms
iter 261590: loss 6.0818, time 126.22ms
iter 261600: loss 6.3278, time 125.66ms
iter 261610: loss 6.0424, time 125.68ms
iter 261620: loss 5.9314, time 125.46ms
iter 261630: loss 5.7440, time 125.31ms
iter 261640: loss 5.8010, time 125.33ms
iter 261650: loss 6.4924, time 125.15ms
iter 261660: loss 5.5791, time 125.51ms
iter 261670: loss 6.4340, time 128.36ms
iter 261680: loss 5.8258, time 125.70ms
iter 261690: loss 6.4951, time 124.82ms
iter 261700: loss 5.9341, time 125.46ms
iter 261710: loss 6.0217, time 128.74ms
iter 261720: loss 6.1629, time 125.53ms
iter 261730: loss 6.0720, time 125.51ms
iter 261740: loss 5.5966, time 126.30ms
step 261750: train loss 5.6835, val loss 5.7777
saving checkpoint to out-shakespeare-char
iter 261750: loss 5.5381, time 2876.56ms
iter 261760: loss 6.1956, time 124.96ms
iter 261770: loss 6.4675, time 125.20ms
iter 261780: loss 6.9846, time 124.80ms
iter 261790: loss 5.6095, time 124.87ms
iter 261800: loss 5.8694, time 125.41ms
iter 261810: loss 6.1272, time 124.93ms
iter 261820: loss 5.5932, time 124.67ms
iter 261830: loss 6.6457, time 125.21ms
iter 261840: loss 5.9219, time 124.51ms
iter 261850: loss 6.4533, time 127.61ms
iter 261860: loss 6.2135, time 124.97ms
iter 261870: loss 6.0814, time 124.91ms
iter 261880: loss 6.5062, time 125.03ms
iter 261890: loss 5.4342, time 124.70ms
iter 261900: loss 6.5430, time 124.49ms
iter 261910: loss 5.2781, time 124.80ms
iter 261920: loss 5.9220, time 124.77ms
iter 261930: loss 5.9963, time 125.39ms
iter 261940: loss 6.5640, time 124.66ms
iter 261950: loss 5.4852, time 124.84ms
iter 261960: loss 6.8180, time 124.85ms
iter 261970: loss 6.0930, time 124.89ms
iter 261980: loss 6.2883, time 124.73ms
iter 261990: loss 5.9212, time 124.94ms
step 262000: train loss 5.7142, val loss 5.7120
saving checkpoint to out-shakespeare-char
iter 262000: loss 5.9824, time 2875.98ms
iter 262010: loss 6.1591, time 126.02ms
iter 262020: loss 6.0560, time 126.13ms
iter 262030: loss 6.1276, time 125.82ms
iter 262040: loss 6.0801, time 125.67ms
iter 262050: loss 5.0010, time 125.30ms
iter 262060: loss 5.2885, time 126.16ms
iter 262070: loss 4.9673, time 125.92ms
iter 262080: loss 7.2051, time 125.56ms
iter 262090: loss 5.9940, time 125.77ms
iter 262100: loss 5.6866, time 129.16ms
iter 262110: loss 6.0984, time 126.08ms
iter 262120: loss 6.1451, time 125.90ms
iter 262130: loss 6.1798, time 126.07ms
iter 262140: loss 6.6731, time 125.98ms
iter 262150: loss 6.1301, time 126.40ms
iter 262160: loss 6.5845, time 125.97ms
iter 262170: loss 5.8904, time 124.87ms
iter 262180: loss 6.3100, time 125.72ms
iter 262190: loss 5.6686, time 125.85ms
iter 262200: loss 6.4763, time 123.78ms
iter 262210: loss 6.8227, time 121.70ms
iter 262220: loss 5.9759, time 123.35ms
iter 262230: loss 5.9149, time 122.17ms
iter 262240: loss 6.0478, time 122.73ms
step 262250: train loss 5.6955, val loss 5.6988
saving checkpoint to out-shakespeare-char
iter 262250: loss 6.1068, time 2893.98ms
iter 262260: loss 5.2090, time 122.06ms
iter 262270: loss 6.4576, time 123.33ms
iter 262280: loss 6.6401, time 121.79ms
iter 262290: loss 5.9876, time 122.60ms
iter 262300: loss 5.9673, time 121.62ms
iter 262310: loss 5.7389, time 122.25ms
iter 262320: loss 6.1454, time 121.40ms
iter 262330: loss 5.7675, time 121.40ms
iter 262340: loss 5.6219, time 121.54ms
iter 262350: loss 6.8521, time 121.62ms
iter 262360: loss 7.0048, time 122.01ms
iter 262370: loss 6.2496, time 121.77ms
iter 262380: loss 6.0459, time 121.48ms
iter 262390: loss 5.5740, time 121.62ms
iter 262400: loss 5.9712, time 121.59ms
iter 262410: loss 5.9000, time 125.81ms
iter 262420: loss 6.0583, time 125.57ms
iter 262430: loss 6.7959, time 128.60ms
iter 262440: loss 6.4387, time 126.87ms
iter 262450: loss 5.4317, time 124.73ms
iter 262460: loss 6.2273, time 125.68ms
iter 262470: loss 6.6581, time 128.33ms
iter 262480: loss 5.4973, time 124.98ms
iter 262490: loss 5.6381, time 125.67ms
step 262500: train loss 5.6894, val loss 5.7218
saving checkpoint to out-shakespeare-char
iter 262500: loss 6.5825, time 2903.92ms
iter 262510: loss 6.2259, time 126.27ms
iter 262520: loss 5.6719, time 126.55ms
iter 262530: loss 5.8383, time 126.02ms
iter 262540: loss 6.0785, time 125.67ms
iter 262550: loss 6.5959, time 125.92ms
iter 262560: loss 6.2032, time 126.29ms
iter 262570: loss 6.6144, time 128.74ms
iter 262580: loss 6.4198, time 125.86ms
iter 262590: loss 6.1504, time 125.92ms
iter 262600: loss 6.0267, time 125.89ms
iter 262610: loss 5.6946, time 125.83ms
iter 262620: loss 6.3719, time 125.72ms
iter 262630: loss 6.5945, time 125.33ms
iter 262640: loss 5.4483, time 125.80ms
iter 262650: loss 5.3605, time 125.41ms
iter 262660: loss 6.0334, time 124.52ms
iter 262670: loss 6.2380, time 125.43ms
iter 262680: loss 5.7054, time 128.41ms
iter 262690: loss 6.8487, time 125.81ms
iter 262700: loss 6.2593, time 125.80ms
iter 262710: loss 6.4969, time 126.15ms
iter 262720: loss 6.6478, time 125.10ms
iter 262730: loss 5.9191, time 125.33ms
iter 262740: loss 6.0357, time 125.98ms
step 262750: train loss 5.6977, val loss 5.6970
saving checkpoint to out-shakespeare-char
iter 262750: loss 5.8310, time 2891.73ms
iter 262760: loss 6.1933, time 124.55ms
iter 262770: loss 5.8828, time 125.92ms
iter 262780: loss 6.6848, time 128.44ms
iter 262790: loss 6.2333, time 125.85ms
iter 262800: loss 5.6086, time 125.78ms
iter 262810: loss 6.0119, time 126.63ms
iter 262820: loss 6.9097, time 125.43ms
iter 262830: loss 6.1270, time 126.01ms
iter 262840: loss 5.9652, time 125.65ms
iter 262850: loss 6.2721, time 125.50ms
iter 262860: loss 6.5268, time 125.86ms
iter 262870: loss 6.2553, time 125.68ms
iter 262880: loss 6.1706, time 124.76ms
iter 262890: loss 6.5746, time 127.98ms
iter 262900: loss 6.6375, time 125.46ms
iter 262910: loss 5.7426, time 125.56ms
iter 262920: loss 6.7783, time 124.80ms
iter 262930: loss 7.3384, time 125.17ms
iter 262940: loss 6.7014, time 125.51ms
iter 262950: loss 6.2947, time 125.61ms
iter 262960: loss 5.9395, time 125.45ms
iter 262970: loss 6.1813, time 125.53ms
iter 262980: loss 6.3192, time 125.59ms
iter 262990: loss 6.1193, time 126.15ms
step 263000: train loss 5.6892, val loss 5.7171
saving checkpoint to out-shakespeare-char
iter 263000: loss 6.0450, time 2912.44ms
iter 263010: loss 6.2317, time 125.85ms
iter 263020: loss 6.4352, time 126.69ms
iter 263030: loss 6.9155, time 125.44ms
iter 263040: loss 6.2213, time 126.03ms
iter 263050: loss 6.4163, time 128.76ms
iter 263060: loss 6.5787, time 129.33ms
iter 263070: loss 6.4970, time 126.22ms
iter 263080: loss 5.8380, time 126.24ms
iter 263090: loss 5.5306, time 126.91ms
iter 263100: loss 6.1369, time 125.60ms
iter 263110: loss 5.9576, time 126.12ms
iter 263120: loss 6.2225, time 125.96ms
iter 263130: loss 6.3420, time 125.70ms
iter 263140: loss 6.0272, time 125.84ms
iter 263150: loss 5.8753, time 125.23ms
iter 263160: loss 5.5679, time 125.21ms
iter 263170: loss 5.7914, time 127.75ms
iter 263180: loss 5.7147, time 125.11ms
iter 263190: loss 5.6685, time 125.20ms
iter 263200: loss 6.0257, time 124.34ms
iter 263210: loss 5.7835, time 127.91ms
iter 263220: loss 5.8871, time 124.23ms
iter 263230: loss 5.7107, time 124.90ms
iter 263240: loss 6.4246, time 125.18ms
step 263250: train loss 5.6984, val loss 5.6917
saving checkpoint to out-shakespeare-char
iter 263250: loss 4.9958, time 2878.37ms
iter 263260: loss 6.2139, time 122.07ms
iter 263270: loss 5.6010, time 121.24ms
iter 263280: loss 5.8947, time 124.26ms
iter 263290: loss 6.8781, time 121.92ms
iter 263300: loss 6.2917, time 123.59ms
iter 263310: loss 6.2027, time 122.15ms
iter 263320: loss 6.1462, time 123.24ms
iter 263330: loss 6.9206, time 122.27ms
iter 263340: loss 6.4771, time 122.46ms
iter 263350: loss 6.2961, time 122.08ms
iter 263360: loss 5.5671, time 121.26ms
iter 263370: loss 6.0580, time 121.81ms
iter 263380: loss 5.7675, time 123.64ms
iter 263390: loss 6.0500, time 121.51ms
iter 263400: loss 5.7540, time 122.83ms
iter 263410: loss 6.1839, time 121.90ms
iter 263420: loss 6.7438, time 122.35ms
iter 263430: loss 6.2439, time 121.79ms
iter 263440: loss 6.4159, time 123.06ms
iter 263450: loss 5.5367, time 122.43ms
iter 263460: loss 6.0331, time 123.94ms
iter 263470: loss 6.1244, time 121.84ms
iter 263480: loss 5.2857, time 122.79ms
iter 263490: loss 6.0188, time 121.88ms
step 263500: train loss 5.6764, val loss 5.6861
saving checkpoint to out-shakespeare-char
iter 263500: loss 5.6610, time 2891.77ms
iter 263510: loss 5.3243, time 125.08ms
iter 263520: loss 6.5692, time 125.73ms
iter 263530: loss 5.9079, time 125.34ms
iter 263540: loss 6.2301, time 125.02ms
iter 263550: loss 6.4566, time 126.22ms
iter 263560: loss 6.3623, time 125.33ms
iter 263570: loss 6.0886, time 128.40ms
iter 263580: loss 6.2974, time 125.53ms
iter 263590: loss 6.2320, time 126.21ms
iter 263600: loss 6.3275, time 126.15ms
iter 263610: loss 5.7667, time 128.97ms
iter 263620: loss 5.7032, time 124.84ms
iter 263630: loss 5.6458, time 125.68ms
iter 263640: loss 6.9324, time 125.81ms
iter 263650: loss 5.9542, time 128.11ms
iter 263660: loss 5.5614, time 124.72ms
iter 263670: loss 6.8562, time 125.85ms
iter 263680: loss 5.7859, time 124.28ms
iter 263690: loss 6.6465, time 125.90ms
iter 263700: loss 6.7596, time 125.82ms
iter 263710: loss 6.1641, time 125.85ms
iter 263720: loss 6.1525, time 125.07ms
iter 263730: loss 6.0549, time 125.61ms
iter 263740: loss 5.5832, time 126.00ms
step 263750: train loss 5.6959, val loss 5.7078
saving checkpoint to out-shakespeare-char
iter 263750: loss 5.8715, time 2896.65ms
iter 263760: loss 6.3314, time 125.84ms
iter 263770: loss 6.4695, time 125.63ms
iter 263780: loss 6.0822, time 125.68ms
iter 263790: loss 6.4925, time 124.91ms
iter 263800: loss 5.8578, time 124.79ms
iter 263810: loss 6.2953, time 125.80ms
iter 263820: loss 5.8073, time 126.04ms
iter 263830: loss 5.8905, time 126.02ms
iter 263840: loss 6.1267, time 125.07ms
iter 263850: loss 5.9350, time 125.66ms
iter 263860: loss 6.4171, time 125.54ms
iter 263870: loss 5.8945, time 125.56ms
iter 263880: loss 6.5669, time 125.75ms
iter 263890: loss 5.4873, time 125.75ms
iter 263900: loss 7.1750, time 128.82ms
iter 263910: loss 5.7291, time 126.10ms
iter 263920: loss 5.8276, time 124.86ms
iter 263930: loss 6.0898, time 126.07ms
iter 263940: loss 5.9799, time 128.32ms
iter 263950: loss 6.4605, time 125.51ms
iter 263960: loss 5.9064, time 125.60ms
iter 263970: loss 6.1300, time 125.91ms
iter 263980: loss 5.9681, time 126.09ms
iter 263990: loss 5.6063, time 125.77ms
step 264000: train loss 5.6848, val loss 5.6857
saving checkpoint to out-shakespeare-char
iter 264000: loss 5.8361, time 2860.48ms
iter 264010: loss 5.6599, time 125.84ms
iter 264020: loss 5.5096, time 126.11ms
iter 264030: loss 6.4261, time 129.41ms
iter 264040: loss 6.4656, time 125.58ms
iter 264050: loss 6.1429, time 125.54ms
iter 264060: loss 5.7330, time 125.57ms
iter 264070: loss 5.9451, time 125.40ms
iter 264080: loss 6.7867, time 125.42ms
iter 264090: loss 5.7863, time 125.92ms
iter 264100: loss 6.2750, time 128.37ms
iter 264110: loss 5.9787, time 125.89ms
iter 264120: loss 6.5584, time 125.89ms
iter 264130: loss 6.1713, time 125.69ms
iter 264140: loss 5.9531, time 128.62ms
iter 264150: loss 6.0031, time 125.79ms
iter 264160: loss 6.5350, time 125.91ms
iter 264170: loss 6.2308, time 125.33ms
iter 264180: loss 5.7113, time 125.40ms
iter 264190: loss 6.6842, time 125.77ms
iter 264200: loss 6.0289, time 126.00ms
iter 264210: loss 5.9142, time 128.76ms
iter 264220: loss 5.9745, time 124.64ms
iter 264230: loss 5.6424, time 125.78ms
iter 264240: loss 6.1014, time 125.90ms
step 264250: train loss 5.7116, val loss 5.7009
saving checkpoint to out-shakespeare-char
iter 264250: loss 6.1322, time 2892.00ms
iter 264260: loss 6.6406, time 125.04ms
iter 264270: loss 6.3746, time 125.76ms
iter 264280: loss 6.9461, time 125.04ms
iter 264290: loss 6.0800, time 126.05ms
iter 264300: loss 6.8128, time 125.75ms
iter 264310: loss 6.1121, time 126.16ms
iter 264320: loss 6.1850, time 125.62ms
iter 264330: loss 6.2901, time 125.72ms
iter 264340: loss 6.1321, time 126.02ms
iter 264350: loss 6.1427, time 128.21ms
iter 264360: loss 5.5702, time 125.53ms
iter 264370: loss 5.9097, time 125.91ms
iter 264380: loss 6.3425, time 125.32ms
iter 264390: loss 6.7865, time 128.27ms
iter 264400: loss 6.3398, time 125.42ms
iter 264410: loss 6.6138, time 126.05ms
iter 264420: loss 6.9293, time 127.05ms
iter 264430: loss 5.5637, time 125.84ms
iter 264440: loss 5.6917, time 126.09ms
iter 264450: loss 5.6609, time 125.85ms
iter 264460: loss 5.4120, time 125.35ms
iter 264470: loss 6.4173, time 125.76ms
iter 264480: loss 6.0336, time 125.84ms
iter 264490: loss 5.4963, time 125.29ms
step 264500: train loss 5.6886, val loss 5.6925
saving checkpoint to out-shakespeare-char
iter 264500: loss 5.6926, time 2883.53ms
iter 264510: loss 6.2502, time 126.27ms
iter 264520: loss 6.2920, time 125.06ms
iter 264530: loss 5.3966, time 126.37ms
iter 264540: loss 6.2990, time 125.99ms
iter 264550: loss 5.9399, time 125.57ms
iter 264560: loss 6.5595, time 125.77ms
iter 264570: loss 6.6454, time 125.37ms
iter 264580: loss 6.1173, time 126.04ms
iter 264590: loss 6.6038, time 125.67ms
iter 264600: loss 4.7295, time 125.01ms
iter 264610: loss 5.6316, time 125.31ms
iter 264620: loss 6.3772, time 125.44ms
iter 264630: loss 5.9896, time 125.70ms
iter 264640: loss 6.6602, time 128.05ms
iter 264650: loss 6.6275, time 124.50ms
iter 264660: loss 6.5064, time 125.87ms
iter 264670: loss 6.0512, time 125.48ms
iter 264680: loss 6.3959, time 125.79ms
iter 264690: loss 6.1407, time 125.65ms
iter 264700: loss 6.0736, time 125.65ms
iter 264710: loss 5.8185, time 125.28ms
iter 264720: loss 6.2342, time 124.91ms
iter 264730: loss 6.7927, time 125.71ms
iter 264740: loss 6.0094, time 126.27ms
step 264750: train loss 5.6584, val loss 5.6819
saving checkpoint to out-shakespeare-char
iter 264750: loss 6.6600, time 2881.65ms
iter 264760: loss 7.0845, time 125.33ms
iter 264770: loss 5.9122, time 125.75ms
iter 264780: loss 6.4818, time 125.72ms
iter 264790: loss 5.5777, time 126.03ms
iter 264800: loss 6.4189, time 125.62ms
iter 264810: loss 6.2404, time 125.64ms
iter 264820: loss 7.0119, time 125.69ms
iter 264830: loss 6.8213, time 125.74ms
iter 264840: loss 6.2419, time 125.75ms
iter 264850: loss 6.2163, time 127.95ms
iter 264860: loss 5.6899, time 125.41ms
iter 264870: loss 5.4732, time 125.73ms
iter 264880: loss 5.2276, time 125.19ms
iter 264890: loss 5.9067, time 125.44ms
iter 264900: loss 6.1920, time 124.90ms
iter 264910: loss 5.8632, time 125.73ms
iter 264920: loss 5.6031, time 125.64ms
iter 264930: loss 5.9549, time 125.38ms
iter 264940: loss 6.2610, time 125.68ms
iter 264950: loss 5.8636, time 125.47ms
iter 264960: loss 5.9346, time 125.46ms
iter 264970: loss 6.0789, time 125.52ms
iter 264980: loss 6.4160, time 125.12ms
iter 264990: loss 5.7456, time 125.52ms
step 265000: train loss 5.6493, val loss 5.6838
saving checkpoint to out-shakespeare-char
iter 265000: loss 6.2495, time 2884.81ms
iter 265010: loss 6.5457, time 124.90ms
iter 265020: loss 6.1528, time 121.68ms
iter 265030: loss 5.6539, time 124.34ms
iter 265040: loss 6.1515, time 121.62ms
iter 265050: loss 5.9489, time 124.67ms
iter 265060: loss 5.8133, time 121.50ms
iter 265070: loss 6.5254, time 124.35ms
iter 265080: loss 5.8449, time 121.60ms
iter 265090: loss 5.8269, time 123.88ms
iter 265100: loss 6.1315, time 120.69ms
iter 265110: loss 5.5486, time 124.44ms
iter 265120: loss 5.7233, time 121.08ms
iter 265130: loss 5.4876, time 124.32ms
iter 265140: loss 5.7312, time 122.00ms
iter 265150: loss 5.5648, time 124.77ms
iter 265160: loss 5.8064, time 121.69ms
iter 265170: loss 6.6449, time 124.56ms
iter 265180: loss 6.2162, time 121.55ms
iter 265190: loss 6.4197, time 124.34ms
iter 265200: loss 5.6939, time 122.71ms
iter 265210: loss 5.9268, time 122.09ms
iter 265220: loss 6.0071, time 120.77ms
iter 265230: loss 5.6500, time 122.98ms
iter 265240: loss 6.3284, time 122.70ms
step 265250: train loss 5.6823, val loss 5.6968
saving checkpoint to out-shakespeare-char
iter 265250: loss 6.5213, time 2895.87ms
iter 265260: loss 5.3252, time 122.04ms
iter 265270: loss 6.5075, time 123.12ms
iter 265280: loss 6.6078, time 122.13ms
iter 265290: loss 6.1611, time 123.32ms
iter 265300: loss 6.6994, time 121.07ms
iter 265310: loss 6.0273, time 123.16ms
iter 265320: loss 6.0775, time 121.86ms
iter 265330: loss 5.8248, time 122.92ms
iter 265340: loss 6.1545, time 122.01ms
iter 265350: loss 5.8579, time 123.30ms
iter 265360: loss 6.1888, time 122.18ms
iter 265370: loss 6.3378, time 123.30ms
iter 265380: loss 5.0263, time 122.02ms
iter 265390: loss 6.8743, time 123.30ms
iter 265400: loss 5.4844, time 121.83ms
iter 265410: loss 6.4006, time 122.85ms
iter 265420: loss 6.4622, time 121.91ms
iter 265430: loss 5.9852, time 123.92ms
iter 265440: loss 6.8071, time 121.96ms
iter 265450: loss 6.6951, time 124.00ms
iter 265460: loss 6.0592, time 121.17ms
iter 265470: loss 5.8654, time 122.70ms
iter 265480: loss 5.9574, time 121.39ms
iter 265490: loss 6.5053, time 122.59ms
step 265500: train loss 5.6975, val loss 5.7257
saving checkpoint to out-shakespeare-char
iter 265500: loss 6.0463, time 2890.46ms
iter 265510: loss 6.0185, time 121.51ms
iter 265520: loss 6.3304, time 121.18ms
iter 265530: loss 6.3521, time 121.39ms
iter 265540: loss 6.0764, time 121.27ms
iter 265550: loss 5.5932, time 121.50ms
iter 265560: loss 5.2601, time 121.20ms
iter 265570: loss 6.3158, time 121.43ms
iter 265580: loss 6.3819, time 121.33ms
iter 265590: loss 6.4521, time 121.26ms
iter 265600: loss 5.7815, time 122.11ms
iter 265610: loss 6.3507, time 121.49ms
iter 265620: loss 5.3669, time 121.64ms
iter 265630: loss 6.5888, time 121.40ms
iter 265640: loss 6.1130, time 121.40ms
iter 265650: loss 6.2264, time 121.41ms
iter 265660: loss 5.9350, time 122.56ms
iter 265670: loss 5.7782, time 121.49ms
iter 265680: loss 6.1881, time 121.39ms
iter 265690: loss 6.6455, time 122.00ms
iter 265700: loss 6.1756, time 120.66ms
iter 265710: loss 6.5297, time 121.51ms
iter 265720: loss 6.0852, time 121.45ms
iter 265730: loss 5.7980, time 121.53ms
iter 265740: loss 4.9717, time 122.03ms
step 265750: train loss 5.6223, val loss 5.6606
saving checkpoint to out-shakespeare-char
iter 265750: loss 6.2125, time 2900.73ms
iter 265760: loss 6.9090, time 123.37ms
iter 265770: loss 6.0223, time 122.24ms
iter 265780: loss 5.5131, time 123.39ms
iter 265790: loss 6.0945, time 121.86ms
iter 265800: loss 5.3124, time 123.17ms
iter 265810: loss 5.7800, time 121.90ms
iter 265820: loss 6.4722, time 123.19ms
iter 265830: loss 6.2421, time 121.91ms
iter 265840: loss 5.7714, time 123.14ms
iter 265850: loss 5.7368, time 122.08ms
iter 265860: loss 6.2228, time 123.25ms
iter 265870: loss 5.3853, time 121.91ms
iter 265880: loss 6.1150, time 122.64ms
iter 265890: loss 5.5781, time 121.16ms
iter 265900: loss 6.2311, time 122.63ms
iter 265910: loss 6.5108, time 121.68ms
iter 265920: loss 6.3853, time 122.57ms
iter 265930: loss 5.2536, time 121.54ms
iter 265940: loss 6.0865, time 122.94ms
iter 265950: loss 6.3946, time 121.92ms
iter 265960: loss 5.5002, time 121.68ms
iter 265970: loss 6.3920, time 122.02ms
iter 265980: loss 5.5924, time 121.42ms
iter 265990: loss 6.6384, time 121.96ms
step 266000: train loss 5.7022, val loss 5.6347
saving checkpoint to out-shakespeare-char
iter 266000: loss 5.7684, time 2903.73ms
iter 266010: loss 5.9757, time 125.53ms
iter 266020: loss 6.0337, time 124.97ms
iter 266030: loss 5.9715, time 127.91ms
iter 266040: loss 5.8819, time 124.95ms
iter 266050: loss 5.9786, time 125.20ms
iter 266060: loss 6.2715, time 123.26ms
iter 266070: loss 6.6382, time 124.92ms
iter 266080: loss 5.5357, time 125.41ms
iter 266090: loss 5.8225, time 125.41ms
iter 266100: loss 5.3459, time 123.62ms
iter 266110: loss 6.6442, time 125.34ms
iter 266120: loss 6.2246, time 125.58ms
iter 266130: loss 6.3334, time 125.30ms
iter 266140: loss 5.9138, time 128.32ms
iter 266150: loss 6.6504, time 125.40ms
iter 266160: loss 5.8174, time 125.55ms
iter 266170: loss 5.8393, time 124.28ms
iter 266180: loss 6.3588, time 124.90ms
iter 266190: loss 5.9483, time 124.15ms
iter 266200: loss 5.5435, time 125.68ms
iter 266210: loss 5.8765, time 123.80ms
iter 266220: loss 5.6905, time 125.78ms
iter 266230: loss 6.5641, time 125.52ms
iter 266240: loss 6.6504, time 125.56ms
step 266250: train loss 5.6806, val loss 5.6677
saving checkpoint to out-shakespeare-char
iter 266250: loss 5.2485, time 2883.19ms
iter 266260: loss 5.6912, time 125.52ms
iter 266270: loss 6.5917, time 125.52ms
iter 266280: loss 6.6363, time 124.32ms
iter 266290: loss 6.6434, time 125.06ms
iter 266300: loss 5.6858, time 125.20ms
iter 266310: loss 5.8866, time 125.43ms
iter 266320: loss 6.1835, time 125.49ms
iter 266330: loss 5.8983, time 125.43ms
iter 266340: loss 6.5104, time 124.32ms
iter 266350: loss 6.4834, time 125.19ms
iter 266360: loss 5.9730, time 124.89ms
iter 266370: loss 5.9839, time 125.44ms
iter 266380: loss 6.3991, time 128.12ms
iter 266390: loss 6.1916, time 125.05ms
iter 266400: loss 6.6714, time 124.24ms
iter 266410: loss 6.4483, time 125.24ms
iter 266420: loss 5.9047, time 125.55ms
iter 266430: loss 5.7827, time 125.41ms
iter 266440: loss 6.1780, time 125.40ms
iter 266450: loss 5.6157, time 125.33ms
iter 266460: loss 6.0169, time 125.48ms
iter 266470: loss 6.5550, time 125.64ms
iter 266480: loss 5.8120, time 125.61ms
iter 266490: loss 5.7528, time 128.23ms
step 266500: train loss 5.6853, val loss 5.6729
saving checkpoint to out-shakespeare-char
iter 266500: loss 6.6409, time 2898.39ms
iter 266510: loss 5.8742, time 124.35ms
iter 266520: loss 5.7529, time 128.20ms
iter 266530: loss 5.8693, time 125.71ms
iter 266540: loss 5.7850, time 125.23ms
iter 266550: loss 6.0419, time 125.35ms
iter 266560: loss 6.6068, time 125.44ms
iter 266570: loss 6.1506, time 125.38ms
iter 266580: loss 6.3970, time 125.43ms
iter 266590: loss 6.1633, time 125.68ms
iter 266600: loss 6.5558, time 125.11ms
iter 266610: loss 5.5330, time 125.39ms
iter 266620: loss 6.2446, time 125.58ms
iter 266630: loss 6.1950, time 128.22ms
iter 266640: loss 6.1218, time 124.89ms
iter 266650: loss 5.9824, time 125.70ms
iter 266660: loss 6.5028, time 125.41ms
iter 266670: loss 5.7703, time 125.89ms
iter 266680: loss 5.7673, time 125.89ms
iter 266690: loss 6.1483, time 125.64ms
iter 266700: loss 6.1881, time 125.18ms
iter 266710: loss 6.3679, time 125.42ms
iter 266720: loss 6.0672, time 125.41ms
iter 266730: loss 5.7215, time 125.34ms
iter 266740: loss 5.3379, time 127.63ms
step 266750: train loss 5.6776, val loss 5.6644
saving checkpoint to out-shakespeare-char
iter 266750: loss 5.7361, time 2884.16ms
iter 266760: loss 5.9581, time 125.60ms
iter 266770: loss 6.4008, time 125.27ms
iter 266780: loss 6.2634, time 125.59ms
iter 266790: loss 6.8693, time 125.32ms
iter 266800: loss 6.5588, time 128.28ms
iter 266810: loss 6.5081, time 125.75ms
iter 266820: loss 5.8645, time 125.57ms
iter 266830: loss 6.1730, time 125.35ms
iter 266840: loss 5.9724, time 125.60ms
iter 266850: loss 5.7870, time 125.50ms
iter 266860: loss 5.7418, time 125.46ms
iter 266870: loss 6.1126, time 125.77ms
iter 266880: loss 6.0909, time 125.49ms
iter 266890: loss 5.2883, time 125.54ms
iter 266900: loss 5.9367, time 125.59ms
iter 266910: loss 5.8585, time 126.52ms
iter 266920: loss 6.7134, time 125.55ms
iter 266930: loss 5.9365, time 125.63ms
iter 266940: loss 6.8294, time 124.89ms
iter 266950: loss 5.1575, time 124.90ms
iter 266960: loss 6.6962, time 125.53ms
iter 266970: loss 6.2220, time 124.18ms
iter 266980: loss 6.2054, time 126.74ms
iter 266990: loss 6.0218, time 125.78ms
step 267000: train loss 5.7033, val loss 5.6366
saving checkpoint to out-shakespeare-char
iter 267000: loss 5.9158, time 2887.29ms
iter 267010: loss 5.9654, time 124.99ms
iter 267020: loss 6.1271, time 129.36ms
iter 267030: loss 6.2956, time 125.67ms
iter 267040: loss 5.6569, time 125.49ms
iter 267050: loss 6.3789, time 125.62ms
iter 267060: loss 5.8526, time 125.69ms
iter 267070: loss 6.3014, time 125.72ms
iter 267080: loss 6.0812, time 125.54ms
iter 267090: loss 6.0571, time 125.60ms
iter 267100: loss 5.8828, time 125.91ms
iter 267110: loss 6.1803, time 125.64ms
iter 267120: loss 6.1654, time 126.08ms
iter 267130: loss 5.6920, time 125.30ms
iter 267140: loss 6.3247, time 125.07ms
iter 267150: loss 6.2566, time 125.20ms
iter 267160: loss 5.7365, time 125.18ms
iter 267170: loss 6.9715, time 125.66ms
iter 267180: loss 6.4133, time 125.69ms
iter 267190: loss 6.3570, time 125.14ms
iter 267200: loss 6.4268, time 128.95ms
iter 267210: loss 6.3742, time 125.37ms
iter 267220: loss 6.1085, time 125.19ms
iter 267230: loss 6.5298, time 124.69ms
iter 267240: loss 6.0055, time 125.49ms
step 267250: train loss 5.6738, val loss 5.7066
saving checkpoint to out-shakespeare-char
iter 267250: loss 7.2893, time 2896.26ms
iter 267260: loss 5.9103, time 124.70ms
iter 267270: loss 6.3785, time 125.25ms
iter 267280: loss 6.8844, time 125.39ms
iter 267290: loss 5.8204, time 128.36ms
iter 267300: loss 6.0172, time 124.71ms
iter 267310: loss 5.5131, time 125.30ms
iter 267320: loss 6.0209, time 125.49ms
iter 267330: loss 5.8005, time 124.92ms
iter 267340: loss 5.3551, time 125.38ms
iter 267350: loss 5.7030, time 125.29ms
iter 267360: loss 6.2295, time 125.05ms
iter 267370: loss 6.3232, time 124.23ms
iter 267380: loss 5.5877, time 125.28ms
iter 267390: loss 6.2407, time 125.83ms
iter 267400: loss 5.9221, time 128.03ms
iter 267410: loss 6.0744, time 125.74ms
iter 267420: loss 5.8312, time 125.77ms
iter 267430: loss 6.1079, time 125.65ms
iter 267440: loss 6.1770, time 125.59ms
iter 267450: loss 6.1176, time 126.81ms
iter 267460: loss 5.9518, time 126.19ms
iter 267470: loss 5.7192, time 127.49ms
iter 267480: loss 5.8608, time 125.58ms
iter 267490: loss 5.9315, time 125.75ms
step 267500: train loss 5.7340, val loss 5.6785
saving checkpoint to out-shakespeare-char
iter 267500: loss 6.4911, time 2898.11ms
iter 267510: loss 5.5576, time 124.51ms
iter 267520: loss 6.3267, time 125.22ms
iter 267530: loss 6.1906, time 125.95ms
iter 267540: loss 5.6023, time 125.44ms
iter 267550: loss 5.9573, time 125.67ms
iter 267560: loss 6.3729, time 125.67ms
iter 267570: loss 6.1304, time 125.17ms
iter 267580: loss 6.4907, time 125.07ms
iter 267590: loss 5.6833, time 125.38ms
iter 267600: loss 6.2909, time 125.80ms
iter 267610: loss 5.4176, time 127.87ms
iter 267620: loss 6.1198, time 125.27ms
iter 267630: loss 5.8553, time 125.22ms
iter 267640: loss 6.2188, time 125.34ms
iter 267650: loss 6.1359, time 125.15ms
iter 267660: loss 5.9646, time 125.30ms
iter 267670: loss 5.2434, time 125.24ms
iter 267680: loss 6.5524, time 124.95ms
iter 267690: loss 6.5982, time 125.06ms
iter 267700: loss 5.8791, time 125.40ms
iter 267710: loss 5.9028, time 126.98ms
iter 267720: loss 5.9307, time 129.86ms
iter 267730: loss 6.6248, time 125.47ms
iter 267740: loss 6.4337, time 126.89ms
step 267750: train loss 5.7254, val loss 5.7091
saving checkpoint to out-shakespeare-char
iter 267750: loss 6.5228, time 2902.17ms
iter 267760: loss 5.7081, time 125.76ms
iter 267770: loss 6.0563, time 122.32ms
iter 267780: loss 5.1190, time 121.59ms
iter 267790: loss 5.7428, time 122.02ms
iter 267800: loss 6.6684, time 121.50ms
iter 267810: loss 6.0336, time 121.76ms
iter 267820: loss 5.5961, time 121.82ms
iter 267830: loss 6.2319, time 121.67ms
iter 267840: loss 5.8638, time 121.58ms
iter 267850: loss 5.6970, time 121.69ms
iter 267860: loss 6.3297, time 123.33ms
iter 267870: loss 5.2962, time 123.49ms
iter 267880: loss 6.2742, time 121.76ms
iter 267890: loss 6.1536, time 123.10ms
iter 267900: loss 5.7822, time 125.92ms
iter 267910: loss 6.3038, time 126.27ms
iter 267920: loss 6.2652, time 126.95ms
iter 267930: loss 5.6797, time 126.88ms
iter 267940: loss 6.3950, time 124.23ms
iter 267950: loss 5.7131, time 127.39ms
iter 267960: loss 6.5268, time 128.68ms
iter 267970: loss 5.7135, time 125.11ms
iter 267980: loss 6.2036, time 126.25ms
iter 267990: loss 6.6428, time 125.86ms
step 268000: train loss 5.6895, val loss 5.7107
saving checkpoint to out-shakespeare-char
iter 268000: loss 6.4427, time 2874.65ms
iter 268010: loss 6.2026, time 121.82ms
iter 268020: loss 6.3484, time 122.14ms
iter 268030: loss 6.1106, time 121.64ms
iter 268040: loss 5.5992, time 122.33ms
iter 268050: loss 5.9764, time 121.45ms
iter 268060: loss 5.7983, time 121.99ms
iter 268070: loss 6.1662, time 122.01ms
iter 268080: loss 6.4727, time 121.85ms
iter 268090: loss 5.6998, time 121.83ms
iter 268100: loss 5.6866, time 122.05ms
iter 268110: loss 5.2716, time 121.88ms
iter 268120: loss 5.7297, time 122.08ms
iter 268130: loss 5.9909, time 122.16ms
iter 268140: loss 5.7965, time 121.92ms
iter 268150: loss 6.6510, time 121.11ms
iter 268160: loss 5.9197, time 121.24ms
iter 268170: loss 6.4686, time 121.88ms
iter 268180: loss 6.1309, time 121.86ms
iter 268190: loss 5.5097, time 121.65ms
iter 268200: loss 6.8730, time 122.60ms
iter 268210: loss 5.6968, time 121.88ms
iter 268220: loss 6.1362, time 121.89ms
iter 268230: loss 5.8718, time 121.78ms
iter 268240: loss 5.4970, time 121.74ms
step 268250: train loss 5.6623, val loss 5.7058
saving checkpoint to out-shakespeare-char
iter 268250: loss 6.0319, time 2897.06ms
iter 268260: loss 6.0223, time 122.60ms
iter 268270: loss 5.9540, time 121.89ms
iter 268280: loss 5.3932, time 122.17ms
iter 268290: loss 5.5125, time 123.58ms
iter 268300: loss 5.9099, time 122.34ms
iter 268310: loss 6.0520, time 121.81ms
iter 268320: loss 6.0181, time 121.78ms
iter 268330: loss 6.0017, time 121.82ms
iter 268340: loss 6.1983, time 121.86ms
iter 268350: loss 5.6522, time 121.80ms
iter 268360: loss 6.2495, time 121.83ms
iter 268370: loss 6.5819, time 121.92ms
iter 268380: loss 6.2681, time 121.89ms
iter 268390: loss 6.2542, time 125.91ms
iter 268400: loss 6.8590, time 122.08ms
iter 268410: loss 6.2694, time 123.55ms
iter 268420: loss 6.5196, time 120.94ms
iter 268430: loss 6.3097, time 122.45ms
iter 268440: loss 6.0579, time 121.77ms
iter 268450: loss 5.9825, time 121.78ms
iter 268460: loss 6.7733, time 122.09ms
iter 268470: loss 5.4310, time 121.77ms
iter 268480: loss 4.9115, time 121.31ms
iter 268490: loss 6.4382, time 122.08ms
step 268500: train loss 5.6929, val loss 5.7248
saving checkpoint to out-shakespeare-char
iter 268500: loss 5.9680, time 2892.97ms
iter 268510: loss 6.3601, time 122.35ms
iter 268520: loss 5.8174, time 123.29ms
iter 268530: loss 6.3785, time 122.09ms
iter 268540: loss 5.9881, time 123.52ms
iter 268550: loss 5.6544, time 121.31ms
iter 268560: loss 6.3744, time 123.31ms
iter 268570: loss 6.4672, time 121.93ms
iter 268580: loss 6.1845, time 123.38ms
iter 268590: loss 6.2673, time 121.98ms
iter 268600: loss 5.8833, time 122.95ms
iter 268610: loss 6.4442, time 121.78ms
iter 268620: loss 6.7641, time 122.91ms
iter 268630: loss 6.2681, time 121.83ms
iter 268640: loss 6.1835, time 123.40ms
iter 268650: loss 5.6884, time 121.94ms
iter 268660: loss 5.8135, time 123.03ms
iter 268670: loss 6.0291, time 121.24ms
iter 268680: loss 5.6895, time 123.01ms
iter 268690: loss 6.0845, time 122.03ms
iter 268700: loss 6.6646, time 123.29ms
iter 268710: loss 6.0860, time 122.96ms
iter 268720: loss 6.3215, time 123.37ms
iter 268730: loss 5.8661, time 121.65ms
iter 268740: loss 6.7329, time 123.06ms
step 268750: train loss 5.6453, val loss 5.6432
saving checkpoint to out-shakespeare-char
iter 268750: loss 6.2035, time 2887.82ms
iter 268760: loss 6.5839, time 121.79ms
iter 268770: loss 6.0275, time 121.45ms
iter 268780: loss 5.7935, time 122.03ms
iter 268790: loss 5.8630, time 121.78ms
iter 268800: loss 6.0743, time 122.16ms
iter 268810: loss 6.4574, time 121.80ms
iter 268820: loss 6.0947, time 121.78ms
iter 268830: loss 5.2859, time 122.88ms
iter 268840: loss 5.8674, time 121.95ms
iter 268850: loss 6.3064, time 122.30ms
iter 268860: loss 5.6049, time 121.90ms
iter 268870: loss 6.0002, time 120.98ms
iter 268880: loss 6.0082, time 121.90ms
iter 268890: loss 6.0000, time 121.77ms
iter 268900: loss 6.0818, time 122.18ms
iter 268910: loss 6.1276, time 121.57ms
iter 268920: loss 5.3205, time 121.98ms
iter 268930: loss 5.7150, time 122.12ms
iter 268940: loss 5.7706, time 122.06ms
iter 268950: loss 5.8319, time 122.15ms
iter 268960: loss 5.4149, time 121.68ms
iter 268970: loss 5.1912, time 121.74ms
iter 268980: loss 5.8655, time 121.28ms
iter 268990: loss 5.8970, time 121.81ms
step 269000: train loss 5.7008, val loss 5.6453
saving checkpoint to out-shakespeare-char
iter 269000: loss 6.6667, time 2891.81ms
iter 269010: loss 5.8277, time 124.52ms
iter 269020: loss 6.5700, time 121.62ms
iter 269030: loss 5.6315, time 122.70ms
iter 269040: loss 6.0543, time 124.53ms
iter 269050: loss 6.0541, time 121.52ms
iter 269060: loss 6.0435, time 124.56ms
iter 269070: loss 6.4548, time 122.83ms
iter 269080: loss 6.7998, time 119.80ms
iter 269090: loss 6.1365, time 122.30ms
iter 269100: loss 6.5974, time 121.72ms
iter 269110: loss 4.9992, time 121.67ms
iter 269120: loss 5.9507, time 120.94ms
iter 269130: loss 6.0416, time 121.46ms
iter 269140: loss 5.9302, time 121.89ms
iter 269150: loss 5.9303, time 121.75ms
iter 269160: loss 6.1019, time 121.49ms
iter 269170: loss 6.4394, time 121.52ms
iter 269180: loss 6.0500, time 121.51ms
iter 269190: loss 6.5577, time 121.77ms
iter 269200: loss 6.4798, time 121.71ms
iter 269210: loss 7.1348, time 121.77ms
iter 269220: loss 6.1835, time 121.85ms
iter 269230: loss 6.3001, time 121.56ms
iter 269240: loss 6.2204, time 122.61ms
step 269250: train loss 5.6362, val loss 5.6914
saving checkpoint to out-shakespeare-char
iter 269250: loss 6.1629, time 2900.63ms
iter 269260: loss 5.6724, time 122.22ms
iter 269270: loss 6.0823, time 121.56ms
iter 269280: loss 6.5581, time 121.60ms
iter 269290: loss 6.2867, time 121.20ms
iter 269300: loss 6.2194, time 121.35ms
iter 269310: loss 6.4906, time 122.59ms
iter 269320: loss 6.0484, time 121.54ms
iter 269330: loss 6.2674, time 121.56ms
iter 269340: loss 6.4198, time 121.38ms
iter 269350: loss 6.3251, time 121.54ms
iter 269360: loss 6.3750, time 121.82ms
iter 269370: loss 6.4771, time 123.18ms
iter 269380: loss 5.5308, time 121.83ms
iter 269390: loss 5.8861, time 121.59ms
iter 269400: loss 5.5942, time 121.90ms
iter 269410: loss 6.4509, time 121.88ms
iter 269420: loss 5.7230, time 121.84ms
iter 269430: loss 5.7878, time 121.62ms
iter 269440: loss 5.7489, time 121.72ms
iter 269450: loss 6.0259, time 121.79ms
iter 269460: loss 6.5916, time 121.78ms
iter 269470: loss 6.3153, time 121.47ms
iter 269480: loss 5.7611, time 122.12ms
iter 269490: loss 6.3514, time 121.62ms
step 269500: train loss 5.6714, val loss 5.6552
saving checkpoint to out-shakespeare-char
iter 269500: loss 6.0245, time 2897.42ms
iter 269510: loss 6.1801, time 121.93ms
iter 269520: loss 6.2074, time 122.09ms
iter 269530: loss 5.5575, time 120.00ms
iter 269540: loss 5.4616, time 119.97ms
iter 269550: loss 6.0220, time 119.88ms
iter 269560: loss 6.0352, time 119.75ms
iter 269570: loss 5.6448, time 119.77ms
iter 269580: loss 6.6533, time 120.04ms
iter 269590: loss 6.2179, time 119.85ms
iter 269600: loss 6.2795, time 119.58ms
iter 269610: loss 6.2259, time 121.08ms
iter 269620: loss 5.9617, time 121.37ms
iter 269630: loss 6.0466, time 121.44ms
iter 269640: loss 6.0924, time 122.09ms
iter 269650: loss 5.7420, time 121.54ms
iter 269660: loss 5.7183, time 122.25ms
iter 269670: loss 5.8723, time 121.78ms
iter 269680: loss 6.3488, time 121.34ms
iter 269690: loss 5.8660, time 121.65ms
iter 269700: loss 6.1651, time 120.87ms
iter 269710: loss 6.2342, time 122.11ms
iter 269720: loss 5.3829, time 121.72ms
iter 269730: loss 5.7085, time 121.57ms
iter 269740: loss 6.4289, time 119.40ms
step 269750: train loss 5.6989, val loss 5.6321
saving checkpoint to out-shakespeare-char
iter 269750: loss 5.6542, time 2896.55ms
iter 269760: loss 6.3621, time 125.28ms
iter 269770: loss 6.2959, time 125.94ms
iter 269780: loss 5.6591, time 126.03ms
iter 269790: loss 6.0341, time 127.49ms
iter 269800: loss 5.3461, time 127.03ms
iter 269810: loss 5.7615, time 125.85ms
iter 269820: loss 5.8950, time 125.76ms
iter 269830: loss 6.7793, time 125.98ms
iter 269840: loss 6.2407, time 126.18ms
iter 269850: loss 6.1012, time 126.92ms
iter 269860: loss 5.8366, time 125.61ms
iter 269870: loss 5.9193, time 124.63ms
iter 269880: loss 6.6134, time 125.84ms
iter 269890: loss 6.0330, time 125.74ms
iter 269900: loss 5.7280, time 128.54ms
iter 269910: loss 6.4239, time 125.82ms
iter 269920: loss 5.7111, time 125.78ms
iter 269930: loss 6.1473, time 126.28ms
iter 269940: loss 5.9216, time 124.91ms
iter 269950: loss 7.0370, time 125.84ms
iter 269960: loss 5.8882, time 125.91ms
iter 269970: loss 5.3725, time 128.76ms
iter 269980: loss 5.2261, time 125.95ms
iter 269990: loss 5.3465, time 125.89ms
step 270000: train loss 5.7217, val loss 5.6413
saving checkpoint to out-shakespeare-char
iter 270000: loss 5.9353, time 2886.53ms
iter 270010: loss 6.5018, time 126.03ms
iter 270020: loss 6.1852, time 126.55ms
iter 270030: loss 5.3355, time 126.44ms
iter 270040: loss 5.6174, time 125.09ms
iter 270050: loss 6.2310, time 125.97ms
iter 270060: loss 5.6516, time 126.63ms
iter 270070: loss 5.9122, time 127.55ms
iter 270080: loss 6.1284, time 126.34ms
iter 270090: loss 6.1063, time 126.06ms
iter 270100: loss 6.0257, time 125.91ms
iter 270110: loss 5.6276, time 125.98ms
iter 270120: loss 5.8269, time 126.42ms
iter 270130: loss 5.7027, time 125.98ms
iter 270140: loss 6.4932, time 126.01ms
iter 270150: loss 6.2776, time 125.31ms
iter 270160: loss 5.7237, time 125.32ms
iter 270170: loss 6.0642, time 125.22ms
iter 270180: loss 6.3885, time 125.30ms
iter 270190: loss 5.8078, time 125.41ms
iter 270200: loss 7.0327, time 125.38ms
iter 270210: loss 5.8268, time 125.44ms
iter 270220: loss 5.5658, time 127.94ms
iter 270230: loss 5.8449, time 125.81ms
iter 270240: loss 6.6359, time 126.25ms
step 270250: train loss 5.6660, val loss 5.6811
saving checkpoint to out-shakespeare-char
iter 270250: loss 6.9327, time 2913.22ms
iter 270260: loss 6.2349, time 125.69ms
iter 270270: loss 6.3442, time 124.84ms
iter 270280: loss 6.7736, time 125.43ms
iter 270290: loss 5.7460, time 125.57ms
iter 270300: loss 5.7874, time 125.54ms
iter 270310: loss 5.9630, time 125.49ms
iter 270320: loss 6.3327, time 125.37ms
iter 270330: loss 5.8496, time 125.39ms
iter 270340: loss 5.9066, time 125.92ms
iter 270350: loss 6.2496, time 127.55ms
iter 270360: loss 6.2872, time 125.44ms
iter 270370: loss 4.9823, time 125.42ms
iter 270380: loss 5.5040, time 125.40ms
iter 270390: loss 5.6275, time 125.27ms
iter 270400: loss 5.4821, time 125.26ms
iter 270410: loss 5.9790, time 125.28ms
iter 270420: loss 6.1446, time 125.17ms
iter 270430: loss 7.0944, time 125.20ms
iter 270440: loss 5.6659, time 125.11ms
iter 270450: loss 6.0360, time 125.16ms
iter 270460: loss 5.3995, time 128.36ms
iter 270470: loss 6.0051, time 125.17ms
iter 270480: loss 6.4462, time 125.31ms
iter 270490: loss 6.3553, time 125.37ms
step 270500: train loss 5.6862, val loss 5.6474
saving checkpoint to out-shakespeare-char
iter 270500: loss 6.0226, time 2891.23ms
iter 270510: loss 6.3131, time 125.34ms
iter 270520: loss 6.2284, time 125.24ms
iter 270530: loss 6.1831, time 125.25ms
iter 270540: loss 5.2763, time 125.10ms
iter 270550: loss 6.3102, time 125.44ms
iter 270560: loss 5.5565, time 125.32ms
iter 270570: loss 6.1424, time 125.20ms
iter 270580: loss 5.8431, time 125.13ms
iter 270590: loss 5.7820, time 124.58ms
iter 270600: loss 6.0466, time 125.31ms
iter 270610: loss 6.1924, time 124.22ms
iter 270620: loss 7.1087, time 125.25ms
iter 270630: loss 5.4722, time 124.74ms
iter 270640: loss 6.0441, time 125.23ms
iter 270650: loss 5.7439, time 125.14ms
iter 270660: loss 6.6934, time 125.31ms
iter 270670: loss 6.2461, time 124.64ms
iter 270680: loss 6.0344, time 125.16ms
iter 270690: loss 5.5364, time 124.92ms
iter 270700: loss 5.5101, time 128.15ms
iter 270710: loss 6.6351, time 125.02ms
iter 270720: loss 6.7104, time 125.26ms
iter 270730: loss 6.3676, time 124.88ms
iter 270740: loss 6.4173, time 125.51ms
step 270750: train loss 5.6541, val loss 5.6677
saving checkpoint to out-shakespeare-char
iter 270750: loss 5.7805, time 2888.06ms
iter 270760: loss 6.4437, time 124.98ms
iter 270770: loss 5.7361, time 125.60ms
iter 270780: loss 5.6021, time 124.06ms
iter 270790: loss 6.0959, time 125.23ms
iter 270800: loss 6.4296, time 127.37ms
iter 270810: loss 5.6902, time 124.89ms
iter 270820: loss 6.6946, time 124.07ms
iter 270830: loss 6.4966, time 124.24ms
iter 270840: loss 6.6470, time 124.20ms
iter 270850: loss 6.3357, time 125.00ms
iter 270860: loss 5.8663, time 126.29ms
iter 270870: loss 6.0705, time 128.22ms
iter 270880: loss 5.9809, time 125.37ms
iter 270890: loss 5.8447, time 125.34ms
iter 270900: loss 6.2947, time 124.62ms
iter 270910: loss 6.6756, time 124.47ms
iter 270920: loss 6.4350, time 125.56ms
iter 270930: loss 5.9042, time 124.55ms
iter 270940: loss 6.0479, time 125.15ms
iter 270950: loss 5.7509, time 125.20ms
iter 270960: loss 6.0501, time 125.62ms
iter 270970: loss 5.7157, time 125.55ms
iter 270980: loss 5.5844, time 128.22ms
iter 270990: loss 6.0139, time 125.28ms
step 271000: train loss 5.7096, val loss 5.6718
saving checkpoint to out-shakespeare-char
iter 271000: loss 7.0517, time 2914.10ms
iter 271010: loss 5.6772, time 126.56ms
iter 271020: loss 5.8693, time 125.66ms
iter 271030: loss 5.8364, time 125.25ms
iter 271040: loss 5.7032, time 125.22ms
iter 271050: loss 5.7937, time 124.61ms
iter 271060: loss 5.8220, time 124.64ms
iter 271070: loss 6.0607, time 126.31ms
iter 271080: loss 5.6377, time 124.86ms
iter 271090: loss 5.4689, time 124.88ms
iter 271100: loss 5.7636, time 124.41ms
iter 271110: loss 6.1692, time 126.21ms
iter 271120: loss 5.7033, time 126.05ms
iter 271130: loss 5.6402, time 126.57ms
iter 271140: loss 6.5379, time 128.84ms
iter 271150: loss 6.7876, time 125.92ms
iter 271160: loss 6.7092, time 124.78ms
iter 271170: loss 6.2269, time 125.61ms
iter 271180: loss 6.2030, time 128.35ms
iter 271190: loss 6.2848, time 125.92ms
iter 271200: loss 6.8960, time 126.27ms
iter 271210: loss 5.6793, time 125.82ms
iter 271220: loss 6.1449, time 125.81ms
iter 271230: loss 5.9490, time 125.37ms
iter 271240: loss 6.0845, time 125.52ms
step 271250: train loss 5.6641, val loss 5.6572
saving checkpoint to out-shakespeare-char
iter 271250: loss 5.8457, time 2887.03ms
iter 271260: loss 6.1286, time 125.63ms
iter 271270: loss 5.9927, time 125.47ms
iter 271280: loss 6.5278, time 125.46ms
iter 271290: loss 6.2875, time 125.87ms
iter 271300: loss 6.0714, time 125.88ms
iter 271310: loss 6.0271, time 125.76ms
iter 271320: loss 5.8046, time 128.58ms
iter 271330: loss 6.0013, time 125.83ms
iter 271340: loss 6.0145, time 126.37ms
iter 271350: loss 6.0397, time 127.54ms
iter 271360: loss 6.4566, time 125.88ms
iter 271370: loss 5.7745, time 125.97ms
iter 271380: loss 6.5568, time 124.71ms
iter 271390: loss 6.1451, time 125.72ms
iter 271400: loss 6.1119, time 128.77ms
iter 271410: loss 6.5676, time 125.63ms
iter 271420: loss 6.0064, time 125.96ms
iter 271430: loss 6.2713, time 126.03ms
iter 271440: loss 5.3945, time 126.11ms
iter 271450: loss 6.0712, time 125.76ms
iter 271460: loss 6.6087, time 126.12ms
iter 271470: loss 5.6570, time 125.78ms
iter 271480: loss 6.5113, time 125.69ms
iter 271490: loss 5.9983, time 125.87ms
step 271500: train loss 5.6677, val loss 5.7277
saving checkpoint to out-shakespeare-char
iter 271500: loss 6.3820, time 2881.46ms
iter 271510: loss 6.3434, time 126.03ms
iter 271520: loss 6.5023, time 126.36ms
iter 271530: loss 5.8617, time 125.88ms
iter 271540: loss 6.2558, time 128.19ms
iter 271550: loss 5.7884, time 125.57ms
iter 271560: loss 6.1252, time 125.98ms
iter 271570: loss 5.6163, time 127.04ms
iter 271580: loss 6.3833, time 125.17ms
iter 271590: loss 5.9449, time 125.88ms
iter 271600: loss 5.8188, time 125.88ms
iter 271610: loss 6.4279, time 125.78ms
iter 271620: loss 6.6846, time 125.73ms
iter 271630: loss 5.6828, time 125.98ms
iter 271640: loss 5.7986, time 126.33ms
iter 271650: loss 5.8580, time 128.54ms
iter 271660: loss 6.3787, time 126.06ms
iter 271670: loss 6.3441, time 126.27ms
iter 271680: loss 6.4298, time 125.88ms
iter 271690: loss 6.0005, time 125.95ms
iter 271700: loss 6.0829, time 125.93ms
iter 271710: loss 5.6864, time 126.42ms
iter 271720: loss 5.8675, time 125.59ms
iter 271730: loss 6.4484, time 126.01ms
iter 271740: loss 6.0397, time 125.87ms
step 271750: train loss 5.7320, val loss 5.7111
saving checkpoint to out-shakespeare-char
iter 271750: loss 6.4738, time 2898.78ms
iter 271760: loss 6.3931, time 125.72ms
iter 271770: loss 5.8241, time 125.92ms
iter 271780: loss 6.4170, time 125.46ms
iter 271790: loss 5.7526, time 125.72ms
iter 271800: loss 6.3714, time 125.87ms
iter 271810: loss 5.8890, time 125.88ms
iter 271820: loss 5.9777, time 125.99ms
iter 271830: loss 6.2294, time 125.91ms
iter 271840: loss 5.7893, time 125.86ms
iter 271850: loss 6.4514, time 125.76ms
iter 271860: loss 5.8572, time 126.20ms
iter 271870: loss 6.3156, time 125.80ms
iter 271880: loss 6.0416, time 125.72ms
iter 271890: loss 6.4057, time 125.48ms
iter 271900: loss 5.4093, time 126.41ms
iter 271910: loss 6.1500, time 125.09ms
iter 271920: loss 5.6002, time 125.69ms
iter 271930: loss 6.5708, time 125.59ms
iter 271940: loss 5.8143, time 125.66ms
iter 271950: loss 5.8926, time 125.36ms
iter 271960: loss 6.6005, time 124.95ms
iter 271970: loss 6.5066, time 125.28ms
iter 271980: loss 5.3010, time 128.29ms
iter 271990: loss 6.4983, time 125.91ms
step 272000: train loss 5.6684, val loss 5.6829
saving checkpoint to out-shakespeare-char
iter 272000: loss 6.4657, time 2902.93ms
iter 272010: loss 5.7075, time 125.38ms
iter 272020: loss 5.7299, time 125.07ms
iter 272030: loss 6.8988, time 125.52ms
iter 272040: loss 5.5132, time 125.50ms
iter 272050: loss 5.6151, time 125.27ms
iter 272060: loss 6.1314, time 128.19ms
iter 272070: loss 5.9386, time 125.40ms
iter 272080: loss 6.0584, time 125.12ms
iter 272090: loss 6.2742, time 125.48ms
iter 272100: loss 6.2572, time 125.23ms
iter 272110: loss 5.7374, time 125.93ms
iter 272120: loss 6.0272, time 124.97ms
iter 272130: loss 6.8411, time 125.18ms
iter 272140: loss 5.9860, time 125.30ms
iter 272150: loss 6.2821, time 125.13ms
iter 272160: loss 6.1482, time 124.47ms
iter 272170: loss 6.2794, time 128.08ms
iter 272180: loss 6.4996, time 125.27ms
iter 272190: loss 6.3646, time 125.74ms
iter 272200: loss 5.8416, time 125.50ms
iter 272210: loss 5.6476, time 128.20ms
iter 272220: loss 5.7630, time 125.40ms
iter 272230: loss 5.6806, time 124.05ms
iter 272240: loss 7.0437, time 125.53ms
step 272250: train loss 5.6804, val loss 5.6314
saving checkpoint to out-shakespeare-char
iter 272250: loss 6.5364, time 2902.27ms
iter 272260: loss 6.2827, time 125.60ms
iter 272270: loss 6.1830, time 129.74ms
iter 272280: loss 6.3511, time 125.88ms
iter 272290: loss 5.5964, time 125.95ms
iter 272300: loss 6.6946, time 125.47ms
iter 272310: loss 5.2570, time 125.93ms
iter 272320: loss 5.4900, time 125.57ms
iter 272330: loss 6.1482, time 125.90ms
iter 272340: loss 4.8656, time 125.56ms
iter 272350: loss 6.2651, time 125.69ms
iter 272360: loss 5.9539, time 125.65ms
iter 272370: loss 5.7797, time 125.14ms
iter 272380: loss 6.4605, time 128.40ms
iter 272390: loss 6.2869, time 125.50ms
iter 272400: loss 5.7952, time 125.62ms
iter 272410: loss 6.7845, time 125.66ms
iter 272420: loss 5.9700, time 125.67ms
iter 272430: loss 6.4239, time 125.57ms
iter 272440: loss 5.9237, time 124.75ms
iter 272450: loss 5.9328, time 125.89ms
iter 272460: loss 6.3712, time 125.58ms
iter 272470: loss 6.3064, time 125.69ms
iter 272480: loss 5.6329, time 125.94ms
iter 272490: loss 6.1305, time 128.54ms
step 272500: train loss 5.6991, val loss 5.6770
saving checkpoint to out-shakespeare-char
iter 272500: loss 6.9557, time 2889.55ms
iter 272510: loss 5.8002, time 125.92ms
iter 272520: loss 6.0971, time 126.30ms
iter 272530: loss 6.5772, time 125.76ms
iter 272540: loss 5.9270, time 124.95ms
iter 272550: loss 5.8576, time 128.76ms
iter 272560: loss 5.7997, time 125.84ms
iter 272570: loss 6.2898, time 125.59ms
iter 272580: loss 6.1807, time 125.95ms
iter 272590: loss 5.9222, time 125.53ms
iter 272600: loss 6.7877, time 125.94ms
iter 272610: loss 6.1803, time 125.57ms
iter 272620: loss 6.6845, time 125.67ms
iter 272630: loss 6.3782, time 126.08ms
iter 272640: loss 5.6387, time 126.19ms
iter 272650: loss 6.2631, time 126.00ms
iter 272660: loss 6.8024, time 125.10ms
iter 272670: loss 6.4183, time 125.48ms
iter 272680: loss 6.7102, time 125.70ms
iter 272690: loss 6.2721, time 125.43ms
iter 272700: loss 6.9669, time 128.22ms
iter 272710: loss 5.6884, time 125.36ms
iter 272720: loss 6.2710, time 125.39ms
iter 272730: loss 6.5278, time 126.49ms
iter 272740: loss 5.6078, time 125.56ms
step 272750: train loss 5.6956, val loss 5.6547
saving checkpoint to out-shakespeare-char
iter 272750: loss 6.4167, time 2895.73ms
iter 272760: loss 5.8676, time 125.31ms
iter 272770: loss 5.4987, time 125.24ms
iter 272780: loss 6.3050, time 125.12ms
iter 272790: loss 5.7737, time 125.08ms
iter 272800: loss 5.6948, time 125.23ms
iter 272810: loss 5.9167, time 125.08ms
iter 272820: loss 6.2345, time 124.82ms
iter 272830: loss 6.4207, time 128.21ms
iter 272840: loss 6.0885, time 124.99ms
iter 272850: loss 5.9728, time 125.06ms
iter 272860: loss 5.1551, time 125.06ms
iter 272870: loss 6.1000, time 125.06ms
iter 272880: loss 6.5060, time 124.53ms
iter 272890: loss 6.0278, time 125.17ms
iter 272900: loss 6.0195, time 124.73ms
iter 272910: loss 6.6159, time 125.11ms
iter 272920: loss 6.4787, time 124.91ms
iter 272930: loss 6.5058, time 124.31ms
iter 272940: loss 5.5335, time 127.94ms
iter 272950: loss 6.4221, time 125.00ms
iter 272960: loss 5.4440, time 124.99ms
iter 272970: loss 5.8071, time 125.04ms
iter 272980: loss 6.3956, time 124.96ms
iter 272990: loss 5.7504, time 124.94ms
step 273000: train loss 5.6634, val loss 5.6930
saving checkpoint to out-shakespeare-char
iter 273000: loss 6.0091, time 2889.33ms
iter 273010: loss 6.6140, time 125.65ms
iter 273020: loss 5.9986, time 125.64ms
iter 273030: loss 6.0021, time 126.72ms
iter 273040: loss 6.6772, time 125.33ms
iter 273050: loss 5.8953, time 124.96ms
iter 273060: loss 6.7363, time 125.44ms
iter 273070: loss 6.2116, time 125.38ms
iter 273080: loss 6.8342, time 125.71ms
iter 273090: loss 6.5591, time 128.43ms
iter 273100: loss 6.0119, time 125.01ms
iter 273110: loss 6.1154, time 128.36ms
iter 273120: loss 5.8585, time 125.64ms
iter 273130: loss 6.3996, time 124.79ms
iter 273140: loss 6.4063, time 125.26ms
iter 273150: loss 6.4581, time 125.73ms
iter 273160: loss 6.3452, time 125.70ms
iter 273170: loss 5.8124, time 125.78ms
iter 273180: loss 4.7110, time 125.86ms
iter 273190: loss 6.1550, time 125.68ms
iter 273200: loss 6.0254, time 125.51ms
iter 273210: loss 6.3679, time 125.56ms
iter 273220: loss 6.3401, time 128.45ms
iter 273230: loss 5.6554, time 125.62ms
iter 273240: loss 6.1362, time 129.08ms
step 273250: train loss 5.7099, val loss 5.6820
saving checkpoint to out-shakespeare-char
iter 273250: loss 6.6734, time 2887.37ms
iter 273260: loss 5.0205, time 125.75ms
iter 273270: loss 6.8101, time 125.78ms
iter 273280: loss 5.7869, time 128.38ms
iter 273290: loss 5.9741, time 125.38ms
iter 273300: loss 6.2269, time 125.56ms
iter 273310: loss 5.6176, time 125.22ms
iter 273320: loss 6.2186, time 125.70ms
iter 273330: loss 6.1048, time 125.61ms
iter 273340: loss 6.2623, time 125.75ms
iter 273350: loss 5.2688, time 125.88ms
iter 273360: loss 6.8712, time 126.21ms
iter 273370: loss 6.0679, time 125.51ms
iter 273380: loss 6.1342, time 126.03ms
iter 273390: loss 6.2548, time 128.54ms
iter 273400: loss 5.9782, time 125.69ms
iter 273410: loss 6.8571, time 125.83ms
iter 273420: loss 6.4420, time 125.72ms
iter 273430: loss 6.4337, time 125.91ms
iter 273440: loss 6.5840, time 125.67ms
iter 273450: loss 6.2767, time 125.71ms
iter 273460: loss 5.9529, time 125.69ms
iter 273470: loss 6.9440, time 125.32ms
iter 273480: loss 5.7182, time 125.54ms
iter 273490: loss 6.1529, time 125.75ms
step 273500: train loss 5.6766, val loss 5.6821
saving checkpoint to out-shakespeare-char
iter 273500: loss 7.0026, time 2918.83ms
iter 273510: loss 5.2949, time 125.12ms
iter 273520: loss 5.9221, time 125.48ms
iter 273530: loss 5.9081, time 124.45ms
iter 273540: loss 5.4910, time 125.51ms
iter 273550: loss 6.3155, time 125.62ms
iter 273560: loss 6.6778, time 128.20ms
iter 273570: loss 5.7690, time 125.32ms
iter 273580: loss 6.6683, time 125.91ms
iter 273590: loss 6.0683, time 124.41ms
iter 273600: loss 5.7747, time 124.65ms
iter 273610: loss 6.3782, time 127.25ms
iter 273620: loss 5.9789, time 125.60ms
iter 273630: loss 5.7162, time 125.08ms
iter 273640: loss 6.2575, time 125.76ms
iter 273650: loss 6.5142, time 125.65ms
iter 273660: loss 6.4181, time 125.87ms
iter 273670: loss 5.5017, time 128.97ms
iter 273680: loss 5.4681, time 125.83ms
iter 273690: loss 6.4998, time 126.09ms
iter 273700: loss 5.8953, time 124.67ms
iter 273710: loss 5.8779, time 124.78ms
iter 273720: loss 6.0080, time 125.26ms
iter 273730: loss 6.1254, time 125.39ms
iter 273740: loss 6.0288, time 128.89ms
step 273750: train loss 5.6430, val loss 5.6691
saving checkpoint to out-shakespeare-char
iter 273750: loss 6.0817, time 2877.74ms
iter 273760: loss 5.8184, time 125.80ms
iter 273770: loss 5.8030, time 126.40ms
iter 273780: loss 5.9381, time 125.55ms
iter 273790: loss 6.4833, time 126.73ms
iter 273800: loss 6.0766, time 129.06ms
iter 273810: loss 5.9645, time 124.57ms
iter 273820: loss 5.7584, time 125.65ms
iter 273830: loss 5.5176, time 125.59ms
iter 273840: loss 6.9508, time 125.87ms
iter 273850: loss 6.1069, time 125.64ms
iter 273860: loss 6.5518, time 125.21ms
iter 273870: loss 6.3137, time 124.75ms
iter 273880: loss 6.0909, time 125.04ms
iter 273890: loss 6.6143, time 124.82ms
iter 273900: loss 5.5507, time 125.20ms
iter 273910: loss 6.2367, time 128.09ms
iter 273920: loss 5.9986, time 125.28ms
iter 273930: loss 6.4639, time 125.16ms
iter 273940: loss 6.4620, time 125.31ms
iter 273950: loss 5.9287, time 125.41ms
iter 273960: loss 6.2419, time 125.30ms
iter 273970: loss 5.8690, time 126.16ms
iter 273980: loss 6.3228, time 125.04ms
iter 273990: loss 6.4464, time 125.38ms
step 274000: train loss 5.7006, val loss 5.6493
saving checkpoint to out-shakespeare-char
iter 274000: loss 6.4804, time 2907.49ms
iter 274010: loss 6.1763, time 125.39ms
iter 274020: loss 5.8853, time 125.24ms
iter 274030: loss 6.3032, time 124.93ms
iter 274040: loss 5.8386, time 128.74ms
iter 274050: loss 6.2205, time 125.74ms
iter 274060: loss 6.5562, time 125.95ms
iter 274070: loss 6.2569, time 127.59ms
iter 274080: loss 6.6968, time 125.72ms
iter 274090: loss 6.3921, time 126.54ms
iter 274100: loss 6.6843, time 126.53ms
iter 274110: loss 5.9308, time 125.54ms
iter 274120: loss 5.9400, time 125.92ms
iter 274130: loss 6.3313, time 126.16ms
iter 274140: loss 6.1512, time 124.70ms
iter 274150: loss 6.9144, time 128.76ms
iter 274160: loss 6.2756, time 125.76ms
iter 274170: loss 5.8827, time 127.22ms
iter 274180: loss 5.7800, time 125.73ms
iter 274190: loss 5.6968, time 125.88ms
iter 274200: loss 6.4148, time 125.70ms
iter 274210: loss 5.9122, time 125.48ms
iter 274220: loss 6.5310, time 125.70ms
iter 274230: loss 5.3469, time 125.55ms
iter 274240: loss 5.5635, time 125.43ms
step 274250: train loss 5.7322, val loss 5.6926
saving checkpoint to out-shakespeare-char
iter 274250: loss 6.7859, time 2901.70ms
iter 274260: loss 6.2867, time 125.71ms
iter 274270: loss 5.6843, time 125.58ms
iter 274280: loss 5.9861, time 125.48ms
iter 274290: loss 5.8805, time 125.71ms
iter 274300: loss 6.0792, time 125.54ms
iter 274310: loss 5.3718, time 124.78ms
iter 274320: loss 6.7329, time 125.74ms
iter 274330: loss 6.5975, time 125.73ms
iter 274340: loss 6.1756, time 128.52ms
iter 274350: loss 6.1706, time 125.43ms
iter 274360: loss 5.8619, time 125.78ms
iter 274370: loss 6.2541, time 124.84ms
iter 274380: loss 6.3450, time 124.86ms
iter 274390: loss 6.2572, time 126.04ms
iter 274400: loss 6.0972, time 126.52ms
iter 274410: loss 6.0246, time 125.40ms
iter 274420: loss 6.2272, time 125.63ms
iter 274430: loss 6.4347, time 125.65ms
iter 274440: loss 5.7633, time 125.25ms
iter 274450: loss 5.2854, time 128.58ms
iter 274460: loss 5.8920, time 125.47ms
iter 274470: loss 6.2389, time 125.64ms
iter 274480: loss 6.6206, time 125.90ms
iter 274490: loss 6.7608, time 125.73ms
step 274500: train loss 5.6453, val loss 5.6720
saving checkpoint to out-shakespeare-char
iter 274500: loss 5.7412, time 2883.31ms
iter 274510: loss 5.9877, time 125.38ms
iter 274520: loss 6.7803, time 125.93ms
iter 274530: loss 5.2536, time 125.62ms
iter 274540: loss 5.7989, time 126.17ms
iter 274550: loss 5.6981, time 128.88ms
iter 274560: loss 5.4317, time 125.89ms
iter 274570: loss 6.2215, time 125.44ms
iter 274580: loss 7.2026, time 125.24ms
iter 274590: loss 5.8924, time 125.96ms
iter 274600: loss 6.3576, time 125.59ms
iter 274610: loss 5.6920, time 125.94ms
iter 274620: loss 5.9084, time 126.07ms
iter 274630: loss 6.1575, time 125.81ms
iter 274640: loss 6.2743, time 126.69ms
iter 274650: loss 5.6602, time 125.54ms
iter 274660: loss 6.2833, time 127.51ms
iter 274670: loss 6.3245, time 125.73ms
iter 274680: loss 6.2053, time 126.00ms
iter 274690: loss 5.9803, time 125.61ms
iter 274700: loss 6.1827, time 125.87ms
iter 274710: loss 6.2409, time 125.95ms
iter 274720: loss 5.4191, time 125.70ms
iter 274730: loss 6.2065, time 125.79ms
iter 274740: loss 6.0182, time 126.05ms
step 274750: train loss 5.7026, val loss 5.6526
saving checkpoint to out-shakespeare-char
iter 274750: loss 5.5179, time 2878.62ms
iter 274760: loss 5.6746, time 125.03ms
iter 274770: loss 6.8698, time 125.41ms
iter 274780: loss 5.8630, time 125.46ms
iter 274790: loss 6.3234, time 126.04ms
iter 274800: loss 6.1767, time 125.04ms
iter 274810: loss 6.2368, time 125.15ms
iter 274820: loss 6.3360, time 126.45ms
iter 274830: loss 5.9817, time 125.58ms
iter 274840: loss 6.2468, time 125.66ms
iter 274850: loss 6.2264, time 126.02ms
iter 274860: loss 5.7275, time 125.93ms
iter 274870: loss 6.5543, time 125.56ms
iter 274880: loss 5.4980, time 125.50ms
iter 274890: loss 6.0447, time 125.83ms
iter 274900: loss 6.4575, time 124.75ms
iter 274910: loss 7.1279, time 128.62ms
iter 274920: loss 6.0622, time 125.77ms
iter 274930: loss 6.0677, time 125.84ms
iter 274940: loss 7.5767, time 126.32ms
iter 274950: loss 6.4184, time 125.93ms
iter 274960: loss 6.0401, time 126.31ms
iter 274970: loss 5.9474, time 125.84ms
iter 274980: loss 6.0416, time 124.64ms
iter 274990: loss 5.6256, time 124.48ms
step 275000: train loss 5.6668, val loss 5.6859
saving checkpoint to out-shakespeare-char
iter 275000: loss 6.2889, time 2881.97ms
iter 275010: loss 6.5382, time 123.85ms
iter 275020: loss 5.2841, time 126.36ms
iter 275030: loss 6.5198, time 125.45ms
iter 275040: loss 6.0298, time 125.54ms
iter 275050: loss 5.9283, time 125.19ms
iter 275060: loss 6.6891, time 125.38ms
iter 275070: loss 5.7185, time 124.69ms
iter 275080: loss 6.0156, time 124.54ms
iter 275090: loss 5.8808, time 125.38ms
iter 275100: loss 5.3978, time 125.05ms
iter 275110: loss 6.4134, time 126.09ms
iter 275120: loss 6.8507, time 124.94ms
iter 275130: loss 6.1375, time 124.84ms
iter 275140: loss 5.7251, time 125.72ms
iter 275150: loss 6.3074, time 125.44ms
iter 275160: loss 6.3851, time 128.11ms
iter 275170: loss 6.2755, time 124.94ms
iter 275180: loss 5.6222, time 125.19ms
iter 275190: loss 5.3888, time 125.49ms
iter 275200: loss 5.8015, time 125.21ms
iter 275210: loss 6.2556, time 125.19ms
iter 275220: loss 6.2986, time 125.21ms
iter 275230: loss 6.2741, time 125.67ms
iter 275240: loss 5.6747, time 125.28ms
step 275250: train loss 5.7115, val loss 5.6856
saving checkpoint to out-shakespeare-char
iter 275250: loss 5.5883, time 2872.43ms
iter 275260: loss 5.9281, time 125.57ms
iter 275270: loss 4.8415, time 125.39ms
iter 275280: loss 6.6115, time 125.09ms
iter 275290: loss 6.5624, time 125.20ms
iter 275300: loss 6.4527, time 125.89ms
iter 275310: loss 5.7644, time 125.11ms
iter 275320: loss 6.3328, time 125.58ms
iter 275330: loss 6.0782, time 128.07ms
iter 275340: loss 5.4135, time 125.22ms
iter 275350: loss 5.8245, time 124.95ms
iter 275360: loss 5.8633, time 125.25ms
iter 275370: loss 6.2527, time 125.46ms
iter 275380: loss 5.9479, time 125.23ms
iter 275390: loss 5.9545, time 123.54ms
iter 275400: loss 5.8707, time 123.99ms
iter 275410: loss 5.9392, time 128.30ms
iter 275420: loss 6.0307, time 125.83ms
iter 275430: loss 6.4418, time 125.49ms
iter 275440: loss 6.3163, time 125.44ms
iter 275450: loss 5.9095, time 125.94ms
iter 275460: loss 6.0080, time 126.71ms
iter 275470: loss 5.8706, time 126.08ms
iter 275480: loss 6.3189, time 125.85ms
iter 275490: loss 6.5554, time 124.64ms
step 275500: train loss 5.6201, val loss 5.6810
saving checkpoint to out-shakespeare-char
iter 275500: loss 6.9671, time 2892.27ms
iter 275510: loss 6.2897, time 126.20ms
iter 275520: loss 6.2546, time 126.02ms
iter 275530: loss 6.2690, time 125.19ms
iter 275540: loss 6.5731, time 127.95ms
iter 275550: loss 5.7198, time 126.02ms
iter 275560: loss 5.3523, time 125.85ms
iter 275570: loss 6.0179, time 125.72ms
iter 275580: loss 5.1667, time 125.02ms
iter 275590: loss 6.1324, time 125.93ms
iter 275600: loss 6.1139, time 125.52ms
iter 275610: loss 6.3258, time 125.82ms
iter 275620: loss 6.2637, time 125.24ms
iter 275630: loss 6.2163, time 125.37ms
iter 275640: loss 5.8998, time 125.10ms
iter 275650: loss 6.2208, time 128.59ms
iter 275660: loss 6.5359, time 125.87ms
iter 275670: loss 5.5940, time 125.54ms
iter 275680: loss 6.1530, time 125.54ms
iter 275690: loss 5.6351, time 125.38ms
iter 275700: loss 5.9113, time 125.53ms
iter 275710: loss 6.5300, time 125.86ms
iter 275720: loss 6.4349, time 125.70ms
iter 275730: loss 6.8359, time 125.81ms
iter 275740: loss 6.1100, time 125.07ms
step 275750: train loss 5.7219, val loss 5.6510
saving checkpoint to out-shakespeare-char
iter 275750: loss 5.6848, time 2870.80ms
iter 275760: loss 6.1023, time 125.91ms
iter 275770: loss 5.8243, time 124.89ms
iter 275780: loss 6.1603, time 125.48ms
iter 275790: loss 5.7666, time 124.44ms
iter 275800: loss 6.3099, time 125.79ms
iter 275810: loss 5.8172, time 127.80ms
iter 275820: loss 5.5852, time 125.65ms
iter 275830: loss 5.6847, time 125.80ms
iter 275840: loss 5.4319, time 125.65ms
iter 275850: loss 6.2557, time 125.55ms
iter 275860: loss 5.5011, time 125.59ms
iter 275870: loss 6.2421, time 125.72ms
iter 275880: loss 5.9658, time 125.61ms
iter 275890: loss 5.8399, time 125.71ms
iter 275900: loss 6.4985, time 125.69ms
iter 275910: loss 6.1781, time 125.49ms
iter 275920: loss 5.5547, time 128.46ms
iter 275930: loss 5.5353, time 125.85ms
iter 275940: loss 5.8398, time 126.15ms
iter 275950: loss 5.8669, time 125.61ms
iter 275960: loss 6.1489, time 128.63ms
iter 275970: loss 5.5414, time 125.34ms
iter 275980: loss 5.3390, time 126.25ms
iter 275990: loss 5.9672, time 125.32ms
step 276000: train loss 5.6710, val loss 5.6307
saving checkpoint to out-shakespeare-char
iter 276000: loss 5.7797, time 2872.91ms
iter 276010: loss 5.1801, time 125.70ms
iter 276020: loss 6.0468, time 125.60ms
iter 276030: loss 5.4216, time 125.19ms
iter 276040: loss 5.9557, time 125.31ms
iter 276050: loss 5.9950, time 125.17ms
iter 276060: loss 6.3529, time 125.51ms
iter 276070: loss 6.2332, time 125.32ms
iter 276080: loss 5.9191, time 127.51ms
iter 276090: loss 5.6749, time 124.98ms
iter 276100: loss 5.7454, time 127.50ms
iter 276110: loss 6.4878, time 125.09ms
iter 276120: loss 6.5756, time 125.18ms
iter 276130: loss 6.4103, time 128.81ms
iter 276140: loss 6.2807, time 124.93ms
iter 276150: loss 5.9522, time 125.30ms
iter 276160: loss 6.6480, time 125.48ms
iter 276170: loss 6.3832, time 124.65ms
iter 276180: loss 5.7951, time 125.17ms
iter 276190: loss 6.2692, time 125.69ms
iter 276200: loss 6.2648, time 125.21ms
iter 276210: loss 6.0965, time 124.87ms
iter 276220: loss 5.4959, time 125.16ms
iter 276230: loss 6.7126, time 125.72ms
iter 276240: loss 5.7788, time 128.25ms
step 276250: train loss 5.6109, val loss 5.6890
saving checkpoint to out-shakespeare-char
iter 276250: loss 5.8829, time 2882.03ms
iter 276260: loss 6.2604, time 125.56ms
iter 276270: loss 5.6318, time 125.39ms
iter 276280: loss 5.5262, time 125.57ms
iter 276290: loss 5.9677, time 125.35ms
iter 276300: loss 5.2575, time 128.22ms
iter 276310: loss 6.1009, time 125.03ms
iter 276320: loss 5.7803, time 125.32ms
iter 276330: loss 5.5350, time 126.36ms
iter 276340: loss 6.3090, time 125.37ms
iter 276350: loss 5.5832, time 125.53ms
iter 276360: loss 6.0757, time 125.04ms
iter 276370: loss 5.9211, time 125.13ms
iter 276380: loss 6.8744, time 125.12ms
iter 276390: loss 6.3429, time 125.84ms
iter 276400: loss 5.7558, time 125.87ms
iter 276410: loss 6.5506, time 125.59ms
iter 276420: loss 6.0194, time 125.77ms
iter 276430: loss 6.0903, time 125.76ms
iter 276440: loss 5.7973, time 126.20ms
iter 276450: loss 6.1876, time 128.76ms
iter 276460: loss 6.6898, time 126.10ms
iter 276470: loss 5.5788, time 125.92ms
iter 276480: loss 5.6505, time 126.00ms
iter 276490: loss 6.2811, time 125.73ms
step 276500: train loss 5.6958, val loss 5.6988
saving checkpoint to out-shakespeare-char
iter 276500: loss 6.3911, time 2883.55ms
iter 276510: loss 5.9964, time 124.10ms
iter 276520: loss 6.4909, time 122.10ms
iter 276530: loss 6.2770, time 123.36ms
iter 276540: loss 5.9130, time 121.82ms
iter 276550: loss 6.7691, time 122.82ms
iter 276560: loss 5.4539, time 121.31ms
iter 276570: loss 6.5290, time 119.81ms
iter 276580: loss 6.5399, time 121.93ms
iter 276590: loss 6.6116, time 124.72ms
iter 276600: loss 5.8977, time 126.09ms
iter 276610: loss 5.7254, time 126.90ms
iter 276620: loss 5.7819, time 125.87ms
iter 276630: loss 6.3579, time 125.63ms
iter 276640: loss 5.6659, time 125.58ms
iter 276650: loss 5.9736, time 125.82ms
iter 276660: loss 6.3672, time 126.16ms
iter 276670: loss 6.1730, time 128.14ms
iter 276680: loss 6.4448, time 125.51ms
iter 276690: loss 5.7068, time 125.22ms
iter 276700: loss 6.1135, time 125.67ms
iter 276710: loss 5.8618, time 127.85ms
iter 276720: loss 5.8094, time 124.34ms
iter 276730: loss 6.8020, time 125.57ms
iter 276740: loss 5.7941, time 125.43ms
step 276750: train loss 5.7144, val loss 5.6957
saving checkpoint to out-shakespeare-char
iter 276750: loss 6.0315, time 2869.18ms
iter 276760: loss 5.9672, time 124.17ms
iter 276770: loss 5.6574, time 128.29ms
iter 276780: loss 6.0896, time 125.37ms
iter 276790: loss 5.8171, time 125.18ms
iter 276800: loss 5.8075, time 124.71ms
iter 276810: loss 6.3095, time 125.22ms
iter 276820: loss 5.7300, time 125.86ms
iter 276830: loss 5.3492, time 126.61ms
iter 276840: loss 5.5806, time 126.09ms
iter 276850: loss 6.4338, time 128.86ms
iter 276860: loss 5.9832, time 126.16ms
iter 276870: loss 5.8810, time 125.63ms
iter 276880: loss 5.8997, time 126.15ms
iter 276890: loss 6.2089, time 127.50ms
iter 276900: loss 6.7877, time 125.07ms
iter 276910: loss 5.5859, time 126.20ms
iter 276920: loss 6.3318, time 126.14ms
iter 276930: loss 6.7966, time 125.93ms
iter 276940: loss 6.7611, time 126.03ms
iter 276950: loss 6.1798, time 124.87ms
iter 276960: loss 6.2469, time 125.69ms
iter 276970: loss 6.4710, time 128.70ms
iter 276980: loss 6.3166, time 124.49ms
iter 276990: loss 6.3808, time 125.69ms
step 277000: train loss 5.6770, val loss 5.6532
saving checkpoint to out-shakespeare-char
iter 277000: loss 6.4053, time 2872.95ms
iter 277010: loss 5.2704, time 127.41ms
iter 277020: loss 6.4454, time 125.94ms
iter 277030: loss 6.4205, time 125.75ms
iter 277040: loss 6.4632, time 126.64ms
iter 277050: loss 6.0766, time 126.43ms
iter 277060: loss 6.5653, time 126.28ms
iter 277070: loss 6.1284, time 125.26ms
iter 277080: loss 6.8741, time 125.61ms
iter 277090: loss 5.8741, time 127.96ms
iter 277100: loss 6.6308, time 127.34ms
iter 277110: loss 6.1557, time 125.07ms
iter 277120: loss 5.6521, time 125.22ms
iter 277130: loss 6.8987, time 128.45ms
iter 277140: loss 6.0872, time 125.16ms
iter 277150: loss 5.7589, time 125.21ms
iter 277160: loss 5.5691, time 125.08ms
iter 277170: loss 6.3670, time 125.42ms
iter 277180: loss 5.6885, time 126.67ms
iter 277190: loss 6.5356, time 126.08ms
iter 277200: loss 5.9683, time 127.51ms
iter 277210: loss 5.0960, time 125.64ms
iter 277220: loss 6.2456, time 124.62ms
iter 277230: loss 5.8021, time 126.32ms
iter 277240: loss 5.5533, time 124.93ms
step 277250: train loss 5.7135, val loss 5.6910
saving checkpoint to out-shakespeare-char
iter 277250: loss 6.2208, time 2906.64ms
iter 277260: loss 6.2960, time 129.05ms
iter 277270: loss 6.2883, time 125.76ms
iter 277280: loss 5.4747, time 125.91ms
iter 277290: loss 5.7748, time 126.70ms
iter 277300: loss 6.1342, time 125.27ms
iter 277310: loss 5.4566, time 125.08ms
iter 277320: loss 5.8552, time 125.34ms
iter 277330: loss 5.7440, time 125.09ms
iter 277340: loss 5.2652, time 125.17ms
iter 277350: loss 5.5183, time 125.10ms
iter 277360: loss 6.7668, time 125.06ms
iter 277370: loss 6.0817, time 128.49ms
iter 277380: loss 5.7677, time 124.96ms
iter 277390: loss 5.6977, time 124.97ms
iter 277400: loss 6.0259, time 125.17ms
iter 277410: loss 5.7380, time 127.91ms
iter 277420: loss 5.3605, time 125.62ms
iter 277430: loss 5.7896, time 125.00ms
iter 277440: loss 6.1834, time 125.32ms
iter 277450: loss 6.1443, time 125.05ms
iter 277460: loss 6.3121, time 125.18ms
iter 277470: loss 6.4998, time 125.01ms
iter 277480: loss 5.4104, time 125.04ms
iter 277490: loss 6.1417, time 125.14ms
step 277500: train loss 5.6507, val loss 5.6504
saving checkpoint to out-shakespeare-char
iter 277500: loss 6.0030, time 2900.40ms
iter 277510: loss 5.8736, time 128.48ms
iter 277520: loss 6.4864, time 125.04ms
iter 277530: loss 6.2213, time 124.94ms
iter 277540: loss 6.1793, time 125.50ms
iter 277550: loss 6.0212, time 125.20ms
iter 277560: loss 6.6854, time 125.42ms
iter 277570: loss 5.6537, time 125.30ms
iter 277580: loss 6.1461, time 125.42ms
iter 277590: loss 5.6293, time 126.59ms
iter 277600: loss 5.6880, time 125.28ms
iter 277610: loss 6.0859, time 125.32ms
iter 277620: loss 6.2333, time 128.24ms
iter 277630: loss 6.3460, time 125.36ms
iter 277640: loss 6.5227, time 125.22ms
iter 277650: loss 6.3617, time 125.63ms
iter 277660: loss 6.3531, time 125.29ms
iter 277670: loss 6.2687, time 124.76ms
iter 277680: loss 6.3762, time 124.77ms
iter 277690: loss 6.6307, time 124.94ms
iter 277700: loss 6.2776, time 124.15ms
iter 277710: loss 5.3062, time 124.96ms
iter 277720: loss 6.1387, time 125.16ms
iter 277730: loss 6.1548, time 126.99ms
iter 277740: loss 6.7631, time 124.69ms
step 277750: train loss 5.6861, val loss 5.6967
saving checkpoint to out-shakespeare-char
iter 277750: loss 6.3121, time 2904.87ms
iter 277760: loss 6.9895, time 125.66ms
iter 277770: loss 5.8486, time 125.16ms
iter 277780: loss 5.9962, time 125.18ms
iter 277790: loss 6.1358, time 125.56ms
iter 277800: loss 6.4286, time 125.81ms
iter 277810: loss 6.3213, time 125.21ms
iter 277820: loss 5.9873, time 127.93ms
iter 277830: loss 5.9511, time 125.14ms
iter 277840: loss 5.7334, time 124.94ms
iter 277850: loss 5.8656, time 124.87ms
iter 277860: loss 6.4193, time 124.93ms
iter 277870: loss 6.5056, time 125.66ms
iter 277880: loss 6.1578, time 125.60ms
iter 277890: loss 5.1559, time 128.40ms
iter 277900: loss 6.5874, time 125.35ms
iter 277910: loss 6.5275, time 125.03ms
iter 277920: loss 6.5373, time 125.19ms
iter 277930: loss 6.2007, time 125.02ms
iter 277940: loss 6.0999, time 125.28ms
iter 277950: loss 6.0665, time 125.44ms
iter 277960: loss 5.4906, time 125.30ms
iter 277970: loss 6.3625, time 125.53ms
iter 277980: loss 6.5987, time 125.37ms
iter 277990: loss 6.1432, time 125.54ms
step 278000: train loss 5.6487, val loss 5.6679
saving checkpoint to out-shakespeare-char
iter 278000: loss 6.0343, time 2903.20ms
iter 278010: loss 6.2776, time 121.67ms
iter 278020: loss 6.0340, time 121.65ms
iter 278030: loss 6.7648, time 121.49ms
iter 278040: loss 6.5717, time 122.07ms
iter 278050: loss 6.1473, time 121.64ms
iter 278060: loss 6.1516, time 121.58ms
iter 278070: loss 6.3400, time 121.59ms
iter 278080: loss 6.2459, time 121.47ms
iter 278090: loss 6.1481, time 121.49ms
iter 278100: loss 5.8739, time 121.67ms
iter 278110: loss 5.3570, time 121.51ms
iter 278120: loss 6.0361, time 121.90ms
iter 278130: loss 6.5549, time 121.48ms
iter 278140: loss 5.5087, time 121.56ms
iter 278150: loss 5.9915, time 122.23ms
iter 278160: loss 6.5400, time 121.55ms
iter 278170: loss 6.4253, time 121.45ms
iter 278180: loss 6.0309, time 121.60ms
iter 278190: loss 6.1713, time 121.52ms
iter 278200: loss 6.2051, time 121.71ms
iter 278210: loss 6.9700, time 121.24ms
iter 278220: loss 6.4914, time 122.15ms
iter 278230: loss 5.2049, time 121.55ms
iter 278240: loss 5.4094, time 121.57ms
step 278250: train loss 5.7028, val loss 5.6992
saving checkpoint to out-shakespeare-char
iter 278250: loss 6.0316, time 2894.23ms
iter 278260: loss 6.4218, time 121.45ms
iter 278270: loss 6.0850, time 124.58ms
iter 278280: loss 5.9795, time 121.45ms
iter 278290: loss 5.3637, time 124.22ms
iter 278300: loss 6.0549, time 121.51ms
iter 278310: loss 6.0007, time 124.36ms
iter 278320: loss 6.3416, time 121.65ms
iter 278330: loss 6.1302, time 124.49ms
iter 278340: loss 6.5821, time 121.61ms
iter 278350: loss 6.3484, time 124.42ms
iter 278360: loss 5.6038, time 121.58ms
iter 278370: loss 5.9913, time 124.97ms
iter 278380: loss 5.6658, time 121.66ms
iter 278390: loss 5.9472, time 124.43ms
iter 278400: loss 6.1837, time 121.62ms
iter 278410: loss 6.5721, time 123.83ms
iter 278420: loss 6.2987, time 121.41ms
iter 278430: loss 6.0271, time 125.02ms
iter 278440: loss 6.2203, time 121.71ms
iter 278450: loss 6.4960, time 124.72ms
iter 278460: loss 6.7599, time 121.82ms
iter 278470: loss 6.6585, time 124.99ms
iter 278480: loss 6.6284, time 122.22ms
iter 278490: loss 6.3300, time 125.84ms
step 278500: train loss 5.6822, val loss 5.6796
saving checkpoint to out-shakespeare-char
iter 278500: loss 6.0371, time 2899.93ms
iter 278510: loss 6.5883, time 121.62ms
iter 278520: loss 6.5555, time 124.62ms
iter 278530: loss 5.6843, time 121.59ms
iter 278540: loss 6.0897, time 124.31ms
iter 278550: loss 6.4851, time 121.52ms
iter 278560: loss 6.3076, time 123.65ms
iter 278570: loss 6.3294, time 121.34ms
iter 278580: loss 5.5948, time 124.31ms
iter 278590: loss 6.2447, time 121.28ms
iter 278600: loss 6.2039, time 124.67ms
iter 278610: loss 5.5259, time 121.35ms
iter 278620: loss 6.1445, time 124.30ms
iter 278630: loss 5.2921, time 121.61ms
iter 278640: loss 6.4330, time 124.51ms
iter 278650: loss 6.9079, time 121.38ms
iter 278660: loss 6.3269, time 123.52ms
iter 278670: loss 5.9935, time 121.62ms
iter 278680: loss 6.2162, time 123.97ms
iter 278690: loss 6.4954, time 121.40ms
iter 278700: loss 5.2039, time 124.10ms
iter 278710: loss 6.1906, time 121.35ms
iter 278720: loss 5.4810, time 124.25ms
iter 278730: loss 6.4304, time 121.86ms
iter 278740: loss 5.5681, time 122.61ms
step 278750: train loss 5.6565, val loss 5.6975
saving checkpoint to out-shakespeare-char
iter 278750: loss 6.1630, time 2882.30ms
iter 278760: loss 6.3457, time 121.71ms
iter 278770: loss 5.7541, time 122.32ms
iter 278780: loss 6.7435, time 121.30ms
iter 278790: loss 6.2828, time 122.66ms
iter 278800: loss 5.6842, time 121.19ms
iter 278810: loss 6.1823, time 122.54ms
iter 278820: loss 6.5187, time 121.33ms
iter 278830: loss 6.2324, time 122.66ms
iter 278840: loss 6.5354, time 121.32ms
iter 278850: loss 5.9624, time 122.57ms
iter 278860: loss 6.1080, time 121.26ms
iter 278870: loss 6.3802, time 123.92ms
iter 278880: loss 6.2844, time 121.64ms
iter 278890: loss 5.5294, time 122.66ms
iter 278900: loss 5.9794, time 120.88ms
iter 278910: loss 6.2789, time 123.19ms
iter 278920: loss 5.6446, time 121.35ms
iter 278930: loss 5.9122, time 122.59ms
iter 278940: loss 6.5382, time 121.63ms
iter 278950: loss 5.2418, time 122.34ms
iter 278960: loss 6.2020, time 121.34ms
iter 278970: loss 6.1011, time 122.71ms
iter 278980: loss 6.2050, time 121.34ms
iter 278990: loss 6.6658, time 123.38ms
step 279000: train loss 5.6675, val loss 5.6825
saving checkpoint to out-shakespeare-char
iter 279000: loss 6.1128, time 2891.24ms
iter 279010: loss 5.8705, time 123.60ms
iter 279020: loss 6.0460, time 121.47ms
iter 279030: loss 5.8865, time 123.55ms
iter 279040: loss 6.3320, time 121.91ms
iter 279050: loss 6.0585, time 123.82ms
iter 279060: loss 5.9064, time 121.88ms
iter 279070: loss 6.6328, time 123.72ms
iter 279080: loss 5.5223, time 122.19ms
iter 279090: loss 6.3115, time 123.48ms
iter 279100: loss 6.1245, time 121.79ms
iter 279110: loss 6.0809, time 124.35ms
iter 279120: loss 6.0679, time 122.04ms
iter 279130: loss 6.1005, time 123.33ms
iter 279140: loss 6.5537, time 121.93ms
iter 279150: loss 6.4559, time 123.20ms
iter 279160: loss 6.3672, time 122.35ms
iter 279170: loss 5.8977, time 123.49ms
iter 279180: loss 6.7738, time 121.97ms
iter 279190: loss 5.9265, time 123.14ms
iter 279200: loss 5.8583, time 121.90ms
iter 279210: loss 6.3771, time 124.15ms
iter 279220: loss 6.2174, time 121.94ms
iter 279230: loss 6.2491, time 123.69ms
iter 279240: loss 7.0531, time 121.74ms
step 279250: train loss 5.6588, val loss 5.7047
saving checkpoint to out-shakespeare-char
iter 279250: loss 6.7616, time 2922.88ms
iter 279260: loss 5.8384, time 124.73ms
iter 279270: loss 5.7508, time 125.29ms
iter 279280: loss 6.2753, time 125.50ms
iter 279290: loss 5.9104, time 125.73ms
iter 279300: loss 5.7809, time 125.29ms
iter 279310: loss 6.3597, time 125.66ms
iter 279320: loss 5.6718, time 127.80ms
iter 279330: loss 5.6539, time 125.85ms
iter 279340: loss 6.4290, time 125.95ms
iter 279350: loss 5.6900, time 126.19ms
iter 279360: loss 5.4876, time 128.52ms
iter 279370: loss 6.1010, time 125.26ms
iter 279380: loss 5.8930, time 126.16ms
iter 279390: loss 6.0757, time 126.02ms
iter 279400: loss 5.5689, time 125.62ms
iter 279410: loss 5.7763, time 125.94ms
iter 279420: loss 5.6732, time 125.98ms
iter 279430: loss 6.5399, time 125.78ms
iter 279440: loss 5.8114, time 125.69ms
iter 279450: loss 5.7641, time 125.77ms
iter 279460: loss 5.6942, time 126.04ms
iter 279470: loss 5.5406, time 126.25ms
iter 279480: loss 5.8972, time 125.71ms
iter 279490: loss 6.2013, time 125.72ms
step 279500: train loss 5.6574, val loss 5.6441
saving checkpoint to out-shakespeare-char
iter 279500: loss 6.5189, time 2892.70ms
iter 279510: loss 5.8498, time 126.29ms
iter 279520: loss 5.2861, time 125.79ms
iter 279530: loss 5.7138, time 125.66ms
iter 279540: loss 5.6884, time 125.54ms
iter 279550: loss 5.7326, time 125.51ms
iter 279560: loss 7.0091, time 125.33ms
iter 279570: loss 5.7403, time 125.26ms
iter 279580: loss 6.1647, time 125.49ms
iter 279590: loss 5.8200, time 125.68ms
iter 279600: loss 6.2632, time 124.62ms
iter 279610: loss 6.6535, time 125.58ms
iter 279620: loss 6.3736, time 128.27ms
iter 279630: loss 7.0970, time 125.26ms
iter 279640: loss 6.1165, time 125.21ms
iter 279650: loss 6.1836, time 124.72ms
iter 279660: loss 5.5969, time 125.24ms
iter 279670: loss 6.1035, time 126.33ms
iter 279680: loss 6.0485, time 126.12ms
iter 279690: loss 6.1380, time 125.70ms
iter 279700: loss 5.7877, time 125.35ms
iter 279710: loss 5.6951, time 125.54ms
iter 279720: loss 6.0440, time 125.31ms
iter 279730: loss 5.8157, time 128.45ms
iter 279740: loss 5.7461, time 125.84ms
step 279750: train loss 5.6577, val loss 5.6523
saving checkpoint to out-shakespeare-char
iter 279750: loss 6.3392, time 2887.72ms
iter 279760: loss 6.3445, time 125.94ms
iter 279770: loss 5.8033, time 125.76ms
iter 279780: loss 6.4700, time 126.09ms
iter 279790: loss 6.6971, time 124.58ms
iter 279800: loss 6.1072, time 125.65ms
iter 279810: loss 5.2243, time 121.82ms
iter 279820: loss 5.7064, time 120.90ms
iter 279830: loss 6.8895, time 124.81ms
iter 279840: loss 5.6834, time 125.54ms
iter 279850: loss 5.9427, time 126.61ms
iter 279860: loss 6.1714, time 125.34ms
iter 279870: loss 6.6452, time 124.21ms
iter 279880: loss 5.7158, time 125.18ms
iter 279890: loss 5.7078, time 125.22ms
iter 279900: loss 5.8989, time 125.30ms
iter 279910: loss 5.5965, time 126.68ms
iter 279920: loss 6.4938, time 127.93ms
iter 279930: loss 5.8522, time 126.65ms
iter 279940: loss 6.1061, time 125.79ms
iter 279950: loss 6.4634, time 125.51ms
iter 279960: loss 6.6169, time 124.52ms
iter 279970: loss 6.2241, time 126.16ms
iter 279980: loss 5.6830, time 125.81ms
iter 279990: loss 5.5936, time 124.48ms
step 280000: train loss 5.6831, val loss 5.6673
saving checkpoint to out-shakespeare-char
iter 280000: loss 5.9930, time 2886.80ms
iter 280010: loss 6.2325, time 125.31ms
iter 280020: loss 6.2452, time 126.85ms
iter 280030: loss 5.8244, time 125.53ms
iter 280040: loss 6.0242, time 125.23ms
iter 280050: loss 5.1947, time 128.55ms
iter 280060: loss 6.1066, time 125.56ms
iter 280070: loss 7.0027, time 124.12ms
iter 280080: loss 6.1155, time 125.57ms
iter 280090: loss 6.0484, time 125.48ms
iter 280100: loss 6.1997, time 125.74ms
iter 280110: loss 6.5222, time 124.76ms
iter 280120: loss 5.7948, time 128.62ms
iter 280130: loss 5.8247, time 125.69ms
iter 280140: loss 5.1248, time 124.51ms
iter 280150: loss 5.8195, time 125.87ms
iter 280160: loss 5.2568, time 125.72ms
iter 280170: loss 6.0836, time 125.72ms
iter 280180: loss 6.1889, time 124.66ms
iter 280190: loss 6.5405, time 124.89ms
iter 280200: loss 5.9406, time 125.64ms
iter 280210: loss 6.1387, time 125.14ms
iter 280220: loss 6.0734, time 125.97ms
iter 280230: loss 5.7760, time 128.80ms
iter 280240: loss 6.0913, time 125.90ms
step 280250: train loss 5.6695, val loss 5.6266
saving checkpoint to out-shakespeare-char
iter 280250: loss 5.9365, time 2899.64ms
iter 280260: loss 6.3228, time 125.54ms
iter 280270: loss 5.8331, time 125.50ms
iter 280280: loss 5.8285, time 125.60ms
iter 280290: loss 5.5405, time 127.17ms
iter 280300: loss 6.1120, time 125.52ms
iter 280310: loss 6.0752, time 125.05ms
iter 280320: loss 6.7630, time 125.12ms
iter 280330: loss 5.4790, time 125.42ms
iter 280340: loss 6.4323, time 125.25ms
iter 280350: loss 5.5568, time 125.04ms
iter 280360: loss 6.2008, time 124.99ms
iter 280370: loss 6.2253, time 124.99ms
iter 280380: loss 5.9240, time 125.26ms
iter 280390: loss 6.0570, time 125.38ms
iter 280400: loss 5.1409, time 128.18ms
iter 280410: loss 5.9086, time 125.20ms
iter 280420: loss 6.4729, time 124.22ms
iter 280430: loss 6.1705, time 125.16ms
iter 280440: loss 5.6733, time 126.65ms
iter 280450: loss 6.6030, time 125.07ms
iter 280460: loss 6.1381, time 125.16ms
iter 280470: loss 6.8327, time 124.94ms
iter 280480: loss 6.0750, time 125.35ms
iter 280490: loss 6.3407, time 125.51ms
step 280500: train loss 5.6848, val loss 5.6795
saving checkpoint to out-shakespeare-char
iter 280500: loss 6.0790, time 2891.20ms
iter 280510: loss 5.8696, time 126.89ms
iter 280520: loss 5.8024, time 124.89ms
iter 280530: loss 6.2414, time 123.93ms
iter 280540: loss 6.0060, time 126.47ms
iter 280550: loss 6.4448, time 124.85ms
iter 280560: loss 6.2630, time 126.95ms
iter 280570: loss 6.0821, time 127.29ms
iter 280580: loss 6.7839, time 126.61ms
iter 280590: loss 5.7374, time 126.91ms
iter 280600: loss 5.9764, time 127.00ms
iter 280610: loss 5.9217, time 125.79ms
iter 280620: loss 6.4265, time 125.25ms
iter 280630: loss 5.6559, time 126.30ms
iter 280640: loss 5.6880, time 128.41ms
iter 280650: loss 6.0209, time 125.64ms
iter 280660: loss 6.2554, time 125.84ms
iter 280670: loss 5.3865, time 125.60ms
iter 280680: loss 6.0164, time 128.48ms
iter 280690: loss 6.5444, time 126.14ms
iter 280700: loss 5.9773, time 127.96ms
iter 280710: loss 6.0465, time 125.36ms
iter 280720: loss 7.3504, time 124.90ms
iter 280730: loss 6.4056, time 124.73ms
iter 280740: loss 5.3470, time 125.32ms
step 280750: train loss 5.6378, val loss 5.7053
saving checkpoint to out-shakespeare-char
iter 280750: loss 5.8804, time 2887.98ms
iter 280760: loss 5.9871, time 128.50ms
iter 280770: loss 5.0702, time 124.03ms
iter 280780: loss 6.1483, time 125.62ms
iter 280790: loss 6.4307, time 124.79ms
iter 280800: loss 5.9415, time 125.83ms
iter 280810: loss 5.3740, time 126.21ms
iter 280820: loss 5.7241, time 125.83ms
iter 280830: loss 5.8511, time 125.47ms
iter 280840: loss 5.8930, time 125.77ms
iter 280850: loss 6.3078, time 125.68ms
iter 280860: loss 6.8566, time 125.77ms
iter 280870: loss 5.4436, time 124.91ms
iter 280880: loss 6.6505, time 128.46ms
iter 280890: loss 6.6246, time 126.18ms
iter 280900: loss 6.4832, time 125.71ms
iter 280910: loss 6.7477, time 125.98ms
iter 280920: loss 5.4087, time 125.76ms
iter 280930: loss 5.6769, time 124.93ms
iter 280940: loss 5.2416, time 125.74ms
iter 280950: loss 6.1431, time 125.98ms
iter 280960: loss 5.4408, time 126.15ms
iter 280970: loss 6.9243, time 125.65ms
iter 280980: loss 6.7708, time 126.63ms
iter 280990: loss 5.5622, time 124.86ms
step 281000: train loss 5.6738, val loss 5.6996
saving checkpoint to out-shakespeare-char
iter 281000: loss 5.3992, time 2889.34ms
iter 281010: loss 6.3923, time 126.03ms
iter 281020: loss 6.4099, time 127.02ms
iter 281030: loss 6.4763, time 125.63ms
iter 281040: loss 6.6099, time 125.90ms
iter 281050: loss 6.9992, time 125.91ms
iter 281060: loss 6.0008, time 125.50ms
iter 281070: loss 5.5815, time 125.50ms
iter 281080: loss 5.7273, time 126.22ms
iter 281090: loss 5.5646, time 126.26ms
iter 281100: loss 5.6477, time 125.78ms
iter 281110: loss 5.9999, time 125.80ms
iter 281120: loss 6.5404, time 127.07ms
iter 281130: loss 5.4011, time 125.14ms
iter 281140: loss 5.2520, time 126.24ms
iter 281150: loss 5.0293, time 129.30ms
iter 281160: loss 5.5791, time 124.99ms
iter 281170: loss 5.5770, time 125.80ms
iter 281180: loss 5.6285, time 125.80ms
iter 281190: loss 5.9502, time 125.33ms
iter 281200: loss 6.3353, time 125.85ms
iter 281210: loss 6.2283, time 125.58ms
iter 281220: loss 6.2538, time 127.52ms
iter 281230: loss 6.5910, time 126.36ms
iter 281240: loss 6.3847, time 128.93ms
step 281250: train loss 5.6602, val loss 5.6967
saving checkpoint to out-shakespeare-char
iter 281250: loss 6.4936, time 2877.50ms
iter 281260: loss 5.8582, time 127.49ms
iter 281270: loss 6.2322, time 125.63ms
iter 281280: loss 5.9781, time 126.17ms
iter 281290: loss 6.6050, time 126.20ms
iter 281300: loss 5.8553, time 125.82ms
iter 281310: loss 6.1610, time 125.79ms
iter 281320: loss 6.2468, time 128.58ms
iter 281330: loss 6.1548, time 125.89ms
iter 281340: loss 6.2907, time 125.73ms
iter 281350: loss 6.0635, time 125.66ms
iter 281360: loss 6.5448, time 125.76ms
iter 281370: loss 5.8161, time 125.80ms
iter 281380: loss 5.6087, time 125.66ms
iter 281390: loss 5.7122, time 125.65ms
iter 281400: loss 5.7300, time 125.68ms
iter 281410: loss 6.2354, time 125.70ms
iter 281420: loss 6.2634, time 125.55ms
iter 281430: loss 6.1680, time 125.98ms
iter 281440: loss 5.5879, time 128.64ms
iter 281450: loss 6.0125, time 125.42ms
iter 281460: loss 6.4516, time 127.09ms
iter 281470: loss 6.2715, time 125.75ms
iter 281480: loss 5.9762, time 126.26ms
iter 281490: loss 6.1546, time 125.76ms
step 281500: train loss 5.6708, val loss 5.6303
saving checkpoint to out-shakespeare-char
iter 281500: loss 5.7338, time 2878.51ms
iter 281510: loss 6.2489, time 127.59ms
iter 281520: loss 5.9644, time 125.85ms
iter 281530: loss 5.3871, time 125.84ms
iter 281540: loss 5.6902, time 125.84ms
iter 281550: loss 6.2836, time 124.72ms
iter 281560: loss 6.5975, time 125.78ms
iter 281570: loss 5.2717, time 125.68ms
iter 281580: loss 6.0056, time 126.27ms
iter 281590: loss 5.3929, time 125.75ms
iter 281600: loss 5.0947, time 125.96ms
iter 281610: loss 6.7887, time 126.02ms
iter 281620: loss 6.2673, time 125.91ms
iter 281630: loss 5.8577, time 125.96ms
iter 281640: loss 5.2938, time 125.90ms
iter 281650: loss 6.8560, time 125.12ms
iter 281660: loss 5.6124, time 125.68ms
iter 281670: loss 5.8842, time 125.87ms
iter 281680: loss 5.6584, time 128.66ms
iter 281690: loss 6.8485, time 125.28ms
iter 281700: loss 7.1189, time 124.83ms
iter 281710: loss 5.4479, time 125.73ms
iter 281720: loss 6.4954, time 126.89ms
iter 281730: loss 5.1335, time 125.94ms
iter 281740: loss 5.8479, time 125.77ms
step 281750: train loss 5.6652, val loss 5.6253
saving checkpoint to out-shakespeare-char
iter 281750: loss 6.5031, time 2851.13ms
iter 281760: loss 5.4163, time 126.03ms
iter 281770: loss 5.9424, time 125.73ms
iter 281780: loss 6.4584, time 128.70ms
iter 281790: loss 6.2460, time 125.89ms
iter 281800: loss 6.7170, time 125.70ms
iter 281810: loss 5.8706, time 126.81ms
iter 281820: loss 5.7896, time 125.72ms
iter 281830: loss 5.5872, time 126.18ms
iter 281840: loss 5.4875, time 127.68ms
iter 281850: loss 5.7701, time 125.73ms
iter 281860: loss 6.7278, time 125.76ms
iter 281870: loss 6.1905, time 125.65ms
iter 281880: loss 5.7702, time 125.95ms
iter 281890: loss 6.5881, time 125.53ms
iter 281900: loss 6.1792, time 128.55ms
iter 281910: loss 6.2533, time 125.48ms
iter 281920: loss 5.6531, time 125.87ms
iter 281930: loss 6.2068, time 125.78ms
iter 281940: loss 6.1727, time 125.62ms
iter 281950: loss 5.9001, time 126.52ms
iter 281960: loss 5.9579, time 125.64ms
iter 281970: loss 6.3080, time 125.70ms
iter 281980: loss 5.9658, time 125.79ms
iter 281990: loss 6.3170, time 129.30ms
step 282000: train loss 5.6641, val loss 5.7033
saving checkpoint to out-shakespeare-char
iter 282000: loss 5.5553, time 2880.82ms
iter 282010: loss 5.8576, time 129.28ms
iter 282020: loss 5.6524, time 125.36ms
iter 282030: loss 5.9470, time 125.78ms
iter 282040: loss 5.5247, time 125.87ms
iter 282050: loss 5.9051, time 128.50ms
iter 282060: loss 5.7924, time 126.02ms
iter 282070: loss 6.5257, time 125.53ms
iter 282080: loss 5.6484, time 125.70ms
iter 282090: loss 5.9234, time 127.00ms
iter 282100: loss 6.6696, time 125.84ms
iter 282110: loss 6.2649, time 125.92ms
iter 282120: loss 5.6474, time 126.21ms
iter 282130: loss 6.1620, time 125.88ms
iter 282140: loss 5.7179, time 125.58ms
iter 282150: loss 6.5238, time 125.88ms
iter 282160: loss 6.3999, time 125.88ms
iter 282170: loss 6.5915, time 128.98ms
iter 282180: loss 5.3094, time 125.43ms
iter 282190: loss 5.8088, time 125.65ms
iter 282200: loss 5.5290, time 125.90ms
iter 282210: loss 6.6009, time 126.22ms
iter 282220: loss 5.9212, time 125.76ms
iter 282230: loss 6.1467, time 125.88ms
iter 282240: loss 6.4199, time 125.68ms
step 282250: train loss 5.6536, val loss 5.6488
saving checkpoint to out-shakespeare-char
iter 282250: loss 6.8999, time 2871.89ms
iter 282260: loss 6.3151, time 125.82ms
iter 282270: loss 6.5712, time 128.90ms
iter 282280: loss 6.2077, time 125.81ms
iter 282290: loss 5.7115, time 125.36ms
iter 282300: loss 6.4463, time 124.11ms
iter 282310: loss 6.2635, time 125.74ms
iter 282320: loss 6.0258, time 125.99ms
iter 282330: loss 6.0892, time 126.02ms
iter 282340: loss 6.5268, time 125.88ms
iter 282350: loss 5.3844, time 126.01ms
iter 282360: loss 6.9242, time 126.01ms
iter 282370: loss 5.7599, time 125.28ms
iter 282380: loss 5.9647, time 125.46ms
iter 282390: loss 5.6371, time 128.61ms
iter 282400: loss 6.2377, time 125.13ms
iter 282410: loss 6.2082, time 126.10ms
iter 282420: loss 6.0907, time 127.22ms
iter 282430: loss 6.0702, time 125.70ms
iter 282440: loss 6.0957, time 128.69ms
iter 282450: loss 5.8368, time 126.13ms
iter 282460: loss 6.3857, time 125.84ms
iter 282470: loss 5.6822, time 125.56ms
iter 282480: loss 6.7082, time 125.73ms
iter 282490: loss 6.1183, time 125.77ms
step 282500: train loss 5.6551, val loss 5.6826
saving checkpoint to out-shakespeare-char
iter 282500: loss 6.0003, time 2857.45ms
iter 282510: loss 5.9338, time 125.48ms
iter 282520: loss 6.8571, time 125.08ms
iter 282530: loss 6.4012, time 128.48ms
iter 282540: loss 7.1560, time 126.24ms
iter 282550: loss 5.7223, time 125.86ms
iter 282560: loss 6.8477, time 125.61ms
iter 282570: loss 6.3238, time 126.81ms
iter 282580: loss 5.7326, time 125.78ms
iter 282590: loss 6.0797, time 127.13ms
iter 282600: loss 5.4645, time 126.63ms
iter 282610: loss 6.2009, time 126.03ms
iter 282620: loss 6.5062, time 125.97ms
iter 282630: loss 6.3258, time 125.76ms
iter 282640: loss 5.9036, time 125.56ms
iter 282650: loss 6.4157, time 125.85ms
iter 282660: loss 6.3084, time 126.25ms
iter 282670: loss 5.7916, time 127.17ms
iter 282680: loss 5.5593, time 125.85ms
iter 282690: loss 5.7415, time 128.73ms
iter 282700: loss 6.2973, time 125.67ms
iter 282710: loss 6.6894, time 125.81ms
iter 282720: loss 6.7732, time 125.91ms
iter 282730: loss 6.1994, time 125.65ms
iter 282740: loss 6.1950, time 125.90ms
step 282750: train loss 5.6772, val loss 5.6512
saving checkpoint to out-shakespeare-char
iter 282750: loss 6.1136, time 2893.57ms
iter 282760: loss 5.7265, time 125.41ms
iter 282770: loss 6.2977, time 125.89ms
iter 282780: loss 6.0408, time 125.80ms
iter 282790: loss 5.0106, time 125.55ms
iter 282800: loss 5.6907, time 129.49ms
iter 282810: loss 6.5738, time 126.04ms
iter 282820: loss 6.1835, time 125.59ms
iter 282830: loss 6.0880, time 125.72ms
iter 282840: loss 5.8038, time 125.86ms
iter 282850: loss 6.4736, time 125.61ms
iter 282860: loss 6.8397, time 125.87ms
iter 282870: loss 6.2383, time 125.66ms
iter 282880: loss 6.0156, time 125.83ms
iter 282890: loss 5.6081, time 124.56ms
iter 282900: loss 5.8574, time 121.39ms
iter 282910: loss 5.6064, time 124.53ms
iter 282920: loss 6.3544, time 123.51ms
iter 282930: loss 5.6068, time 122.28ms
iter 282940: loss 5.9231, time 121.48ms
iter 282950: loss 5.8251, time 122.68ms
iter 282960: loss 5.8164, time 121.53ms
iter 282970: loss 6.7951, time 122.41ms
iter 282980: loss 6.3468, time 121.55ms
iter 282990: loss 6.1359, time 122.46ms
step 283000: train loss 5.6650, val loss 5.6927
saving checkpoint to out-shakespeare-char
iter 283000: loss 6.4204, time 2896.90ms
iter 283010: loss 5.7205, time 122.13ms
iter 283020: loss 5.5563, time 121.70ms
iter 283030: loss 6.2652, time 121.21ms
iter 283040: loss 5.9123, time 121.71ms
iter 283050: loss 5.9146, time 121.63ms
iter 283060: loss 5.6496, time 121.49ms
iter 283070: loss 6.8095, time 122.49ms
iter 283080: loss 5.7081, time 121.81ms
iter 283090: loss 5.9316, time 121.72ms
iter 283100: loss 5.4448, time 124.62ms
iter 283110: loss 6.4507, time 121.54ms
iter 283120: loss 6.7237, time 124.08ms
iter 283130: loss 6.0328, time 121.47ms
iter 283140: loss 5.8237, time 124.38ms
iter 283150: loss 6.2730, time 121.67ms
iter 283160: loss 6.4862, time 124.30ms
iter 283170: loss 5.3370, time 121.16ms
iter 283180: loss 5.8683, time 124.36ms
iter 283190: loss 6.5386, time 122.47ms
iter 283200: loss 6.3839, time 122.71ms
iter 283210: loss 6.6572, time 122.18ms
iter 283220: loss 7.0290, time 122.59ms
iter 283230: loss 5.4309, time 121.66ms
iter 283240: loss 6.1906, time 122.74ms
step 283250: train loss 5.6397, val loss 5.6696
saving checkpoint to out-shakespeare-char
iter 283250: loss 5.8506, time 2891.06ms
iter 283260: loss 6.0280, time 124.57ms
iter 283270: loss 6.5470, time 121.14ms
iter 283280: loss 6.4302, time 123.37ms
iter 283290: loss 6.5885, time 120.28ms
iter 283300: loss 5.8534, time 124.74ms
iter 283310: loss 6.0527, time 121.59ms
iter 283320: loss 5.6147, time 125.65ms
iter 283330: loss 5.9806, time 121.66ms
iter 283340: loss 6.8050, time 124.77ms
iter 283350: loss 5.7873, time 121.41ms
iter 283360: loss 5.7294, time 120.58ms
iter 283370: loss 6.3938, time 121.26ms
iter 283380: loss 6.2307, time 121.67ms
iter 283390: loss 6.4028, time 123.04ms
iter 283400: loss 6.7534, time 120.23ms
iter 283410: loss 6.1028, time 122.66ms
iter 283420: loss 6.3288, time 120.84ms
iter 283430: loss 5.7556, time 122.68ms
iter 283440: loss 5.3672, time 121.88ms
iter 283450: loss 6.2210, time 121.19ms
iter 283460: loss 6.0954, time 122.89ms
iter 283470: loss 5.6689, time 121.68ms
iter 283480: loss 6.3041, time 122.06ms
iter 283490: loss 6.0236, time 122.38ms
step 283500: train loss 5.6562, val loss 5.6792
saving checkpoint to out-shakespeare-char
iter 283500: loss 5.8199, time 2885.83ms
iter 283510: loss 5.9821, time 121.65ms
iter 283520: loss 6.3470, time 121.28ms
iter 283530: loss 5.7767, time 121.72ms
iter 283540: loss 6.2225, time 121.66ms
iter 283550: loss 6.3151, time 121.53ms
iter 283560: loss 6.1551, time 121.75ms
iter 283570: loss 7.0976, time 122.33ms
iter 283580: loss 6.7118, time 121.96ms
iter 283590: loss 6.2646, time 122.64ms
iter 283600: loss 5.5120, time 120.24ms
iter 283610: loss 5.7100, time 122.10ms
iter 283620: loss 6.0558, time 121.63ms
iter 283630: loss 6.0695, time 121.60ms
iter 283640: loss 5.2933, time 121.62ms
iter 283650: loss 5.7155, time 121.40ms
iter 283660: loss 5.8102, time 121.53ms
iter 283670: loss 5.4950, time 120.69ms
iter 283680: loss 6.0625, time 121.58ms
iter 283690: loss 5.7502, time 121.22ms
iter 283700: loss 6.1548, time 121.59ms
iter 283710: loss 5.9030, time 121.43ms
iter 283720: loss 5.6892, time 121.63ms
iter 283730: loss 6.1592, time 122.17ms
iter 283740: loss 6.2629, time 122.27ms
step 283750: train loss 5.6509, val loss 5.6607
saving checkpoint to out-shakespeare-char
iter 283750: loss 6.2153, time 2879.44ms
iter 283760: loss 5.6719, time 121.61ms
iter 283770: loss 6.1666, time 123.13ms
iter 283780: loss 5.7605, time 122.22ms
iter 283790: loss 5.7954, time 121.41ms
iter 283800: loss 5.2948, time 121.96ms
iter 283810: loss 6.1501, time 121.30ms
iter 283820: loss 6.2888, time 121.30ms
iter 283830: loss 5.4233, time 120.77ms
iter 283840: loss 6.5654, time 121.82ms
iter 283850: loss 6.1026, time 121.47ms
iter 283860: loss 6.2550, time 120.99ms
iter 283870: loss 5.8906, time 120.36ms
iter 283880: loss 6.8349, time 121.24ms
iter 283890: loss 6.4911, time 121.41ms
iter 283900: loss 6.0082, time 122.33ms
iter 283910: loss 5.6338, time 121.29ms
iter 283920: loss 6.1092, time 121.32ms
iter 283930: loss 6.2715, time 121.08ms
iter 283940: loss 6.0602, time 121.29ms
iter 283950: loss 5.7405, time 121.17ms
iter 283960: loss 5.9780, time 121.33ms
iter 283970: loss 6.6552, time 121.37ms
iter 283980: loss 6.7401, time 121.28ms
iter 283990: loss 5.2983, time 121.35ms
step 284000: train loss 5.6749, val loss 5.6726
saving checkpoint to out-shakespeare-char
iter 284000: loss 5.9228, time 2893.28ms
iter 284010: loss 6.6829, time 122.77ms
iter 284020: loss 6.4862, time 121.21ms
iter 284030: loss 5.7888, time 122.24ms
iter 284040: loss 5.7779, time 121.37ms
iter 284050: loss 6.5638, time 122.64ms
iter 284060: loss 6.3180, time 121.26ms
iter 284070: loss 6.7054, time 122.40ms
iter 284080: loss 6.2844, time 121.37ms
iter 284090: loss 5.8578, time 122.65ms
iter 284100: loss 6.6255, time 121.46ms
iter 284110: loss 5.8736, time 120.64ms
iter 284120: loss 5.6141, time 122.90ms
iter 284130: loss 5.9555, time 121.82ms
iter 284140: loss 5.4562, time 121.32ms
iter 284150: loss 5.6243, time 121.17ms
iter 284160: loss 6.6188, time 120.78ms
iter 284170: loss 6.4662, time 121.41ms
iter 284180: loss 6.4285, time 121.23ms
iter 284190: loss 5.4907, time 121.49ms
iter 284200: loss 6.2531, time 122.64ms
iter 284210: loss 6.1579, time 121.66ms
iter 284220: loss 6.4502, time 121.86ms
iter 284230: loss 5.3269, time 122.77ms
iter 284240: loss 5.4569, time 121.48ms
step 284250: train loss 5.6688, val loss 5.6715
saving checkpoint to out-shakespeare-char
iter 284250: loss 5.9934, time 2903.89ms
iter 284260: loss 5.7895, time 121.46ms
iter 284270: loss 5.9967, time 122.03ms
iter 284280: loss 6.5327, time 121.31ms
iter 284290: loss 6.3817, time 121.82ms
iter 284300: loss 6.8251, time 121.74ms
iter 284310: loss 5.9850, time 121.84ms
iter 284320: loss 5.1674, time 122.22ms
iter 284330: loss 6.2709, time 122.36ms
iter 284340: loss 6.4598, time 121.94ms
iter 284350: loss 5.1289, time 121.89ms
iter 284360: loss 5.7229, time 123.46ms
iter 284370: loss 5.7274, time 121.78ms
iter 284380: loss 5.9361, time 121.75ms
iter 284390: loss 5.7411, time 122.95ms
iter 284400: loss 5.6658, time 121.78ms
iter 284410: loss 6.3249, time 120.93ms
iter 284420: loss 6.5233, time 121.58ms
iter 284430: loss 6.9731, time 121.86ms
iter 284440: loss 5.9514, time 121.85ms
iter 284450: loss 6.4906, time 121.81ms
iter 284460: loss 6.0773, time 121.98ms
iter 284470: loss 5.6443, time 123.08ms
iter 284480: loss 6.1738, time 122.26ms
iter 284490: loss 5.9911, time 121.82ms
step 284500: train loss 5.6885, val loss 5.7021
saving checkpoint to out-shakespeare-char
iter 284500: loss 5.8283, time 2904.30ms
iter 284510: loss 6.1072, time 120.42ms
iter 284520: loss 6.3244, time 121.31ms
iter 284530: loss 6.3317, time 122.67ms
iter 284540: loss 6.0345, time 121.60ms
iter 284550: loss 7.0094, time 119.70ms
iter 284560: loss 5.6803, time 121.52ms
iter 284570: loss 5.9490, time 121.57ms
iter 284580: loss 5.9319, time 121.58ms
iter 284590: loss 6.2339, time 121.34ms
iter 284600: loss 6.1800, time 121.58ms
iter 284610: loss 6.2940, time 121.60ms
iter 284620: loss 6.5798, time 121.33ms
iter 284630: loss 6.2856, time 121.28ms
iter 284640: loss 6.1534, time 122.59ms
iter 284650: loss 5.8006, time 121.73ms
iter 284660: loss 6.5105, time 121.67ms
iter 284670: loss 6.3225, time 122.93ms
iter 284680: loss 6.7495, time 121.53ms
iter 284690: loss 5.9835, time 122.60ms
iter 284700: loss 6.0287, time 121.70ms
iter 284710: loss 5.7019, time 123.62ms
iter 284720: loss 6.1109, time 122.14ms
iter 284730: loss 6.1424, time 122.77ms
iter 284740: loss 5.7950, time 121.67ms
step 284750: train loss 5.6775, val loss 5.6311
saving checkpoint to out-shakespeare-char
iter 284750: loss 5.1991, time 2894.61ms
iter 284760: loss 6.0178, time 124.39ms
iter 284770: loss 5.8958, time 121.18ms
iter 284780: loss 6.0295, time 125.54ms
iter 284790: loss 6.0029, time 122.48ms
iter 284800: loss 6.1418, time 121.03ms
iter 284810: loss 5.7339, time 122.59ms
iter 284820: loss 5.8996, time 120.88ms
iter 284830: loss 5.6930, time 124.55ms
iter 284840: loss 6.3459, time 120.90ms
iter 284850: loss 6.6218, time 122.89ms
iter 284860: loss 5.6302, time 120.73ms
iter 284870: loss 6.6141, time 123.82ms
iter 284880: loss 5.8519, time 122.25ms
iter 284890: loss 6.4435, time 123.29ms
iter 284900: loss 5.9575, time 120.38ms
iter 284910: loss 6.1735, time 120.74ms
iter 284920: loss 6.1070, time 121.72ms
iter 284930: loss 6.5045, time 122.34ms
iter 284940: loss 6.1167, time 122.19ms
iter 284950: loss 6.1901, time 121.65ms
iter 284960: loss 6.4245, time 121.52ms
iter 284970: loss 5.8635, time 121.64ms
iter 284980: loss 6.0612, time 121.98ms
iter 284990: loss 5.4844, time 122.13ms
step 285000: train loss 5.6526, val loss 5.6477
saving checkpoint to out-shakespeare-char
iter 285000: loss 5.9959, time 2917.29ms
iter 285010: loss 6.2468, time 125.38ms
iter 285020: loss 5.7307, time 125.54ms
iter 285030: loss 5.8617, time 125.38ms
iter 285040: loss 5.7574, time 125.77ms
iter 285050: loss 5.9510, time 125.90ms
iter 285060: loss 5.2996, time 124.81ms
iter 285070: loss 5.8591, time 125.50ms
iter 285080: loss 6.0201, time 124.64ms
iter 285090: loss 5.1701, time 126.98ms
iter 285100: loss 5.9955, time 125.73ms
iter 285110: loss 6.0514, time 126.10ms
iter 285120: loss 7.1037, time 124.81ms
iter 285130: loss 5.7252, time 127.91ms
iter 285140: loss 6.5854, time 125.03ms
iter 285150: loss 7.2150, time 125.52ms
iter 285160: loss 5.7516, time 125.06ms
iter 285170: loss 6.3441, time 125.83ms
iter 285180: loss 5.3606, time 125.77ms
iter 285190: loss 5.7496, time 125.63ms
iter 285200: loss 6.1667, time 125.45ms
iter 285210: loss 6.9457, time 125.88ms
iter 285220: loss 5.8202, time 124.71ms
iter 285230: loss 6.6713, time 125.87ms
iter 285240: loss 5.9826, time 125.57ms
step 285250: train loss 5.6513, val loss 5.6452
saving checkpoint to out-shakespeare-char
iter 285250: loss 6.4417, time 2900.99ms
iter 285260: loss 5.7870, time 125.62ms
iter 285270: loss 5.3000, time 127.11ms
iter 285280: loss 5.7802, time 125.37ms
iter 285290: loss 6.1717, time 123.55ms
iter 285300: loss 6.5813, time 125.41ms
iter 285310: loss 6.4562, time 127.79ms
iter 285320: loss 5.9861, time 124.96ms
iter 285330: loss 5.9748, time 124.00ms
iter 285340: loss 5.7797, time 125.00ms
iter 285350: loss 6.5470, time 125.23ms
iter 285360: loss 6.1286, time 125.18ms
iter 285370: loss 6.0644, time 124.74ms
iter 285380: loss 5.6385, time 125.49ms
iter 285390: loss 5.6133, time 124.46ms
iter 285400: loss 6.1577, time 124.89ms
iter 285410: loss 6.5457, time 124.12ms
iter 285420: loss 5.7571, time 125.20ms
iter 285430: loss 6.3362, time 128.06ms
iter 285440: loss 5.6584, time 127.53ms
iter 285450: loss 6.9160, time 125.80ms
iter 285460: loss 6.3580, time 125.85ms
iter 285470: loss 6.0555, time 125.47ms
iter 285480: loss 5.9381, time 125.81ms
iter 285490: loss 6.6253, time 125.61ms
step 285500: train loss 5.6407, val loss 5.6797
saving checkpoint to out-shakespeare-char
iter 285500: loss 5.9382, time 2903.93ms
iter 285510: loss 6.1337, time 125.67ms
iter 285520: loss 5.9669, time 126.89ms
iter 285530: loss 6.2121, time 125.19ms
iter 285540: loss 6.6282, time 129.00ms
iter 285550: loss 6.0537, time 125.54ms
iter 285560: loss 6.2729, time 125.98ms
iter 285570: loss 6.6478, time 126.40ms
iter 285580: loss 6.3851, time 125.55ms
iter 285590: loss 6.2772, time 125.54ms
iter 285600: loss 5.6196, time 125.73ms
iter 285610: loss 6.2016, time 125.74ms
iter 285620: loss 6.6260, time 125.78ms
iter 285630: loss 6.0167, time 125.48ms
iter 285640: loss 5.5610, time 125.88ms
iter 285650: loss 5.9084, time 125.59ms
iter 285660: loss 5.9729, time 128.79ms
iter 285670: loss 6.4063, time 124.82ms
iter 285680: loss 5.7587, time 127.20ms
iter 285690: loss 6.3650, time 125.08ms
iter 285700: loss 5.6563, time 125.50ms
iter 285710: loss 5.6693, time 128.47ms
iter 285720: loss 6.7936, time 125.75ms
iter 285730: loss 6.4624, time 125.07ms
iter 285740: loss 5.3871, time 125.29ms
step 285750: train loss 5.6889, val loss 5.6256
saving checkpoint to out-shakespeare-char
iter 285750: loss 5.7840, time 2891.57ms
iter 285760: loss 5.7666, time 125.43ms
iter 285770: loss 5.7921, time 128.49ms
iter 285780: loss 5.8378, time 125.25ms
iter 285790: loss 6.4431, time 125.19ms
iter 285800: loss 5.7940, time 125.21ms
iter 285810: loss 6.1800, time 125.04ms
iter 285820: loss 6.5646, time 125.01ms
iter 285830: loss 6.3098, time 126.46ms
iter 285840: loss 6.2639, time 124.93ms
iter 285850: loss 6.0623, time 125.15ms
iter 285860: loss 6.1803, time 125.03ms
iter 285870: loss 5.4964, time 124.87ms
iter 285880: loss 5.9310, time 125.25ms
iter 285890: loss 6.1776, time 128.52ms
iter 285900: loss 5.3780, time 125.44ms
iter 285910: loss 5.8288, time 125.58ms
iter 285920: loss 5.7156, time 125.58ms
iter 285930: loss 6.1714, time 125.48ms
iter 285940: loss 5.7895, time 125.30ms
iter 285950: loss 6.1934, time 125.52ms
iter 285960: loss 6.0398, time 125.49ms
iter 285970: loss 6.0598, time 125.43ms
iter 285980: loss 6.3723, time 125.01ms
iter 285990: loss 6.4978, time 125.18ms
step 286000: train loss 5.6341, val loss 5.6243
saving checkpoint to out-shakespeare-char
iter 286000: loss 6.1850, time 2895.12ms
iter 286010: loss 6.5211, time 125.34ms
iter 286020: loss 6.3266, time 125.56ms
iter 286030: loss 6.1738, time 125.29ms
iter 286040: loss 5.7920, time 125.34ms
iter 286050: loss 6.2743, time 125.56ms
iter 286060: loss 6.3439, time 125.38ms
iter 286070: loss 5.9054, time 124.72ms
iter 286080: loss 5.6247, time 126.54ms
iter 286090: loss 6.1252, time 125.44ms
iter 286100: loss 6.0834, time 125.41ms
iter 286110: loss 5.5416, time 125.35ms
iter 286120: loss 6.3084, time 125.19ms
iter 286130: loss 6.3506, time 125.09ms
iter 286140: loss 6.6567, time 127.56ms
iter 286150: loss 6.0496, time 124.97ms
iter 286160: loss 5.8721, time 125.01ms
iter 286170: loss 5.9925, time 127.18ms
iter 286180: loss 6.0541, time 125.45ms
iter 286190: loss 5.8495, time 128.77ms
iter 286200: loss 6.3791, time 125.67ms
iter 286210: loss 6.6940, time 125.54ms
iter 286220: loss 6.4012, time 124.96ms
iter 286230: loss 5.9359, time 125.50ms
iter 286240: loss 5.7628, time 125.93ms
step 286250: train loss 5.6547, val loss 5.6391
saving checkpoint to out-shakespeare-char
iter 286250: loss 6.6607, time 2879.82ms
iter 286260: loss 5.6464, time 122.25ms
iter 286270: loss 6.1585, time 120.72ms
iter 286280: loss 6.2523, time 121.77ms
iter 286290: loss 6.4225, time 121.99ms
iter 286300: loss 6.5703, time 122.13ms
iter 286310: loss 5.6021, time 121.77ms
iter 286320: loss 5.9023, time 122.95ms
iter 286330: loss 6.1110, time 123.11ms
iter 286340: loss 5.8686, time 122.04ms
iter 286350: loss 5.9359, time 121.81ms
iter 286360: loss 6.1539, time 119.76ms
iter 286370: loss 6.8369, time 122.66ms
iter 286380: loss 6.3079, time 121.87ms
iter 286390: loss 6.0773, time 121.53ms
iter 286400: loss 5.9694, time 121.28ms
iter 286410: loss 6.0129, time 125.75ms
iter 286420: loss 6.0460, time 125.75ms
iter 286430: loss 6.6173, time 125.81ms
iter 286440: loss 6.3834, time 129.62ms
iter 286450: loss 6.0928, time 125.30ms
iter 286460: loss 6.5882, time 124.43ms
iter 286470: loss 5.9819, time 125.64ms
iter 286480: loss 6.7664, time 125.40ms
iter 286490: loss 6.2323, time 125.37ms
step 286500: train loss 5.6583, val loss 5.6953
saving checkpoint to out-shakespeare-char
iter 286500: loss 5.2798, time 2893.17ms
iter 286510: loss 6.1523, time 125.44ms
iter 286520: loss 6.2881, time 125.44ms
iter 286530: loss 6.1467, time 125.30ms
iter 286540: loss 6.1959, time 124.01ms
iter 286550: loss 6.5726, time 125.28ms
iter 286560: loss 6.5522, time 124.61ms
iter 286570: loss 5.6471, time 128.19ms
iter 286580: loss 6.0959, time 125.58ms
iter 286590: loss 5.0742, time 125.24ms
iter 286600: loss 5.3619, time 125.32ms
iter 286610: loss 5.8505, time 125.01ms
iter 286620: loss 6.4960, time 125.54ms
iter 286630: loss 5.5275, time 125.54ms
iter 286640: loss 6.5455, time 125.76ms
iter 286650: loss 6.2618, time 126.81ms
iter 286660: loss 5.5179, time 125.06ms
iter 286670: loss 6.5182, time 125.36ms
iter 286680: loss 6.2013, time 126.14ms
iter 286690: loss 6.3146, time 124.65ms
iter 286700: loss 6.3514, time 125.44ms
iter 286710: loss 5.7391, time 125.75ms
iter 286720: loss 5.6868, time 124.98ms
iter 286730: loss 5.7365, time 125.60ms
iter 286740: loss 6.5402, time 125.86ms
step 286750: train loss 5.6248, val loss 5.6602
saving checkpoint to out-shakespeare-char
iter 286750: loss 5.9709, time 2903.61ms
iter 286760: loss 6.7391, time 125.72ms
iter 286770: loss 6.5561, time 128.33ms
iter 286780: loss 6.4268, time 125.38ms
iter 286790: loss 6.2857, time 125.34ms
iter 286800: loss 6.3744, time 125.53ms
iter 286810: loss 5.4025, time 125.32ms
iter 286820: loss 5.8170, time 125.59ms
iter 286830: loss 6.2199, time 126.59ms
iter 286840: loss 5.3746, time 125.59ms
iter 286850: loss 6.6921, time 125.31ms
iter 286860: loss 6.1235, time 124.92ms
iter 286870: loss 5.7938, time 126.19ms
iter 286880: loss 6.5683, time 125.60ms
iter 286890: loss 5.5959, time 128.30ms
iter 286900: loss 5.6435, time 125.36ms
iter 286910: loss 6.0261, time 125.31ms
iter 286920: loss 6.2350, time 125.64ms
iter 286930: loss 6.1246, time 125.70ms
iter 286940: loss 6.3220, time 125.51ms
iter 286950: loss 5.9630, time 125.20ms
iter 286960: loss 5.4080, time 125.17ms
iter 286970: loss 6.6385, time 125.39ms
iter 286980: loss 5.5472, time 125.24ms
iter 286990: loss 6.2543, time 125.25ms
step 287000: train loss 5.6748, val loss 5.6623
saving checkpoint to out-shakespeare-char
iter 287000: loss 5.7946, time 2878.90ms
iter 287010: loss 6.3598, time 125.68ms
iter 287020: loss 5.7347, time 125.53ms
iter 287030: loss 6.3370, time 128.26ms
iter 287040: loss 5.9602, time 125.33ms
iter 287050: loss 5.6336, time 125.56ms
iter 287060: loss 6.3804, time 125.39ms
iter 287070: loss 5.4330, time 125.70ms
iter 287080: loss 6.2660, time 125.38ms
iter 287090: loss 5.7854, time 125.37ms
iter 287100: loss 5.9536, time 125.33ms
iter 287110: loss 6.8716, time 124.66ms
iter 287120: loss 5.8704, time 125.54ms
iter 287130: loss 6.7701, time 125.41ms
iter 287140: loss 5.9653, time 125.54ms
iter 287150: loss 6.9394, time 124.55ms
iter 287160: loss 5.5889, time 124.91ms
iter 287170: loss 5.7330, time 127.25ms
iter 287180: loss 7.1275, time 125.42ms
iter 287190: loss 5.9949, time 125.53ms
iter 287200: loss 6.2997, time 128.17ms
iter 287210: loss 5.6943, time 125.73ms
iter 287220: loss 6.2984, time 125.39ms
iter 287230: loss 6.5450, time 125.70ms
iter 287240: loss 5.3215, time 125.01ms
step 287250: train loss 5.6631, val loss 5.6429
saving checkpoint to out-shakespeare-char
iter 287250: loss 5.5160, time 2877.68ms
iter 287260: loss 6.2622, time 124.84ms
iter 287270: loss 6.5803, time 125.08ms
iter 287280: loss 5.8684, time 125.94ms
iter 287290: loss 6.0159, time 125.65ms
iter 287300: loss 6.2718, time 128.10ms
iter 287310: loss 6.2678, time 124.69ms
iter 287320: loss 5.4526, time 127.32ms
iter 287330: loss 6.6494, time 122.34ms
iter 287340: loss 5.9574, time 125.96ms
iter 287350: loss 6.2670, time 125.81ms
iter 287360: loss 6.8798, time 124.89ms
iter 287370: loss 4.6730, time 125.04ms
iter 287380: loss 5.7580, time 128.06ms
iter 287390: loss 5.6286, time 125.18ms
iter 287400: loss 6.3089, time 124.82ms
iter 287410: loss 5.6239, time 125.29ms
iter 287420: loss 6.7631, time 126.93ms
iter 287430: loss 5.3054, time 124.97ms
iter 287440: loss 5.4398, time 125.38ms
iter 287450: loss 5.6767, time 125.09ms
iter 287460: loss 6.7505, time 124.82ms
iter 287470: loss 5.9751, time 124.98ms
iter 287480: loss 5.8060, time 125.12ms
iter 287490: loss 6.1051, time 125.20ms
step 287500: train loss 5.6944, val loss 5.6875
saving checkpoint to out-shakespeare-char
iter 287500: loss 5.6045, time 2884.46ms
iter 287510: loss 7.2068, time 125.39ms
iter 287520: loss 5.5231, time 124.91ms
iter 287530: loss 5.6531, time 124.46ms
iter 287540: loss 6.4079, time 125.41ms
iter 287550: loss 5.8130, time 125.56ms
iter 287560: loss 5.5103, time 127.28ms
iter 287570: loss 6.0345, time 126.90ms
iter 287580: loss 5.9219, time 125.16ms
iter 287590: loss 6.1975, time 125.23ms
iter 287600: loss 5.9491, time 123.86ms
iter 287610: loss 5.8645, time 124.75ms
iter 287620: loss 6.0446, time 125.24ms
iter 287630: loss 6.5118, time 125.42ms
iter 287640: loss 5.5694, time 127.54ms
iter 287650: loss 6.2142, time 124.27ms
iter 287660: loss 6.1487, time 125.49ms
iter 287670: loss 5.5512, time 126.81ms
iter 287680: loss 6.0465, time 125.18ms
iter 287690: loss 5.6094, time 125.26ms
iter 287700: loss 6.8899, time 125.40ms
iter 287710: loss 5.8420, time 125.59ms
iter 287720: loss 5.6451, time 125.17ms
iter 287730: loss 6.7435, time 125.58ms
iter 287740: loss 6.6654, time 125.40ms
step 287750: train loss 5.6956, val loss 5.6609
saving checkpoint to out-shakespeare-char
iter 287750: loss 6.0534, time 2881.98ms
iter 287760: loss 5.6687, time 123.73ms
iter 287770: loss 5.9767, time 121.82ms
iter 287780: loss 6.8579, time 122.27ms
iter 287790: loss 6.2472, time 121.71ms
iter 287800: loss 6.3008, time 122.40ms
iter 287810: loss 6.1723, time 122.11ms
iter 287820: loss 7.1120, time 122.94ms
iter 287830: loss 6.0231, time 121.61ms
iter 287840: loss 6.4047, time 124.82ms
iter 287850: loss 5.8535, time 121.93ms
iter 287860: loss 5.6385, time 125.01ms
iter 287870: loss 6.2809, time 121.71ms
iter 287880: loss 5.9612, time 124.91ms
iter 287890: loss 6.0187, time 121.55ms
iter 287900: loss 6.7885, time 125.72ms
iter 287910: loss 6.5101, time 121.65ms
iter 287920: loss 5.9050, time 124.89ms
iter 287930: loss 5.5715, time 123.01ms
iter 287940: loss 5.5454, time 121.84ms
iter 287950: loss 6.3156, time 121.74ms
iter 287960: loss 6.3489, time 121.56ms
iter 287970: loss 5.9691, time 121.51ms
iter 287980: loss 6.1270, time 122.24ms
iter 287990: loss 5.9511, time 121.88ms
step 288000: train loss 5.6428, val loss 5.6458
saving checkpoint to out-shakespeare-char
iter 288000: loss 5.9466, time 2898.23ms
iter 288010: loss 5.8914, time 125.66ms
iter 288020: loss 7.1848, time 125.56ms
iter 288030: loss 5.8115, time 125.48ms
iter 288040: loss 5.5184, time 125.68ms
iter 288050: loss 6.2840, time 125.78ms
iter 288060: loss 6.6505, time 125.88ms
iter 288070: loss 5.9210, time 128.26ms
iter 288080: loss 5.9314, time 125.33ms
iter 288090: loss 5.9678, time 125.52ms
iter 288100: loss 6.3266, time 125.39ms
iter 288110: loss 5.5193, time 125.55ms
iter 288120: loss 5.1794, time 121.31ms
iter 288130: loss 5.9653, time 122.30ms
iter 288140: loss 6.2219, time 121.65ms
iter 288150: loss 5.9689, time 122.48ms
iter 288160: loss 5.8944, time 121.79ms
iter 288170: loss 7.1224, time 122.37ms
iter 288180: loss 6.0010, time 121.83ms
iter 288190: loss 7.2862, time 122.93ms
iter 288200: loss 6.1196, time 121.60ms
iter 288210: loss 6.0910, time 121.91ms
iter 288220: loss 6.7472, time 122.12ms
iter 288230: loss 6.0148, time 121.51ms
iter 288240: loss 5.7993, time 122.01ms
step 288250: train loss 5.6077, val loss 5.6815
saving checkpoint to out-shakespeare-char
iter 288250: loss 6.3481, time 2899.40ms
iter 288260: loss 6.2312, time 124.78ms
iter 288270: loss 6.0625, time 121.72ms
iter 288280: loss 6.0348, time 124.60ms
iter 288290: loss 6.2144, time 121.53ms
iter 288300: loss 7.2350, time 124.64ms
iter 288310: loss 6.5905, time 121.65ms
iter 288320: loss 5.7611, time 124.69ms
iter 288330: loss 5.5337, time 121.75ms
iter 288340: loss 6.0260, time 120.40ms
iter 288350: loss 6.1046, time 121.65ms
iter 288360: loss 6.3726, time 121.02ms
iter 288370: loss 5.7189, time 122.08ms
iter 288380: loss 5.3100, time 121.55ms
iter 288390: loss 6.3678, time 121.70ms
iter 288400: loss 6.5149, time 121.46ms
iter 288410: loss 6.2923, time 121.67ms
iter 288420: loss 6.2718, time 121.63ms
iter 288430: loss 6.4522, time 121.85ms
iter 288440: loss 5.7863, time 121.92ms
iter 288450: loss 6.3655, time 122.75ms
iter 288460: loss 6.9438, time 121.63ms
iter 288470: loss 5.3164, time 121.79ms
iter 288480: loss 6.3192, time 123.64ms
iter 288490: loss 6.0485, time 121.74ms
step 288500: train loss 5.6622, val loss 5.6279
saving checkpoint to out-shakespeare-char
iter 288500: loss 6.1158, time 2907.51ms
iter 288510: loss 5.7672, time 121.82ms
iter 288520: loss 6.1751, time 121.55ms
iter 288530: loss 5.6422, time 121.71ms
iter 288540: loss 5.7699, time 121.98ms
iter 288550: loss 5.9069, time 121.77ms
iter 288560: loss 6.1088, time 121.90ms
iter 288570: loss 6.2414, time 121.86ms
iter 288580: loss 6.2772, time 121.63ms
iter 288590: loss 5.7640, time 121.72ms
iter 288600: loss 6.5796, time 121.97ms
iter 288610: loss 6.1353, time 122.94ms
iter 288620: loss 5.7057, time 121.76ms
iter 288630: loss 5.6848, time 121.60ms
iter 288640: loss 6.3384, time 123.00ms
iter 288650: loss 6.0792, time 121.57ms
iter 288660: loss 6.6701, time 123.25ms
iter 288670: loss 6.2710, time 121.76ms
iter 288680: loss 5.4484, time 123.04ms
iter 288690: loss 6.9029, time 121.70ms
iter 288700: loss 6.0664, time 123.01ms
iter 288710: loss 6.3400, time 123.60ms
iter 288720: loss 6.2679, time 120.51ms
iter 288730: loss 6.6775, time 123.11ms
iter 288740: loss 6.2155, time 121.72ms
step 288750: train loss 5.6685, val loss 5.6850
saving checkpoint to out-shakespeare-char
iter 288750: loss 6.8147, time 2900.65ms
iter 288760: loss 5.4197, time 121.53ms
iter 288770: loss 5.9778, time 121.67ms
iter 288780: loss 5.9457, time 123.06ms
iter 288790: loss 6.6785, time 121.65ms
iter 288800: loss 6.9221, time 121.83ms
iter 288810: loss 6.4192, time 121.43ms
iter 288820: loss 6.1576, time 121.25ms
iter 288830: loss 5.4977, time 121.59ms
iter 288840: loss 5.6123, time 121.66ms
iter 288850: loss 5.7387, time 121.39ms
iter 288860: loss 6.2265, time 121.68ms
iter 288870: loss 6.1516, time 121.55ms
iter 288880: loss 6.3545, time 121.72ms
iter 288890: loss 6.6124, time 123.16ms
iter 288900: loss 6.1449, time 121.48ms
iter 288910: loss 5.5700, time 121.74ms
iter 288920: loss 5.8243, time 122.73ms
iter 288930: loss 6.2944, time 121.70ms
iter 288940: loss 6.0700, time 120.94ms
iter 288950: loss 5.6187, time 121.61ms
iter 288960: loss 6.0531, time 121.56ms
iter 288970: loss 6.6107, time 122.10ms
iter 288980: loss 5.7390, time 121.46ms
iter 288990: loss 5.6495, time 121.50ms
step 289000: train loss 5.6864, val loss 5.7370
saving checkpoint to out-shakespeare-char
iter 289000: loss 6.2604, time 2905.39ms
iter 289010: loss 6.5398, time 121.54ms
iter 289020: loss 5.7542, time 121.75ms
iter 289030: loss 6.3432, time 121.73ms
iter 289040: loss 6.6921, time 122.82ms
iter 289050: loss 5.5521, time 121.49ms
iter 289060: loss 5.9096, time 121.53ms
iter 289070: loss 5.4773, time 121.52ms
iter 289080: loss 6.6368, time 121.81ms
iter 289090: loss 6.7773, time 121.71ms
iter 289100: loss 6.1751, time 121.64ms
iter 289110: loss 6.0890, time 121.62ms
iter 289120: loss 6.6710, time 121.71ms
iter 289130: loss 6.4448, time 121.68ms
iter 289140: loss 5.9022, time 121.29ms
iter 289150: loss 6.3627, time 121.64ms
iter 289160: loss 5.7915, time 122.96ms
iter 289170: loss 5.5476, time 123.05ms
iter 289180: loss 5.8854, time 121.75ms
iter 289190: loss 6.2541, time 123.22ms
iter 289200: loss 5.9329, time 121.63ms
iter 289210: loss 5.9147, time 123.16ms
iter 289220: loss 5.9189, time 121.54ms
iter 289230: loss 5.6547, time 122.94ms
iter 289240: loss 6.1583, time 121.85ms
step 289250: train loss 5.6376, val loss 5.6168
saving checkpoint to out-shakespeare-char
iter 289250: loss 5.6101, time 2895.35ms
iter 289260: loss 5.7847, time 121.36ms
iter 289270: loss 6.5312, time 121.79ms
iter 289280: loss 5.8221, time 121.66ms
iter 289290: loss 7.1163, time 121.57ms
iter 289300: loss 5.8153, time 121.63ms
iter 289310: loss 5.2300, time 121.62ms
iter 289320: loss 6.4189, time 121.56ms
iter 289330: loss 5.9220, time 121.95ms
iter 289340: loss 6.0888, time 121.61ms
iter 289350: loss 6.5518, time 121.99ms
iter 289360: loss 5.1739, time 121.97ms
iter 289370: loss 6.1668, time 122.88ms
iter 289380: loss 6.3191, time 121.44ms
iter 289390: loss 6.0387, time 120.72ms
iter 289400: loss 5.9868, time 121.96ms
iter 289410: loss 5.9710, time 121.20ms
iter 289420: loss 5.6832, time 121.73ms
iter 289430: loss 5.8449, time 121.53ms
iter 289440: loss 6.0075, time 121.64ms
iter 289450: loss 5.9086, time 121.92ms
iter 289460: loss 6.3367, time 122.92ms
iter 289470: loss 5.8998, time 121.70ms
iter 289480: loss 6.3709, time 121.62ms
iter 289490: loss 6.0994, time 121.07ms
step 289500: train loss 5.6148, val loss 5.6661
saving checkpoint to out-shakespeare-char
iter 289500: loss 6.4837, time 2871.59ms
iter 289510: loss 5.7921, time 121.87ms
iter 289520: loss 6.1370, time 123.08ms
iter 289530: loss 6.6297, time 121.79ms
iter 289540: loss 6.4953, time 123.42ms
iter 289550: loss 6.1234, time 121.91ms
iter 289560: loss 7.0364, time 123.20ms
iter 289570: loss 5.8887, time 123.37ms
iter 289580: loss 5.3997, time 121.15ms
iter 289590: loss 6.1290, time 124.43ms
iter 289600: loss 6.2939, time 121.60ms
iter 289610: loss 5.7547, time 123.31ms
iter 289620: loss 6.2531, time 123.30ms
iter 289630: loss 6.1218, time 122.94ms
iter 289640: loss 5.5698, time 121.72ms
iter 289650: loss 5.6138, time 122.93ms
iter 289660: loss 5.9628, time 121.68ms
iter 289670: loss 6.1740, time 123.30ms
iter 289680: loss 5.6725, time 121.08ms
iter 289690: loss 6.5319, time 123.26ms
iter 289700: loss 6.0083, time 121.74ms
iter 289710: loss 6.4133, time 122.99ms
iter 289720: loss 6.5822, time 121.64ms
iter 289730: loss 6.2441, time 120.28ms
iter 289740: loss 5.4815, time 123.01ms
step 289750: train loss 5.6373, val loss 5.7052
saving checkpoint to out-shakespeare-char
iter 289750: loss 6.4888, time 2903.78ms
iter 289760: loss 5.9451, time 120.53ms
iter 289770: loss 5.2303, time 124.01ms
iter 289780: loss 6.4154, time 123.37ms
iter 289790: loss 5.8647, time 121.58ms
iter 289800: loss 5.2660, time 121.70ms
iter 289810: loss 5.9287, time 121.61ms
iter 289820: loss 5.8010, time 121.63ms
iter 289830: loss 5.4881, time 121.54ms
iter 289840: loss 6.5018, time 120.99ms
iter 289850: loss 6.1372, time 121.32ms
iter 289860: loss 5.7804, time 121.59ms
iter 289870: loss 5.9558, time 121.56ms
iter 289880: loss 6.4537, time 121.62ms
iter 289890: loss 5.9467, time 123.19ms
iter 289900: loss 6.2412, time 121.57ms
iter 289910: loss 5.9223, time 122.90ms
iter 289920: loss 5.7635, time 121.56ms
iter 289930: loss 5.4536, time 123.15ms
iter 289940: loss 6.5669, time 121.79ms
iter 289950: loss 5.9298, time 122.80ms
iter 289960: loss 5.4819, time 121.73ms
iter 289970: loss 6.1869, time 122.93ms
iter 289980: loss 6.5710, time 121.63ms
iter 289990: loss 6.3376, time 122.97ms
step 290000: train loss 5.6349, val loss 5.6230
saving checkpoint to out-shakespeare-char
iter 290000: loss 6.2038, time 2909.98ms
iter 290010: loss 6.1208, time 121.62ms
iter 290020: loss 6.2128, time 123.84ms
iter 290030: loss 6.9732, time 121.55ms
iter 290040: loss 5.7618, time 124.05ms
iter 290050: loss 5.7711, time 123.06ms
iter 290060: loss 6.1461, time 121.66ms
iter 290070: loss 5.7732, time 121.89ms
iter 290080: loss 5.7167, time 121.61ms
iter 290090: loss 6.3216, time 121.42ms
iter 290100: loss 5.4173, time 121.92ms
iter 290110: loss 5.6937, time 121.83ms
iter 290120: loss 5.9314, time 122.89ms
iter 290130: loss 5.5379, time 121.80ms
iter 290140: loss 5.6231, time 121.70ms
iter 290150: loss 6.3263, time 123.24ms
iter 290160: loss 5.5087, time 122.91ms
iter 290170: loss 6.2464, time 121.63ms
iter 290180: loss 5.1162, time 123.02ms
iter 290190: loss 5.9972, time 121.71ms
iter 290200: loss 6.6665, time 122.69ms
iter 290210: loss 5.5755, time 121.73ms
iter 290220: loss 5.7549, time 122.71ms
iter 290230: loss 5.5741, time 121.64ms
iter 290240: loss 6.2848, time 122.23ms
step 290250: train loss 5.6320, val loss 5.6372
saving checkpoint to out-shakespeare-char
iter 290250: loss 6.0595, time 2906.55ms
iter 290260: loss 6.2018, time 122.91ms
iter 290270: loss 6.5912, time 121.59ms
iter 290280: loss 6.4669, time 123.00ms
iter 290290: loss 5.8418, time 121.60ms
iter 290300: loss 5.5733, time 123.02ms
iter 290310: loss 6.1509, time 122.83ms
iter 290320: loss 6.0968, time 121.72ms
iter 290330: loss 5.6510, time 123.21ms
iter 290340: loss 5.5565, time 121.86ms
iter 290350: loss 5.8108, time 122.94ms
iter 290360: loss 6.5127, time 122.07ms
iter 290370: loss 5.4298, time 123.24ms
iter 290380: loss 6.0540, time 121.71ms
iter 290390: loss 6.3328, time 122.97ms
iter 290400: loss 6.0860, time 121.80ms
iter 290410: loss 5.9575, time 122.86ms
iter 290420: loss 5.6510, time 122.97ms
iter 290430: loss 6.5242, time 122.08ms
iter 290440: loss 5.9120, time 122.85ms
iter 290450: loss 6.3549, time 121.68ms
iter 290460: loss 6.3883, time 122.73ms
iter 290470: loss 5.4717, time 121.81ms
iter 290480: loss 5.8032, time 122.90ms
iter 290490: loss 5.6538, time 121.72ms
step 290500: train loss 5.6740, val loss 5.6889
saving checkpoint to out-shakespeare-char
iter 290500: loss 5.9019, time 2904.10ms
iter 290510: loss 5.5509, time 124.10ms
iter 290520: loss 5.6171, time 121.81ms
iter 290530: loss 5.9837, time 123.95ms
iter 290540: loss 6.1497, time 121.77ms
iter 290550: loss 6.2090, time 124.04ms
iter 290560: loss 5.5000, time 122.04ms
iter 290570: loss 5.9283, time 123.96ms
iter 290580: loss 5.6241, time 122.48ms
iter 290590: loss 5.6362, time 120.38ms
iter 290600: loss 5.7217, time 121.37ms
iter 290610: loss 5.7973, time 121.42ms
iter 290620: loss 6.3338, time 121.67ms
iter 290630: loss 5.1416, time 121.65ms
iter 290640: loss 4.9434, time 121.63ms
iter 290650: loss 6.3803, time 121.74ms
iter 290660: loss 6.5503, time 121.75ms
iter 290670: loss 6.5192, time 121.85ms
iter 290680: loss 6.0711, time 122.00ms
iter 290690: loss 6.1654, time 121.69ms
iter 290700: loss 6.5650, time 123.07ms
iter 290710: loss 6.0548, time 121.95ms
iter 290720: loss 5.8729, time 122.01ms
iter 290730: loss 6.2925, time 121.62ms
iter 290740: loss 6.5489, time 121.46ms
step 290750: train loss 5.6408, val loss 5.6375
saving checkpoint to out-shakespeare-char
iter 290750: loss 6.1887, time 2909.54ms
iter 290760: loss 5.7184, time 119.69ms
iter 290770: loss 6.4307, time 120.07ms
iter 290780: loss 6.1735, time 119.67ms
iter 290790: loss 5.2808, time 119.94ms
iter 290800: loss 5.0809, time 119.68ms
iter 290810: loss 6.6703, time 120.02ms
iter 290820: loss 5.9094, time 120.01ms
iter 290830: loss 6.7898, time 120.09ms
iter 290840: loss 5.7187, time 119.88ms
iter 290850: loss 6.4347, time 119.94ms
iter 290860: loss 6.1168, time 121.04ms
iter 290870: loss 6.2427, time 119.64ms
iter 290880: loss 5.7500, time 123.22ms
iter 290890: loss 5.8861, time 119.86ms
iter 290900: loss 6.0953, time 123.24ms
iter 290910: loss 5.9857, time 119.89ms
iter 290920: loss 6.1720, time 122.24ms
iter 290930: loss 5.0811, time 120.57ms
iter 290940: loss 6.7169, time 122.40ms
iter 290950: loss 6.0764, time 119.76ms
iter 290960: loss 6.2145, time 123.89ms
iter 290970: loss 6.3172, time 123.24ms
iter 290980: loss 6.7765, time 121.47ms
iter 290990: loss 5.8958, time 121.58ms
step 291000: train loss 5.6364, val loss 5.6693
saving checkpoint to out-shakespeare-char
iter 291000: loss 5.1241, time 2899.86ms
iter 291010: loss 6.1690, time 125.32ms
iter 291020: loss 6.4001, time 127.58ms
iter 291030: loss 6.4152, time 125.90ms
iter 291040: loss 5.9085, time 125.95ms
iter 291050: loss 6.3103, time 125.99ms
iter 291060: loss 6.2792, time 126.16ms
iter 291070: loss 6.3068, time 125.52ms
iter 291080: loss 6.0713, time 125.28ms
iter 291090: loss 5.8538, time 125.78ms
iter 291100: loss 5.1043, time 127.88ms
iter 291110: loss 5.9096, time 125.86ms
iter 291120: loss 6.6024, time 126.85ms
iter 291130: loss 6.1667, time 124.60ms
iter 291140: loss 5.8713, time 125.59ms
iter 291150: loss 5.9468, time 125.93ms
iter 291160: loss 6.1899, time 125.38ms
iter 291170: loss 5.7242, time 125.24ms
iter 291180: loss 5.8446, time 125.24ms
iter 291190: loss 6.1156, time 125.39ms
iter 291200: loss 6.0789, time 125.66ms
iter 291210: loss 6.2763, time 125.41ms
iter 291220: loss 6.1613, time 125.67ms
iter 291230: loss 5.9974, time 127.80ms
iter 291240: loss 5.7375, time 126.29ms
step 291250: train loss 5.6952, val loss 5.6567
saving checkpoint to out-shakespeare-char
iter 291250: loss 5.7210, time 2888.42ms
iter 291260: loss 6.3642, time 126.53ms
iter 291270: loss 5.6755, time 125.80ms
iter 291280: loss 5.7585, time 125.21ms
iter 291290: loss 5.4873, time 126.57ms
iter 291300: loss 6.9430, time 125.48ms
iter 291310: loss 6.8619, time 128.18ms
iter 291320: loss 5.9672, time 125.69ms
iter 291330: loss 6.6827, time 125.48ms
iter 291340: loss 5.6038, time 125.45ms
iter 291350: loss 6.4107, time 124.23ms
iter 291360: loss 6.1702, time 126.39ms
iter 291370: loss 6.3343, time 124.93ms
iter 291380: loss 5.2488, time 125.03ms
iter 291390: loss 5.1631, time 125.29ms
iter 291400: loss 5.7399, time 124.73ms
iter 291410: loss 6.0694, time 126.40ms
iter 291420: loss 6.4356, time 124.47ms
iter 291430: loss 6.3832, time 128.32ms
iter 291440: loss 5.8024, time 125.70ms
iter 291450: loss 7.3148, time 126.01ms
iter 291460: loss 5.6052, time 125.82ms
iter 291470: loss 5.7105, time 125.68ms
iter 291480: loss 5.5767, time 125.75ms
iter 291490: loss 5.5573, time 125.61ms
step 291500: train loss 5.6431, val loss 5.6818
saving checkpoint to out-shakespeare-char
iter 291500: loss 5.6601, time 2894.93ms
iter 291510: loss 6.2725, time 121.55ms
iter 291520: loss 5.8949, time 122.90ms
iter 291530: loss 5.7148, time 120.93ms
iter 291540: loss 6.6251, time 122.51ms
iter 291550: loss 6.5999, time 121.51ms
iter 291560: loss 5.6728, time 122.80ms
iter 291570: loss 6.7110, time 121.32ms
iter 291580: loss 6.0674, time 123.20ms
iter 291590: loss 5.8077, time 121.15ms
iter 291600: loss 5.4623, time 122.69ms
iter 291610: loss 6.2989, time 123.02ms
iter 291620: loss 6.1711, time 121.83ms
iter 291630: loss 5.4131, time 123.24ms
iter 291640: loss 5.7673, time 121.06ms
iter 291650: loss 6.2678, time 122.36ms
iter 291660: loss 6.7571, time 121.42ms
iter 291670: loss 5.4825, time 122.94ms
iter 291680: loss 6.1988, time 121.38ms
iter 291690: loss 5.4947, time 122.78ms
iter 291700: loss 5.9116, time 121.48ms
iter 291710: loss 5.9311, time 123.08ms
iter 291720: loss 6.2568, time 123.36ms
iter 291730: loss 5.9997, time 121.71ms
iter 291740: loss 6.4967, time 122.82ms
step 291750: train loss 5.7041, val loss 5.6985
saving checkpoint to out-shakespeare-char
iter 291750: loss 5.9911, time 2874.83ms
iter 291760: loss 5.9689, time 122.36ms
iter 291770: loss 5.4590, time 122.17ms
iter 291780: loss 5.7143, time 122.13ms
iter 291790: loss 5.6695, time 122.05ms
iter 291800: loss 5.8252, time 122.05ms
iter 291810: loss 6.3847, time 121.93ms
iter 291820: loss 6.0793, time 121.38ms
iter 291830: loss 6.4958, time 121.50ms
iter 291840: loss 6.1377, time 121.35ms
iter 291850: loss 6.2236, time 121.61ms
iter 291860: loss 5.6756, time 121.42ms
iter 291870: loss 6.9706, time 121.50ms
iter 291880: loss 6.7970, time 122.95ms
iter 291890: loss 5.7130, time 121.59ms
iter 291900: loss 5.4079, time 121.81ms
iter 291910: loss 6.3737, time 121.45ms
iter 291920: loss 6.0715, time 123.24ms
iter 291930: loss 5.8012, time 122.07ms
iter 291940: loss 5.7273, time 122.90ms
iter 291950: loss 5.2834, time 121.44ms
iter 291960: loss 5.4019, time 121.88ms
iter 291970: loss 6.7450, time 121.32ms
iter 291980: loss 5.7919, time 123.49ms
iter 291990: loss 6.4145, time 122.80ms
step 292000: train loss 5.6647, val loss 5.6393
saving checkpoint to out-shakespeare-char
iter 292000: loss 6.2764, time 2905.66ms
iter 292010: loss 5.8807, time 121.46ms
iter 292020: loss 5.0846, time 123.10ms
iter 292030: loss 5.9811, time 121.93ms
iter 292040: loss 5.2449, time 123.41ms
iter 292050: loss 6.1496, time 122.78ms
iter 292060: loss 6.3806, time 121.27ms
iter 292070: loss 5.9216, time 122.83ms
iter 292080: loss 6.8831, time 121.74ms
iter 292090: loss 6.1222, time 122.81ms
iter 292100: loss 5.9174, time 121.72ms
iter 292110: loss 5.7649, time 122.74ms
iter 292120: loss 5.6241, time 121.56ms
iter 292130: loss 6.3911, time 122.82ms
iter 292140: loss 5.8765, time 121.39ms
iter 292150: loss 5.3409, time 122.49ms
iter 292160: loss 6.4913, time 122.81ms
iter 292170: loss 6.3902, time 121.62ms
iter 292180: loss 5.2362, time 121.39ms
iter 292190: loss 6.1453, time 121.76ms
iter 292200: loss 5.9189, time 122.03ms
iter 292210: loss 5.7083, time 121.75ms
iter 292220: loss 5.9451, time 120.98ms
iter 292230: loss 5.9772, time 121.59ms
iter 292240: loss 6.2720, time 121.06ms
step 292250: train loss 5.6638, val loss 5.6654
saving checkpoint to out-shakespeare-char
iter 292250: loss 6.5783, time 2906.41ms
iter 292260: loss 6.3643, time 121.67ms
iter 292270: loss 5.6163, time 121.57ms
iter 292280: loss 6.6075, time 121.71ms
iter 292290: loss 5.5154, time 121.73ms
iter 292300: loss 5.2968, time 121.62ms
iter 292310: loss 5.7809, time 121.83ms
iter 292320: loss 6.2567, time 123.05ms
iter 292330: loss 6.2893, time 122.79ms
iter 292340: loss 6.3480, time 121.41ms
iter 292350: loss 5.7151, time 122.82ms
iter 292360: loss 6.2586, time 121.46ms
iter 292370: loss 6.0256, time 122.45ms
iter 292380: loss 5.8355, time 120.70ms
iter 292390: loss 6.5641, time 122.70ms
iter 292400: loss 5.0889, time 121.40ms
iter 292410: loss 7.0106, time 123.17ms
iter 292420: loss 5.5032, time 121.47ms
iter 292430: loss 6.5558, time 122.33ms
iter 292440: loss 5.5361, time 121.43ms
iter 292450: loss 6.0736, time 120.03ms
iter 292460: loss 5.0700, time 121.49ms
iter 292470: loss 6.7024, time 121.85ms
iter 292480: loss 5.9736, time 121.99ms
iter 292490: loss 5.9074, time 121.50ms
step 292500: train loss 5.6322, val loss 5.6708
saving checkpoint to out-shakespeare-char
iter 292500: loss 6.8119, time 2912.63ms
iter 292510: loss 5.8342, time 121.23ms
iter 292520: loss 5.4439, time 121.47ms
iter 292530: loss 5.7338, time 121.67ms
iter 292540: loss 6.7188, time 122.85ms
iter 292550: loss 5.9623, time 121.58ms
iter 292560: loss 5.9230, time 122.91ms
iter 292570: loss 4.9612, time 121.53ms
iter 292580: loss 6.8872, time 122.79ms
iter 292590: loss 5.6785, time 122.91ms
iter 292600: loss 6.9391, time 121.83ms
iter 292610: loss 6.4628, time 123.37ms
iter 292620: loss 6.0837, time 121.67ms
iter 292630: loss 5.6289, time 122.88ms
iter 292640: loss 5.9610, time 121.76ms
iter 292650: loss 6.5731, time 123.15ms
iter 292660: loss 6.2696, time 121.41ms
iter 292670: loss 5.3057, time 123.22ms
iter 292680: loss 5.9493, time 121.41ms
iter 292690: loss 6.1411, time 123.11ms
iter 292700: loss 6.5832, time 123.84ms
iter 292710: loss 6.6398, time 121.53ms
iter 292720: loss 6.0093, time 123.72ms
iter 292730: loss 6.4999, time 121.74ms
iter 292740: loss 6.3682, time 124.05ms
step 292750: train loss 5.5981, val loss 5.6363
saving checkpoint to out-shakespeare-char
iter 292750: loss 6.2606, time 2902.93ms
iter 292760: loss 5.9898, time 125.35ms
iter 292770: loss 6.2765, time 124.80ms
iter 292780: loss 5.9849, time 125.47ms
iter 292790: loss 5.9142, time 125.93ms
iter 292800: loss 6.6969, time 125.22ms
iter 292810: loss 6.0707, time 125.02ms
iter 292820: loss 5.4360, time 125.05ms
iter 292830: loss 6.4738, time 125.39ms
iter 292840: loss 6.4953, time 127.65ms
iter 292850: loss 5.9404, time 126.70ms
iter 292860: loss 5.8751, time 125.08ms
iter 292870: loss 5.8292, time 125.10ms
iter 292880: loss 6.3825, time 125.12ms
iter 292890: loss 6.0294, time 125.24ms
iter 292900: loss 6.1602, time 125.17ms
iter 292910: loss 5.7452, time 124.59ms
iter 292920: loss 6.0352, time 125.42ms
iter 292930: loss 6.2055, time 125.13ms
iter 292940: loss 5.8507, time 125.21ms
iter 292950: loss 6.3016, time 125.30ms
iter 292960: loss 5.4821, time 125.10ms
iter 292970: loss 5.8606, time 127.68ms
iter 292980: loss 5.8942, time 125.37ms
iter 292990: loss 5.6626, time 125.32ms
step 293000: train loss 5.6380, val loss 5.6622
saving checkpoint to out-shakespeare-char
iter 293000: loss 6.2926, time 2857.50ms
iter 293010: loss 5.8996, time 125.06ms
iter 293020: loss 5.9051, time 125.32ms
iter 293030: loss 6.4881, time 125.17ms
iter 293040: loss 6.2465, time 127.84ms
iter 293050: loss 6.3053, time 124.20ms
iter 293060: loss 5.8900, time 125.12ms
iter 293070: loss 5.9140, time 125.22ms
iter 293080: loss 6.1610, time 127.35ms
iter 293090: loss 6.2032, time 127.81ms
iter 293100: loss 5.9495, time 124.96ms
iter 293110: loss 5.9086, time 125.15ms
iter 293120: loss 7.3427, time 125.21ms
iter 293130: loss 5.7635, time 125.14ms
iter 293140: loss 5.6952, time 124.49ms
iter 293150: loss 6.3170, time 125.07ms
iter 293160: loss 5.4037, time 124.54ms
iter 293170: loss 5.9609, time 124.96ms
iter 293180: loss 6.0711, time 124.42ms
iter 293190: loss 5.8504, time 125.14ms
iter 293200: loss 5.6826, time 124.42ms
iter 293210: loss 5.8886, time 127.67ms
iter 293220: loss 6.2702, time 124.40ms
iter 293230: loss 6.2268, time 124.59ms
iter 293240: loss 6.4863, time 124.23ms
step 293250: train loss 5.7237, val loss 5.6422
saving checkpoint to out-shakespeare-char
iter 293250: loss 6.3276, time 2880.63ms
iter 293260: loss 6.0742, time 125.15ms
iter 293270: loss 6.0086, time 125.20ms
iter 293280: loss 5.2437, time 127.58ms
iter 293290: loss 5.4093, time 125.13ms
iter 293300: loss 6.6685, time 125.19ms
iter 293310: loss 6.3170, time 125.17ms
iter 293320: loss 5.7779, time 125.63ms
iter 293330: loss 5.7487, time 127.06ms
iter 293340: loss 5.9833, time 124.87ms
iter 293350: loss 5.2116, time 125.25ms
iter 293360: loss 5.9423, time 125.40ms
iter 293370: loss 5.5645, time 125.11ms
iter 293380: loss 5.7844, time 124.93ms
iter 293390: loss 5.9329, time 125.45ms
iter 293400: loss 6.2426, time 127.41ms
iter 293410: loss 5.8218, time 125.19ms
iter 293420: loss 6.7813, time 124.77ms
iter 293430: loss 6.1962, time 124.86ms
iter 293440: loss 4.9640, time 125.41ms
iter 293450: loss 6.4140, time 125.03ms
iter 293460: loss 7.0527, time 125.47ms
iter 293470: loss 5.8063, time 125.07ms
iter 293480: loss 6.2744, time 125.48ms
iter 293490: loss 5.9704, time 125.25ms
step 293500: train loss 5.6808, val loss 5.6894
saving checkpoint to out-shakespeare-char
iter 293500: loss 5.9629, time 2857.52ms
iter 293510: loss 6.4388, time 125.16ms
iter 293520: loss 5.7861, time 125.08ms
iter 293530: loss 6.6048, time 125.10ms
iter 293540: loss 6.4864, time 125.24ms
iter 293550: loss 6.6053, time 124.84ms
iter 293560: loss 5.9015, time 127.60ms
iter 293570: loss 6.6294, time 126.24ms
iter 293580: loss 5.3983, time 125.02ms
iter 293590: loss 5.9230, time 124.98ms
iter 293600: loss 7.2549, time 125.32ms
iter 293610: loss 5.8707, time 124.99ms
iter 293620: loss 5.9080, time 124.97ms
iter 293630: loss 6.1098, time 125.05ms
iter 293640: loss 5.9969, time 125.15ms
iter 293650: loss 6.6793, time 125.00ms
iter 293660: loss 5.4948, time 126.54ms
iter 293670: loss 5.6653, time 125.11ms
iter 293680: loss 5.6958, time 125.50ms
iter 293690: loss 6.5757, time 127.44ms
iter 293700: loss 5.5848, time 125.08ms
iter 293710: loss 6.2562, time 125.25ms
iter 293720: loss 5.7821, time 125.61ms
iter 293730: loss 6.1144, time 125.18ms
iter 293740: loss 5.9474, time 125.08ms
step 293750: train loss 5.6672, val loss 5.6678
saving checkpoint to out-shakespeare-char
iter 293750: loss 5.7370, time 2887.08ms
iter 293760: loss 5.7945, time 125.40ms
iter 293770: loss 6.2749, time 125.72ms
iter 293780: loss 5.9599, time 125.12ms
iter 293790: loss 5.9416, time 125.15ms
iter 293800: loss 6.1788, time 127.67ms
iter 293810: loss 6.5872, time 125.52ms
iter 293820: loss 6.6922, time 125.13ms
iter 293830: loss 5.6878, time 124.98ms
iter 293840: loss 5.9432, time 125.30ms
iter 293850: loss 5.6097, time 125.05ms
iter 293860: loss 5.5793, time 125.01ms
iter 293870: loss 5.9259, time 125.24ms
iter 293880: loss 5.4395, time 125.36ms
iter 293890: loss 5.5724, time 125.22ms
iter 293900: loss 6.1935, time 124.97ms
iter 293910: loss 5.8327, time 126.55ms
iter 293920: loss 6.5438, time 125.28ms
iter 293930: loss 5.9222, time 126.99ms
iter 293940: loss 6.8505, time 124.83ms
iter 293950: loss 5.5588, time 125.13ms
iter 293960: loss 6.3891, time 125.57ms
iter 293970: loss 6.5982, time 125.07ms
iter 293980: loss 6.0542, time 125.10ms
iter 293990: loss 6.0194, time 124.93ms
step 294000: train loss 5.6445, val loss 5.5958
saving checkpoint to out-shakespeare-char
iter 294000: loss 6.5385, time 2886.64ms
iter 294010: loss 6.0294, time 122.34ms
iter 294020: loss 5.5853, time 121.86ms
iter 294030: loss 6.2926, time 123.83ms
iter 294040: loss 6.4663, time 124.73ms
iter 294050: loss 5.6275, time 125.39ms
iter 294060: loss 5.8549, time 126.96ms
iter 294070: loss 6.2453, time 127.01ms
iter 294080: loss 6.4247, time 125.18ms
iter 294090: loss 5.8940, time 125.43ms
iter 294100: loss 6.8373, time 125.36ms
iter 294110: loss 6.0944, time 127.48ms
iter 294120: loss 5.6147, time 125.26ms
iter 294130: loss 5.9495, time 124.64ms
iter 294140: loss 6.4354, time 125.52ms
iter 294150: loss 5.4591, time 124.32ms
iter 294160: loss 7.0571, time 125.22ms
iter 294170: loss 6.2555, time 124.64ms
iter 294180: loss 6.2507, time 125.15ms
iter 294190: loss 6.0141, time 127.54ms
iter 294200: loss 6.3557, time 125.57ms
iter 294210: loss 5.8028, time 124.43ms
iter 294220: loss 5.8825, time 126.52ms
iter 294230: loss 6.2141, time 125.34ms
iter 294240: loss 6.1059, time 125.30ms
step 294250: train loss 5.6495, val loss 5.7013
saving checkpoint to out-shakespeare-char
iter 294250: loss 5.9964, time 2882.84ms
iter 294260: loss 6.0360, time 125.42ms
iter 294270: loss 5.9491, time 125.70ms
iter 294280: loss 6.1730, time 125.36ms
iter 294290: loss 6.5349, time 125.50ms
iter 294300: loss 6.1709, time 127.97ms
iter 294310: loss 6.1061, time 126.73ms
iter 294320: loss 6.5184, time 125.44ms
iter 294330: loss 5.8472, time 125.58ms
iter 294340: loss 5.8392, time 125.32ms
iter 294350: loss 6.1076, time 125.34ms
iter 294360: loss 6.5493, time 125.44ms
iter 294370: loss 5.2739, time 125.25ms
iter 294380: loss 5.4951, time 125.64ms
iter 294390: loss 6.2304, time 125.21ms
iter 294400: loss 5.0745, time 125.60ms
iter 294410: loss 5.9281, time 126.68ms
iter 294420: loss 5.7963, time 126.26ms
iter 294430: loss 6.1324, time 126.42ms
iter 294440: loss 6.2300, time 124.67ms
iter 294450: loss 5.6827, time 124.98ms
iter 294460: loss 6.0011, time 125.45ms
iter 294470: loss 5.3669, time 125.20ms
iter 294480: loss 6.6463, time 123.60ms
iter 294490: loss 5.9257, time 124.37ms
step 294500: train loss 5.6509, val loss 5.6866
saving checkpoint to out-shakespeare-char
iter 294500: loss 5.9244, time 2888.89ms
iter 294510: loss 5.2508, time 121.22ms
iter 294520: loss 5.9438, time 122.72ms
iter 294530: loss 6.4296, time 121.63ms
iter 294540: loss 5.6342, time 122.58ms
iter 294550: loss 6.3171, time 120.86ms
iter 294560: loss 5.7478, time 123.21ms
iter 294570: loss 6.5564, time 122.85ms
iter 294580: loss 6.9706, time 122.02ms
iter 294590: loss 4.9789, time 121.46ms
iter 294600: loss 5.5021, time 121.83ms
iter 294610: loss 5.4646, time 121.75ms
iter 294620: loss 5.9175, time 121.07ms
iter 294630: loss 5.4570, time 121.02ms
iter 294640: loss 6.0358, time 121.01ms
iter 294650: loss 6.0219, time 121.49ms
iter 294660: loss 5.9017, time 121.39ms
iter 294670: loss 6.4569, time 120.80ms
iter 294680: loss 6.5386, time 120.33ms
iter 294690: loss 5.9281, time 122.01ms
iter 294700: loss 5.8373, time 121.59ms
iter 294710: loss 5.9731, time 120.58ms
iter 294720: loss 5.5022, time 121.31ms
iter 294730: loss 5.8185, time 124.70ms
iter 294740: loss 6.2970, time 124.88ms
step 294750: train loss 5.7067, val loss 5.6372
saving checkpoint to out-shakespeare-char
iter 294750: loss 6.1249, time 2887.28ms
iter 294760: loss 6.7100, time 125.85ms
iter 294770: loss 6.0326, time 125.66ms
iter 294780: loss 6.2819, time 124.96ms
iter 294790: loss 7.1798, time 123.52ms
iter 294800: loss 5.8423, time 125.59ms
iter 294810: loss 6.3859, time 125.52ms
iter 294820: loss 5.7945, time 128.18ms
iter 294830: loss 6.5265, time 126.86ms
iter 294840: loss 5.8702, time 125.49ms
iter 294850: loss 6.1732, time 125.61ms
iter 294860: loss 6.1328, time 125.56ms
iter 294870: loss 6.0894, time 125.66ms
iter 294880: loss 5.7330, time 124.48ms
iter 294890: loss 6.5224, time 123.94ms
iter 294900: loss 5.6814, time 127.69ms
iter 294910: loss 5.7775, time 125.13ms
iter 294920: loss 6.6987, time 125.49ms
iter 294930: loss 6.7290, time 124.98ms
iter 294940: loss 5.3913, time 124.68ms
iter 294950: loss 5.6041, time 125.05ms
iter 294960: loss 6.3167, time 125.28ms
iter 294970: loss 6.4947, time 123.35ms
iter 294980: loss 5.9981, time 125.07ms
iter 294990: loss 6.2775, time 124.13ms
step 295000: train loss 5.6898, val loss 5.7014
saving checkpoint to out-shakespeare-char
iter 295000: loss 6.2533, time 2892.06ms
iter 295010: loss 5.8306, time 125.92ms
iter 295020: loss 5.6358, time 126.64ms
iter 295030: loss 6.1644, time 125.09ms
iter 295040: loss 6.2875, time 124.46ms
iter 295050: loss 5.5200, time 125.46ms
iter 295060: loss 6.0056, time 124.75ms
iter 295070: loss 5.8311, time 125.25ms
iter 295080: loss 6.4456, time 126.45ms
iter 295090: loss 5.1614, time 124.28ms
iter 295100: loss 5.8813, time 124.99ms
iter 295110: loss 5.5868, time 124.92ms
iter 295120: loss 6.1408, time 124.20ms
iter 295130: loss 6.8403, time 127.62ms
iter 295140: loss 6.0344, time 125.04ms
iter 295150: loss 5.8979, time 124.90ms
iter 295160: loss 6.5940, time 124.30ms
iter 295170: loss 5.0371, time 124.98ms
iter 295180: loss 5.7006, time 124.89ms
iter 295190: loss 5.5306, time 124.00ms
iter 295200: loss 5.5821, time 124.09ms
iter 295210: loss 5.8278, time 125.16ms
iter 295220: loss 6.6265, time 124.08ms
iter 295230: loss 5.7735, time 125.14ms
iter 295240: loss 6.1214, time 127.53ms
step 295250: train loss 5.6874, val loss 5.6679
saving checkpoint to out-shakespeare-char
iter 295250: loss 6.1869, time 2896.98ms
iter 295260: loss 5.9268, time 125.10ms
iter 295270: loss 6.4390, time 125.00ms
iter 295280: loss 6.6151, time 125.13ms
iter 295290: loss 5.3897, time 126.43ms
iter 295300: loss 6.5511, time 125.25ms
iter 295310: loss 5.8255, time 124.37ms
iter 295320: loss 5.6240, time 124.77ms
iter 295330: loss 6.2601, time 127.33ms
iter 295340: loss 6.5373, time 124.81ms
iter 295350: loss 5.8324, time 123.96ms
iter 295360: loss 5.8985, time 124.00ms
iter 295370: loss 5.7850, time 127.76ms
iter 295380: loss 5.4384, time 125.03ms
iter 295390: loss 5.6409, time 124.91ms
iter 295400: loss 6.5550, time 124.25ms
iter 295410: loss 5.6776, time 126.56ms
iter 295420: loss 5.7179, time 124.78ms
iter 295430: loss 5.8675, time 124.95ms
iter 295440: loss 6.7329, time 125.12ms
iter 295450: loss 5.2150, time 127.53ms
iter 295460: loss 6.5519, time 125.03ms
iter 295470: loss 5.5160, time 124.58ms
iter 295480: loss 6.5464, time 124.23ms
iter 295490: loss 5.9093, time 123.74ms
step 295500: train loss 5.6182, val loss 5.6449
saving checkpoint to out-shakespeare-char
iter 295500: loss 5.7248, time 2905.12ms
iter 295510: loss 5.6481, time 125.05ms
iter 295520: loss 6.0216, time 125.95ms
iter 295530: loss 6.6511, time 126.02ms
iter 295540: loss 5.3583, time 125.37ms
iter 295550: loss 6.7318, time 124.98ms
iter 295560: loss 5.9319, time 124.52ms
iter 295570: loss 6.0165, time 125.43ms
iter 295580: loss 6.0598, time 127.72ms
iter 295590: loss 6.4018, time 125.31ms
iter 295600: loss 5.4882, time 125.28ms
iter 295610: loss 6.4053, time 125.63ms
iter 295620: loss 5.9170, time 125.12ms
iter 295630: loss 6.3880, time 125.69ms
iter 295640: loss 5.8902, time 125.23ms
iter 295650: loss 5.7774, time 125.36ms
iter 295660: loss 6.3255, time 127.25ms
iter 295670: loss 6.5393, time 125.33ms
iter 295680: loss 6.1586, time 126.46ms
iter 295690: loss 6.1684, time 124.65ms
iter 295700: loss 6.1758, time 125.24ms
iter 295710: loss 6.5023, time 125.23ms
iter 295720: loss 5.6709, time 126.01ms
iter 295730: loss 6.6645, time 128.18ms
iter 295740: loss 6.1371, time 125.74ms
step 295750: train loss 5.6753, val loss 5.6600
saving checkpoint to out-shakespeare-char
iter 295750: loss 6.0772, time 2894.39ms
iter 295760: loss 6.2261, time 125.10ms
iter 295770: loss 5.4894, time 127.48ms
iter 295780: loss 6.0744, time 125.09ms
iter 295790: loss 5.9131, time 125.61ms
iter 295800: loss 5.9079, time 125.04ms
iter 295810: loss 6.3144, time 125.23ms
iter 295820: loss 6.2906, time 127.07ms
iter 295830: loss 6.7139, time 125.82ms
iter 295840: loss 5.1895, time 125.30ms
iter 295850: loss 6.0733, time 127.38ms
iter 295860: loss 6.4851, time 125.65ms
iter 295870: loss 5.5663, time 124.45ms
iter 295880: loss 6.2467, time 124.52ms
iter 295890: loss 6.7933, time 124.80ms
iter 295900: loss 5.5439, time 125.66ms
iter 295910: loss 5.7551, time 125.42ms
iter 295920: loss 5.5257, time 127.04ms
iter 295930: loss 5.6332, time 125.70ms
iter 295940: loss 5.7162, time 125.54ms
iter 295950: loss 6.2159, time 125.62ms
iter 295960: loss 6.2394, time 126.36ms
iter 295970: loss 6.7276, time 127.83ms
iter 295980: loss 6.0888, time 125.61ms
iter 295990: loss 6.0200, time 125.73ms
step 296000: train loss 5.6407, val loss 5.6077
saving checkpoint to out-shakespeare-char
iter 296000: loss 6.3342, time 2914.18ms
iter 296010: loss 6.2591, time 125.88ms
iter 296020: loss 5.7841, time 125.18ms
iter 296030: loss 6.0774, time 125.16ms
iter 296040: loss 6.3883, time 125.79ms
iter 296050: loss 6.5366, time 127.58ms
iter 296060: loss 6.1830, time 126.81ms
iter 296070: loss 6.4648, time 125.35ms
iter 296080: loss 6.3836, time 125.42ms
iter 296090: loss 5.6982, time 125.49ms
iter 296100: loss 5.5639, time 125.16ms
iter 296110: loss 6.8239, time 124.74ms
iter 296120: loss 5.1661, time 125.35ms
iter 296130: loss 6.1319, time 125.27ms
iter 296140: loss 5.8989, time 125.64ms
iter 296150: loss 5.9911, time 125.46ms
iter 296160: loss 5.2369, time 126.14ms
iter 296170: loss 5.9914, time 127.64ms
iter 296180: loss 5.5954, time 125.30ms
iter 296190: loss 5.7003, time 125.20ms
iter 296200: loss 6.1824, time 125.85ms
iter 296210: loss 5.9402, time 125.44ms
iter 296220: loss 5.7840, time 125.37ms
iter 296230: loss 5.6906, time 125.36ms
iter 296240: loss 6.0619, time 125.61ms
step 296250: train loss 5.6513, val loss 5.6784
saving checkpoint to out-shakespeare-char
iter 296250: loss 5.8967, time 2891.98ms
iter 296260: loss 6.7902, time 126.48ms
iter 296270: loss 5.9450, time 125.95ms
iter 296280: loss 6.3281, time 128.33ms
iter 296290: loss 6.4614, time 125.64ms
iter 296300: loss 6.7680, time 125.90ms
iter 296310: loss 5.7416, time 126.56ms
iter 296320: loss 6.5319, time 125.88ms
iter 296330: loss 6.1797, time 126.21ms
iter 296340: loss 5.5435, time 125.99ms
iter 296350: loss 6.3635, time 125.80ms
iter 296360: loss 5.5413, time 125.74ms
iter 296370: loss 5.9452, time 126.03ms
iter 296380: loss 6.1330, time 125.82ms
iter 296390: loss 6.0329, time 126.10ms
iter 296400: loss 5.6043, time 124.45ms
iter 296410: loss 6.1731, time 125.33ms
iter 296420: loss 6.6265, time 125.63ms
iter 296430: loss 5.9506, time 125.52ms
iter 296440: loss 6.5370, time 125.72ms
iter 296450: loss 5.6113, time 127.31ms
iter 296460: loss 5.8588, time 125.87ms
iter 296470: loss 6.2371, time 124.96ms
iter 296480: loss 5.6854, time 125.79ms
iter 296490: loss 6.2829, time 128.36ms
step 296500: train loss 5.6334, val loss 5.6433
saving checkpoint to out-shakespeare-char
iter 296500: loss 5.8428, time 2879.48ms
iter 296510: loss 6.3255, time 125.54ms
iter 296520: loss 5.2905, time 125.70ms
iter 296530: loss 6.5131, time 127.93ms
iter 296540: loss 6.3176, time 125.30ms
iter 296550: loss 5.4210, time 125.45ms
iter 296560: loss 5.5771, time 127.43ms
iter 296570: loss 5.8169, time 125.77ms
iter 296580: loss 6.4841, time 125.63ms
iter 296590: loss 6.1086, time 125.68ms
iter 296600: loss 5.4610, time 125.51ms
iter 296610: loss 6.4476, time 125.72ms
iter 296620: loss 7.1318, time 125.94ms
iter 296630: loss 7.0612, time 125.33ms
iter 296640: loss 5.4572, time 126.11ms
iter 296650: loss 6.7673, time 128.21ms
iter 296660: loss 6.5589, time 126.93ms
iter 296670: loss 6.1039, time 125.58ms
iter 296680: loss 6.3579, time 125.52ms
iter 296690: loss 6.6503, time 125.51ms
iter 296700: loss 5.7688, time 127.92ms
iter 296710: loss 6.0477, time 125.46ms
iter 296720: loss 6.7651, time 125.47ms
iter 296730: loss 5.5384, time 125.70ms
iter 296740: loss 6.0013, time 125.10ms
step 296750: train loss 5.6439, val loss 5.6541
saving checkpoint to out-shakespeare-char
iter 296750: loss 5.8544, time 2889.35ms
iter 296760: loss 6.8282, time 124.86ms
iter 296770: loss 6.0874, time 124.73ms
iter 296780: loss 5.6871, time 125.04ms
iter 296790: loss 5.1877, time 125.54ms
iter 296800: loss 5.6599, time 127.94ms
iter 296810: loss 6.3087, time 124.71ms
iter 296820: loss 5.8977, time 124.67ms
iter 296830: loss 6.1849, time 125.27ms
iter 296840: loss 5.9263, time 124.37ms
iter 296850: loss 6.2824, time 124.03ms
iter 296860: loss 6.9487, time 124.50ms
iter 296870: loss 6.4373, time 125.30ms
iter 296880: loss 5.5485, time 124.54ms
iter 296890: loss 6.4652, time 124.68ms
iter 296900: loss 5.4795, time 125.09ms
iter 296910: loss 6.5353, time 126.22ms
iter 296920: loss 5.6650, time 127.42ms
iter 296930: loss 5.8341, time 124.89ms
iter 296940: loss 6.1962, time 124.07ms
iter 296950: loss 6.4647, time 124.85ms
iter 296960: loss 5.9590, time 124.37ms
iter 296970: loss 6.6763, time 124.80ms
iter 296980: loss 5.7965, time 124.75ms
iter 296990: loss 5.8905, time 126.75ms
step 297000: train loss 5.6955, val loss 5.6958
saving checkpoint to out-shakespeare-char
iter 297000: loss 6.0990, time 2901.02ms
iter 297010: loss 6.1293, time 125.23ms
iter 297020: loss 6.4235, time 125.68ms
iter 297030: loss 6.3726, time 125.94ms
iter 297040: loss 5.5102, time 125.94ms
iter 297050: loss 5.8791, time 127.69ms
iter 297060: loss 6.1169, time 125.40ms
iter 297070: loss 6.7828, time 125.47ms
iter 297080: loss 5.8695, time 126.04ms
iter 297090: loss 6.3724, time 121.37ms
iter 297100: loss 6.6984, time 121.51ms
iter 297110: loss 4.9400, time 121.05ms
iter 297120: loss 5.5348, time 122.85ms
iter 297130: loss 6.2969, time 121.17ms
iter 297140: loss 5.4901, time 122.90ms
iter 297150: loss 6.3191, time 121.70ms
iter 297160: loss 5.9562, time 121.76ms
iter 297170: loss 5.8814, time 121.70ms
iter 297180: loss 6.3211, time 121.56ms
iter 297190: loss 6.1780, time 121.44ms
iter 297200: loss 6.2153, time 120.87ms
iter 297210: loss 5.9861, time 119.97ms
iter 297220: loss 5.9894, time 121.04ms
iter 297230: loss 6.2205, time 121.65ms
iter 297240: loss 5.3947, time 121.40ms
step 297250: train loss 5.6919, val loss 5.6556
saving checkpoint to out-shakespeare-char
iter 297250: loss 6.1203, time 2893.65ms
iter 297260: loss 5.9798, time 125.82ms
iter 297270: loss 6.6893, time 126.00ms
iter 297280: loss 5.6128, time 126.65ms
iter 297290: loss 6.2143, time 125.75ms
iter 297300: loss 6.5595, time 125.60ms
iter 297310: loss 5.9560, time 124.49ms
iter 297320: loss 5.8332, time 125.82ms
iter 297330: loss 7.1749, time 125.66ms
iter 297340: loss 6.0233, time 125.80ms
iter 297350: loss 6.2019, time 125.54ms
iter 297360: loss 6.3579, time 126.02ms
iter 297370: loss 6.1113, time 125.49ms
iter 297380: loss 6.6141, time 125.11ms
iter 297390: loss 6.3146, time 127.26ms
iter 297400: loss 6.6809, time 125.82ms
iter 297410: loss 5.9341, time 125.73ms
iter 297420: loss 5.8019, time 125.65ms
iter 297430: loss 6.2031, time 125.85ms
iter 297440: loss 6.0283, time 125.89ms
iter 297450: loss 6.2858, time 126.01ms
iter 297460: loss 6.5577, time 125.19ms
iter 297470: loss 5.3904, time 125.39ms
iter 297480: loss 6.1720, time 125.07ms
iter 297490: loss 6.2266, time 127.97ms
step 297500: train loss 5.6518, val loss 5.6930
saving checkpoint to out-shakespeare-char
iter 297500: loss 5.5637, time 2873.42ms
iter 297510: loss 5.5896, time 127.83ms
iter 297520: loss 6.2408, time 125.46ms
iter 297530: loss 6.4295, time 126.43ms
iter 297540: loss 5.7659, time 126.02ms
iter 297550: loss 5.9341, time 125.39ms
iter 297560: loss 5.8736, time 128.45ms
iter 297570: loss 6.3258, time 125.59ms
iter 297580: loss 5.6633, time 125.58ms
iter 297590: loss 5.8507, time 125.78ms
iter 297600: loss 6.1608, time 127.99ms
iter 297610: loss 5.9991, time 125.60ms
iter 297620: loss 6.4226, time 125.73ms
iter 297630: loss 5.7302, time 126.92ms
iter 297640: loss 5.4856, time 125.78ms
iter 297650: loss 5.9669, time 125.48ms
iter 297660: loss 5.7303, time 125.65ms
iter 297670: loss 6.7432, time 125.59ms
iter 297680: loss 6.2607, time 128.10ms
iter 297690: loss 5.5795, time 125.41ms
iter 297700: loss 6.2180, time 125.28ms
iter 297710: loss 6.8057, time 125.80ms
iter 297720: loss 6.2779, time 125.97ms
iter 297730: loss 6.2725, time 125.61ms
iter 297740: loss 6.0615, time 127.00ms
step 297750: train loss 5.6407, val loss 5.6755
saving checkpoint to out-shakespeare-char
iter 297750: loss 5.7422, time 2884.66ms
iter 297760: loss 6.1992, time 122.93ms
iter 297770: loss 6.4057, time 121.98ms
iter 297780: loss 6.1596, time 122.95ms
iter 297790: loss 5.5047, time 123.40ms
iter 297800: loss 5.4065, time 122.43ms
iter 297810: loss 5.4740, time 122.53ms
iter 297820: loss 5.8574, time 122.01ms
iter 297830: loss 6.6135, time 121.92ms
iter 297840: loss 6.1920, time 121.83ms
iter 297850: loss 5.8632, time 121.62ms
iter 297860: loss 5.7395, time 121.67ms
iter 297870: loss 5.8956, time 122.12ms
iter 297880: loss 5.2707, time 121.79ms
iter 297890: loss 5.7509, time 121.47ms
iter 297900: loss 5.8411, time 124.16ms
iter 297910: loss 6.4422, time 121.68ms
iter 297920: loss 5.8710, time 121.84ms
iter 297930: loss 6.5152, time 121.79ms
iter 297940: loss 5.9301, time 124.18ms
iter 297950: loss 6.9391, time 121.83ms
iter 297960: loss 6.2202, time 123.99ms
iter 297970: loss 6.1001, time 121.98ms
iter 297980: loss 6.3390, time 124.25ms
iter 297990: loss 5.8945, time 122.09ms
step 298000: train loss 5.5748, val loss 5.6268
saving checkpoint to out-shakespeare-char
iter 298000: loss 5.7849, time 2896.08ms
iter 298010: loss 5.7740, time 121.84ms
iter 298020: loss 5.9389, time 123.99ms
iter 298030: loss 5.7585, time 121.75ms
iter 298040: loss 6.0099, time 123.70ms
iter 298050: loss 4.9870, time 123.12ms
iter 298060: loss 5.8933, time 121.57ms
iter 298070: loss 5.4425, time 121.79ms
iter 298080: loss 6.3374, time 121.23ms
iter 298090: loss 6.1629, time 122.45ms
iter 298100: loss 5.7282, time 121.38ms
iter 298110: loss 5.7754, time 122.56ms
iter 298120: loss 5.7602, time 121.78ms
iter 298130: loss 5.2964, time 121.42ms
iter 298140: loss 5.4464, time 122.84ms
iter 298150: loss 5.4212, time 121.60ms
iter 298160: loss 5.9263, time 122.31ms
iter 298170: loss 6.9764, time 122.88ms
iter 298180: loss 6.0733, time 121.94ms
iter 298190: loss 5.7499, time 123.03ms
iter 298200: loss 6.1367, time 121.86ms
iter 298210: loss 5.9425, time 122.95ms
iter 298220: loss 6.0175, time 121.41ms
iter 298230: loss 6.2607, time 122.73ms
iter 298240: loss 6.1891, time 121.97ms
step 298250: train loss 5.6533, val loss 5.6512
saving checkpoint to out-shakespeare-char
iter 298250: loss 5.9405, time 2889.27ms
iter 298260: loss 6.4925, time 121.78ms
iter 298270: loss 6.6161, time 122.92ms
iter 298280: loss 6.1567, time 121.78ms
iter 298290: loss 6.0697, time 122.96ms
iter 298300: loss 6.6015, time 121.89ms
iter 298310: loss 5.9180, time 122.98ms
iter 298320: loss 5.6727, time 121.88ms
iter 298330: loss 6.8238, time 122.95ms
iter 298340: loss 5.9490, time 123.06ms
iter 298350: loss 5.5701, time 121.38ms
iter 298360: loss 5.6323, time 123.03ms
iter 298370: loss 5.9363, time 121.96ms
iter 298380: loss 5.9203, time 122.89ms
iter 298390: loss 5.7937, time 121.85ms
iter 298400: loss 6.2826, time 123.36ms
iter 298410: loss 6.6347, time 121.91ms
iter 298420: loss 6.3386, time 123.03ms
iter 298430: loss 6.3001, time 121.96ms
iter 298440: loss 6.0009, time 123.98ms
iter 298450: loss 6.7102, time 121.64ms
iter 298460: loss 5.9763, time 121.09ms
iter 298470: loss 6.0995, time 121.28ms
iter 298480: loss 6.4764, time 121.92ms
iter 298490: loss 5.7174, time 121.59ms
step 298500: train loss 5.6421, val loss 5.6472
saving checkpoint to out-shakespeare-char
iter 298500: loss 5.5693, time 2883.63ms
iter 298510: loss 6.4614, time 121.73ms
iter 298520: loss 6.0765, time 122.71ms
iter 298530: loss 6.6065, time 119.54ms
iter 298540: loss 5.7854, time 122.81ms
iter 298550: loss 6.4318, time 122.99ms
iter 298560: loss 5.6593, time 120.16ms
iter 298570: loss 6.0800, time 121.48ms
iter 298580: loss 6.2148, time 121.69ms
iter 298590: loss 6.3304, time 122.07ms
iter 298600: loss 6.2821, time 122.78ms
iter 298610: loss 5.9773, time 121.77ms
iter 298620: loss 5.9019, time 121.59ms
iter 298630: loss 6.0175, time 121.66ms
iter 298640: loss 6.6273, time 121.69ms
iter 298650: loss 5.5483, time 120.73ms
iter 298660: loss 6.7156, time 125.16ms
iter 298670: loss 6.4532, time 125.12ms
iter 298680: loss 5.7740, time 124.81ms
iter 298690: loss 6.2554, time 124.33ms
iter 298700: loss 6.3665, time 124.94ms
iter 298710: loss 6.6554, time 123.07ms
iter 298720: loss 6.0887, time 122.00ms
iter 298730: loss 6.3064, time 122.83ms
iter 298740: loss 6.2288, time 121.52ms
step 298750: train loss 5.6606, val loss 5.6648
saving checkpoint to out-shakespeare-char
iter 298750: loss 5.8102, time 2909.34ms
iter 298760: loss 6.0790, time 126.05ms
iter 298770: loss 6.2174, time 128.15ms
iter 298780: loss 6.5903, time 124.90ms
iter 298790: loss 6.1316, time 125.54ms
iter 298800: loss 5.5653, time 127.20ms
iter 298810: loss 6.1018, time 132.69ms
iter 298820: loss 6.5077, time 126.58ms
iter 298830: loss 6.3560, time 125.31ms
iter 298840: loss 5.7875, time 124.57ms
iter 298850: loss 6.4677, time 125.44ms
iter 298860: loss 5.9472, time 125.63ms
iter 298870: loss 5.9329, time 125.97ms
iter 298880: loss 6.5503, time 125.49ms
iter 298890: loss 5.7291, time 124.66ms
iter 298900: loss 6.2039, time 126.01ms
iter 298910: loss 6.3444, time 124.58ms
iter 298920: loss 6.3402, time 125.61ms
iter 298930: loss 5.6160, time 128.12ms
iter 298940: loss 5.2083, time 125.61ms
iter 298950: loss 5.8703, time 126.76ms
iter 298960: loss 6.1287, time 125.91ms
iter 298970: loss 6.0226, time 124.40ms
iter 298980: loss 6.4456, time 125.37ms
iter 298990: loss 6.1900, time 125.54ms
step 299000: train loss 5.6461, val loss 5.6802
saving checkpoint to out-shakespeare-char
iter 299000: loss 5.4294, time 2873.49ms
iter 299010: loss 6.3092, time 125.55ms
iter 299020: loss 6.3558, time 123.98ms
iter 299030: loss 5.4982, time 125.30ms
iter 299040: loss 5.7332, time 125.17ms
iter 299050: loss 5.9160, time 127.73ms
iter 299060: loss 5.9733, time 122.20ms
iter 299070: loss 6.6111, time 121.86ms
iter 299080: loss 6.2077, time 121.85ms
iter 299090: loss 5.5155, time 123.32ms
iter 299100: loss 5.0757, time 123.62ms
iter 299110: loss 6.2312, time 121.83ms
iter 299120: loss 6.4477, time 122.84ms
iter 299130: loss 5.4197, time 121.89ms
iter 299140: loss 6.5954, time 123.14ms
iter 299150: loss 6.2972, time 121.75ms
iter 299160: loss 5.6140, time 122.97ms
iter 299170: loss 6.1961, time 121.66ms
iter 299180: loss 6.0659, time 122.81ms
iter 299190: loss 5.1544, time 121.80ms
iter 299200: loss 6.1682, time 122.97ms
iter 299210: loss 5.6615, time 121.68ms
iter 299220: loss 6.1614, time 121.58ms
iter 299230: loss 5.8221, time 121.79ms
iter 299240: loss 5.5049, time 122.27ms
step 299250: train loss 5.6388, val loss 5.6368
saving checkpoint to out-shakespeare-char
iter 299250: loss 6.1714, time 2882.89ms
iter 299260: loss 5.6872, time 121.89ms
iter 299270: loss 6.3793, time 121.49ms
iter 299280: loss 5.7636, time 121.85ms
iter 299290: loss 6.1606, time 121.91ms
iter 299300: loss 6.1228, time 121.97ms
iter 299310: loss 5.7132, time 121.64ms
iter 299320: loss 6.1822, time 121.92ms
iter 299330: loss 5.8599, time 121.65ms
iter 299340: loss 5.9794, time 119.57ms
iter 299350: loss 5.8672, time 122.70ms
iter 299360: loss 6.7030, time 121.56ms
iter 299370: loss 6.4171, time 122.73ms
iter 299380: loss 6.1685, time 121.56ms
iter 299390: loss 7.0182, time 122.67ms
iter 299400: loss 5.9468, time 121.53ms
iter 299410: loss 6.3145, time 122.89ms
iter 299420: loss 5.4198, time 121.67ms
iter 299430: loss 5.9568, time 121.85ms
iter 299440: loss 6.1910, time 121.70ms
iter 299450: loss 6.6445, time 123.09ms
iter 299460: loss 7.0030, time 123.36ms
iter 299470: loss 6.5646, time 121.62ms
iter 299480: loss 5.7072, time 121.67ms
iter 299490: loss 5.7656, time 121.52ms
step 299500: train loss 5.6368, val loss 5.7017
saving checkpoint to out-shakespeare-char
iter 299500: loss 6.1022, time 2882.22ms
iter 299510: loss 5.9443, time 122.09ms
iter 299520: loss 5.7453, time 122.31ms
iter 299530: loss 6.3118, time 121.99ms
iter 299540: loss 5.7369, time 121.01ms
iter 299550: loss 6.1884, time 122.15ms
iter 299560: loss 5.9811, time 121.99ms
iter 299570: loss 6.9630, time 122.96ms
iter 299580: loss 5.4609, time 122.20ms
iter 299590: loss 5.8767, time 121.78ms
iter 299600: loss 5.6152, time 121.80ms
iter 299610: loss 5.6654, time 121.73ms
iter 299620: loss 6.1226, time 123.19ms
iter 299630: loss 5.3209, time 120.99ms
iter 299640: loss 5.7993, time 121.19ms
iter 299650: loss 6.2018, time 124.28ms
iter 299660: loss 5.9912, time 123.21ms
iter 299670: loss 6.8060, time 127.82ms
iter 299680: loss 6.8166, time 125.75ms
iter 299690: loss 6.0801, time 124.97ms
iter 299700: loss 6.0614, time 125.40ms
iter 299710: loss 5.8083, time 125.75ms
iter 299720: loss 6.3036, time 125.75ms
iter 299730: loss 6.3804, time 124.64ms
iter 299740: loss 6.0644, time 125.54ms
step 299750: train loss 5.6299, val loss 5.6614
saving checkpoint to out-shakespeare-char
iter 299750: loss 5.9910, time 2908.25ms
iter 299760: loss 5.2051, time 125.87ms
iter 299770: loss 6.1099, time 124.58ms
iter 299780: loss 5.3817, time 126.04ms
iter 299790: loss 5.9372, time 128.04ms
iter 299800: loss 5.4736, time 125.71ms
iter 299810: loss 6.3730, time 126.55ms
iter 299820: loss 5.5932, time 125.71ms
iter 299830: loss 6.1672, time 125.74ms
iter 299840: loss 6.6818, time 127.92ms
iter 299850: loss 6.0168, time 125.35ms
iter 299860: loss 6.3668, time 128.04ms
iter 299870: loss 6.0873, time 125.59ms
iter 299880: loss 5.7784, time 127.83ms
iter 299890: loss 6.1000, time 125.29ms
iter 299900: loss 6.3790, time 125.27ms
iter 299910: loss 5.2582, time 125.61ms
iter 299920: loss 5.9895, time 126.00ms
iter 299930: loss 5.9377, time 126.04ms
iter 299940: loss 5.5187, time 125.27ms
iter 299950: loss 5.4952, time 125.26ms
iter 299960: loss 6.4041, time 126.96ms
iter 299970: loss 6.4545, time 125.07ms
iter 299980: loss 6.2922, time 125.46ms
iter 299990: loss 5.6541, time 124.56ms
step 300000: train loss 5.6244, val loss 5.6290
saving checkpoint to out-shakespeare-char
iter 300000: loss 5.6879, time 2875.33ms
iter 300010: loss 5.9572, time 123.05ms
iter 300020: loss 5.4759, time 119.49ms
iter 300030: loss 5.9852, time 120.41ms
iter 300040: loss 5.5711, time 119.96ms
iter 300050: loss 5.8618, time 124.31ms
iter 300060: loss 6.7878, time 122.06ms
iter 300070: loss 6.3763, time 128.80ms
iter 300080: loss 5.8486, time 124.80ms
iter 300090: loss 5.5748, time 125.88ms
iter 300100: loss 6.1192, time 126.31ms
iter 300110: loss 5.9516, time 125.93ms
iter 300120: loss 6.6291, time 124.93ms
iter 300130: loss 5.4880, time 125.75ms
iter 300140: loss 6.8412, time 126.25ms
iter 300150: loss 5.8570, time 124.99ms
iter 300160: loss 5.6584, time 125.64ms
iter 300170: loss 5.7473, time 125.88ms
iter 300180: loss 6.4745, time 126.43ms
iter 300190: loss 5.6330, time 123.57ms
iter 300200: loss 6.6117, time 125.13ms
iter 300210: loss 5.9156, time 127.97ms
iter 300220: loss 5.7888, time 125.48ms
iter 300230: loss 6.1089, time 125.66ms
iter 300240: loss 6.2658, time 125.04ms
step 300250: train loss 5.6422, val loss 5.6378
saving checkpoint to out-shakespeare-char
iter 300250: loss 6.2333, time 2872.44ms
iter 300260: loss 6.2303, time 128.31ms
iter 300270: loss 6.4834, time 126.24ms
iter 300280: loss 6.3377, time 125.78ms
iter 300290: loss 5.9779, time 124.80ms
iter 300300: loss 5.9637, time 125.56ms
iter 300310: loss 6.5744, time 125.69ms
iter 300320: loss 5.9189, time 124.70ms
iter 300330: loss 6.2624, time 127.04ms
iter 300340: loss 6.3870, time 124.65ms
iter 300350: loss 6.0868, time 126.88ms
iter 300360: loss 6.6387, time 125.20ms
iter 300370: loss 5.4175, time 124.45ms
iter 300380: loss 6.0733, time 124.34ms
iter 300390: loss 5.8565, time 124.98ms
iter 300400: loss 5.5449, time 124.98ms
iter 300410: loss 6.3157, time 127.50ms
iter 300420: loss 6.1971, time 124.00ms
iter 300430: loss 4.6575, time 124.66ms
iter 300440: loss 5.6465, time 125.57ms
iter 300450: loss 5.4231, time 126.80ms
iter 300460: loss 6.1666, time 124.50ms
iter 300470: loss 5.2313, time 126.24ms
iter 300480: loss 5.9385, time 125.79ms
iter 300490: loss 4.9998, time 124.53ms
step 300500: train loss 5.6394, val loss 5.6607
saving checkpoint to out-shakespeare-char
iter 300500: loss 5.7772, time 2885.13ms
iter 300510: loss 5.8569, time 122.24ms
iter 300520: loss 5.8628, time 121.75ms
iter 300530: loss 5.7244, time 121.57ms
iter 300540: loss 5.4913, time 120.84ms
iter 300550: loss 6.1117, time 121.59ms
iter 300560: loss 5.7495, time 121.86ms
iter 300570: loss 6.0987, time 121.48ms
iter 300580: loss 5.5394, time 121.23ms
iter 300590: loss 5.7268, time 121.44ms
iter 300600: loss 5.5648, time 121.89ms
iter 300610: loss 5.6521, time 123.08ms
iter 300620: loss 6.5620, time 121.81ms
iter 300630: loss 5.7595, time 123.20ms
iter 300640: loss 6.0644, time 121.63ms
iter 300650: loss 6.2221, time 122.63ms
iter 300660: loss 6.2490, time 122.21ms
iter 300670: loss 6.0720, time 123.10ms
iter 300680: loss 5.5693, time 121.54ms
iter 300690: loss 6.4746, time 123.03ms
iter 300700: loss 6.4412, time 121.34ms
iter 300710: loss 6.4405, time 122.55ms
iter 300720: loss 6.1715, time 122.85ms
iter 300730: loss 5.7402, time 121.48ms
iter 300740: loss 6.0741, time 121.51ms
step 300750: train loss 5.6457, val loss 5.6211
saving checkpoint to out-shakespeare-char
iter 300750: loss 6.5763, time 2878.28ms
iter 300760: loss 6.0753, time 121.71ms
iter 300770: loss 5.6640, time 123.45ms
iter 300780: loss 5.7421, time 121.40ms
iter 300790: loss 6.0326, time 121.69ms
iter 300800: loss 5.4759, time 121.45ms
iter 300810: loss 5.4615, time 121.42ms
iter 300820: loss 5.6449, time 121.46ms
iter 300830: loss 6.2263, time 121.55ms
iter 300840: loss 5.7099, time 121.56ms
iter 300850: loss 5.8135, time 121.58ms
iter 300860: loss 5.6858, time 121.59ms
iter 300870: loss 5.9444, time 122.06ms
iter 300880: loss 6.6745, time 120.03ms
iter 300890: loss 5.9940, time 121.86ms
iter 300900: loss 5.9919, time 121.57ms
iter 300910: loss 6.0975, time 121.66ms
iter 300920: loss 6.1705, time 121.56ms
iter 300930: loss 6.4204, time 123.13ms
iter 300940: loss 6.2064, time 121.74ms
iter 300950: loss 5.4764, time 121.63ms
iter 300960: loss 6.8687, time 121.62ms
iter 300970: loss 6.2074, time 121.61ms
iter 300980: loss 5.9498, time 121.72ms
iter 300990: loss 6.1400, time 122.86ms
step 301000: train loss 5.6048, val loss 5.6158
saving checkpoint to out-shakespeare-char
iter 301000: loss 6.5258, time 2892.80ms
iter 301010: loss 5.2317, time 125.86ms
iter 301020: loss 6.2066, time 125.72ms
iter 301030: loss 6.4845, time 125.49ms
iter 301040: loss 5.9277, time 125.67ms
iter 301050: loss 5.8775, time 127.76ms
iter 301060: loss 5.9004, time 126.77ms
iter 301070: loss 4.9873, time 125.64ms
iter 301080: loss 5.7090, time 125.31ms
iter 301090: loss 5.6254, time 125.61ms
iter 301100: loss 6.1493, time 125.35ms
iter 301110: loss 5.9187, time 126.11ms
iter 301120: loss 6.0303, time 125.39ms
iter 301130: loss 6.1206, time 125.59ms
iter 301140: loss 5.8666, time 127.90ms
iter 301150: loss 6.5623, time 125.61ms
iter 301160: loss 6.4436, time 125.60ms
iter 301170: loss 6.2016, time 125.88ms
iter 301180: loss 6.2390, time 125.65ms
iter 301190: loss 5.5847, time 125.61ms
iter 301200: loss 6.6581, time 124.75ms
iter 301210: loss 6.0494, time 125.87ms
iter 301220: loss 5.7420, time 125.53ms
iter 301230: loss 5.9605, time 125.62ms
iter 301240: loss 5.9332, time 125.96ms
step 301250: train loss 5.6166, val loss 5.6349
saving checkpoint to out-shakespeare-char
iter 301250: loss 5.4563, time 2899.85ms
iter 301260: loss 5.9907, time 125.64ms
iter 301270: loss 5.9291, time 125.35ms
iter 301280: loss 6.2574, time 124.29ms
iter 301290: loss 6.7164, time 125.55ms
iter 301300: loss 6.1150, time 125.24ms
iter 301310: loss 5.5277, time 125.78ms
iter 301320: loss 6.2267, time 125.15ms
iter 301330: loss 6.1155, time 125.69ms
iter 301340: loss 6.6623, time 125.34ms
iter 301350: loss 5.5453, time 125.07ms
iter 301360: loss 6.4264, time 125.89ms
iter 301370: loss 6.1209, time 125.51ms
iter 301380: loss 6.2426, time 126.95ms
iter 301390: loss 6.1857, time 125.96ms
iter 301400: loss 6.3109, time 125.65ms
iter 301410: loss 5.7253, time 125.31ms
iter 301420: loss 5.8976, time 125.03ms
iter 301430: loss 6.1287, time 125.29ms
iter 301440: loss 5.9873, time 127.73ms
iter 301450: loss 6.3473, time 125.41ms
iter 301460: loss 6.3066, time 125.70ms
iter 301470: loss 5.7181, time 125.58ms
iter 301480: loss 6.1571, time 125.53ms
iter 301490: loss 6.1442, time 125.30ms
step 301500: train loss 5.6254, val loss 5.6941
saving checkpoint to out-shakespeare-char
iter 301500: loss 6.1413, time 2891.28ms
iter 301510: loss 6.0414, time 125.85ms
iter 301520: loss 5.9293, time 126.83ms
iter 301530: loss 5.6789, time 125.27ms
iter 301540: loss 6.0928, time 128.46ms
iter 301550: loss 6.7840, time 125.33ms
iter 301560: loss 6.1982, time 126.76ms
iter 301570: loss 6.3006, time 125.39ms
iter 301580: loss 5.3396, time 125.20ms
iter 301590: loss 6.8123, time 125.36ms
iter 301600: loss 5.7279, time 125.62ms
iter 301610: loss 6.1715, time 125.56ms
iter 301620: loss 6.3668, time 124.39ms
iter 301630: loss 5.3830, time 125.38ms
iter 301640: loss 5.7239, time 125.27ms
iter 301650: loss 6.3182, time 125.39ms
iter 301660: loss 5.9791, time 125.54ms
iter 301670: loss 6.1135, time 127.37ms
iter 301680: loss 5.6722, time 125.30ms
iter 301690: loss 5.9585, time 125.38ms
iter 301700: loss 5.9923, time 125.17ms
iter 301710: loss 5.8428, time 125.24ms
iter 301720: loss 5.8851, time 127.01ms
iter 301730: loss 6.4253, time 125.54ms
iter 301740: loss 6.0653, time 125.45ms
step 301750: train loss 5.6159, val loss 5.6590
saving checkpoint to out-shakespeare-char
iter 301750: loss 6.4581, time 2881.49ms
iter 301760: loss 5.8508, time 121.98ms
iter 301770: loss 6.3635, time 122.52ms
iter 301780: loss 6.6623, time 121.48ms
iter 301790: loss 6.7529, time 121.35ms
iter 301800: loss 5.8837, time 121.59ms
iter 301810: loss 6.7164, time 121.92ms
iter 301820: loss 6.0854, time 121.50ms
iter 301830: loss 6.3456, time 122.79ms
iter 301840: loss 6.3580, time 120.59ms
iter 301850: loss 7.2007, time 121.63ms
iter 301860: loss 6.7772, time 121.54ms
iter 301870: loss 5.9510, time 121.63ms
iter 301880: loss 5.8193, time 122.48ms
iter 301890: loss 6.2243, time 121.79ms
iter 301900: loss 5.5388, time 122.73ms
iter 301910: loss 6.5911, time 121.50ms
iter 301920: loss 6.3073, time 121.90ms
iter 301930: loss 6.0018, time 121.50ms
iter 301940: loss 6.3441, time 122.53ms
iter 301950: loss 6.1763, time 121.58ms
iter 301960: loss 5.8749, time 122.69ms
iter 301970: loss 5.9351, time 121.68ms
iter 301980: loss 6.6419, time 122.77ms
iter 301990: loss 5.9703, time 122.99ms
step 302000: train loss 5.6286, val loss 5.6870
saving checkpoint to out-shakespeare-char
iter 302000: loss 5.7483, time 2895.30ms
iter 302010: loss 5.8904, time 124.18ms
iter 302020: loss 5.2370, time 121.66ms
iter 302030: loss 5.8007, time 122.91ms
iter 302040: loss 5.8501, time 121.63ms
iter 302050: loss 6.2049, time 122.83ms
iter 302060: loss 5.9111, time 123.53ms
iter 302070: loss 5.5397, time 121.39ms
iter 302080: loss 5.8056, time 123.87ms
iter 302090: loss 6.2924, time 121.65ms
iter 302100: loss 5.2825, time 123.84ms
iter 302110: loss 6.3366, time 121.82ms
iter 302120: loss 5.6372, time 123.76ms
iter 302130: loss 5.6332, time 120.86ms
iter 302140: loss 5.6004, time 123.75ms
iter 302150: loss 6.0031, time 121.95ms
iter 302160: loss 5.2751, time 123.21ms
iter 302170: loss 6.3017, time 123.77ms
iter 302180: loss 6.1236, time 121.95ms
iter 302190: loss 6.1256, time 123.56ms
iter 302200: loss 6.6956, time 121.74ms
iter 302210: loss 6.0302, time 125.83ms
iter 302220: loss 6.1386, time 121.96ms
iter 302230: loss 6.7377, time 125.96ms
iter 302240: loss 5.5903, time 126.12ms
step 302250: train loss 5.6620, val loss 5.6608
saving checkpoint to out-shakespeare-char
iter 302250: loss 5.9118, time 2888.16ms
iter 302260: loss 6.4993, time 125.88ms
iter 302270: loss 5.7888, time 125.88ms
iter 302280: loss 5.7833, time 126.08ms
iter 302290: loss 5.4088, time 127.09ms
iter 302300: loss 5.8538, time 125.63ms
iter 302310: loss 6.2291, time 127.10ms
iter 302320: loss 6.1156, time 125.78ms
iter 302330: loss 5.3669, time 125.64ms
iter 302340: loss 6.0794, time 128.08ms
iter 302350: loss 6.8423, time 125.52ms
iter 302360: loss 5.9928, time 124.67ms
iter 302370: loss 5.8822, time 125.37ms
iter 302380: loss 6.3757, time 124.55ms
iter 302390: loss 6.4525, time 125.26ms
iter 302400: loss 5.2438, time 124.89ms
iter 302410: loss 6.2285, time 126.29ms
iter 302420: loss 6.4558, time 127.94ms
iter 302430: loss 5.8501, time 125.31ms
iter 302440: loss 6.0048, time 127.33ms
iter 302450: loss 5.6827, time 125.19ms
iter 302460: loss 5.2716, time 124.71ms
iter 302470: loss 5.8231, time 126.00ms
iter 302480: loss 5.9041, time 126.38ms
iter 302490: loss 5.7229, time 125.80ms
step 302500: train loss 5.6360, val loss 5.6519
saving checkpoint to out-shakespeare-char
iter 302500: loss 6.3695, time 2875.56ms
iter 302510: loss 5.8956, time 123.22ms
iter 302520: loss 6.2364, time 121.86ms
iter 302530: loss 6.5120, time 123.26ms
iter 302540: loss 5.9177, time 122.20ms
iter 302550: loss 5.2640, time 123.11ms
iter 302560: loss 6.3088, time 121.92ms
iter 302570: loss 6.0300, time 123.26ms
iter 302580: loss 6.1135, time 122.85ms
iter 302590: loss 5.7055, time 122.94ms
iter 302600: loss 5.5705, time 121.88ms
iter 302610: loss 6.3410, time 121.78ms
iter 302620: loss 5.6996, time 121.91ms
iter 302630: loss 5.9204, time 121.87ms
iter 302640: loss 5.9442, time 121.89ms
iter 302650: loss 5.7710, time 122.17ms
iter 302660: loss 5.9195, time 121.82ms
iter 302670: loss 5.8875, time 119.75ms
iter 302680: loss 5.9393, time 122.23ms
iter 302690: loss 6.1772, time 123.08ms
iter 302700: loss 5.8365, time 121.79ms
iter 302710: loss 6.1701, time 121.97ms
iter 302720: loss 6.7113, time 121.69ms
iter 302730: loss 5.6046, time 122.02ms
iter 302740: loss 5.9701, time 123.29ms
step 302750: train loss 5.6687, val loss 5.6657
saving checkpoint to out-shakespeare-char
iter 302750: loss 5.8530, time 2898.05ms
iter 302760: loss 5.9647, time 120.36ms
iter 302770: loss 6.2869, time 121.86ms
iter 302780: loss 5.4349, time 121.92ms
iter 302790: loss 6.1554, time 121.91ms
iter 302800: loss 5.9140, time 122.96ms
iter 302810: loss 5.7837, time 123.53ms
iter 302820: loss 5.8815, time 122.96ms
iter 302830: loss 6.6082, time 121.86ms
iter 302840: loss 6.3014, time 121.95ms
iter 302850: loss 5.8118, time 121.87ms
iter 302860: loss 5.7026, time 122.05ms
iter 302870: loss 5.8405, time 123.11ms
iter 302880: loss 6.0608, time 122.22ms
iter 302890: loss 5.6735, time 123.12ms
iter 302900: loss 5.9207, time 122.00ms
iter 302910: loss 5.4432, time 122.95ms
iter 302920: loss 5.8458, time 122.28ms
iter 302930: loss 5.7796, time 123.90ms
iter 302940: loss 5.6841, time 122.88ms
iter 302950: loss 6.0947, time 123.16ms
iter 302960: loss 6.3544, time 122.47ms
iter 302970: loss 6.5781, time 124.66ms
iter 302980: loss 5.8786, time 123.04ms
iter 302990: loss 6.4080, time 122.04ms
step 303000: train loss 5.6103, val loss 5.6831
saving checkpoint to out-shakespeare-char
iter 303000: loss 5.6645, time 2894.23ms
iter 303010: loss 5.9947, time 121.78ms
iter 303020: loss 6.4588, time 122.11ms
iter 303030: loss 6.5423, time 123.50ms
iter 303040: loss 6.2886, time 119.73ms
iter 303050: loss 5.7184, time 121.54ms
iter 303060: loss 6.3787, time 121.66ms
iter 303070: loss 6.1932, time 121.94ms
iter 303080: loss 5.8180, time 121.80ms
iter 303090: loss 6.0940, time 121.94ms
iter 303100: loss 6.0368, time 122.05ms
iter 303110: loss 5.9739, time 121.80ms
iter 303120: loss 5.7041, time 121.30ms
iter 303130: loss 6.0419, time 121.81ms
iter 303140: loss 5.9848, time 121.86ms
iter 303150: loss 6.3807, time 123.40ms
iter 303160: loss 5.6021, time 120.90ms
iter 303170: loss 5.4514, time 121.72ms
iter 303180: loss 6.2797, time 121.80ms
iter 303190: loss 5.8914, time 121.95ms
iter 303200: loss 6.5165, time 122.04ms
iter 303210: loss 5.9624, time 122.12ms
iter 303220: loss 6.3757, time 121.85ms
iter 303230: loss 5.6589, time 122.01ms
iter 303240: loss 5.8146, time 121.44ms
step 303250: train loss 5.6370, val loss 5.5699
saving checkpoint to out-shakespeare-char
iter 303250: loss 6.4622, time 2890.03ms
iter 303260: loss 6.2292, time 122.03ms
iter 303270: loss 5.8263, time 121.80ms
iter 303280: loss 5.7330, time 121.96ms
iter 303290: loss 6.6796, time 121.60ms
iter 303300: loss 5.6766, time 122.54ms
iter 303310: loss 6.5341, time 120.52ms
iter 303320: loss 6.1130, time 122.00ms
iter 303330: loss 6.4581, time 124.28ms
iter 303340: loss 5.5042, time 122.39ms
iter 303350: loss 5.8641, time 124.62ms
iter 303360: loss 6.3322, time 121.89ms
iter 303370: loss 6.1154, time 124.10ms
iter 303380: loss 5.6300, time 121.97ms
iter 303390: loss 5.7429, time 124.08ms
iter 303400: loss 6.1932, time 121.95ms
iter 303410: loss 6.4214, time 124.16ms
iter 303420: loss 5.4683, time 123.09ms
iter 303430: loss 5.9384, time 123.03ms
iter 303440: loss 6.2730, time 121.96ms
iter 303450: loss 6.4875, time 123.00ms
iter 303460: loss 5.8801, time 121.87ms
iter 303470: loss 6.0694, time 122.51ms
iter 303480: loss 6.0444, time 122.30ms
iter 303490: loss 6.1059, time 122.23ms
step 303500: train loss 5.5864, val loss 5.5803
saving checkpoint to out-shakespeare-char
iter 303500: loss 5.6296, time 2889.66ms
iter 303510: loss 6.5678, time 122.98ms
iter 303520: loss 6.4491, time 121.92ms
iter 303530: loss 6.2548, time 123.01ms
iter 303540: loss 5.5852, time 121.48ms
iter 303550: loss 5.9880, time 122.95ms
iter 303560: loss 6.2001, time 122.48ms
iter 303570: loss 5.8004, time 123.09ms
iter 303580: loss 5.7104, time 121.94ms
iter 303590: loss 6.4923, time 123.07ms
iter 303600: loss 6.6027, time 123.39ms
iter 303610: loss 6.2910, time 121.88ms
iter 303620: loss 6.3281, time 123.30ms
iter 303630: loss 5.4601, time 121.95ms
iter 303640: loss 6.1456, time 123.00ms
iter 303650: loss 5.9653, time 121.96ms
iter 303660: loss 6.1381, time 123.06ms
iter 303670: loss 5.8053, time 121.96ms
iter 303680: loss 6.0977, time 122.90ms
iter 303690: loss 5.7205, time 122.38ms
iter 303700: loss 5.6374, time 123.12ms
iter 303710: loss 6.1230, time 122.83ms
iter 303720: loss 6.1997, time 122.54ms
iter 303730: loss 5.7794, time 123.36ms
iter 303740: loss 5.6051, time 122.27ms
step 303750: train loss 5.6402, val loss 5.6418
saving checkpoint to out-shakespeare-char
iter 303750: loss 7.2217, time 2897.99ms
iter 303760: loss 5.8461, time 125.88ms
iter 303770: loss 6.0265, time 125.80ms
iter 303780: loss 6.5392, time 125.98ms
iter 303790: loss 6.4733, time 124.89ms
iter 303800: loss 5.6062, time 126.23ms
iter 303810: loss 6.1530, time 126.84ms
iter 303820: loss 6.2988, time 126.16ms
iter 303830: loss 5.3644, time 128.55ms
iter 303840: loss 6.5278, time 125.74ms
iter 303850: loss 5.9495, time 126.40ms
iter 303860: loss 5.9272, time 126.56ms
iter 303870: loss 5.8403, time 125.95ms
iter 303880: loss 5.8119, time 128.59ms
iter 303890: loss 5.9421, time 125.89ms
iter 303900: loss 6.0724, time 125.91ms
iter 303910: loss 6.2771, time 126.26ms
iter 303920: loss 5.6420, time 126.04ms
iter 303930: loss 5.8687, time 125.34ms
iter 303940: loss 6.8253, time 126.13ms
iter 303950: loss 5.9340, time 127.46ms
iter 303960: loss 6.0375, time 126.49ms
iter 303970: loss 6.7154, time 125.87ms
iter 303980: loss 6.0692, time 125.96ms
iter 303990: loss 6.0193, time 126.43ms
step 304000: train loss 5.6307, val loss 5.6496
saving checkpoint to out-shakespeare-char
iter 304000: loss 5.7295, time 2894.19ms
iter 304010: loss 6.1680, time 125.68ms
iter 304020: loss 5.9779, time 126.21ms
iter 304030: loss 6.2504, time 127.76ms
iter 304040: loss 6.1234, time 126.25ms
iter 304050: loss 6.0426, time 126.16ms
iter 304060: loss 5.5899, time 125.79ms
iter 304070: loss 6.0126, time 128.29ms
iter 304080: loss 5.7909, time 125.68ms
iter 304090: loss 5.5936, time 127.21ms
iter 304100: loss 5.6118, time 125.59ms
iter 304110: loss 6.3041, time 125.92ms
iter 304120: loss 6.0300, time 126.71ms
iter 304130: loss 6.2781, time 125.78ms
iter 304140: loss 6.5608, time 126.01ms
iter 304150: loss 5.4395, time 128.34ms
iter 304160: loss 6.0733, time 125.89ms
iter 304170: loss 5.7102, time 126.47ms
iter 304180: loss 6.3031, time 126.32ms
iter 304190: loss 6.5483, time 127.23ms
iter 304200: loss 5.4172, time 126.30ms
iter 304210: loss 6.0155, time 125.68ms
iter 304220: loss 6.4680, time 125.47ms
iter 304230: loss 5.9081, time 125.38ms
iter 304240: loss 5.4867, time 125.83ms
step 304250: train loss 5.6681, val loss 5.6195
saving checkpoint to out-shakespeare-char
iter 304250: loss 5.8110, time 2900.66ms
iter 304260: loss 5.7580, time 126.09ms
iter 304270: loss 5.6025, time 125.35ms
iter 304280: loss 5.7383, time 128.02ms
iter 304290: loss 5.8008, time 125.55ms
iter 304300: loss 6.8764, time 125.73ms
iter 304310: loss 6.1074, time 125.92ms
iter 304320: loss 5.8393, time 128.54ms
iter 304330: loss 5.8744, time 127.19ms
iter 304340: loss 6.1529, time 125.72ms
iter 304350: loss 5.7128, time 125.92ms
iter 304360: loss 5.9378, time 125.80ms
iter 304370: loss 6.2537, time 125.59ms
iter 304380: loss 6.0594, time 126.22ms
iter 304390: loss 6.1414, time 125.95ms
iter 304400: loss 5.6304, time 128.29ms
iter 304410: loss 6.0586, time 125.87ms
iter 304420: loss 5.6508, time 125.82ms
iter 304430: loss 5.8189, time 127.37ms
iter 304440: loss 5.1796, time 125.81ms
iter 304450: loss 6.3822, time 126.13ms
iter 304460: loss 6.1909, time 125.78ms
iter 304470: loss 5.6740, time 125.60ms
iter 304480: loss 5.7277, time 125.90ms
iter 304490: loss 5.8117, time 125.34ms
step 304500: train loss 5.6578, val loss 5.6776
saving checkpoint to out-shakespeare-char
iter 304500: loss 6.1161, time 2870.05ms
iter 304510: loss 6.4546, time 126.31ms
iter 304520: loss 5.4669, time 128.53ms
iter 304530: loss 6.0461, time 125.67ms
iter 304540: loss 5.7742, time 125.71ms
iter 304550: loss 5.6929, time 125.87ms
iter 304560: loss 6.4766, time 125.78ms
iter 304570: loss 5.8077, time 125.71ms
iter 304580: loss 6.1590, time 126.02ms
iter 304590: loss 6.5610, time 126.23ms
iter 304600: loss 5.6161, time 128.02ms
iter 304610: loss 5.9835, time 125.60ms
iter 304620: loss 6.4100, time 125.92ms
iter 304630: loss 6.5116, time 125.66ms
iter 304640: loss 6.1804, time 125.71ms
iter 304650: loss 6.1602, time 126.28ms
iter 304660: loss 6.2807, time 125.57ms
iter 304670: loss 5.2849, time 127.05ms
iter 304680: loss 6.0137, time 125.86ms
iter 304690: loss 6.0817, time 125.75ms
iter 304700: loss 6.9890, time 125.90ms
iter 304710: loss 6.4302, time 125.93ms
iter 304720: loss 6.1220, time 128.11ms
iter 304730: loss 6.2808, time 124.43ms
iter 304740: loss 6.0538, time 126.03ms
step 304750: train loss 5.6584, val loss 5.6933
saving checkpoint to out-shakespeare-char
iter 304750: loss 6.1600, time 2876.49ms
iter 304760: loss 5.8457, time 128.03ms
iter 304770: loss 6.5009, time 125.51ms
iter 304780: loss 5.8116, time 125.58ms
iter 304790: loss 6.1795, time 125.32ms
iter 304800: loss 6.3749, time 127.40ms
iter 304810: loss 6.0865, time 125.65ms
iter 304820: loss 6.2875, time 125.55ms
iter 304830: loss 6.2844, time 125.78ms
iter 304840: loss 5.6010, time 125.97ms
iter 304850: loss 5.6517, time 125.69ms
iter 304860: loss 6.1047, time 125.94ms
iter 304870: loss 5.3051, time 125.57ms
iter 304880: loss 6.6205, time 125.74ms
iter 304890: loss 6.2807, time 125.96ms
iter 304900: loss 6.4444, time 125.63ms
iter 304910: loss 6.5745, time 127.34ms
iter 304920: loss 5.6721, time 125.67ms
iter 304930: loss 6.3137, time 128.32ms
iter 304940: loss 6.1818, time 125.72ms
iter 304950: loss 6.1189, time 125.74ms
iter 304960: loss 6.8973, time 125.39ms
iter 304970: loss 6.1775, time 125.96ms
iter 304980: loss 5.7637, time 125.61ms
iter 304990: loss 5.9683, time 125.84ms
step 305000: train loss 5.6198, val loss 5.6852
saving checkpoint to out-shakespeare-char
iter 305000: loss 6.3142, time 2883.52ms
iter 305010: loss 5.9017, time 125.89ms
iter 305020: loss 5.7180, time 125.66ms
iter 305030: loss 5.7221, time 126.06ms
iter 305040: loss 5.6267, time 128.08ms
iter 305050: loss 6.0831, time 125.85ms
iter 305060: loss 6.2899, time 126.27ms
iter 305070: loss 5.8165, time 127.34ms
iter 305080: loss 6.0611, time 125.89ms
iter 305090: loss 5.4861, time 128.04ms
iter 305100: loss 5.8748, time 125.95ms
iter 305110: loss 5.4394, time 126.41ms
iter 305120: loss 5.9022, time 125.85ms
iter 305130: loss 5.6947, time 125.59ms
iter 305140: loss 5.1914, time 125.47ms
iter 305150: loss 5.6582, time 125.70ms
iter 305160: loss 5.9926, time 125.72ms
iter 305170: loss 5.4522, time 127.09ms
iter 305180: loss 5.9965, time 125.59ms
iter 305190: loss 5.7800, time 125.60ms
iter 305200: loss 5.8089, time 126.18ms
iter 305210: loss 5.8772, time 128.32ms
iter 305220: loss 5.9932, time 125.80ms
iter 305230: loss 6.1059, time 125.81ms
iter 305240: loss 6.6224, time 126.26ms
step 305250: train loss 5.6153, val loss 5.6469
saving checkpoint to out-shakespeare-char
iter 305250: loss 5.8787, time 2895.06ms
iter 305260: loss 6.1931, time 125.78ms
iter 305270: loss 5.2290, time 126.24ms
iter 305280: loss 6.0749, time 126.05ms
iter 305290: loss 5.8089, time 127.95ms
iter 305300: loss 6.2450, time 125.68ms
iter 305310: loss 5.5891, time 127.13ms
iter 305320: loss 5.4776, time 126.30ms
iter 305330: loss 5.8388, time 126.53ms
iter 305340: loss 6.0834, time 125.79ms
iter 305350: loss 6.2038, time 125.38ms
iter 305360: loss 5.4210, time 125.02ms
iter 305370: loss 5.7126, time 125.37ms
iter 305380: loss 6.4741, time 125.31ms
iter 305390: loss 6.0372, time 124.74ms
iter 305400: loss 5.5545, time 125.13ms
iter 305410: loss 6.4075, time 127.32ms
iter 305420: loss 5.6042, time 125.40ms
iter 305430: loss 6.0194, time 124.59ms
iter 305440: loss 6.8407, time 125.40ms
iter 305450: loss 6.8400, time 125.90ms
iter 305460: loss 6.5455, time 125.18ms
iter 305470: loss 5.6267, time 125.76ms
iter 305480: loss 5.7073, time 125.03ms
iter 305490: loss 6.0796, time 125.15ms
step 305500: train loss 5.6549, val loss 5.5963
saving checkpoint to out-shakespeare-char
iter 305500: loss 5.6352, time 2910.37ms
iter 305510: loss 6.1902, time 125.56ms
iter 305520: loss 5.7940, time 125.77ms
iter 305530: loss 6.1007, time 128.20ms
iter 305540: loss 6.2926, time 125.48ms
iter 305550: loss 6.2511, time 125.69ms
iter 305560: loss 6.6428, time 125.64ms
iter 305570: loss 6.2815, time 125.46ms
iter 305580: loss 5.8764, time 125.24ms
iter 305590: loss 5.7606, time 125.50ms
iter 305600: loss 5.7640, time 125.45ms
iter 305610: loss 6.6265, time 125.74ms
iter 305620: loss 6.7976, time 125.39ms
iter 305630: loss 5.8670, time 125.33ms
iter 305640: loss 5.5835, time 125.38ms
iter 305650: loss 5.9108, time 128.13ms
iter 305660: loss 5.7920, time 127.56ms
iter 305670: loss 5.8932, time 125.12ms
iter 305680: loss 6.0890, time 125.35ms
iter 305690: loss 6.5494, time 125.61ms
iter 305700: loss 6.2311, time 125.68ms
iter 305710: loss 5.9064, time 125.04ms
iter 305720: loss 6.4292, time 125.29ms
iter 305730: loss 6.7261, time 125.35ms
iter 305740: loss 6.0472, time 125.15ms
step 305750: train loss 5.6849, val loss 5.6681
saving checkpoint to out-shakespeare-char
iter 305750: loss 6.3424, time 2872.68ms
iter 305760: loss 6.3018, time 126.02ms
iter 305770: loss 6.0116, time 128.21ms
iter 305780: loss 6.9772, time 125.40ms
iter 305790: loss 5.7720, time 125.45ms
iter 305800: loss 5.7150, time 126.60ms
iter 305810: loss 6.3496, time 125.87ms
iter 305820: loss 5.5744, time 125.35ms
iter 305830: loss 6.2452, time 125.41ms
iter 305840: loss 6.8100, time 125.30ms
iter 305850: loss 5.9713, time 125.67ms
iter 305860: loss 6.1768, time 125.33ms
iter 305870: loss 5.8341, time 124.79ms
iter 305880: loss 5.6239, time 125.56ms
iter 305890: loss 5.9977, time 127.65ms
iter 305900: loss 5.9755, time 128.12ms
iter 305910: loss 5.7977, time 125.43ms
iter 305920: loss 5.7867, time 125.93ms
iter 305930: loss 5.6108, time 125.35ms
iter 305940: loss 5.7742, time 124.53ms
iter 305950: loss 5.2417, time 125.45ms
iter 305960: loss 7.0363, time 125.30ms
iter 305970: loss 6.3523, time 125.40ms
iter 305980: loss 6.4737, time 125.50ms
iter 305990: loss 6.2557, time 124.74ms
step 306000: train loss 5.6448, val loss 5.6752
saving checkpoint to out-shakespeare-char
iter 306000: loss 5.9013, time 2866.57ms
iter 306010: loss 6.1132, time 125.07ms
iter 306020: loss 5.5768, time 125.95ms
iter 306030: loss 5.8369, time 127.86ms
iter 306040: loss 5.7979, time 127.52ms
iter 306050: loss 6.4025, time 125.27ms
iter 306060: loss 6.3955, time 125.31ms
iter 306070: loss 6.8180, time 125.52ms
iter 306080: loss 5.9049, time 125.23ms
iter 306090: loss 5.9927, time 125.27ms
iter 306100: loss 6.5433, time 125.47ms
iter 306110: loss 5.2239, time 125.45ms
iter 306120: loss 5.8581, time 125.14ms
iter 306130: loss 5.6875, time 126.80ms
iter 306140: loss 6.1421, time 125.17ms
iter 306150: loss 6.2203, time 125.37ms
iter 306160: loss 5.6961, time 127.54ms
iter 306170: loss 6.3552, time 125.18ms
iter 306180: loss 5.5575, time 125.30ms
iter 306190: loss 6.4863, time 125.54ms
iter 306200: loss 5.8829, time 125.76ms
iter 306210: loss 6.1794, time 125.50ms
iter 306220: loss 6.0557, time 125.61ms
iter 306230: loss 6.8243, time 125.37ms
iter 306240: loss 5.7516, time 127.52ms
step 306250: train loss 5.6809, val loss 5.6501
saving checkpoint to out-shakespeare-char
iter 306250: loss 6.2630, time 2901.57ms
iter 306260: loss 6.1121, time 125.47ms
iter 306270: loss 5.9889, time 125.50ms
iter 306280: loss 6.4256, time 126.53ms
iter 306290: loss 6.0794, time 126.90ms
iter 306300: loss 6.2329, time 124.81ms
iter 306310: loss 6.9743, time 124.93ms
iter 306320: loss 5.7110, time 124.85ms
iter 306330: loss 5.6290, time 125.40ms
iter 306340: loss 6.1701, time 124.99ms
iter 306350: loss 5.6932, time 125.09ms
iter 306360: loss 5.4388, time 124.94ms
iter 306370: loss 6.0009, time 125.19ms
iter 306380: loss 6.6364, time 127.03ms
iter 306390: loss 6.4999, time 125.22ms
iter 306400: loss 6.2879, time 125.21ms
iter 306410: loss 6.5993, time 125.82ms
iter 306420: loss 6.2191, time 124.83ms
iter 306430: loss 6.1965, time 125.51ms
iter 306440: loss 6.5454, time 127.49ms
iter 306450: loss 5.9517, time 125.09ms
iter 306460: loss 6.6663, time 125.17ms
iter 306470: loss 5.9748, time 125.18ms
iter 306480: loss 6.1119, time 126.47ms
iter 306490: loss 6.0219, time 125.01ms
step 306500: train loss 5.6568, val loss 5.6174
saving checkpoint to out-shakespeare-char
iter 306500: loss 6.1542, time 2921.73ms
iter 306510: loss 6.3083, time 125.29ms
iter 306520: loss 6.0568, time 126.86ms
iter 306530: loss 5.8166, time 125.28ms
iter 306540: loss 6.3339, time 127.62ms
iter 306550: loss 5.9934, time 125.30ms
iter 306560: loss 6.2621, time 124.98ms
iter 306570: loss 6.3106, time 125.38ms
iter 306580: loss 5.8972, time 125.04ms
iter 306590: loss 5.8140, time 125.03ms
iter 306600: loss 6.1246, time 125.21ms
iter 306610: loss 6.0314, time 125.09ms
iter 306620: loss 6.0046, time 125.13ms
iter 306630: loss 5.8741, time 125.11ms
iter 306640: loss 6.1926, time 125.19ms
iter 306650: loss 5.9542, time 124.79ms
iter 306660: loss 5.4581, time 127.45ms
iter 306670: loss 5.4701, time 125.25ms
iter 306680: loss 5.9411, time 125.35ms
iter 306690: loss 6.4627, time 125.21ms
iter 306700: loss 5.7958, time 124.90ms
iter 306710: loss 6.5577, time 124.83ms
iter 306720: loss 6.0037, time 126.02ms
iter 306730: loss 6.2983, time 125.16ms
iter 306740: loss 6.4878, time 125.28ms
step 306750: train loss 5.6348, val loss 5.6424
saving checkpoint to out-shakespeare-char
iter 306750: loss 5.7737, time 2905.19ms
iter 306760: loss 5.2155, time 123.25ms
iter 306770: loss 6.1774, time 121.67ms
iter 306780: loss 6.0519, time 121.42ms
iter 306790: loss 5.7495, time 121.96ms
iter 306800: loss 5.6512, time 121.77ms
iter 306810: loss 5.6491, time 121.73ms
iter 306820: loss 6.2176, time 127.30ms
iter 306830: loss 6.1757, time 125.51ms
iter 306840: loss 6.0667, time 125.78ms
iter 306850: loss 6.5330, time 125.61ms
iter 306860: loss 5.6502, time 125.75ms
iter 306870: loss 5.4058, time 125.44ms
iter 306880: loss 6.2215, time 125.66ms
iter 306890: loss 5.9336, time 126.15ms
iter 306900: loss 5.6523, time 128.38ms
iter 306910: loss 6.6451, time 125.80ms
iter 306920: loss 6.0722, time 126.00ms
iter 306930: loss 6.4670, time 124.91ms
iter 306940: loss 5.2755, time 126.06ms
iter 306950: loss 6.0939, time 124.76ms
iter 306960: loss 5.8351, time 126.91ms
iter 306970: loss 5.8436, time 124.13ms
iter 306980: loss 5.9645, time 124.98ms
iter 306990: loss 5.9918, time 124.41ms
step 307000: train loss 5.6559, val loss 5.6228
saving checkpoint to out-shakespeare-char
iter 307000: loss 5.4259, time 2899.14ms
iter 307010: loss 5.7476, time 124.98ms
iter 307020: loss 6.1745, time 126.63ms
iter 307030: loss 6.6291, time 126.22ms
iter 307040: loss 6.5700, time 128.30ms
iter 307050: loss 5.9417, time 126.33ms
iter 307060: loss 6.2241, time 124.99ms
iter 307070: loss 5.9012, time 124.73ms
iter 307080: loss 5.8272, time 125.79ms
iter 307090: loss 5.5690, time 126.03ms
iter 307100: loss 6.4172, time 127.46ms
iter 307110: loss 5.8139, time 124.85ms
iter 307120: loss 6.7969, time 125.54ms
iter 307130: loss 6.4548, time 125.87ms
iter 307140: loss 5.1804, time 125.78ms
iter 307150: loss 5.4457, time 125.36ms
iter 307160: loss 5.4919, time 128.03ms
iter 307170: loss 5.8688, time 125.23ms
iter 307180: loss 6.2092, time 124.86ms
iter 307190: loss 5.8597, time 125.35ms
iter 307200: loss 6.2133, time 126.04ms
iter 307210: loss 5.5021, time 125.40ms
iter 307220: loss 5.9496, time 125.29ms
iter 307230: loss 6.1955, time 125.14ms
iter 307240: loss 5.6866, time 125.54ms
step 307250: train loss 5.6792, val loss 5.6585
saving checkpoint to out-shakespeare-char
iter 307250: loss 5.8228, time 2899.08ms
iter 307260: loss 5.1210, time 126.15ms
iter 307270: loss 6.4696, time 125.47ms
iter 307280: loss 5.8157, time 126.76ms
iter 307290: loss 6.2888, time 124.95ms
iter 307300: loss 5.9489, time 125.61ms
iter 307310: loss 5.9825, time 127.23ms
iter 307320: loss 5.7830, time 127.97ms
iter 307330: loss 5.8161, time 125.88ms
iter 307340: loss 5.8962, time 125.80ms
iter 307350: loss 5.9471, time 125.79ms
iter 307360: loss 6.3055, time 125.02ms
iter 307370: loss 5.9323, time 126.43ms
iter 307380: loss 5.6806, time 125.83ms
iter 307390: loss 5.8228, time 125.61ms
iter 307400: loss 6.8029, time 124.72ms
iter 307410: loss 5.6969, time 125.22ms
iter 307420: loss 6.4249, time 125.45ms
iter 307430: loss 6.2082, time 125.86ms
iter 307440: loss 6.6576, time 127.71ms
iter 307450: loss 5.5233, time 126.64ms
iter 307460: loss 5.9839, time 125.38ms
iter 307470: loss 6.1766, time 125.37ms
iter 307480: loss 6.3813, time 125.83ms
iter 307490: loss 5.7580, time 125.70ms
step 307500: train loss 5.6575, val loss 5.6393
saving checkpoint to out-shakespeare-char
iter 307500: loss 6.7390, time 2879.74ms
iter 307510: loss 6.8132, time 125.57ms
iter 307520: loss 6.1627, time 126.01ms
iter 307530: loss 5.9010, time 126.18ms
iter 307540: loss 5.7841, time 126.23ms
iter 307550: loss 5.9597, time 125.83ms
iter 307560: loss 5.7991, time 125.81ms
iter 307570: loss 5.9656, time 124.95ms
iter 307580: loss 6.2513, time 125.47ms
iter 307590: loss 5.6981, time 125.63ms
iter 307600: loss 5.4280, time 128.26ms
iter 307610: loss 5.4361, time 126.88ms
iter 307620: loss 5.9691, time 125.55ms
iter 307630: loss 5.7029, time 125.56ms
iter 307640: loss 6.2428, time 125.88ms
iter 307650: loss 5.9235, time 125.92ms
iter 307660: loss 6.2951, time 125.63ms
iter 307670: loss 6.1087, time 125.83ms
iter 307680: loss 6.5712, time 125.68ms
iter 307690: loss 5.9269, time 125.62ms
iter 307700: loss 6.3902, time 125.38ms
iter 307710: loss 6.0979, time 125.62ms
iter 307720: loss 5.6757, time 124.76ms
iter 307730: loss 5.0785, time 128.22ms
iter 307740: loss 6.2408, time 124.15ms
step 307750: train loss 5.5963, val loss 5.5678
saving checkpoint to out-shakespeare-char
iter 307750: loss 6.8199, time 2878.86ms
iter 307760: loss 5.2993, time 125.73ms
iter 307770: loss 5.9087, time 125.78ms
iter 307780: loss 6.4230, time 125.75ms
iter 307790: loss 6.2898, time 126.03ms
iter 307800: loss 5.8057, time 125.18ms
iter 307810: loss 6.3087, time 127.96ms
iter 307820: loss 6.6135, time 125.85ms
iter 307830: loss 5.9868, time 126.49ms
iter 307840: loss 5.7131, time 126.23ms
iter 307850: loss 6.0586, time 127.10ms
iter 307860: loss 6.5569, time 125.51ms
iter 307870: loss 6.3797, time 125.47ms
iter 307880: loss 6.5372, time 126.21ms
iter 307890: loss 6.9364, time 125.58ms
iter 307900: loss 5.6659, time 125.83ms
iter 307910: loss 4.8097, time 125.41ms
iter 307920: loss 6.5232, time 125.99ms
iter 307930: loss 6.3727, time 125.46ms
iter 307940: loss 5.6933, time 125.48ms
iter 307950: loss 6.1501, time 126.29ms
iter 307960: loss 5.3803, time 125.75ms
iter 307970: loss 6.0558, time 125.95ms
iter 307980: loss 6.1138, time 128.00ms
iter 307990: loss 6.0408, time 125.90ms
step 308000: train loss 5.5994, val loss 5.6347
saving checkpoint to out-shakespeare-char
iter 308000: loss 6.2095, time 2907.85ms
iter 308010: loss 6.5908, time 121.13ms
iter 308020: loss 6.3564, time 121.12ms
iter 308030: loss 5.9606, time 121.95ms
iter 308040: loss 6.2010, time 122.13ms
iter 308050: loss 5.9928, time 121.81ms
iter 308060: loss 6.1412, time 121.78ms
iter 308070: loss 6.9164, time 122.01ms
iter 308080: loss 6.2834, time 121.82ms
iter 308090: loss 6.0591, time 122.02ms
iter 308100: loss 6.6468, time 121.73ms
iter 308110: loss 5.4523, time 123.09ms
iter 308120: loss 5.2927, time 122.00ms
iter 308130: loss 5.7292, time 122.58ms
iter 308140: loss 5.1726, time 121.80ms
iter 308150: loss 6.7872, time 121.92ms
iter 308160: loss 5.8235, time 121.96ms
iter 308170: loss 6.5361, time 121.95ms
iter 308180: loss 5.6318, time 121.89ms
iter 308190: loss 6.0021, time 122.02ms
iter 308200: loss 6.7264, time 121.22ms
iter 308210: loss 6.6415, time 123.09ms
iter 308220: loss 5.8349, time 123.26ms
iter 308230: loss 5.9755, time 123.23ms
iter 308240: loss 6.2189, time 121.97ms
step 308250: train loss 5.5824, val loss 5.6247
saving checkpoint to out-shakespeare-char
iter 308250: loss 6.1707, time 2893.67ms
iter 308260: loss 5.9913, time 119.09ms
iter 308270: loss 6.3973, time 122.07ms
iter 308280: loss 6.1789, time 121.56ms
iter 308290: loss 6.0622, time 121.14ms
iter 308300: loss 6.6423, time 122.91ms
iter 308310: loss 6.2445, time 121.01ms
iter 308320: loss 6.1976, time 123.14ms
iter 308330: loss 5.7800, time 121.92ms
iter 308340: loss 6.9418, time 123.84ms
iter 308350: loss 6.4247, time 121.69ms
iter 308360: loss 5.3958, time 124.02ms
iter 308370: loss 6.1849, time 122.02ms
iter 308380: loss 5.8498, time 122.11ms
iter 308390: loss 5.8698, time 121.99ms
iter 308400: loss 5.5419, time 121.78ms
iter 308410: loss 6.1894, time 121.70ms
iter 308420: loss 5.7806, time 121.53ms
iter 308430: loss 6.1661, time 121.60ms
iter 308440: loss 6.9135, time 121.47ms
iter 308450: loss 6.7716, time 121.67ms
iter 308460: loss 5.7614, time 121.37ms
iter 308470: loss 5.9372, time 121.50ms
iter 308480: loss 6.1347, time 121.40ms
iter 308490: loss 5.9894, time 123.03ms
step 308500: train loss 5.6961, val loss 5.6105
saving checkpoint to out-shakespeare-char
iter 308500: loss 6.5905, time 2879.72ms
iter 308510: loss 5.6764, time 122.18ms
iter 308520: loss 6.3500, time 121.44ms
iter 308530: loss 6.0595, time 122.97ms
iter 308540: loss 6.7260, time 122.97ms
iter 308550: loss 6.0681, time 123.09ms
iter 308560: loss 6.3301, time 123.00ms
iter 308570: loss 5.6862, time 121.71ms
iter 308580: loss 6.4125, time 124.05ms
iter 308590: loss 5.3565, time 122.33ms
iter 308600: loss 5.8370, time 123.25ms
iter 308610: loss 6.3194, time 121.03ms
iter 308620: loss 5.6729, time 123.16ms
iter 308630: loss 5.7246, time 121.98ms
iter 308640: loss 6.1528, time 122.97ms
iter 308650: loss 7.2008, time 123.14ms
iter 308660: loss 5.6420, time 121.82ms
iter 308670: loss 6.0382, time 122.95ms
iter 308680: loss 6.6455, time 121.83ms
iter 308690: loss 5.3977, time 122.85ms
iter 308700: loss 5.7113, time 122.00ms
iter 308710: loss 5.8698, time 123.21ms
iter 308720: loss 5.6962, time 121.51ms
iter 308730: loss 6.5833, time 122.86ms
iter 308740: loss 5.5655, time 122.02ms
step 308750: train loss 5.6936, val loss 5.6146
saving checkpoint to out-shakespeare-char
iter 308750: loss 5.6603, time 2893.06ms
iter 308760: loss 5.6945, time 121.36ms
iter 308770: loss 6.5676, time 124.24ms
iter 308780: loss 6.0958, time 121.84ms
iter 308790: loss 6.0872, time 123.12ms
iter 308800: loss 5.4148, time 122.95ms
iter 308810: loss 5.8233, time 123.34ms
iter 308820: loss 6.1041, time 121.80ms
iter 308830: loss 6.1539, time 123.76ms
iter 308840: loss 6.5667, time 122.15ms
iter 308850: loss 5.4560, time 122.45ms
iter 308860: loss 5.6778, time 120.77ms
iter 308870: loss 5.4605, time 122.63ms
iter 308880: loss 5.9294, time 121.41ms
iter 308890: loss 6.7816, time 122.83ms
iter 308900: loss 6.1380, time 123.10ms
iter 308910: loss 6.3625, time 122.50ms
iter 308920: loss 5.5331, time 121.46ms
iter 308930: loss 5.3227, time 121.33ms
iter 308940: loss 5.8411, time 121.42ms
iter 308950: loss 5.8126, time 121.55ms
iter 308960: loss 5.4373, time 121.47ms
iter 308970: loss 6.7254, time 121.48ms
iter 308980: loss 6.5043, time 121.59ms
iter 308990: loss 6.3845, time 121.58ms
step 309000: train loss 5.6122, val loss 5.6475
saving checkpoint to out-shakespeare-char
iter 309000: loss 6.2596, time 2885.86ms
iter 309010: loss 5.7386, time 121.41ms
iter 309020: loss 5.4254, time 121.57ms
iter 309030: loss 6.1094, time 126.14ms
iter 309040: loss 6.3431, time 128.53ms
iter 309050: loss 5.9141, time 126.15ms
iter 309060: loss 6.0394, time 125.76ms
iter 309070: loss 5.8657, time 125.92ms
iter 309080: loss 6.5802, time 125.72ms
iter 309090: loss 5.5940, time 125.97ms
iter 309100: loss 5.4759, time 125.25ms
iter 309110: loss 5.8300, time 125.76ms
iter 309120: loss 6.5020, time 125.79ms
iter 309130: loss 5.3269, time 125.90ms
iter 309140: loss 6.4280, time 125.80ms
iter 309150: loss 5.4434, time 127.34ms
iter 309160: loss 5.3763, time 125.36ms
iter 309170: loss 5.8819, time 125.05ms
iter 309180: loss 5.8266, time 124.80ms
iter 309190: loss 6.2156, time 125.20ms
iter 309200: loss 6.0799, time 125.14ms
iter 309210: loss 6.3691, time 127.56ms
iter 309220: loss 5.5783, time 124.86ms
iter 309230: loss 6.8977, time 124.71ms
iter 309240: loss 6.1490, time 125.03ms
step 309250: train loss 5.6540, val loss 5.5913
saving checkpoint to out-shakespeare-char
iter 309250: loss 5.7559, time 2886.35ms
iter 309260: loss 6.5171, time 125.77ms
iter 309270: loss 5.8395, time 125.52ms
iter 309280: loss 6.1696, time 125.68ms
iter 309290: loss 5.3591, time 124.92ms
iter 309300: loss 6.1586, time 125.17ms
iter 309310: loss 5.9296, time 125.72ms
iter 309320: loss 6.2852, time 125.03ms
iter 309330: loss 5.6921, time 126.01ms
iter 309340: loss 5.5813, time 128.03ms
iter 309350: loss 5.9880, time 125.42ms
iter 309360: loss 6.1935, time 125.41ms
iter 309370: loss 6.4185, time 125.44ms
iter 309380: loss 6.6764, time 125.64ms
iter 309390: loss 5.4604, time 127.02ms
iter 309400: loss 6.3602, time 125.54ms
iter 309410: loss 6.0075, time 125.63ms
iter 309420: loss 6.1765, time 126.02ms
iter 309430: loss 6.8920, time 125.47ms
iter 309440: loss 6.4366, time 125.43ms
iter 309450: loss 6.1107, time 125.67ms
iter 309460: loss 6.4504, time 123.23ms
iter 309470: loss 6.1936, time 122.04ms
iter 309480: loss 6.1472, time 123.41ms
iter 309490: loss 5.7142, time 122.56ms
step 309500: train loss 5.5983, val loss 5.6272
saving checkpoint to out-shakespeare-char
iter 309500: loss 5.8850, time 2908.73ms
iter 309510: loss 5.7140, time 121.80ms
iter 309520: loss 6.2073, time 121.83ms
iter 309530: loss 5.9121, time 121.70ms
iter 309540: loss 5.6789, time 123.30ms
iter 309550: loss 4.8477, time 121.59ms
iter 309560: loss 6.1465, time 122.86ms
iter 309570: loss 5.7729, time 122.12ms
iter 309580: loss 6.0286, time 122.70ms
iter 309590: loss 6.0446, time 121.64ms
iter 309600: loss 6.1753, time 121.79ms
iter 309610: loss 6.0753, time 121.73ms
iter 309620: loss 6.8910, time 122.85ms
iter 309630: loss 6.4130, time 121.74ms
iter 309640: loss 6.2932, time 126.72ms
iter 309650: loss 6.1895, time 126.60ms
iter 309660: loss 5.9835, time 124.95ms
iter 309670: loss 5.4879, time 124.69ms
iter 309680: loss 6.1671, time 126.80ms
iter 309690: loss 6.2851, time 127.44ms
iter 309700: loss 5.9816, time 125.20ms
iter 309710: loss 5.9596, time 125.23ms
iter 309720: loss 6.1248, time 125.21ms
iter 309730: loss 6.5500, time 124.87ms
iter 309740: loss 5.8552, time 125.36ms
step 309750: train loss 5.6650, val loss 5.6515
saving checkpoint to out-shakespeare-char
iter 309750: loss 5.9948, time 2869.20ms
iter 309760: loss 5.3271, time 125.14ms
iter 309770: loss 4.9113, time 125.36ms
iter 309780: loss 6.0980, time 124.84ms
iter 309790: loss 6.2423, time 128.35ms
iter 309800: loss 6.5580, time 125.75ms
iter 309810: loss 6.2150, time 127.89ms
iter 309820: loss 5.6779, time 124.88ms
iter 309830: loss 5.4555, time 125.90ms
iter 309840: loss 5.6956, time 123.83ms
iter 309850: loss 6.0777, time 125.68ms
iter 309860: loss 6.2116, time 125.76ms
iter 309870: loss 6.0434, time 125.69ms
iter 309880: loss 5.3451, time 126.10ms
iter 309890: loss 5.7113, time 126.99ms
iter 309900: loss 6.3767, time 125.64ms
iter 309910: loss 6.2341, time 125.54ms
iter 309920: loss 5.8363, time 125.29ms
iter 309930: loss 5.5308, time 125.59ms
iter 309940: loss 5.3216, time 124.22ms
iter 309950: loss 6.4070, time 125.30ms
iter 309960: loss 6.0266, time 125.56ms
iter 309970: loss 5.8098, time 125.49ms
iter 309980: loss 5.8367, time 124.43ms
iter 309990: loss 6.5183, time 126.02ms
step 310000: train loss 5.6117, val loss 5.6270
saving checkpoint to out-shakespeare-char
iter 310000: loss 5.9694, time 2893.23ms
iter 310010: loss 5.9171, time 121.63ms
iter 310020: loss 5.7470, time 122.22ms
iter 310030: loss 6.6316, time 120.30ms
iter 310040: loss 6.5182, time 121.52ms
iter 310050: loss 6.2066, time 121.67ms
iter 310060: loss 5.6346, time 121.67ms
iter 310070: loss 6.2772, time 121.72ms
iter 310080: loss 5.8228, time 121.66ms
iter 310090: loss 5.2111, time 121.91ms
iter 310100: loss 6.9242, time 120.62ms
iter 310110: loss 6.2221, time 122.07ms
iter 310120: loss 5.9317, time 121.70ms
iter 310130: loss 6.0645, time 121.51ms
iter 310140: loss 6.0577, time 122.74ms
iter 310150: loss 6.3979, time 123.22ms
iter 310160: loss 6.2319, time 121.56ms
iter 310170: loss 6.1372, time 121.91ms
iter 310180: loss 6.1095, time 122.27ms
iter 310190: loss 5.9524, time 121.89ms
iter 310200: loss 6.2486, time 121.37ms
iter 310210: loss 6.6911, time 121.95ms
iter 310220: loss 5.8443, time 121.71ms
iter 310230: loss 5.6687, time 121.55ms
iter 310240: loss 5.4588, time 121.81ms
step 310250: train loss 5.6671, val loss 5.6533
saving checkpoint to out-shakespeare-char
iter 310250: loss 6.2283, time 2896.95ms
iter 310260: loss 6.4643, time 121.77ms
iter 310270: loss 6.5346, time 121.78ms
iter 310280: loss 6.2345, time 121.92ms
iter 310290: loss 5.9210, time 121.89ms
iter 310300: loss 6.5884, time 123.06ms
iter 310310: loss 5.9474, time 121.66ms
iter 310320: loss 6.0804, time 121.72ms
iter 310330: loss 5.5596, time 121.81ms
iter 310340: loss 5.4075, time 121.75ms
iter 310350: loss 5.5462, time 121.64ms
iter 310360: loss 5.4605, time 121.99ms
iter 310370: loss 6.9060, time 122.11ms
iter 310380: loss 6.1381, time 122.00ms
iter 310390: loss 5.0968, time 121.77ms
iter 310400: loss 5.9632, time 121.81ms
iter 310410: loss 6.0631, time 122.80ms
iter 310420: loss 5.2329, time 121.79ms
iter 310430: loss 5.7526, time 121.88ms
iter 310440: loss 6.3677, time 121.66ms
iter 310450: loss 6.1217, time 121.66ms
iter 310460: loss 5.5791, time 122.12ms
iter 310470: loss 6.1423, time 121.77ms
iter 310480: loss 5.4432, time 121.73ms
iter 310490: loss 5.9054, time 121.70ms
step 310500: train loss 5.6310, val loss 5.6616
saving checkpoint to out-shakespeare-char
iter 310500: loss 6.6592, time 2897.74ms
iter 310510: loss 5.9160, time 123.34ms
iter 310520: loss 5.7584, time 121.92ms
iter 310530: loss 6.3067, time 123.93ms
iter 310540: loss 6.0864, time 121.96ms
iter 310550: loss 6.3572, time 124.44ms
iter 310560: loss 6.6162, time 121.68ms
iter 310570: loss 6.0290, time 122.66ms
iter 310580: loss 5.9313, time 121.89ms
iter 310590: loss 5.6609, time 120.63ms
iter 310600: loss 5.8077, time 121.63ms
iter 310610: loss 5.7258, time 121.72ms
iter 310620: loss 5.8865, time 121.88ms
iter 310630: loss 5.9633, time 121.88ms
iter 310640: loss 5.6191, time 121.84ms
iter 310650: loss 6.0922, time 121.63ms
iter 310660: loss 5.6078, time 121.39ms
iter 310670: loss 5.0831, time 121.70ms
iter 310680: loss 6.1606, time 122.87ms
iter 310690: loss 6.1179, time 121.74ms
iter 310700: loss 5.7268, time 122.95ms
iter 310710: loss 6.4004, time 120.75ms
iter 310720: loss 6.3625, time 122.88ms
iter 310730: loss 5.3653, time 121.67ms
iter 310740: loss 6.2396, time 123.14ms
step 310750: train loss 5.6285, val loss 5.6344
saving checkpoint to out-shakespeare-char
iter 310750: loss 5.7952, time 2897.04ms
iter 310760: loss 6.0535, time 121.69ms
iter 310770: loss 6.2369, time 121.51ms
iter 310780: loss 5.8562, time 121.71ms
iter 310790: loss 6.4356, time 121.61ms
iter 310800: loss 6.3222, time 121.46ms
iter 310810: loss 6.5804, time 121.48ms
iter 310820: loss 6.1969, time 121.64ms
iter 310830: loss 6.1400, time 121.70ms
iter 310840: loss 6.1007, time 122.57ms
iter 310850: loss 5.4061, time 123.00ms
iter 310860: loss 6.4745, time 121.58ms
iter 310870: loss 6.0899, time 122.96ms
iter 310880: loss 5.5529, time 121.69ms
iter 310890: loss 5.0337, time 122.89ms
iter 310900: loss 6.0270, time 121.97ms
iter 310910: loss 6.4270, time 122.85ms
iter 310920: loss 6.0459, time 121.96ms
iter 310930: loss 5.8581, time 123.15ms
iter 310940: loss 6.1859, time 121.82ms
iter 310950: loss 5.8858, time 123.41ms
iter 310960: loss 6.5401, time 124.13ms
iter 310970: loss 6.2273, time 121.88ms
iter 310980: loss 5.6593, time 123.88ms
iter 310990: loss 6.2244, time 121.79ms
step 311000: train loss 5.6407, val loss 5.6281
saving checkpoint to out-shakespeare-char
iter 311000: loss 6.0024, time 2899.80ms
iter 311010: loss 5.8444, time 122.04ms
iter 311020: loss 5.3431, time 124.01ms
iter 311030: loss 5.4730, time 122.59ms
iter 311040: loss 6.7474, time 123.87ms
iter 311050: loss 6.4863, time 121.67ms
iter 311060: loss 5.7308, time 123.87ms
iter 311070: loss 5.7497, time 121.95ms
iter 311080: loss 6.0599, time 124.05ms
iter 311090: loss 6.3257, time 121.74ms
iter 311100: loss 6.3175, time 123.92ms
iter 311110: loss 6.5183, time 123.03ms
iter 311120: loss 5.4755, time 122.16ms
iter 311130: loss 5.9632, time 122.44ms
iter 311140: loss 5.5037, time 121.70ms
iter 311150: loss 6.4121, time 122.82ms
iter 311160: loss 6.7890, time 121.83ms
iter 311170: loss 6.0412, time 122.91ms
iter 311180: loss 5.9968, time 122.36ms
iter 311190: loss 6.2450, time 123.05ms
iter 311200: loss 6.1609, time 122.79ms
iter 311210: loss 6.1703, time 122.54ms
iter 311220: loss 6.1203, time 122.79ms
iter 311230: loss 5.3463, time 121.70ms
iter 311240: loss 5.4114, time 121.78ms
step 311250: train loss 5.6397, val loss 5.5956
saving checkpoint to out-shakespeare-char
iter 311250: loss 7.0839, time 2879.67ms
iter 311260: loss 6.3904, time 122.22ms
iter 311270: loss 5.7554, time 121.97ms
iter 311280: loss 6.3481, time 122.35ms
iter 311290: loss 6.2206, time 121.12ms
iter 311300: loss 6.5077, time 123.06ms
iter 311310: loss 6.1563, time 121.94ms
iter 311320: loss 6.1415, time 122.72ms
iter 311330: loss 6.7443, time 121.19ms
iter 311340: loss 6.0245, time 122.26ms
iter 311350: loss 5.9413, time 121.13ms
iter 311360: loss 6.3160, time 122.94ms
iter 311370: loss 6.1531, time 122.76ms
iter 311380: loss 5.6071, time 121.71ms
iter 311390: loss 5.8184, time 121.33ms
iter 311400: loss 5.6084, time 120.78ms
iter 311410: loss 5.3089, time 121.22ms
iter 311420: loss 5.8025, time 121.69ms
iter 311430: loss 6.3168, time 121.89ms
iter 311440: loss 6.5050, time 121.81ms
iter 311450: loss 6.5037, time 121.55ms
iter 311460: loss 6.1178, time 120.78ms
iter 311470: loss 5.9158, time 123.73ms
iter 311480: loss 6.1496, time 127.55ms
iter 311490: loss 6.2424, time 126.22ms
step 311500: train loss 5.6633, val loss 5.6425
saving checkpoint to out-shakespeare-char
iter 311500: loss 5.4080, time 2896.78ms
iter 311510: loss 6.5029, time 128.47ms
iter 311520: loss 5.9821, time 125.28ms
iter 311530: loss 6.3321, time 125.84ms
iter 311540: loss 6.0026, time 125.83ms
iter 311550: loss 6.3224, time 126.29ms
iter 311560: loss 6.3438, time 127.42ms
iter 311570: loss 6.6455, time 121.85ms
iter 311580: loss 6.0645, time 123.05ms
iter 311590: loss 6.2515, time 122.46ms
iter 311600: loss 6.1953, time 122.83ms
iter 311610: loss 5.5165, time 121.57ms
iter 311620: loss 6.0982, time 122.98ms
iter 311630: loss 5.9215, time 123.07ms
iter 311640: loss 5.9557, time 121.43ms
iter 311650: loss 5.9191, time 123.38ms
iter 311660: loss 5.8946, time 121.42ms
iter 311670: loss 6.3460, time 123.33ms
iter 311680: loss 5.5495, time 121.77ms
iter 311690: loss 4.7877, time 123.22ms
iter 311700: loss 7.0620, time 121.91ms
iter 311710: loss 5.8694, time 122.15ms
iter 311720: loss 5.9298, time 121.60ms
iter 311730: loss 6.1885, time 123.14ms
iter 311740: loss 6.0375, time 120.55ms
step 311750: train loss 5.6521, val loss 5.6523
saving checkpoint to out-shakespeare-char
iter 311750: loss 5.9844, time 2883.17ms
iter 311760: loss 7.1326, time 122.44ms
iter 311770: loss 6.0749, time 120.90ms
iter 311780: loss 5.7256, time 121.00ms
iter 311790: loss 5.8840, time 123.08ms
iter 311800: loss 6.0784, time 121.89ms
iter 311810: loss 6.4257, time 122.94ms
iter 311820: loss 5.7122, time 121.02ms
iter 311830: loss 6.0573, time 121.85ms
iter 311840: loss 6.2315, time 121.19ms
iter 311850: loss 5.2885, time 122.89ms
iter 311860: loss 6.2621, time 121.73ms
iter 311870: loss 5.7685, time 122.93ms
iter 311880: loss 6.1963, time 121.67ms
iter 311890: loss 5.6289, time 122.11ms
iter 311900: loss 6.0371, time 122.32ms
iter 311910: loss 6.3281, time 121.76ms
iter 311920: loss 5.9198, time 121.70ms
iter 311930: loss 6.1363, time 122.17ms
iter 311940: loss 5.9068, time 121.32ms
iter 311950: loss 5.8838, time 120.87ms
iter 311960: loss 6.2350, time 121.01ms
iter 311970: loss 5.6727, time 121.72ms
iter 311980: loss 6.3741, time 121.82ms
iter 311990: loss 6.0480, time 121.71ms
step 312000: train loss 5.6229, val loss 5.6562
saving checkpoint to out-shakespeare-char
iter 312000: loss 7.1050, time 2894.44ms
iter 312010: loss 5.5891, time 121.81ms
iter 312020: loss 6.0524, time 121.58ms
iter 312030: loss 5.0641, time 121.19ms
iter 312040: loss 5.6226, time 121.94ms
iter 312050: loss 5.8088, time 121.79ms
iter 312060: loss 5.8144, time 122.72ms
iter 312070: loss 6.6199, time 121.42ms
iter 312080: loss 5.2607, time 121.75ms
iter 312090: loss 6.5103, time 122.90ms
iter 312100: loss 5.6357, time 122.19ms
iter 312110: loss 5.9763, time 122.88ms
iter 312120: loss 6.4723, time 121.93ms
iter 312130: loss 5.8483, time 122.36ms
iter 312140: loss 5.4345, time 121.69ms
iter 312150: loss 6.3834, time 122.04ms
iter 312160: loss 6.2662, time 121.84ms
iter 312170: loss 5.7035, time 122.29ms
iter 312180: loss 6.5779, time 121.94ms
iter 312190: loss 6.2779, time 123.54ms
iter 312200: loss 6.6308, time 122.10ms
iter 312210: loss 6.2506, time 123.23ms
iter 312220: loss 6.0796, time 122.25ms
iter 312230: loss 6.4996, time 123.21ms
iter 312240: loss 6.2922, time 121.32ms
step 312250: train loss 5.6307, val loss 5.6281
saving checkpoint to out-shakespeare-char
iter 312250: loss 5.8692, time 2887.45ms
iter 312260: loss 5.3934, time 121.71ms
iter 312270: loss 5.5216, time 121.88ms
iter 312280: loss 5.7881, time 120.82ms
iter 312290: loss 5.9843, time 122.00ms
iter 312300: loss 6.1788, time 121.26ms
iter 312310: loss 6.6819, time 122.92ms
iter 312320: loss 6.0151, time 122.19ms
iter 312330: loss 6.2623, time 122.22ms
iter 312340: loss 6.2773, time 121.97ms
iter 312350: loss 6.3537, time 120.97ms
iter 312360: loss 6.2249, time 122.89ms
iter 312370: loss 6.5167, time 121.76ms
iter 312380: loss 5.6150, time 122.65ms
iter 312390: loss 5.3932, time 121.43ms
iter 312400: loss 6.5928, time 121.86ms
iter 312410: loss 6.0606, time 121.13ms
iter 312420: loss 5.9454, time 122.90ms
iter 312430: loss 6.0880, time 121.65ms
iter 312440: loss 6.1181, time 123.28ms
iter 312450: loss 5.9987, time 123.79ms
iter 312460: loss 5.5857, time 121.53ms
iter 312470: loss 6.0435, time 123.83ms
iter 312480: loss 5.9683, time 121.78ms
iter 312490: loss 6.0364, time 123.79ms
step 312500: train loss 5.6230, val loss 5.6449
saving checkpoint to out-shakespeare-char
iter 312500: loss 5.7765, time 2903.27ms
iter 312510: loss 5.1972, time 123.32ms
iter 312520: loss 5.9508, time 121.44ms
iter 312530: loss 5.9746, time 122.97ms
iter 312540: loss 5.9790, time 122.03ms
iter 312550: loss 5.7886, time 123.54ms
iter 312560: loss 5.9249, time 122.03ms
iter 312570: loss 5.9865, time 122.51ms
iter 312580: loss 5.5652, time 121.89ms
iter 312590: loss 6.4238, time 123.37ms
iter 312600: loss 5.7268, time 123.48ms
iter 312610: loss 6.0050, time 122.51ms
iter 312620: loss 5.8555, time 123.07ms
iter 312630: loss 6.4998, time 121.95ms
iter 312640: loss 5.7226, time 123.10ms
iter 312650: loss 6.4942, time 122.04ms
iter 312660: loss 5.3416, time 123.48ms
iter 312670: loss 6.6169, time 121.53ms
iter 312680: loss 6.0982, time 123.21ms
iter 312690: loss 6.0192, time 126.32ms
iter 312700: loss 6.1495, time 125.84ms
iter 312710: loss 6.3758, time 125.94ms
iter 312720: loss 5.5500, time 128.75ms
iter 312730: loss 5.8161, time 125.90ms
iter 312740: loss 5.2809, time 126.04ms
step 312750: train loss 5.6466, val loss 5.6182
saving checkpoint to out-shakespeare-char
iter 312750: loss 5.7728, time 2887.44ms
iter 312760: loss 6.2243, time 125.44ms
iter 312770: loss 6.0933, time 124.75ms
iter 312780: loss 5.1684, time 125.19ms
iter 312790: loss 5.9753, time 124.47ms
iter 312800: loss 5.8068, time 124.83ms
iter 312810: loss 6.0685, time 124.34ms
iter 312820: loss 5.8261, time 123.96ms
iter 312830: loss 6.6378, time 125.33ms
iter 312840: loss 5.5456, time 124.78ms
iter 312850: loss 5.2320, time 125.75ms
iter 312860: loss 6.0431, time 125.01ms
iter 312870: loss 5.3837, time 125.04ms
iter 312880: loss 5.9052, time 127.04ms
iter 312890: loss 6.0934, time 125.50ms
iter 312900: loss 6.1283, time 127.69ms
iter 312910: loss 5.9972, time 124.99ms
iter 312920: loss 6.2370, time 124.42ms
iter 312930: loss 5.3485, time 124.69ms
iter 312940: loss 6.0447, time 125.07ms
iter 312950: loss 5.3977, time 126.49ms
iter 312960: loss 6.1681, time 125.02ms
iter 312970: loss 5.5714, time 124.62ms
iter 312980: loss 5.7500, time 125.15ms
iter 312990: loss 5.4020, time 125.24ms
step 313000: train loss 5.6471, val loss 5.6249
saving checkpoint to out-shakespeare-char
iter 313000: loss 6.6571, time 2903.53ms
iter 313010: loss 6.1729, time 125.22ms
iter 313020: loss 6.0745, time 125.54ms
iter 313030: loss 5.8609, time 125.56ms
iter 313040: loss 5.8278, time 125.64ms
iter 313050: loss 6.7165, time 126.51ms
iter 313060: loss 6.4940, time 125.49ms
iter 313070: loss 5.5341, time 125.54ms
iter 313080: loss 6.0126, time 125.42ms
iter 313090: loss 6.3968, time 127.18ms
iter 313100: loss 6.1139, time 125.46ms
iter 313110: loss 5.8520, time 127.91ms
iter 313120: loss 6.4953, time 125.45ms
iter 313130: loss 5.6870, time 125.47ms
iter 313140: loss 5.9793, time 125.38ms
iter 313150: loss 5.9401, time 125.55ms
iter 313160: loss 6.0648, time 125.41ms
iter 313170: loss 6.0533, time 125.34ms
iter 313180: loss 5.7960, time 126.22ms
iter 313190: loss 5.6278, time 127.13ms
iter 313200: loss 6.3434, time 125.66ms
iter 313210: loss 6.5007, time 125.50ms
iter 313220: loss 5.4132, time 125.78ms
iter 313230: loss 6.1885, time 127.57ms
iter 313240: loss 6.0800, time 125.52ms
step 313250: train loss 5.5741, val loss 5.6634
saving checkpoint to out-shakespeare-char
iter 313250: loss 5.9217, time 2900.35ms
iter 313260: loss 6.0605, time 125.54ms
iter 313270: loss 5.9065, time 126.16ms
iter 313280: loss 5.7223, time 125.69ms
iter 313290: loss 7.2401, time 125.64ms
iter 313300: loss 6.0977, time 125.65ms
iter 313310: loss 5.6039, time 121.98ms
iter 313320: loss 5.5111, time 121.64ms
iter 313330: loss 5.6501, time 121.43ms
iter 313340: loss 6.5194, time 122.86ms
iter 313350: loss 6.1696, time 121.71ms
iter 313360: loss 5.8149, time 121.45ms
iter 313370: loss 6.8479, time 121.68ms
iter 313380: loss 6.4754, time 121.60ms
iter 313390: loss 5.8861, time 121.61ms
iter 313400: loss 6.3362, time 121.99ms
iter 313410: loss 5.7802, time 121.66ms
iter 313420: loss 6.0498, time 121.52ms
iter 313430: loss 6.1163, time 121.57ms
iter 313440: loss 5.5483, time 121.70ms
iter 313450: loss 5.6951, time 119.92ms
iter 313460: loss 5.8813, time 121.72ms
iter 313470: loss 6.9195, time 121.57ms
iter 313480: loss 5.8828, time 121.63ms
iter 313490: loss 6.1425, time 121.70ms
step 313500: train loss 5.6086, val loss 5.6534
saving checkpoint to out-shakespeare-char
iter 313500: loss 6.0919, time 2892.26ms
iter 313510: loss 5.9940, time 121.98ms
iter 313520: loss 5.2953, time 121.75ms
iter 313530: loss 6.3934, time 121.51ms
iter 313540: loss 5.8614, time 122.00ms
iter 313550: loss 5.8845, time 121.59ms
iter 313560: loss 5.8996, time 121.90ms
iter 313570: loss 5.7079, time 121.59ms
iter 313580: loss 5.3658, time 121.64ms
iter 313590: loss 5.6083, time 121.51ms
iter 313600: loss 6.0627, time 121.47ms
iter 313610: loss 6.1618, time 121.88ms
iter 313620: loss 5.6115, time 122.67ms
iter 313630: loss 6.0240, time 121.67ms
iter 313640: loss 6.6139, time 122.83ms
iter 313650: loss 6.2872, time 121.72ms
iter 313660: loss 6.3312, time 122.48ms
iter 313670: loss 6.2106, time 121.88ms
iter 313680: loss 6.1656, time 123.37ms
iter 313690: loss 6.2503, time 121.27ms
iter 313700: loss 5.7284, time 122.86ms
iter 313710: loss 5.9461, time 121.60ms
iter 313720: loss 5.7903, time 123.07ms
iter 313730: loss 5.7382, time 121.77ms
iter 313740: loss 5.7241, time 121.49ms
step 313750: train loss 5.6278, val loss 5.6758
saving checkpoint to out-shakespeare-char
iter 313750: loss 4.6605, time 2911.19ms
iter 313760: loss 5.9406, time 124.85ms
iter 313770: loss 6.3176, time 126.53ms
iter 313780: loss 6.1071, time 124.48ms
iter 313790: loss 5.6942, time 126.75ms
iter 313800: loss 6.7578, time 126.49ms
iter 313810: loss 5.4228, time 124.75ms
iter 313820: loss 6.1057, time 124.32ms
iter 313830: loss 6.1716, time 125.25ms
iter 313840: loss 5.6652, time 124.21ms
iter 313850: loss 5.9944, time 124.95ms
iter 313860: loss 6.2848, time 124.82ms
iter 313870: loss 6.6776, time 126.66ms
iter 313880: loss 6.1158, time 124.37ms
iter 313890: loss 5.8134, time 127.74ms
iter 313900: loss 6.6587, time 124.24ms
iter 313910: loss 6.4285, time 124.73ms
iter 313920: loss 6.8945, time 125.05ms
iter 313930: loss 6.1127, time 125.57ms
iter 313940: loss 5.7625, time 125.23ms
iter 313950: loss 6.4178, time 125.74ms
iter 313960: loss 6.2750, time 125.66ms
iter 313970: loss 5.7432, time 126.76ms
iter 313980: loss 6.0148, time 126.13ms
iter 313990: loss 6.0412, time 125.68ms
step 314000: train loss 5.6024, val loss 5.6275
saving checkpoint to out-shakespeare-char
iter 314000: loss 6.6141, time 2897.77ms
iter 314010: loss 5.8337, time 127.46ms
iter 314020: loss 7.0000, time 126.12ms
iter 314030: loss 6.1530, time 126.17ms
iter 314040: loss 5.8400, time 125.67ms
iter 314050: loss 6.0901, time 125.68ms
iter 314060: loss 6.2434, time 125.79ms
iter 314070: loss 6.8040, time 125.82ms
iter 314080: loss 5.6992, time 125.65ms
iter 314090: loss 5.2402, time 125.48ms
iter 314100: loss 5.8842, time 125.79ms
iter 314110: loss 6.4487, time 125.76ms
iter 314120: loss 6.2886, time 128.00ms
iter 314130: loss 5.7123, time 125.73ms
iter 314140: loss 6.1906, time 125.85ms
iter 314150: loss 6.4115, time 125.59ms
iter 314160: loss 5.7573, time 127.94ms
iter 314170: loss 5.3914, time 125.66ms
iter 314180: loss 5.6266, time 126.65ms
iter 314190: loss 6.1958, time 126.00ms
iter 314200: loss 6.5520, time 127.08ms
iter 314210: loss 6.4831, time 125.54ms
iter 314220: loss 6.3583, time 125.92ms
iter 314230: loss 6.5724, time 125.99ms
iter 314240: loss 6.2377, time 125.91ms
step 314250: train loss 5.6002, val loss 5.6542
saving checkpoint to out-shakespeare-char
iter 314250: loss 6.7298, time 2882.29ms
iter 314260: loss 5.9961, time 125.82ms
iter 314270: loss 5.4413, time 125.38ms
iter 314280: loss 6.1066, time 125.88ms
iter 314290: loss 5.3322, time 125.21ms
iter 314300: loss 5.8364, time 125.88ms
iter 314310: loss 5.7222, time 125.91ms
iter 314320: loss 5.5036, time 125.51ms
iter 314330: loss 5.6740, time 127.97ms
iter 314340: loss 5.9744, time 125.58ms
iter 314350: loss 6.1624, time 125.55ms
iter 314360: loss 6.4807, time 124.69ms
iter 314370: loss 5.7027, time 125.97ms
iter 314380: loss 6.2533, time 126.05ms
iter 314390: loss 5.7197, time 125.70ms
iter 314400: loss 6.2762, time 125.09ms
iter 314410: loss 6.3772, time 125.87ms
iter 314420: loss 6.3855, time 125.56ms
iter 314430: loss 5.5340, time 125.20ms
iter 314440: loss 5.9547, time 125.96ms
iter 314450: loss 5.9936, time 125.73ms
iter 314460: loss 6.5142, time 127.73ms
iter 314470: loss 5.9081, time 125.86ms
iter 314480: loss 6.0062, time 125.73ms
iter 314490: loss 6.5007, time 126.03ms
step 314500: train loss 5.6308, val loss 5.6211
saving checkpoint to out-shakespeare-char
iter 314500: loss 6.0545, time 2875.84ms
iter 314510: loss 5.5748, time 121.67ms
iter 314520: loss 6.3175, time 123.15ms
iter 314530: loss 5.6853, time 121.67ms
iter 314540: loss 5.9513, time 124.47ms
iter 314550: loss 6.0750, time 122.08ms
iter 314560: loss 6.7795, time 123.86ms
iter 314570: loss 5.7959, time 121.94ms
iter 314580: loss 5.4577, time 124.34ms
iter 314590: loss 6.1661, time 122.24ms
iter 314600: loss 6.2342, time 122.46ms
iter 314610: loss 6.1530, time 120.98ms
iter 314620: loss 6.3694, time 121.83ms
iter 314630: loss 6.1428, time 121.69ms
iter 314640: loss 5.9292, time 121.83ms
iter 314650: loss 6.1039, time 121.75ms
iter 314660: loss 5.9792, time 122.39ms
iter 314670: loss 5.9428, time 121.80ms
iter 314680: loss 6.3533, time 121.64ms
iter 314690: loss 6.5285, time 121.84ms
iter 314700: loss 6.0678, time 121.04ms
iter 314710: loss 6.0437, time 122.95ms
iter 314720: loss 6.5622, time 122.46ms
iter 314730: loss 5.5876, time 121.78ms
iter 314740: loss 6.4940, time 122.11ms
step 314750: train loss 5.6476, val loss 5.6455
saving checkpoint to out-shakespeare-char
iter 314750: loss 6.0432, time 2896.40ms
iter 314760: loss 5.9161, time 125.40ms
iter 314770: loss 5.9001, time 125.30ms
iter 314780: loss 5.3930, time 125.43ms
iter 314790: loss 6.0758, time 128.02ms
iter 314800: loss 6.0575, time 124.77ms
iter 314810: loss 6.4411, time 125.57ms
iter 314820: loss 5.6379, time 125.62ms
iter 314830: loss 6.5366, time 125.50ms
iter 314840: loss 5.7021, time 125.59ms
iter 314850: loss 5.6338, time 126.26ms
iter 314860: loss 6.4347, time 126.61ms
iter 314870: loss 5.8726, time 128.23ms
iter 314880: loss 5.8317, time 125.25ms
iter 314890: loss 5.9053, time 125.59ms
iter 314900: loss 5.8740, time 125.53ms
iter 314910: loss 6.5053, time 125.73ms
iter 314920: loss 6.3848, time 125.38ms
iter 314930: loss 5.7742, time 125.54ms
iter 314940: loss 6.0843, time 125.76ms
iter 314950: loss 5.5261, time 127.01ms
iter 314960: loss 5.3779, time 125.53ms
iter 314970: loss 5.4926, time 125.48ms
iter 314980: loss 5.6196, time 125.97ms
iter 314990: loss 6.5441, time 127.85ms
step 315000: train loss 5.6315, val loss 5.5638
saving checkpoint to out-shakespeare-char
iter 315000: loss 5.9862, time 2877.24ms
iter 315010: loss 5.9317, time 125.04ms
iter 315020: loss 6.0103, time 125.68ms
iter 315030: loss 5.8568, time 121.71ms
iter 315040: loss 6.2837, time 120.77ms
iter 315050: loss 6.3186, time 121.98ms
iter 315060: loss 5.5882, time 122.49ms
iter 315070: loss 6.4003, time 121.89ms
iter 315080: loss 6.0784, time 121.98ms
iter 315090: loss 5.8063, time 121.69ms
iter 315100: loss 6.2866, time 122.01ms
iter 315110: loss 6.1695, time 121.87ms
iter 315120: loss 5.8882, time 123.29ms
iter 315130: loss 5.6728, time 125.60ms
iter 315140: loss 5.5900, time 122.14ms
iter 315150: loss 5.4186, time 124.40ms
iter 315160: loss 6.1887, time 122.32ms
iter 315170: loss 6.2872, time 123.92ms
iter 315180: loss 5.2617, time 122.16ms
iter 315190: loss 6.7030, time 124.74ms
iter 315200: loss 6.1150, time 122.04ms
iter 315210: loss 6.0478, time 124.20ms
iter 315220: loss 6.2512, time 121.52ms
iter 315230: loss 5.5093, time 120.88ms
iter 315240: loss 6.3419, time 122.68ms
step 315250: train loss 5.6080, val loss 5.6648
saving checkpoint to out-shakespeare-char
iter 315250: loss 5.7744, time 2884.66ms
iter 315260: loss 6.2607, time 124.47ms
iter 315270: loss 5.8667, time 121.39ms
iter 315280: loss 5.2808, time 123.59ms
iter 315290: loss 6.3668, time 122.01ms
iter 315300: loss 6.2790, time 121.96ms
iter 315310: loss 6.5168, time 121.12ms
iter 315320: loss 5.6809, time 121.82ms
iter 315330: loss 6.0019, time 121.88ms
iter 315340: loss 6.3960, time 121.94ms
iter 315350: loss 5.6999, time 121.93ms
iter 315360: loss 6.0024, time 122.23ms
iter 315370: loss 6.0414, time 121.85ms
iter 315380: loss 6.0146, time 121.77ms
iter 315390: loss 6.2269, time 123.29ms
iter 315400: loss 5.3377, time 121.85ms
iter 315410: loss 6.8709, time 123.06ms
iter 315420: loss 6.2503, time 121.94ms
iter 315430: loss 6.1083, time 123.05ms
iter 315440: loss 5.7721, time 121.82ms
iter 315450: loss 5.7054, time 123.20ms
iter 315460: loss 6.2472, time 122.13ms
iter 315470: loss 5.8762, time 122.86ms
iter 315480: loss 5.5872, time 121.86ms
iter 315490: loss 6.2448, time 122.15ms
step 315500: train loss 5.6145, val loss 5.6523
saving checkpoint to out-shakespeare-char
iter 315500: loss 5.9664, time 2892.71ms
iter 315510: loss 5.2150, time 125.82ms
iter 315520: loss 6.1531, time 128.56ms
iter 315530: loss 5.7075, time 125.94ms
iter 315540: loss 5.4825, time 127.25ms
iter 315550: loss 5.8698, time 127.73ms
iter 315560: loss 6.1518, time 125.42ms
iter 315570: loss 5.8935, time 125.77ms
iter 315580: loss 6.6441, time 124.83ms
iter 315590: loss 5.9704, time 125.88ms
iter 315600: loss 6.0049, time 127.26ms
iter 315610: loss 5.5023, time 124.91ms
iter 315620: loss 6.1828, time 125.14ms
iter 315630: loss 6.4140, time 125.86ms
iter 315640: loss 6.4274, time 126.27ms
iter 315650: loss 6.3585, time 124.88ms
iter 315660: loss 6.4892, time 125.90ms
iter 315670: loss 5.9226, time 126.07ms
iter 315680: loss 5.7613, time 127.62ms
iter 315690: loss 6.2811, time 125.14ms
iter 315700: loss 6.0839, time 125.22ms
iter 315710: loss 6.4103, time 121.08ms
iter 315720: loss 5.7236, time 125.28ms
iter 315730: loss 5.5171, time 125.82ms
iter 315740: loss 4.7566, time 125.64ms
step 315750: train loss 5.6260, val loss 5.6508
saving checkpoint to out-shakespeare-char
iter 315750: loss 5.9975, time 2926.25ms
iter 315760: loss 6.0811, time 120.99ms
iter 315770: loss 6.4023, time 120.40ms
iter 315780: loss 6.1928, time 122.97ms
iter 315790: loss 6.3514, time 121.81ms
iter 315800: loss 6.4168, time 121.16ms
iter 315810: loss 6.2251, time 120.60ms
iter 315820: loss 6.4162, time 121.28ms
iter 315830: loss 6.2333, time 120.28ms
iter 315840: loss 5.3982, time 120.85ms
iter 315850: loss 5.1906, time 121.23ms
iter 315860: loss 5.4358, time 120.92ms
iter 315870: loss 5.9009, time 120.84ms
iter 315880: loss 6.1761, time 122.53ms
iter 315890: loss 5.5257, time 121.16ms
iter 315900: loss 6.8841, time 122.56ms
iter 315910: loss 5.8742, time 120.74ms
iter 315920: loss 6.0107, time 122.41ms
iter 315930: loss 6.8028, time 120.42ms
iter 315940: loss 6.3159, time 122.25ms
iter 315950: loss 5.8909, time 120.59ms
iter 315960: loss 6.0124, time 122.44ms
iter 315970: loss 5.2572, time 120.74ms
iter 315980: loss 6.7658, time 122.42ms
iter 315990: loss 5.7923, time 122.87ms
step 316000: train loss 5.6479, val loss 5.5894
saving checkpoint to out-shakespeare-char
iter 316000: loss 5.8542, time 2871.22ms
iter 316010: loss 5.9360, time 123.64ms
iter 316020: loss 5.6652, time 121.32ms
iter 316030: loss 5.3904, time 123.08ms
iter 316040: loss 5.7534, time 123.11ms
iter 316050: loss 5.8237, time 121.40ms
iter 316060: loss 6.3823, time 121.08ms
iter 316070: loss 6.1767, time 122.16ms
iter 316080: loss 6.0034, time 122.12ms
iter 316090: loss 5.9835, time 120.88ms
iter 316100: loss 6.1854, time 121.65ms
iter 316110: loss 5.9515, time 120.74ms
iter 316120: loss 6.0436, time 121.27ms
iter 316130: loss 6.6887, time 121.66ms
iter 316140: loss 5.2193, time 121.84ms
iter 316150: loss 5.9675, time 121.86ms
iter 316160: loss 6.2127, time 121.97ms
iter 316170: loss 6.2671, time 121.56ms
iter 316180: loss 5.7580, time 122.76ms
iter 316190: loss 5.8093, time 121.97ms
iter 316200: loss 6.2654, time 122.12ms
iter 316210: loss 5.8900, time 121.80ms
iter 316220: loss 6.0503, time 122.85ms
iter 316230: loss 6.3524, time 121.89ms
iter 316240: loss 5.4006, time 121.73ms
step 316250: train loss 5.6175, val loss 5.6262
saving checkpoint to out-shakespeare-char
iter 316250: loss 5.7952, time 2901.92ms
iter 316260: loss 6.0601, time 119.72ms
iter 316270: loss 5.5638, time 120.54ms
iter 316280: loss 5.3477, time 120.53ms
iter 316290: loss 6.3566, time 123.15ms
iter 316300: loss 5.9369, time 121.52ms
iter 316310: loss 5.1952, time 121.88ms
iter 316320: loss 5.7060, time 121.52ms
iter 316330: loss 6.3623, time 124.55ms
iter 316340: loss 5.7760, time 127.97ms
iter 316350: loss 5.6346, time 125.38ms
iter 316360: loss 6.0507, time 125.16ms
iter 316370: loss 5.7180, time 125.95ms
iter 316380: loss 5.1616, time 125.39ms
iter 316390: loss 6.8608, time 125.79ms
iter 316400: loss 6.5868, time 124.98ms
iter 316410: loss 5.6848, time 126.40ms
iter 316420: loss 5.4173, time 125.89ms
iter 316430: loss 5.5393, time 124.82ms
iter 316440: loss 5.9763, time 125.65ms
iter 316450: loss 6.1195, time 126.09ms
iter 316460: loss 6.0389, time 128.30ms
iter 316470: loss 6.3084, time 124.28ms
iter 316480: loss 6.1039, time 124.85ms
iter 316490: loss 6.1254, time 125.62ms
step 316500: train loss 5.6434, val loss 5.6177
saving checkpoint to out-shakespeare-char
iter 316500: loss 6.3497, time 2876.90ms
iter 316510: loss 6.5053, time 126.15ms
iter 316520: loss 5.7975, time 125.24ms
iter 316530: loss 5.4076, time 125.04ms
iter 316540: loss 5.7301, time 124.91ms
iter 316550: loss 6.7554, time 124.95ms
iter 316560: loss 5.9201, time 125.82ms
iter 316570: loss 5.6026, time 126.44ms
iter 316580: loss 5.5762, time 127.35ms
iter 316590: loss 5.8558, time 127.49ms
iter 316600: loss 5.9713, time 124.71ms
iter 316610: loss 6.0337, time 126.04ms
iter 316620: loss 5.3927, time 125.65ms
iter 316630: loss 5.8548, time 125.14ms
iter 316640: loss 6.3821, time 125.80ms
iter 316650: loss 6.2053, time 125.07ms
iter 316660: loss 6.2740, time 125.99ms
iter 316670: loss 6.4684, time 122.91ms
iter 316680: loss 6.1605, time 126.08ms
iter 316690: loss 6.4577, time 125.57ms
iter 316700: loss 5.6635, time 125.68ms
iter 316710: loss 5.7423, time 125.58ms
iter 316720: loss 6.1423, time 128.20ms
iter 316730: loss 5.7373, time 125.68ms
iter 316740: loss 5.8593, time 125.74ms
step 316750: train loss 5.6379, val loss 5.6604
saving checkpoint to out-shakespeare-char
iter 316750: loss 6.0820, time 2873.86ms
iter 316760: loss 5.5116, time 125.68ms
iter 316770: loss 4.9979, time 125.10ms
iter 316780: loss 6.4679, time 125.90ms
iter 316790: loss 5.8688, time 128.34ms
iter 316800: loss 5.6832, time 125.52ms
iter 316810: loss 6.4032, time 126.25ms
iter 316820: loss 6.3128, time 126.09ms
iter 316830: loss 5.4487, time 125.99ms
iter 316840: loss 5.9411, time 125.58ms
iter 316850: loss 5.8511, time 127.75ms
iter 316860: loss 5.4282, time 125.66ms
iter 316870: loss 5.8097, time 125.50ms
iter 316880: loss 6.5543, time 127.10ms
iter 316890: loss 6.4775, time 128.80ms
iter 316900: loss 6.4283, time 126.16ms
iter 316910: loss 5.5613, time 124.40ms
iter 316920: loss 5.7311, time 126.05ms
iter 316930: loss 6.2657, time 125.73ms
iter 316940: loss 5.6381, time 125.70ms
iter 316950: loss 5.8209, time 125.90ms
iter 316960: loss 6.0991, time 128.73ms
iter 316970: loss 6.1935, time 125.67ms
iter 316980: loss 6.4405, time 125.71ms
iter 316990: loss 5.9741, time 123.96ms
step 317000: train loss 5.5920, val loss 5.6455
saving checkpoint to out-shakespeare-char
iter 317000: loss 5.6746, time 2877.23ms
iter 317010: loss 5.9011, time 125.81ms
iter 317020: loss 6.6448, time 125.83ms
iter 317030: loss 5.5395, time 126.56ms
iter 317040: loss 5.7926, time 126.02ms
iter 317050: loss 5.8579, time 124.93ms
iter 317060: loss 6.0617, time 125.51ms
iter 317070: loss 6.0928, time 127.59ms
iter 317080: loss 5.6187, time 125.47ms
iter 317090: loss 6.0954, time 124.17ms
iter 317100: loss 5.8459, time 124.91ms
iter 317110: loss 6.7340, time 127.69ms
iter 317120: loss 5.9138, time 125.51ms
iter 317130: loss 6.4862, time 124.95ms
iter 317140: loss 6.3066, time 125.45ms
iter 317150: loss 7.0086, time 127.54ms
iter 317160: loss 6.5391, time 126.55ms
iter 317170: loss 6.4367, time 125.24ms
iter 317180: loss 6.2655, time 125.08ms
iter 317190: loss 6.6146, time 125.61ms
iter 317200: loss 6.4072, time 125.28ms
iter 317210: loss 5.3565, time 125.78ms
iter 317220: loss 6.2537, time 125.19ms
iter 317230: loss 5.7285, time 125.65ms
iter 317240: loss 5.5927, time 125.62ms
step 317250: train loss 5.5814, val loss 5.6337
saving checkpoint to out-shakespeare-char
iter 317250: loss 5.5174, time 2860.52ms
iter 317260: loss 6.4186, time 123.86ms
iter 317270: loss 6.3957, time 122.19ms
iter 317280: loss 5.3772, time 123.72ms
iter 317290: loss 5.9739, time 121.40ms
iter 317300: loss 6.1146, time 123.67ms
iter 317310: loss 6.3426, time 121.69ms
iter 317320: loss 5.7307, time 123.04ms
iter 317330: loss 5.6436, time 121.45ms
iter 317340: loss 6.3430, time 123.58ms
iter 317350: loss 6.1550, time 121.17ms
iter 317360: loss 5.7802, time 123.54ms
iter 317370: loss 5.5680, time 121.44ms
iter 317380: loss 5.8946, time 123.54ms
iter 317390: loss 5.6919, time 121.21ms
iter 317400: loss 6.0178, time 122.45ms
iter 317410: loss 6.4370, time 121.06ms
iter 317420: loss 6.6803, time 123.56ms
iter 317430: loss 6.2885, time 121.74ms
iter 317440: loss 5.4445, time 123.24ms
iter 317450: loss 6.0447, time 121.48ms
iter 317460: loss 6.4109, time 123.40ms
iter 317470: loss 6.0945, time 121.67ms
iter 317480: loss 6.4651, time 123.76ms
iter 317490: loss 6.0249, time 121.41ms
step 317500: train loss 5.6538, val loss 5.6341
saving checkpoint to out-shakespeare-char
iter 317500: loss 6.6330, time 2887.77ms
iter 317510: loss 6.1448, time 121.63ms
iter 317520: loss 5.7506, time 123.02ms
iter 317530: loss 5.5422, time 121.35ms
iter 317540: loss 6.5322, time 122.95ms
iter 317550: loss 6.4429, time 120.84ms
iter 317560: loss 6.4382, time 122.42ms
iter 317570: loss 5.3613, time 121.52ms
iter 317580: loss 6.2395, time 122.38ms
iter 317590: loss 5.7075, time 120.46ms
iter 317600: loss 6.6280, time 122.18ms
iter 317610: loss 6.8603, time 121.39ms
iter 317620: loss 5.3760, time 122.07ms
iter 317630: loss 5.5828, time 127.19ms
iter 317640: loss 6.3146, time 125.04ms
iter 317650: loss 5.3317, time 125.23ms
iter 317660: loss 5.6305, time 125.07ms
iter 317670: loss 5.9331, time 124.93ms
iter 317680: loss 6.0949, time 124.22ms
iter 317690: loss 6.0376, time 124.97ms
iter 317700: loss 5.8689, time 125.08ms
iter 317710: loss 5.5926, time 124.89ms
iter 317720: loss 5.5588, time 125.34ms
iter 317730: loss 5.7982, time 127.50ms
iter 317740: loss 6.0154, time 125.67ms
step 317750: train loss 5.6393, val loss 5.6669
saving checkpoint to out-shakespeare-char
iter 317750: loss 6.1449, time 2896.86ms
iter 317760: loss 6.1687, time 125.81ms
iter 317770: loss 5.5078, time 125.63ms
iter 317780: loss 5.7099, time 125.83ms
iter 317790: loss 5.6770, time 124.51ms
iter 317800: loss 5.7481, time 125.51ms
iter 317810: loss 5.7789, time 125.35ms
iter 317820: loss 6.2583, time 127.82ms
iter 317830: loss 5.5434, time 125.76ms
iter 317840: loss 5.9706, time 125.53ms
iter 317850: loss 5.3294, time 125.76ms
iter 317860: loss 6.5183, time 125.61ms
iter 317870: loss 5.4153, time 125.87ms
iter 317880: loss 6.1437, time 125.43ms
iter 317890: loss 6.0192, time 125.85ms
iter 317900: loss 6.0390, time 125.66ms
iter 317910: loss 6.7713, time 125.80ms
iter 317920: loss 6.8634, time 125.65ms
iter 317930: loss 6.3206, time 128.34ms
iter 317940: loss 5.8669, time 125.52ms
iter 317950: loss 6.6630, time 126.17ms
iter 317960: loss 5.9704, time 126.21ms
iter 317970: loss 5.7135, time 125.95ms
iter 317980: loss 5.8867, time 125.83ms
iter 317990: loss 6.0515, time 125.92ms
step 318000: train loss 5.5802, val loss 5.5970
saving checkpoint to out-shakespeare-char
iter 318000: loss 5.2908, time 2882.08ms
iter 318010: loss 6.1312, time 125.89ms
iter 318020: loss 5.7662, time 125.17ms
iter 318030: loss 5.8895, time 124.28ms
iter 318040: loss 5.3013, time 125.33ms
iter 318050: loss 5.3654, time 126.85ms
iter 318060: loss 6.1936, time 125.15ms
iter 318070: loss 6.4459, time 125.53ms
iter 318080: loss 6.3357, time 125.24ms
iter 318090: loss 6.4625, time 125.19ms
iter 318100: loss 6.1377, time 125.77ms
iter 318110: loss 6.0766, time 124.82ms
iter 318120: loss 5.6363, time 125.23ms
iter 318130: loss 5.5056, time 124.46ms
iter 318140: loss 6.4814, time 125.19ms
iter 318150: loss 5.8503, time 124.41ms
iter 318160: loss 6.0759, time 127.71ms
iter 318170: loss 5.6349, time 124.19ms
iter 318180: loss 6.4208, time 125.15ms
iter 318190: loss 6.1220, time 125.60ms
iter 318200: loss 6.1388, time 125.64ms
iter 318210: loss 6.4589, time 124.72ms
iter 318220: loss 5.9875, time 125.63ms
iter 318230: loss 5.7790, time 125.56ms
iter 318240: loss 5.4356, time 125.56ms
step 318250: train loss 5.6498, val loss 5.6894
saving checkpoint to out-shakespeare-char
iter 318250: loss 6.5851, time 2899.15ms
iter 318260: loss 5.9624, time 124.59ms
iter 318270: loss 6.4709, time 124.31ms
iter 318280: loss 5.9700, time 127.92ms
iter 318290: loss 5.7620, time 124.96ms
iter 318300: loss 5.9342, time 125.17ms
iter 318310: loss 6.3362, time 125.41ms
iter 318320: loss 6.1151, time 125.13ms
iter 318330: loss 5.7943, time 125.13ms
iter 318340: loss 6.1639, time 125.03ms
iter 318350: loss 5.7277, time 124.52ms
iter 318360: loss 4.8633, time 124.95ms
iter 318370: loss 6.3168, time 121.39ms
iter 318380: loss 5.8848, time 122.34ms
iter 318390: loss 6.0693, time 121.40ms
iter 318400: loss 5.8552, time 121.58ms
iter 318410: loss 5.8516, time 121.60ms
iter 318420: loss 6.0085, time 121.95ms
iter 318430: loss 6.0469, time 121.00ms
iter 318440: loss 6.1482, time 121.88ms
iter 318450: loss 6.0355, time 121.85ms
iter 318460: loss 5.8803, time 121.90ms
iter 318470: loss 6.2477, time 121.44ms
iter 318480: loss 5.4145, time 121.51ms
iter 318490: loss 6.2051, time 121.29ms
step 318500: train loss 5.6151, val loss 5.6424
saving checkpoint to out-shakespeare-char
iter 318500: loss 6.2819, time 2896.00ms
iter 318510: loss 6.0299, time 121.73ms
iter 318520: loss 6.5968, time 122.85ms
iter 318530: loss 5.6343, time 121.45ms
iter 318540: loss 5.7621, time 122.68ms
iter 318550: loss 5.6959, time 121.79ms
iter 318560: loss 5.6957, time 122.49ms
iter 318570: loss 6.8452, time 122.28ms
iter 318580: loss 6.3082, time 122.56ms
iter 318590: loss 7.2784, time 121.37ms
iter 318600: loss 5.8846, time 122.65ms
iter 318610: loss 6.1732, time 121.71ms
iter 318620: loss 7.2643, time 123.08ms
iter 318630: loss 6.1313, time 121.48ms
iter 318640: loss 5.3694, time 122.05ms
iter 318650: loss 6.1158, time 121.79ms
iter 318660: loss 6.1048, time 122.59ms
iter 318670: loss 6.7967, time 121.48ms
iter 318680: loss 5.0954, time 122.69ms
iter 318690: loss 6.3412, time 121.44ms
iter 318700: loss 5.5483, time 122.58ms
iter 318710: loss 6.5426, time 121.43ms
iter 318720: loss 5.5259, time 122.92ms
iter 318730: loss 5.9745, time 121.37ms
iter 318740: loss 5.9432, time 122.55ms
step 318750: train loss 5.6470, val loss 5.6226
saving checkpoint to out-shakespeare-char
iter 318750: loss 5.7501, time 2884.03ms
iter 318760: loss 6.1037, time 121.54ms
iter 318770: loss 5.6887, time 121.29ms
iter 318780: loss 5.9313, time 121.90ms
iter 318790: loss 5.9564, time 121.35ms
iter 318800: loss 6.2029, time 121.35ms
iter 318810: loss 6.0266, time 121.72ms
iter 318820: loss 6.3558, time 122.60ms
iter 318830: loss 6.3559, time 121.30ms
iter 318840: loss 5.4589, time 122.46ms
iter 318850: loss 6.3604, time 121.71ms
iter 318860: loss 6.7055, time 122.99ms
iter 318870: loss 5.8177, time 121.59ms
iter 318880: loss 6.0221, time 122.76ms
iter 318890: loss 4.7948, time 120.96ms
iter 318900: loss 6.1555, time 122.47ms
iter 318910: loss 6.2922, time 120.67ms
iter 318920: loss 6.1480, time 122.74ms
iter 318930: loss 5.5799, time 122.04ms
iter 318940: loss 6.2126, time 122.68ms
iter 318950: loss 5.9045, time 121.58ms
iter 318960: loss 6.7392, time 122.59ms
iter 318970: loss 5.7414, time 122.41ms
iter 318980: loss 5.3047, time 122.57ms
iter 318990: loss 5.5682, time 121.61ms
step 319000: train loss 5.6586, val loss 5.6172
saving checkpoint to out-shakespeare-char
iter 319000: loss 5.5691, time 2885.24ms
iter 319010: loss 6.1419, time 121.64ms
iter 319020: loss 5.7690, time 120.96ms
iter 319030: loss 5.8257, time 121.61ms
iter 319040: loss 5.7695, time 121.28ms
iter 319050: loss 5.9143, time 121.55ms
iter 319060: loss 6.2292, time 121.36ms
iter 319070: loss 5.7865, time 121.36ms
iter 319080: loss 5.3193, time 121.42ms
iter 319090: loss 6.0006, time 121.58ms
iter 319100: loss 5.7827, time 121.54ms
iter 319110: loss 5.9407, time 121.47ms
iter 319120: loss 7.1978, time 121.46ms
iter 319130: loss 6.0129, time 121.09ms
iter 319140: loss 5.2531, time 121.48ms
iter 319150: loss 6.2348, time 120.90ms
iter 319160: loss 6.5743, time 121.51ms
iter 319170: loss 5.6652, time 121.86ms
iter 319180: loss 6.6611, time 121.59ms
iter 319190: loss 5.8296, time 121.49ms
iter 319200: loss 6.2519, time 121.73ms
iter 319210: loss 5.3848, time 121.39ms
iter 319220: loss 6.7617, time 121.48ms
iter 319230: loss 6.3589, time 121.76ms
iter 319240: loss 5.1504, time 121.24ms
step 319250: train loss 5.6205, val loss 5.6385
saving checkpoint to out-shakespeare-char
iter 319250: loss 5.7211, time 2889.59ms
iter 319260: loss 5.5685, time 120.81ms
iter 319270: loss 5.9815, time 123.73ms
iter 319280: loss 5.6838, time 121.50ms
iter 319290: loss 5.3928, time 123.65ms
iter 319300: loss 5.2586, time 121.58ms
iter 319310: loss 6.2386, time 123.68ms
iter 319320: loss 6.4327, time 121.74ms
iter 319330: loss 6.4642, time 124.42ms
iter 319340: loss 5.7951, time 121.93ms
iter 319350: loss 5.5669, time 123.99ms
iter 319360: loss 4.9522, time 121.33ms
iter 319370: loss 6.6592, time 123.77ms
iter 319380: loss 6.6541, time 121.17ms
iter 319390: loss 5.8121, time 124.08ms
iter 319400: loss 5.8366, time 121.39ms
iter 319410: loss 5.7830, time 123.60ms
iter 319420: loss 6.1863, time 121.86ms
iter 319430: loss 5.7487, time 123.83ms
iter 319440: loss 5.9383, time 121.55ms
iter 319450: loss 5.5534, time 123.64ms
iter 319460: loss 5.8333, time 122.66ms
iter 319470: loss 5.4463, time 123.65ms
iter 319480: loss 5.7192, time 121.39ms
iter 319490: loss 5.7337, time 123.92ms
step 319500: train loss 5.6163, val loss 5.6260
saving checkpoint to out-shakespeare-char
iter 319500: loss 6.2671, time 2876.64ms
iter 319510: loss 6.4200, time 121.81ms
iter 319520: loss 5.6632, time 121.54ms
iter 319530: loss 6.0175, time 121.76ms
iter 319540: loss 5.5934, time 121.46ms
iter 319550: loss 6.2316, time 121.33ms
iter 319560: loss 6.0292, time 121.67ms
iter 319570: loss 6.0213, time 121.56ms
iter 319580: loss 5.8663, time 121.47ms
iter 319590: loss 6.0700, time 121.51ms
iter 319600: loss 6.0097, time 121.45ms
iter 319610: loss 5.9772, time 121.31ms
iter 319620: loss 6.3334, time 121.48ms
iter 319630: loss 6.0473, time 121.44ms
iter 319640: loss 5.8615, time 121.52ms
iter 319650: loss 6.3919, time 121.37ms
iter 319660: loss 5.7488, time 121.58ms
iter 319670: loss 5.7049, time 121.45ms
iter 319680: loss 5.5987, time 121.51ms
iter 319690: loss 6.7634, time 121.46ms
iter 319700: loss 6.2902, time 121.89ms
iter 319710: loss 6.0125, time 120.65ms
iter 319720: loss 5.7150, time 121.32ms
iter 319730: loss 6.7341, time 121.51ms
iter 319740: loss 5.9280, time 121.55ms
step 319750: train loss 5.5846, val loss 5.6107
saving checkpoint to out-shakespeare-char
iter 319750: loss 5.8932, time 2882.75ms
iter 319760: loss 5.6314, time 121.69ms
iter 319770: loss 5.7545, time 121.68ms
iter 319780: loss 6.1778, time 121.57ms
iter 319790: loss 6.0707, time 122.60ms
iter 319800: loss 5.9791, time 121.16ms
iter 319810: loss 5.9012, time 121.61ms
iter 319820: loss 6.8359, time 121.47ms
iter 319830: loss 6.0240, time 121.65ms
iter 319840: loss 5.5628, time 121.61ms
iter 319850: loss 6.0828, time 121.61ms
iter 319860: loss 5.8844, time 121.29ms
iter 319870: loss 4.8508, time 121.48ms
iter 319880: loss 6.1743, time 121.58ms
iter 319890: loss 5.5334, time 121.66ms
iter 319900: loss 6.3360, time 121.55ms
iter 319910: loss 6.1952, time 121.54ms
iter 319920: loss 5.4419, time 121.65ms
iter 319930: loss 6.0591, time 121.57ms
iter 319940: loss 4.9993, time 121.61ms
iter 319950: loss 5.8784, time 121.08ms
iter 319960: loss 6.4393, time 121.45ms
iter 319970: loss 5.6789, time 121.53ms
iter 319980: loss 5.0467, time 121.73ms
iter 319990: loss 6.1361, time 121.48ms
step 320000: train loss 5.6411, val loss 5.6543
saving checkpoint to out-shakespeare-char
iter 320000: loss 5.8273, time 2885.62ms
iter 320010: loss 6.2029, time 121.49ms
iter 320020: loss 5.2495, time 121.50ms
iter 320030: loss 6.4973, time 121.90ms
iter 320040: loss 6.7234, time 120.61ms
iter 320050: loss 5.7076, time 121.44ms
iter 320060: loss 5.9906, time 121.75ms
iter 320070: loss 6.4398, time 121.42ms
iter 320080: loss 6.0289, time 121.52ms
iter 320090: loss 5.9324, time 121.44ms
iter 320100: loss 6.2226, time 121.53ms
iter 320110: loss 6.4200, time 121.41ms
iter 320120: loss 6.3972, time 121.32ms
iter 320130: loss 6.4899, time 121.25ms
iter 320140: loss 6.3687, time 121.39ms
iter 320150: loss 6.0238, time 121.26ms
iter 320160: loss 5.6694, time 121.41ms
iter 320170: loss 5.2687, time 121.56ms
iter 320180: loss 5.9405, time 121.73ms
iter 320190: loss 5.7797, time 120.83ms
iter 320200: loss 6.6009, time 121.57ms
iter 320210: loss 6.1144, time 121.87ms
iter 320220: loss 6.0944, time 121.55ms
iter 320230: loss 6.3749, time 121.65ms
iter 320240: loss 5.3182, time 121.35ms
step 320250: train loss 5.5743, val loss 5.6209
saving checkpoint to out-shakespeare-char
iter 320250: loss 5.9400, time 2881.60ms
iter 320260: loss 6.3236, time 123.80ms
iter 320270: loss 5.8718, time 120.65ms
iter 320280: loss 6.3311, time 123.46ms
iter 320290: loss 5.9650, time 121.28ms
iter 320300: loss 6.1138, time 122.90ms
iter 320310: loss 6.0574, time 121.35ms
iter 320320: loss 6.0678, time 123.55ms
iter 320330: loss 5.5994, time 121.33ms
iter 320340: loss 6.2209, time 123.81ms
iter 320350: loss 6.1040, time 121.40ms
iter 320360: loss 5.5090, time 123.48ms
iter 320370: loss 5.7282, time 121.91ms
iter 320380: loss 6.0741, time 123.44ms
iter 320390: loss 6.0432, time 121.67ms
iter 320400: loss 6.1873, time 123.53ms
iter 320410: loss 5.6272, time 121.21ms
iter 320420: loss 7.2927, time 124.01ms
iter 320430: loss 5.1700, time 121.01ms
iter 320440: loss 6.3420, time 123.45ms
iter 320450: loss 5.6520, time 120.81ms
iter 320460: loss 6.0687, time 123.39ms
iter 320470: loss 6.0285, time 121.33ms
iter 320480: loss 5.7163, time 123.59ms
iter 320490: loss 5.8924, time 121.27ms
step 320500: train loss 5.6486, val loss 5.6542
saving checkpoint to out-shakespeare-char
iter 320500: loss 6.0639, time 2879.09ms
iter 320510: loss 5.9462, time 121.84ms
iter 320520: loss 6.2331, time 122.82ms
iter 320530: loss 6.3560, time 119.80ms
iter 320540: loss 6.1365, time 122.05ms
iter 320550: loss 6.1750, time 121.43ms
iter 320560: loss 6.6612, time 122.58ms
iter 320570: loss 6.5166, time 121.43ms
iter 320580: loss 5.6396, time 122.51ms
iter 320590: loss 6.7347, time 121.42ms
iter 320600: loss 5.6881, time 122.69ms
iter 320610: loss 6.3246, time 121.45ms
iter 320620: loss 6.3856, time 122.34ms
iter 320630: loss 6.6346, time 121.34ms
iter 320640: loss 5.6674, time 122.30ms
iter 320650: loss 6.5863, time 121.41ms
iter 320660: loss 5.9501, time 123.05ms
iter 320670: loss 5.9701, time 122.94ms
iter 320680: loss 6.1543, time 122.86ms
iter 320690: loss 6.4040, time 121.56ms
iter 320700: loss 5.9997, time 122.77ms
iter 320710: loss 6.0127, time 121.36ms
iter 320720: loss 5.6374, time 122.67ms
iter 320730: loss 6.2527, time 121.56ms
iter 320740: loss 5.7735, time 122.58ms
step 320750: train loss 5.5933, val loss 5.6418
saving checkpoint to out-shakespeare-char
iter 320750: loss 6.2553, time 2894.56ms
iter 320760: loss 6.7810, time 121.87ms
iter 320770: loss 6.0396, time 121.77ms
iter 320780: loss 5.5799, time 121.87ms
iter 320790: loss 5.7812, time 122.95ms
iter 320800: loss 5.8559, time 121.78ms
iter 320810: loss 6.1693, time 121.55ms
iter 320820: loss 6.0432, time 121.79ms
iter 320830: loss 5.8032, time 121.86ms
iter 320840: loss 6.4284, time 121.73ms
iter 320850: loss 5.7284, time 121.75ms
iter 320860: loss 5.9469, time 122.18ms
iter 320870: loss 5.9509, time 121.92ms
iter 320880: loss 5.7090, time 121.23ms
iter 320890: loss 6.8130, time 121.70ms
iter 320900: loss 5.7583, time 121.78ms
iter 320910: loss 6.0422, time 121.69ms
iter 320920: loss 5.9788, time 121.87ms
iter 320930: loss 5.8788, time 122.24ms
iter 320940: loss 5.9336, time 121.88ms
iter 320950: loss 5.4080, time 121.64ms
iter 320960: loss 6.2232, time 121.89ms
iter 320970: loss 4.9797, time 121.78ms
iter 320980: loss 6.6693, time 121.84ms
iter 320990: loss 5.4924, time 121.77ms
step 321000: train loss 5.6645, val loss 5.5832
saving checkpoint to out-shakespeare-char
iter 321000: loss 6.0395, time 2883.53ms
iter 321010: loss 6.6115, time 121.73ms
iter 321020: loss 6.1046, time 121.45ms
iter 321030: loss 5.6988, time 121.49ms
iter 321040: loss 5.7263, time 121.94ms
iter 321050: loss 6.1153, time 121.49ms
iter 321060: loss 6.2293, time 121.53ms
iter 321070: loss 6.4338, time 121.59ms
iter 321080: loss 6.3771, time 121.62ms
iter 321090: loss 5.8037, time 121.26ms
iter 321100: loss 5.6152, time 121.43ms
iter 321110: loss 5.9394, time 121.42ms
iter 321120: loss 5.5488, time 121.72ms
iter 321130: loss 5.4504, time 121.46ms
iter 321140: loss 5.5247, time 121.42ms
iter 321150: loss 5.9860, time 122.67ms
iter 321160: loss 6.6542, time 121.40ms
iter 321170: loss 6.2956, time 121.46ms
iter 321180: loss 5.8187, time 121.34ms
iter 321190: loss 6.2985, time 121.47ms
iter 321200: loss 5.9044, time 121.85ms
iter 321210: loss 5.7974, time 120.55ms
iter 321220: loss 6.3140, time 121.39ms
iter 321230: loss 6.1951, time 121.54ms
iter 321240: loss 5.6907, time 121.58ms
step 321250: train loss 5.5682, val loss 5.6000
saving checkpoint to out-shakespeare-char
iter 321250: loss 5.6685, time 2902.58ms
iter 321260: loss 5.1589, time 121.76ms
iter 321270: loss 5.8631, time 122.25ms
iter 321280: loss 6.0175, time 121.85ms
iter 321290: loss 6.1095, time 121.77ms
iter 321300: loss 6.4551, time 121.88ms
iter 321310: loss 6.0289, time 121.98ms
iter 321320: loss 6.0363, time 121.66ms
iter 321330: loss 6.1756, time 121.91ms
iter 321340: loss 6.0033, time 121.69ms
iter 321350: loss 6.0559, time 121.78ms
iter 321360: loss 5.4033, time 122.05ms
iter 321370: loss 5.4391, time 122.40ms
iter 321380: loss 5.9936, time 121.80ms
iter 321390: loss 5.8967, time 121.42ms
iter 321400: loss 5.8036, time 121.87ms
iter 321410: loss 6.0393, time 121.84ms
iter 321420: loss 5.9236, time 122.05ms
iter 321430: loss 6.1720, time 121.76ms
iter 321440: loss 5.9789, time 122.15ms
iter 321450: loss 5.7139, time 121.78ms
iter 321460: loss 5.3361, time 121.82ms
iter 321470: loss 6.0084, time 121.78ms
iter 321480: loss 5.7875, time 121.89ms
iter 321490: loss 6.0405, time 121.86ms
step 321500: train loss 5.6237, val loss 5.6120
saving checkpoint to out-shakespeare-char
iter 321500: loss 6.6123, time 2898.86ms
iter 321510: loss 5.4129, time 123.92ms
iter 321520: loss 6.3280, time 121.32ms
iter 321530: loss 6.5879, time 123.63ms
iter 321540: loss 5.7749, time 121.40ms
iter 321550: loss 6.4253, time 123.86ms
iter 321560: loss 7.1550, time 121.47ms
iter 321570: loss 6.0210, time 123.09ms
iter 321580: loss 6.1038, time 121.27ms
iter 321590: loss 5.8731, time 123.97ms
iter 321600: loss 5.7711, time 121.36ms
iter 321610: loss 5.2498, time 123.46ms
iter 321620: loss 6.1708, time 121.46ms
iter 321630: loss 6.2371, time 123.32ms
iter 321640: loss 5.5816, time 121.41ms
iter 321650: loss 6.5566, time 123.56ms
iter 321660: loss 6.1158, time 121.53ms
iter 321670: loss 6.4276, time 123.58ms
iter 321680: loss 5.7968, time 121.41ms
iter 321690: loss 5.4894, time 123.57ms
iter 321700: loss 6.1959, time 121.39ms
iter 321710: loss 6.1090, time 123.64ms
iter 321720: loss 6.5689, time 121.81ms
iter 321730: loss 6.1563, time 124.34ms
iter 321740: loss 5.6805, time 121.44ms
step 321750: train loss 5.5956, val loss 5.5884
saving checkpoint to out-shakespeare-char
iter 321750: loss 5.9465, time 2897.49ms
iter 321760: loss 5.9166, time 124.45ms
iter 321770: loss 6.2775, time 125.11ms
iter 321780: loss 6.4718, time 125.15ms
iter 321790: loss 6.7012, time 125.20ms
iter 321800: loss 6.6930, time 125.45ms
iter 321810: loss 6.0575, time 125.09ms
iter 321820: loss 6.2603, time 124.81ms
iter 321830: loss 5.9716, time 127.48ms
iter 321840: loss 6.0094, time 124.88ms
iter 321850: loss 5.7320, time 125.20ms
iter 321860: loss 6.1200, time 125.26ms
iter 321870: loss 6.6511, time 125.11ms
iter 321880: loss 5.9558, time 125.07ms
iter 321890: loss 6.4117, time 125.04ms
iter 321900: loss 5.9450, time 125.49ms
iter 321910: loss 5.6480, time 124.97ms
iter 321920: loss 6.7284, time 124.91ms
iter 321930: loss 5.7794, time 125.21ms
iter 321940: loss 5.9199, time 127.49ms
iter 321950: loss 6.1252, time 125.00ms
iter 321960: loss 6.1043, time 124.86ms
iter 321970: loss 5.9919, time 125.06ms
iter 321980: loss 6.3180, time 124.93ms
iter 321990: loss 5.5816, time 124.84ms
step 322000: train loss 5.6173, val loss 5.5877
saving checkpoint to out-shakespeare-char
iter 322000: loss 5.7515, time 2881.10ms
iter 322010: loss 6.2510, time 125.16ms
iter 322020: loss 6.3897, time 125.57ms
iter 322030: loss 5.4961, time 124.98ms
iter 322040: loss 6.1389, time 125.18ms
iter 322050: loss 5.9655, time 125.40ms
iter 322060: loss 6.2668, time 127.36ms
iter 322070: loss 6.3346, time 125.10ms
iter 322080: loss 6.5732, time 125.04ms
iter 322090: loss 6.0555, time 125.30ms
iter 322100: loss 5.5168, time 124.72ms
iter 322110: loss 6.8455, time 124.93ms
iter 322120: loss 5.6115, time 125.10ms
iter 322130: loss 6.0737, time 123.68ms
iter 322140: loss 6.5673, time 124.89ms
iter 322150: loss 5.5020, time 125.13ms
iter 322160: loss 6.1966, time 125.04ms
iter 322170: loss 6.1968, time 127.57ms
iter 322180: loss 6.3683, time 125.12ms
iter 322190: loss 6.2761, time 124.95ms
iter 322200: loss 5.3292, time 125.14ms
iter 322210: loss 5.3003, time 125.19ms
iter 322220: loss 6.6076, time 124.05ms
iter 322230: loss 6.7610, time 125.04ms
iter 322240: loss 6.3425, time 125.39ms
step 322250: train loss 5.5869, val loss 5.6127
saving checkpoint to out-shakespeare-char
iter 322250: loss 6.0466, time 2893.17ms
iter 322260: loss 5.2777, time 125.69ms
iter 322270: loss 5.8874, time 125.63ms
iter 322280: loss 7.3256, time 125.51ms
iter 322290: loss 6.0044, time 125.58ms
iter 322300: loss 5.1785, time 125.72ms
iter 322310: loss 5.2291, time 125.78ms
iter 322320: loss 5.6907, time 125.29ms
iter 322330: loss 5.8000, time 125.24ms
iter 322340: loss 6.3527, time 125.32ms
iter 322350: loss 6.0852, time 127.76ms
iter 322360: loss 5.5111, time 125.01ms
iter 322370: loss 6.4216, time 125.24ms
iter 322380: loss 6.3901, time 125.41ms
iter 322390: loss 5.6566, time 125.31ms
iter 322400: loss 5.6358, time 125.26ms
iter 322410: loss 6.0941, time 125.33ms
iter 322420: loss 6.1507, time 127.56ms
iter 322430: loss 6.5783, time 125.38ms
iter 322440: loss 5.4700, time 124.84ms
iter 322450: loss 6.3848, time 125.42ms
iter 322460: loss 6.7226, time 124.87ms
iter 322470: loss 5.9217, time 125.36ms
iter 322480: loss 6.0947, time 125.91ms
iter 322490: loss 6.3335, time 128.13ms
step 322500: train loss 5.6484, val loss 5.6600
saving checkpoint to out-shakespeare-char
iter 322500: loss 5.7316, time 2903.02ms
iter 322510: loss 6.1025, time 125.38ms
iter 322520: loss 5.2769, time 125.18ms
iter 322530: loss 6.0310, time 127.30ms
iter 322540: loss 5.6556, time 125.33ms
iter 322550: loss 6.1885, time 125.31ms
iter 322560: loss 6.0427, time 125.17ms
iter 322570: loss 5.2478, time 125.04ms
iter 322580: loss 5.8766, time 125.58ms
iter 322590: loss 6.0514, time 125.70ms
iter 322600: loss 6.1738, time 127.95ms
iter 322610: loss 6.2829, time 125.70ms
iter 322620: loss 6.2201, time 125.60ms
iter 322630: loss 6.5153, time 124.61ms
iter 322640: loss 6.0890, time 125.69ms
iter 322650: loss 5.7969, time 125.47ms
iter 322660: loss 5.5999, time 125.53ms
iter 322670: loss 6.2445, time 125.54ms
iter 322680: loss 6.1380, time 125.51ms
iter 322690: loss 6.5736, time 125.75ms
iter 322700: loss 5.7079, time 125.52ms
iter 322710: loss 6.6193, time 128.27ms
iter 322720: loss 5.5402, time 124.69ms
iter 322730: loss 5.8379, time 125.44ms
iter 322740: loss 5.8007, time 125.28ms
step 322750: train loss 5.6512, val loss 5.6603
saving checkpoint to out-shakespeare-char
iter 322750: loss 6.3091, time 2872.57ms
iter 322760: loss 6.5997, time 125.64ms
iter 322770: loss 6.8008, time 125.81ms
iter 322780: loss 5.9947, time 125.58ms
iter 322790: loss 5.7434, time 125.64ms
iter 322800: loss 6.0689, time 125.66ms
iter 322810: loss 6.2999, time 125.38ms
iter 322820: loss 6.1585, time 125.74ms
iter 322830: loss 6.1598, time 128.19ms
iter 322840: loss 6.1406, time 125.52ms
iter 322850: loss 6.2286, time 126.04ms
iter 322860: loss 5.3026, time 125.62ms
iter 322870: loss 6.0051, time 125.43ms
iter 322880: loss 6.2664, time 124.80ms
iter 322890: loss 5.9678, time 126.27ms
iter 322900: loss 5.7847, time 125.59ms
iter 322910: loss 6.3055, time 127.92ms
iter 322920: loss 6.3001, time 126.02ms
iter 322930: loss 6.9734, time 126.04ms
iter 322940: loss 6.2194, time 125.93ms
iter 322950: loss 6.0464, time 125.90ms
iter 322960: loss 6.5545, time 125.48ms
iter 322970: loss 5.9837, time 125.58ms
iter 322980: loss 7.0266, time 125.80ms
iter 322990: loss 6.6986, time 126.03ms
step 323000: train loss 5.6300, val loss 5.6557
saving checkpoint to out-shakespeare-char
iter 323000: loss 5.9262, time 2879.02ms
iter 323010: loss 5.9293, time 126.37ms
iter 323020: loss 5.4293, time 125.89ms
iter 323030: loss 5.0650, time 125.73ms
iter 323040: loss 5.7784, time 127.91ms
iter 323050: loss 5.6519, time 125.43ms
iter 323060: loss 6.1608, time 125.30ms
iter 323070: loss 5.9383, time 125.41ms
iter 323080: loss 5.9493, time 125.58ms
iter 323090: loss 6.2179, time 125.55ms
iter 323100: loss 6.0538, time 125.46ms
iter 323110: loss 6.3703, time 125.99ms
iter 323120: loss 5.9697, time 125.30ms
iter 323130: loss 5.9476, time 125.62ms
iter 323140: loss 5.9435, time 125.83ms
iter 323150: loss 6.2134, time 127.90ms
iter 323160: loss 6.4878, time 125.34ms
iter 323170: loss 6.0723, time 125.34ms
iter 323180: loss 5.5022, time 125.92ms
iter 323190: loss 6.0618, time 126.52ms
iter 323200: loss 5.5296, time 125.36ms
iter 323210: loss 6.0633, time 125.33ms
iter 323220: loss 5.5953, time 125.63ms
iter 323230: loss 5.2154, time 125.15ms
iter 323240: loss 6.6629, time 125.50ms
step 323250: train loss 5.6363, val loss 5.6133
saving checkpoint to out-shakespeare-char
iter 323250: loss 6.6618, time 2869.37ms
iter 323260: loss 6.4607, time 125.57ms
iter 323270: loss 5.8593, time 125.23ms
iter 323280: loss 6.8204, time 125.39ms
iter 323290: loss 5.3466, time 125.68ms
iter 323300: loss 5.7334, time 127.80ms
iter 323310: loss 6.5189, time 125.25ms
iter 323320: loss 6.4770, time 125.46ms
iter 323330: loss 6.6435, time 125.18ms
iter 323340: loss 6.1270, time 125.73ms
iter 323350: loss 6.3372, time 125.43ms
iter 323360: loss 6.0278, time 126.72ms
iter 323370: loss 6.6864, time 125.44ms
iter 323380: loss 5.9040, time 124.78ms
iter 323390: loss 6.0583, time 125.69ms
iter 323400: loss 6.0124, time 125.28ms
iter 323410: loss 5.7622, time 125.45ms
iter 323420: loss 5.8763, time 125.46ms
iter 323430: loss 5.6626, time 125.44ms
iter 323440: loss 5.5817, time 128.08ms
iter 323450: loss 5.8057, time 125.65ms
iter 323460: loss 5.5737, time 126.41ms
iter 323470: loss 5.7036, time 125.71ms
iter 323480: loss 5.8714, time 125.06ms
iter 323490: loss 5.8639, time 125.46ms
step 323500: train loss 5.6493, val loss 5.6453
saving checkpoint to out-shakespeare-char
iter 323500: loss 5.8689, time 2889.88ms
iter 323510: loss 6.4531, time 123.37ms
iter 323520: loss 5.7248, time 121.67ms
iter 323530: loss 5.8737, time 122.69ms
iter 323540: loss 6.0377, time 121.69ms
iter 323550: loss 6.2703, time 123.02ms
iter 323560: loss 6.1231, time 121.72ms
iter 323570: loss 6.5654, time 122.67ms
iter 323580: loss 6.2698, time 121.59ms
iter 323590: loss 6.1251, time 123.12ms
iter 323600: loss 6.0861, time 121.12ms
iter 323610: loss 6.1534, time 122.10ms
iter 323620: loss 6.1335, time 121.46ms
iter 323630: loss 5.5816, time 122.57ms
iter 323640: loss 6.1754, time 121.98ms
iter 323650: loss 5.4373, time 122.74ms
iter 323660: loss 6.4630, time 121.36ms
iter 323670: loss 6.6067, time 122.79ms
iter 323680: loss 6.0952, time 121.25ms
iter 323690: loss 5.7668, time 122.96ms
iter 323700: loss 5.4146, time 121.55ms
iter 323710: loss 6.8984, time 122.47ms
iter 323720: loss 6.1123, time 121.57ms
iter 323730: loss 6.3591, time 122.69ms
iter 323740: loss 5.8038, time 121.76ms
step 323750: train loss 5.6112, val loss 5.5780
saving checkpoint to out-shakespeare-char
iter 323750: loss 6.1795, time 2889.52ms
iter 323760: loss 5.8312, time 122.04ms
iter 323770: loss 6.0248, time 121.85ms
iter 323780: loss 5.5076, time 122.09ms
iter 323790: loss 6.4472, time 121.85ms
iter 323800: loss 6.5008, time 122.34ms
iter 323810: loss 6.5739, time 121.76ms
iter 323820: loss 6.0714, time 121.95ms
iter 323830: loss 6.1271, time 121.78ms
iter 323840: loss 5.6811, time 122.13ms
iter 323850: loss 6.6502, time 121.73ms
iter 323860: loss 5.7185, time 121.88ms
iter 323870: loss 6.4088, time 121.88ms
iter 323880: loss 5.2029, time 121.88ms
iter 323890: loss 5.8970, time 122.51ms
iter 323900: loss 6.6419, time 122.00ms
iter 323910: loss 5.8822, time 121.99ms
iter 323920: loss 5.6729, time 121.90ms
iter 323930: loss 5.4072, time 121.81ms
iter 323940: loss 5.7010, time 121.65ms
iter 323950: loss 5.4879, time 121.89ms
iter 323960: loss 6.1628, time 122.13ms
iter 323970: loss 6.3845, time 121.92ms
iter 323980: loss 6.4900, time 121.84ms
iter 323990: loss 6.0202, time 121.79ms
step 324000: train loss 5.5730, val loss 5.6542
saving checkpoint to out-shakespeare-char
iter 324000: loss 5.5782, time 2895.73ms
iter 324010: loss 6.3715, time 121.45ms
iter 324020: loss 6.2568, time 121.72ms
iter 324030: loss 6.0731, time 120.81ms
iter 324040: loss 6.8650, time 121.90ms
iter 324050: loss 5.9454, time 121.43ms
iter 324060: loss 6.6579, time 121.41ms
iter 324070: loss 5.5326, time 121.38ms
iter 324080: loss 5.9546, time 121.41ms
iter 324090: loss 5.4413, time 121.80ms
iter 324100: loss 6.6127, time 121.54ms
iter 324110: loss 5.7245, time 121.34ms
iter 324120: loss 6.4831, time 121.39ms
iter 324130: loss 6.5742, time 121.33ms
iter 324140: loss 5.5094, time 121.86ms
iter 324150: loss 6.1221, time 120.86ms
iter 324160: loss 6.3741, time 121.86ms
iter 324170: loss 6.1230, time 121.48ms
iter 324180: loss 6.1999, time 121.42ms
iter 324190: loss 5.7732, time 121.61ms
iter 324200: loss 6.4088, time 121.92ms
iter 324210: loss 5.3029, time 121.06ms
iter 324220: loss 5.2093, time 121.23ms
iter 324230: loss 5.7633, time 121.27ms
iter 324240: loss 5.5410, time 121.84ms
step 324250: train loss 5.6151, val loss 5.7068
saving checkpoint to out-shakespeare-char
iter 324250: loss 6.0127, time 2885.84ms
iter 324260: loss 5.8954, time 122.74ms
iter 324270: loss 5.9761, time 121.57ms
iter 324280: loss 6.8399, time 122.71ms
iter 324290: loss 6.0978, time 121.56ms
iter 324300: loss 6.4050, time 122.95ms
iter 324310: loss 6.0533, time 121.53ms
iter 324320: loss 6.8193, time 122.63ms
iter 324330: loss 6.3032, time 121.50ms
iter 324340: loss 6.9251, time 122.89ms
iter 324350: loss 5.4474, time 121.49ms
iter 324360: loss 5.7709, time 122.87ms
iter 324370: loss 5.9802, time 121.02ms
iter 324380: loss 6.3130, time 122.94ms
iter 324390: loss 5.6920, time 122.01ms
iter 324400: loss 6.4194, time 122.55ms
iter 324410: loss 6.5365, time 121.50ms
iter 324420: loss 5.9637, time 122.62ms
iter 324430: loss 5.8094, time 121.33ms
iter 324440: loss 5.8783, time 123.26ms
iter 324450: loss 6.5792, time 121.41ms
iter 324460: loss 5.3996, time 122.54ms
iter 324470: loss 5.9112, time 121.25ms
iter 324480: loss 6.3017, time 123.01ms
iter 324490: loss 6.8232, time 121.55ms
step 324500: train loss 5.6202, val loss 5.6024
saving checkpoint to out-shakespeare-char
iter 324500: loss 6.5880, time 2894.82ms
iter 324510: loss 6.3677, time 122.66ms
iter 324520: loss 5.3263, time 120.95ms
iter 324530: loss 6.0875, time 122.74ms
iter 324540: loss 6.1279, time 121.30ms
iter 324550: loss 5.4708, time 122.75ms
iter 324560: loss 6.0894, time 121.27ms
iter 324570: loss 5.7769, time 122.53ms
iter 324580: loss 6.7291, time 121.48ms
iter 324590: loss 6.6164, time 122.71ms
iter 324600: loss 6.2932, time 121.78ms
iter 324610: loss 6.3863, time 122.57ms
iter 324620: loss 5.5501, time 121.65ms
iter 324630: loss 5.8561, time 122.48ms
iter 324640: loss 5.3614, time 121.76ms
iter 324650: loss 6.1458, time 122.58ms
iter 324660: loss 6.1103, time 121.45ms
iter 324670: loss 5.7434, time 122.48ms
iter 324680: loss 6.0406, time 121.99ms
iter 324690: loss 5.8556, time 122.92ms
iter 324700: loss 5.7458, time 121.65ms
iter 324710: loss 5.9715, time 122.62ms
iter 324720: loss 6.2067, time 121.62ms
iter 324730: loss 5.6166, time 122.53ms
iter 324740: loss 5.8926, time 121.56ms
step 324750: train loss 5.6350, val loss 5.5885
saving checkpoint to out-shakespeare-char
iter 324750: loss 5.6405, time 2888.45ms
iter 324760: loss 6.0681, time 121.58ms
iter 324770: loss 6.3137, time 121.44ms
iter 324780: loss 5.6221, time 121.55ms
iter 324790: loss 5.5981, time 121.28ms
iter 324800: loss 5.7846, time 121.84ms
iter 324810: loss 5.8739, time 121.68ms
iter 324820: loss 6.7871, time 121.82ms
iter 324830: loss 5.0555, time 121.87ms
iter 324840: loss 5.8817, time 121.96ms
iter 324850: loss 6.0689, time 121.73ms
iter 324860: loss 6.2701, time 120.50ms
iter 324870: loss 6.4841, time 121.54ms
iter 324880: loss 5.6799, time 121.54ms
iter 324890: loss 6.6186, time 120.78ms
iter 324900: loss 6.5400, time 121.60ms
iter 324910: loss 5.8508, time 121.03ms
iter 324920: loss 6.6758, time 121.36ms
iter 324930: loss 5.5892, time 121.37ms
iter 324940: loss 6.3351, time 120.85ms
iter 324950: loss 6.2581, time 121.64ms
iter 324960: loss 6.0444, time 121.60ms
iter 324970: loss 5.4973, time 121.49ms
iter 324980: loss 6.2731, time 121.40ms
iter 324990: loss 5.9481, time 122.46ms
step 325000: train loss 5.6826, val loss 5.6389
saving checkpoint to out-shakespeare-char
iter 325000: loss 6.3069, time 2885.75ms
iter 325010: loss 6.0419, time 121.93ms
iter 325020: loss 5.8388, time 121.98ms
iter 325030: loss 6.3714, time 121.70ms
iter 325040: loss 5.7958, time 121.46ms
iter 325050: loss 5.9767, time 121.80ms
iter 325060: loss 5.4240, time 121.46ms
iter 325070: loss 5.6804, time 121.86ms
iter 325080: loss 7.1515, time 121.29ms
iter 325090: loss 6.2180, time 121.61ms
iter 325100: loss 5.9068, time 122.17ms
iter 325110: loss 5.7115, time 121.51ms
iter 325120: loss 5.8593, time 122.17ms
iter 325130: loss 6.0252, time 121.90ms
iter 325140: loss 6.5775, time 121.92ms
iter 325150: loss 5.6512, time 121.47ms
iter 325160: loss 6.4682, time 121.50ms
iter 325170: loss 5.7356, time 122.27ms
iter 325180: loss 6.6860, time 121.78ms
iter 325190: loss 5.9634, time 122.01ms
iter 325200: loss 5.3389, time 121.89ms
iter 325210: loss 5.1674, time 122.15ms
iter 325220: loss 6.8496, time 122.41ms
iter 325230: loss 6.4732, time 121.65ms
iter 325240: loss 5.9937, time 123.02ms
step 325250: train loss 5.6604, val loss 5.6068
saving checkpoint to out-shakespeare-char
iter 325250: loss 5.9927, time 2894.38ms
iter 325260: loss 6.3533, time 126.17ms
iter 325270: loss 5.3142, time 125.74ms
iter 325280: loss 5.6899, time 126.08ms
iter 325290: loss 4.7858, time 125.76ms
iter 325300: loss 5.6500, time 127.98ms
iter 325310: loss 5.8185, time 125.51ms
iter 325320: loss 5.8557, time 126.12ms
iter 325330: loss 6.1400, time 126.05ms
iter 325340: loss 5.6575, time 125.49ms
iter 325350: loss 5.5525, time 125.40ms
iter 325360: loss 5.3222, time 125.57ms
iter 325370: loss 5.9419, time 126.23ms
iter 325380: loss 5.4538, time 125.62ms
iter 325390: loss 5.5439, time 125.09ms
iter 325400: loss 5.5385, time 125.41ms
iter 325410: loss 6.0697, time 128.94ms
iter 325420: loss 5.8574, time 125.32ms
iter 325430: loss 5.7673, time 125.14ms
iter 325440: loss 5.8465, time 126.20ms
iter 325450: loss 5.4764, time 128.63ms
iter 325460: loss 5.4171, time 126.21ms
iter 325470: loss 5.8328, time 126.04ms
iter 325480: loss 5.6547, time 126.60ms
iter 325490: loss 5.5200, time 125.98ms
step 325500: train loss 5.6414, val loss 5.6028
saving checkpoint to out-shakespeare-char
iter 325500: loss 5.3945, time 2893.95ms
iter 325510: loss 6.0251, time 125.33ms
iter 325520: loss 6.1138, time 125.33ms
iter 325530: loss 6.0406, time 125.55ms
iter 325540: loss 6.5502, time 125.27ms
iter 325550: loss 5.2504, time 125.12ms
iter 325560: loss 5.7180, time 125.37ms
iter 325570: loss 6.8595, time 128.68ms
iter 325580: loss 5.9610, time 125.92ms
iter 325590: loss 5.4860, time 125.60ms
iter 325600: loss 6.2719, time 125.78ms
iter 325610: loss 5.9667, time 125.25ms
iter 325620: loss 5.6670, time 125.15ms
iter 325630: loss 6.2949, time 125.05ms
iter 325640: loss 5.6681, time 125.43ms
iter 325650: loss 5.7363, time 125.15ms
iter 325660: loss 5.6617, time 125.20ms
iter 325670: loss 6.5807, time 125.18ms
iter 325680: loss 5.6001, time 127.70ms
iter 325690: loss 5.9389, time 125.25ms
iter 325700: loss 5.8972, time 125.16ms
iter 325710: loss 6.0213, time 125.29ms
iter 325720: loss 6.3379, time 125.23ms
iter 325730: loss 5.2625, time 125.25ms
iter 325740: loss 6.2409, time 125.25ms
step 325750: train loss 5.6276, val loss 5.6069
saving checkpoint to out-shakespeare-char
iter 325750: loss 6.1915, time 2897.05ms
iter 325760: loss 6.4465, time 125.16ms
iter 325770: loss 5.2419, time 124.87ms
iter 325780: loss 6.1536, time 124.64ms
iter 325790: loss 6.3695, time 126.98ms
iter 325800: loss 5.8005, time 124.71ms
iter 325810: loss 6.5563, time 124.44ms
iter 325820: loss 6.1134, time 124.13ms
iter 325830: loss 6.4496, time 124.51ms
iter 325840: loss 6.1905, time 126.07ms
iter 325850: loss 6.3848, time 125.04ms
iter 325860: loss 6.5054, time 124.70ms
iter 325870: loss 6.1983, time 124.97ms
iter 325880: loss 5.7581, time 125.17ms
iter 325890: loss 6.2898, time 125.20ms
iter 325900: loss 5.7826, time 125.54ms
iter 325910: loss 5.9261, time 125.47ms
iter 325920: loss 6.1060, time 127.74ms
iter 325930: loss 6.3225, time 125.28ms
iter 325940: loss 6.1200, time 124.99ms
iter 325950: loss 5.3895, time 125.19ms
iter 325960: loss 6.1316, time 125.22ms
iter 325970: loss 5.6770, time 125.25ms
iter 325980: loss 5.6823, time 125.10ms
iter 325990: loss 5.6153, time 125.12ms
step 326000: train loss 5.6420, val loss 5.6012
saving checkpoint to out-shakespeare-char
iter 326000: loss 6.5313, time 2883.53ms
iter 326010: loss 6.0061, time 123.82ms
iter 326020: loss 6.8330, time 125.20ms
iter 326030: loss 6.1693, time 126.05ms
iter 326040: loss 5.4577, time 125.08ms
iter 326050: loss 5.6279, time 125.12ms
iter 326060: loss 6.0836, time 125.10ms
iter 326070: loss 5.6258, time 125.27ms
iter 326080: loss 5.7019, time 125.09ms
iter 326090: loss 5.8162, time 125.01ms
iter 326100: loss 5.9211, time 125.17ms
iter 326110: loss 5.7416, time 127.51ms
iter 326120: loss 6.1110, time 125.03ms
iter 326130: loss 5.8990, time 124.97ms
iter 326140: loss 6.2181, time 125.90ms
iter 326150: loss 5.9070, time 125.08ms
iter 326160: loss 6.2088, time 124.13ms
iter 326170: loss 6.1672, time 125.22ms
iter 326180: loss 5.9558, time 125.09ms
iter 326190: loss 6.4632, time 125.30ms
iter 326200: loss 5.9705, time 125.11ms
iter 326210: loss 6.5988, time 125.15ms
iter 326220: loss 5.6066, time 127.51ms
iter 326230: loss 6.7447, time 125.10ms
iter 326240: loss 6.3030, time 125.32ms
step 326250: train loss 5.5791, val loss 5.6120
saving checkpoint to out-shakespeare-char
iter 326250: loss 6.7055, time 2891.97ms
iter 326260: loss 5.8596, time 127.62ms
iter 326270: loss 6.1584, time 126.42ms
iter 326280: loss 5.9672, time 125.11ms
iter 326290: loss 5.8568, time 124.73ms
iter 326300: loss 5.7679, time 124.42ms
iter 326310: loss 6.8583, time 124.74ms
iter 326320: loss 6.1397, time 125.22ms
iter 326330: loss 5.6439, time 124.59ms
iter 326340: loss 5.9299, time 125.59ms
iter 326350: loss 6.0075, time 125.15ms
iter 326360: loss 6.1499, time 125.32ms
iter 326370: loss 6.2585, time 125.53ms
iter 326380: loss 5.5205, time 125.72ms
iter 326390: loss 6.1401, time 125.32ms
iter 326400: loss 6.2203, time 125.42ms
iter 326410: loss 6.0439, time 124.54ms
iter 326420: loss 5.7786, time 124.30ms
iter 326430: loss 5.8622, time 125.57ms
iter 326440: loss 5.8834, time 125.24ms
iter 326450: loss 6.9284, time 124.28ms
iter 326460: loss 5.9193, time 125.92ms
iter 326470: loss 5.7018, time 125.49ms
iter 326480: loss 6.1609, time 125.69ms
iter 326490: loss 6.0226, time 128.13ms
step 326500: train loss 5.6358, val loss 5.6676
saving checkpoint to out-shakespeare-char
iter 326500: loss 5.2110, time 2886.98ms
iter 326510: loss 6.4917, time 125.83ms
iter 326520: loss 5.6427, time 126.23ms
iter 326530: loss 5.8354, time 128.55ms
iter 326540: loss 5.9253, time 124.82ms
iter 326550: loss 5.8614, time 125.76ms
iter 326560: loss 5.9099, time 125.97ms
iter 326570: loss 5.5918, time 125.68ms
iter 326580: loss 5.9261, time 125.22ms
iter 326590: loss 5.4912, time 125.80ms
iter 326600: loss 5.5155, time 126.02ms
iter 326610: loss 5.3496, time 125.63ms
iter 326620: loss 5.9160, time 125.66ms
iter 326630: loss 5.7191, time 126.00ms
iter 326640: loss 5.3651, time 127.60ms
iter 326650: loss 6.0885, time 125.54ms
iter 326660: loss 6.5191, time 125.62ms
iter 326670: loss 6.2271, time 125.87ms
iter 326680: loss 5.7855, time 125.72ms
iter 326690: loss 6.2538, time 125.89ms
iter 326700: loss 6.2000, time 125.70ms
iter 326710: loss 6.4916, time 125.35ms
iter 326720: loss 7.4180, time 125.22ms
iter 326730: loss 5.8674, time 125.68ms
iter 326740: loss 5.2220, time 125.71ms
step 326750: train loss 5.6096, val loss 5.6048
saving checkpoint to out-shakespeare-char
iter 326750: loss 6.1530, time 2905.80ms
iter 326760: loss 6.1429, time 121.66ms
iter 326770: loss 5.8159, time 122.36ms
iter 326780: loss 5.9695, time 121.55ms
iter 326790: loss 6.2411, time 122.65ms
iter 326800: loss 6.3465, time 121.25ms
iter 326810: loss 5.7071, time 122.94ms
iter 326820: loss 6.3667, time 121.36ms
iter 326830: loss 5.8776, time 122.59ms
iter 326840: loss 6.1243, time 120.64ms
iter 326850: loss 6.5841, time 122.71ms
iter 326860: loss 6.0956, time 121.39ms
iter 326870: loss 6.0138, time 122.80ms
iter 326880: loss 6.7372, time 121.35ms
iter 326890: loss 6.4835, time 122.63ms
iter 326900: loss 6.0138, time 121.57ms
iter 326910: loss 5.7357, time 122.73ms
iter 326920: loss 5.8276, time 121.63ms
iter 326930: loss 6.0843, time 122.69ms
iter 326940: loss 5.0070, time 121.45ms
iter 326950: loss 5.6278, time 122.62ms
iter 326960: loss 5.6276, time 121.12ms
iter 326970: loss 6.0360, time 122.42ms
iter 326980: loss 6.3752, time 121.78ms
iter 326990: loss 6.2000, time 123.03ms
step 327000: train loss 5.6293, val loss 5.6759
saving checkpoint to out-shakespeare-char
iter 327000: loss 6.1492, time 2887.20ms
iter 327010: loss 6.0512, time 122.04ms
iter 327020: loss 5.7632, time 121.63ms
iter 327030: loss 6.3764, time 121.57ms
iter 327040: loss 6.0854, time 122.09ms
iter 327050: loss 6.4421, time 121.51ms
iter 327060: loss 5.6007, time 121.66ms
iter 327070: loss 6.2581, time 120.79ms
iter 327080: loss 4.9230, time 121.57ms
iter 327090: loss 6.2315, time 121.52ms
iter 327100: loss 6.5224, time 121.44ms
iter 327110: loss 5.3805, time 121.62ms
iter 327120: loss 6.0021, time 121.26ms
iter 327130: loss 6.7539, time 121.92ms
iter 327140: loss 5.9332, time 121.70ms
iter 327150: loss 5.9391, time 121.34ms
iter 327160: loss 5.8870, time 121.60ms
iter 327170: loss 6.3130, time 121.29ms
iter 327180: loss 5.4389, time 121.64ms
iter 327190: loss 5.9524, time 121.15ms
iter 327200: loss 5.3774, time 121.51ms
iter 327210: loss 6.0541, time 121.40ms
iter 327220: loss 5.4017, time 121.56ms
iter 327230: loss 6.0487, time 121.89ms
iter 327240: loss 6.1355, time 120.61ms
step 327250: train loss 5.6339, val loss 5.6176
saving checkpoint to out-shakespeare-char
iter 327250: loss 5.7896, time 2882.84ms
iter 327260: loss 5.6834, time 121.90ms
iter 327270: loss 5.9669, time 121.63ms
iter 327280: loss 5.5274, time 121.76ms
iter 327290: loss 6.2084, time 121.77ms
iter 327300: loss 6.3752, time 121.84ms
iter 327310: loss 5.4325, time 122.04ms
iter 327320: loss 6.5010, time 121.69ms
iter 327330: loss 6.0556, time 122.02ms
iter 327340: loss 6.1002, time 122.01ms
iter 327350: loss 6.1558, time 122.01ms
iter 327360: loss 5.8254, time 121.94ms
iter 327370: loss 5.9256, time 121.64ms
iter 327380: loss 6.7128, time 120.85ms
iter 327390: loss 5.6234, time 121.83ms
iter 327400: loss 5.4042, time 121.88ms
iter 327410: loss 5.7621, time 121.60ms
iter 327420: loss 5.7220, time 121.25ms
iter 327430: loss 5.7286, time 121.89ms
iter 327440: loss 5.8130, time 121.83ms
iter 327450: loss 6.4009, time 122.98ms
iter 327460: loss 6.3964, time 121.72ms
iter 327470: loss 5.6142, time 122.02ms
iter 327480: loss 5.9771, time 121.93ms
iter 327490: loss 5.7364, time 122.26ms
step 327500: train loss 5.6096, val loss 5.6575
saving checkpoint to out-shakespeare-char
iter 327500: loss 5.5280, time 2903.15ms
iter 327510: loss 5.5158, time 125.65ms
iter 327520: loss 5.4557, time 126.17ms
iter 327530: loss 5.2595, time 127.71ms
iter 327540: loss 6.2466, time 127.08ms
iter 327550: loss 5.9485, time 125.43ms
iter 327560: loss 5.5680, time 125.31ms
iter 327570: loss 6.5894, time 124.96ms
iter 327580: loss 6.1548, time 125.69ms
iter 327590: loss 6.1871, time 125.14ms
iter 327600: loss 5.9485, time 127.95ms
iter 327610: loss 5.0188, time 125.00ms
iter 327620: loss 6.1997, time 125.60ms
iter 327630: loss 5.7680, time 124.95ms
iter 327640: loss 6.4306, time 125.99ms
iter 327650: loss 5.7879, time 125.32ms
iter 327660: loss 6.0670, time 125.65ms
iter 327670: loss 5.9051, time 125.32ms
iter 327680: loss 6.4872, time 125.93ms
iter 327690: loss 5.7801, time 125.74ms
iter 327700: loss 5.9372, time 125.64ms
iter 327710: loss 6.2224, time 125.74ms
iter 327720: loss 6.2547, time 128.21ms
iter 327730: loss 5.3308, time 125.61ms
iter 327740: loss 6.0766, time 125.73ms
step 327750: train loss 5.6281, val loss 5.6217
saving checkpoint to out-shakespeare-char
iter 327750: loss 5.9315, time 2898.14ms
iter 327760: loss 5.9851, time 121.60ms
iter 327770: loss 6.1074, time 122.59ms
iter 327780: loss 6.1178, time 121.43ms
iter 327790: loss 6.3473, time 121.55ms
iter 327800: loss 6.6667, time 121.50ms
iter 327810: loss 6.0329, time 121.70ms
iter 327820: loss 6.3409, time 121.88ms
iter 327830: loss 5.3714, time 122.52ms
iter 327840: loss 6.1832, time 122.01ms
iter 327850: loss 6.1885, time 121.59ms
iter 327860: loss 6.3620, time 121.55ms
iter 327870: loss 6.1731, time 121.55ms
iter 327880: loss 6.2913, time 121.57ms
iter 327890: loss 5.7592, time 121.65ms
iter 327900: loss 5.9598, time 121.62ms
iter 327910: loss 6.0438, time 121.75ms
iter 327920: loss 5.7272, time 121.51ms
iter 327930: loss 5.9405, time 121.55ms
iter 327940: loss 6.5173, time 121.57ms
iter 327950: loss 5.3169, time 121.58ms
iter 327960: loss 5.8553, time 121.64ms
iter 327970: loss 6.2140, time 121.55ms
iter 327980: loss 5.4527, time 121.54ms
iter 327990: loss 6.6898, time 121.58ms
step 328000: train loss 5.6163, val loss 5.6017
saving checkpoint to out-shakespeare-char
iter 328000: loss 5.9260, time 2876.10ms
iter 328010: loss 5.7463, time 121.57ms
iter 328020: loss 6.2715, time 121.59ms
iter 328030: loss 4.9033, time 120.56ms
iter 328040: loss 5.7859, time 120.53ms
iter 328050: loss 5.4994, time 121.61ms
iter 328060: loss 6.7500, time 121.56ms
iter 328070: loss 6.3489, time 120.59ms
iter 328080: loss 5.2541, time 121.97ms
iter 328090: loss 5.7998, time 121.56ms
iter 328100: loss 6.1372, time 121.51ms
iter 328110: loss 6.0786, time 121.48ms
iter 328120: loss 6.1463, time 121.56ms
iter 328130: loss 6.1651, time 121.54ms
iter 328140: loss 5.3150, time 121.35ms
iter 328150: loss 6.3174, time 121.32ms
iter 328160: loss 6.0556, time 121.52ms
iter 328170: loss 6.4678, time 121.62ms
iter 328180: loss 5.6223, time 121.52ms
iter 328190: loss 5.1307, time 121.98ms
iter 328200: loss 5.8465, time 120.51ms
iter 328210: loss 5.7933, time 121.51ms
iter 328220: loss 6.1663, time 121.64ms
iter 328230: loss 6.0129, time 122.16ms
iter 328240: loss 5.6236, time 122.27ms
step 328250: train loss 5.5745, val loss 5.6586
saving checkpoint to out-shakespeare-char
iter 328250: loss 5.4703, time 2883.57ms
iter 328260: loss 6.1385, time 122.25ms
iter 328270: loss 5.4796, time 121.98ms
iter 328280: loss 6.1012, time 121.91ms
iter 328290: loss 5.8015, time 122.43ms
iter 328300: loss 6.0906, time 121.66ms
iter 328310: loss 5.4402, time 121.81ms
iter 328320: loss 6.0095, time 121.69ms
iter 328330: loss 6.3707, time 121.15ms
iter 328340: loss 5.8107, time 121.83ms
iter 328350: loss 6.2716, time 121.92ms
iter 328360: loss 6.5654, time 122.24ms
iter 328370: loss 6.2665, time 122.30ms
iter 328380: loss 5.1018, time 121.94ms
iter 328390: loss 5.1729, time 122.18ms
iter 328400: loss 6.2175, time 121.77ms
iter 328410: loss 6.8944, time 121.75ms
iter 328420: loss 5.6388, time 121.98ms
iter 328430: loss 6.7673, time 122.17ms
iter 328440: loss 6.3397, time 122.00ms
iter 328450: loss 6.7186, time 122.58ms
iter 328460: loss 6.3678, time 121.87ms
iter 328470: loss 6.2190, time 121.89ms
iter 328480: loss 5.5757, time 121.91ms
iter 328490: loss 6.2037, time 123.31ms
step 328500: train loss 5.6483, val loss 5.6248
saving checkpoint to out-shakespeare-char
iter 328500: loss 5.3747, time 2886.59ms
iter 328510: loss 6.8806, time 121.93ms
iter 328520: loss 6.2487, time 122.26ms
iter 328530: loss 6.1298, time 121.82ms
iter 328540: loss 6.4125, time 121.13ms
iter 328550: loss 6.3372, time 121.78ms
iter 328560: loss 6.1185, time 122.30ms
iter 328570: loss 6.1842, time 121.82ms
iter 328580: loss 5.5703, time 121.85ms
iter 328590: loss 5.5802, time 121.67ms
iter 328600: loss 6.2586, time 121.70ms
iter 328610: loss 6.2508, time 121.63ms
iter 328620: loss 5.9187, time 121.94ms
iter 328630: loss 5.1902, time 122.37ms
iter 328640: loss 7.0639, time 121.77ms
iter 328650: loss 6.2114, time 121.80ms
iter 328660: loss 6.0253, time 121.93ms
iter 328670: loss 5.9805, time 121.88ms
iter 328680: loss 5.6099, time 122.08ms
iter 328690: loss 5.8438, time 121.76ms
iter 328700: loss 6.3122, time 121.95ms
iter 328710: loss 5.7093, time 122.21ms
iter 328720: loss 6.8703, time 121.63ms
iter 328730: loss 6.0896, time 122.04ms
iter 328740: loss 5.1963, time 121.93ms
step 328750: train loss 5.6217, val loss 5.6080
saving checkpoint to out-shakespeare-char
iter 328750: loss 5.5869, time 2907.54ms
iter 328760: loss 5.7525, time 123.74ms
iter 328770: loss 6.2159, time 122.96ms
iter 328780: loss 6.4415, time 124.17ms
iter 328790: loss 6.2225, time 121.44ms
iter 328800: loss 5.5575, time 122.49ms
iter 328810: loss 6.5152, time 121.54ms
iter 328820: loss 6.9411, time 123.70ms
iter 328830: loss 5.2006, time 121.67ms
iter 328840: loss 5.7093, time 123.64ms
iter 328850: loss 6.5856, time 121.51ms
iter 328860: loss 6.1384, time 123.67ms
iter 328870: loss 5.4785, time 121.57ms
iter 328880: loss 5.4950, time 123.72ms
iter 328890: loss 6.5441, time 121.48ms
iter 328900: loss 6.5784, time 123.90ms
iter 328910: loss 5.9353, time 121.68ms
iter 328920: loss 6.0826, time 123.59ms
iter 328930: loss 5.8012, time 121.42ms
iter 328940: loss 5.7002, time 122.87ms
iter 328950: loss 5.6469, time 121.47ms
iter 328960: loss 5.9545, time 124.11ms
iter 328970: loss 5.9972, time 121.56ms
iter 328980: loss 6.1278, time 123.75ms
iter 328990: loss 5.5733, time 121.92ms
step 329000: train loss 5.5960, val loss 5.6370
saving checkpoint to out-shakespeare-char
iter 329000: loss 6.4505, time 2885.39ms
iter 329010: loss 6.2112, time 121.73ms
iter 329020: loss 6.2536, time 122.24ms
iter 329030: loss 5.7262, time 121.66ms
iter 329040: loss 6.6273, time 123.11ms
iter 329050: loss 6.2546, time 122.03ms
iter 329060: loss 4.9200, time 123.08ms
iter 329070: loss 6.2720, time 121.62ms
iter 329080: loss 5.9331, time 122.63ms
iter 329090: loss 5.0544, time 121.45ms
iter 329100: loss 5.6687, time 123.45ms
iter 329110: loss 6.1334, time 121.42ms
iter 329120: loss 6.0097, time 122.33ms
iter 329130: loss 5.7612, time 121.70ms
iter 329140: loss 6.4417, time 121.95ms
iter 329150: loss 6.8417, time 121.49ms
iter 329160: loss 5.9330, time 122.49ms
iter 329170: loss 6.6366, time 121.55ms
iter 329180: loss 5.6653, time 122.98ms
iter 329190: loss 5.4057, time 121.36ms
iter 329200: loss 5.5609, time 122.81ms
iter 329210: loss 6.4499, time 120.67ms
iter 329220: loss 6.5261, time 122.66ms
iter 329230: loss 6.2496, time 121.24ms
iter 329240: loss 6.1803, time 122.57ms
step 329250: train loss 5.6059, val loss 5.6183
saving checkpoint to out-shakespeare-char
iter 329250: loss 5.5210, time 2891.14ms
iter 329260: loss 6.3042, time 121.66ms
iter 329270: loss 6.3169, time 121.46ms
iter 329280: loss 6.3857, time 121.50ms
iter 329290: loss 5.9084, time 121.45ms
iter 329300: loss 6.2983, time 121.76ms
iter 329310: loss 6.1103, time 120.79ms
iter 329320: loss 5.1906, time 121.39ms
iter 329330: loss 5.9235, time 121.58ms
iter 329340: loss 6.1175, time 121.64ms
iter 329350: loss 5.9199, time 121.62ms
iter 329360: loss 6.5444, time 121.90ms
iter 329370: loss 6.3763, time 121.57ms
iter 329380: loss 5.7227, time 121.46ms
iter 329390: loss 5.2813, time 121.65ms
iter 329400: loss 5.3561, time 122.98ms
iter 329410: loss 5.7970, time 122.19ms
iter 329420: loss 6.5840, time 122.22ms
iter 329430: loss 6.2184, time 121.97ms
iter 329440: loss 5.5865, time 121.11ms
iter 329450: loss 6.2606, time 122.56ms
iter 329460: loss 5.5274, time 121.54ms
iter 329470: loss 5.8757, time 121.88ms
iter 329480: loss 5.6657, time 121.76ms
iter 329490: loss 6.0659, time 121.37ms
step 329500: train loss 5.6151, val loss 5.6524
saving checkpoint to out-shakespeare-char
iter 329500: loss 5.8234, time 2892.75ms
iter 329510: loss 5.7055, time 121.61ms
iter 329520: loss 5.8384, time 121.82ms
iter 329530: loss 5.6936, time 121.74ms
iter 329540: loss 5.8821, time 121.36ms
iter 329550: loss 5.7447, time 121.56ms
iter 329560: loss 5.8809, time 121.63ms
iter 329570: loss 5.2325, time 122.05ms
iter 329580: loss 6.7539, time 122.91ms
iter 329590: loss 5.8965, time 122.74ms
iter 329600: loss 6.2605, time 122.27ms
iter 329610: loss 5.8451, time 121.65ms
iter 329620: loss 6.3343, time 120.65ms
iter 329630: loss 5.9587, time 121.72ms
iter 329640: loss 5.8092, time 120.65ms
iter 329650: loss 5.8863, time 121.56ms
iter 329660: loss 5.9695, time 121.51ms
iter 329670: loss 6.7224, time 120.80ms
iter 329680: loss 6.2193, time 121.40ms
iter 329690: loss 5.7715, time 123.02ms
iter 329700: loss 5.7974, time 122.72ms
iter 329710: loss 6.6190, time 121.50ms
iter 329720: loss 5.7224, time 122.96ms
iter 329730: loss 6.1186, time 121.58ms
iter 329740: loss 5.5648, time 122.75ms
step 329750: train loss 5.5702, val loss 5.6283
saving checkpoint to out-shakespeare-char
iter 329750: loss 6.3088, time 2882.19ms
iter 329760: loss 6.1146, time 122.73ms
iter 329770: loss 6.1411, time 121.46ms
iter 329780: loss 5.8038, time 122.81ms
iter 329790: loss 5.4780, time 121.39ms
iter 329800: loss 5.9630, time 122.55ms
iter 329810: loss 6.2700, time 121.67ms
iter 329820: loss 6.1259, time 122.73ms
iter 329830: loss 5.9635, time 121.85ms
iter 329840: loss 6.2248, time 122.93ms
iter 329850: loss 6.4088, time 122.69ms
iter 329860: loss 5.5885, time 121.55ms
iter 329870: loss 6.0535, time 122.83ms
iter 329880: loss 5.3220, time 121.63ms
iter 329890: loss 6.3855, time 122.64ms
iter 329900: loss 6.1945, time 121.48ms
iter 329910: loss 6.3719, time 122.95ms
iter 329920: loss 5.6243, time 121.47ms
iter 329930: loss 6.2247, time 122.90ms
iter 329940: loss 5.9675, time 121.62ms
iter 329950: loss 6.5246, time 122.36ms
iter 329960: loss 5.7205, time 122.97ms
iter 329970: loss 6.1543, time 121.43ms
iter 329980: loss 6.0780, time 123.61ms
iter 329990: loss 5.8918, time 120.70ms
step 330000: train loss 5.5673, val loss 5.6348
saving checkpoint to out-shakespeare-char
iter 330000: loss 6.0906, time 2901.53ms
iter 330010: loss 6.0666, time 126.75ms
iter 330020: loss 5.8945, time 126.18ms
iter 330030: loss 5.6544, time 125.91ms
iter 330040: loss 6.0088, time 125.55ms
iter 330050: loss 5.7233, time 126.15ms
iter 330060: loss 7.0659, time 128.50ms
iter 330070: loss 6.1034, time 125.76ms
iter 330080: loss 6.2773, time 125.74ms
iter 330090: loss 6.7115, time 125.52ms
iter 330100: loss 6.0172, time 125.82ms
iter 330110: loss 6.7529, time 126.97ms
iter 330120: loss 5.9257, time 125.39ms
iter 330130: loss 5.6721, time 125.67ms
iter 330140: loss 6.5288, time 126.21ms
iter 330150: loss 6.3009, time 125.56ms
iter 330160: loss 6.1011, time 125.35ms
iter 330170: loss 6.1356, time 125.45ms
iter 330180: loss 6.3907, time 125.92ms
iter 330190: loss 6.3633, time 127.11ms
iter 330200: loss 5.8455, time 125.85ms
iter 330210: loss 6.2664, time 125.37ms
iter 330220: loss 6.0935, time 126.17ms
iter 330230: loss 6.6894, time 126.77ms
iter 330240: loss 5.7671, time 124.89ms
step 330250: train loss 5.6235, val loss 5.6016
saving checkpoint to out-shakespeare-char
iter 330250: loss 6.3555, time 2890.46ms
iter 330260: loss 6.0369, time 125.33ms
iter 330270: loss 5.9490, time 125.39ms
iter 330280: loss 5.8468, time 127.49ms
iter 330290: loss 5.3645, time 124.18ms
iter 330300: loss 6.2241, time 125.61ms
iter 330310: loss 6.1318, time 125.06ms
iter 330320: loss 5.8130, time 125.21ms
iter 330330: loss 6.0080, time 125.65ms
iter 330340: loss 6.3392, time 126.24ms
iter 330350: loss 5.9319, time 126.68ms
iter 330360: loss 5.3980, time 125.39ms
iter 330370: loss 6.0396, time 125.70ms
iter 330380: loss 5.8186, time 125.98ms
iter 330390: loss 6.0059, time 128.23ms
iter 330400: loss 6.1858, time 125.06ms
iter 330410: loss 6.1369, time 125.81ms
iter 330420: loss 5.5957, time 125.83ms
iter 330430: loss 5.6931, time 124.93ms
iter 330440: loss 6.6179, time 125.66ms
iter 330450: loss 5.6997, time 127.24ms
iter 330460: loss 6.0994, time 125.78ms
iter 330470: loss 6.2739, time 128.31ms
iter 330480: loss 6.2580, time 125.64ms
iter 330490: loss 5.1148, time 125.89ms
step 330500: train loss 5.5658, val loss 5.5958
saving checkpoint to out-shakespeare-char
iter 330500: loss 6.1631, time 2901.38ms
iter 330510: loss 6.1800, time 125.25ms
iter 330520: loss 6.1303, time 125.19ms
iter 330530: loss 5.6216, time 124.76ms
iter 330540: loss 6.0063, time 125.36ms
iter 330550: loss 6.5970, time 125.26ms
iter 330560: loss 5.9510, time 124.07ms
iter 330570: loss 6.6314, time 124.16ms
iter 330580: loss 5.5895, time 125.48ms
iter 330590: loss 5.6788, time 125.15ms
iter 330600: loss 6.0274, time 124.51ms
iter 330610: loss 5.9141, time 127.56ms
iter 330620: loss 6.1250, time 125.26ms
iter 330630: loss 5.7475, time 125.53ms
iter 330640: loss 6.4241, time 124.99ms
iter 330650: loss 5.1492, time 125.56ms
iter 330660: loss 5.8363, time 125.75ms
iter 330670: loss 6.5624, time 125.68ms
iter 330680: loss 6.1423, time 125.83ms
iter 330690: loss 6.3325, time 127.97ms
iter 330700: loss 5.5966, time 124.16ms
iter 330710: loss 6.5441, time 125.67ms
iter 330720: loss 5.8042, time 125.85ms
iter 330730: loss 6.5078, time 124.94ms
iter 330740: loss 6.5704, time 121.79ms
step 330750: train loss 5.5915, val loss 5.6284
saving checkpoint to out-shakespeare-char
iter 330750: loss 5.7027, time 2894.75ms
iter 330760: loss 5.7472, time 121.85ms
iter 330770: loss 5.7992, time 122.20ms
iter 330780: loss 4.8923, time 121.53ms
iter 330790: loss 5.9300, time 122.74ms
iter 330800: loss 5.9846, time 121.42ms
iter 330810: loss 5.9894, time 122.72ms
iter 330820: loss 7.0994, time 120.91ms
iter 330830: loss 5.5377, time 123.32ms
iter 330840: loss 5.6788, time 121.70ms
iter 330850: loss 6.4601, time 122.92ms
iter 330860: loss 5.6635, time 122.61ms
iter 330870: loss 6.3118, time 121.64ms
iter 330880: loss 6.4146, time 123.22ms
iter 330890: loss 5.8833, time 121.62ms
iter 330900: loss 6.2272, time 122.60ms
iter 330910: loss 6.7224, time 121.74ms
iter 330920: loss 6.1998, time 122.83ms
iter 330930: loss 6.4030, time 122.26ms
iter 330940: loss 5.5816, time 122.74ms
iter 330950: loss 5.9714, time 121.63ms
iter 330960: loss 6.4709, time 122.77ms
iter 330970: loss 6.2441, time 122.69ms
iter 330980: loss 6.0534, time 121.62ms
iter 330990: loss 6.4068, time 122.73ms
step 331000: train loss 5.6609, val loss 5.6064
saving checkpoint to out-shakespeare-char
iter 331000: loss 6.5194, time 2901.18ms
iter 331010: loss 5.9330, time 126.85ms
iter 331020: loss 6.4597, time 125.51ms
iter 331030: loss 5.9962, time 125.38ms
iter 331040: loss 6.1398, time 124.86ms
iter 331050: loss 5.7844, time 124.78ms
iter 331060: loss 6.5221, time 124.96ms
iter 331070: loss 5.6953, time 125.46ms
iter 331080: loss 5.8060, time 127.59ms
iter 331090: loss 6.2413, time 125.40ms
iter 331100: loss 6.2849, time 125.32ms
iter 331110: loss 5.6313, time 125.35ms
iter 331120: loss 5.7969, time 124.87ms
iter 331130: loss 6.2688, time 124.44ms
iter 331140: loss 6.4945, time 125.31ms
iter 331150: loss 6.3828, time 125.26ms
iter 331160: loss 5.9758, time 125.84ms
iter 331170: loss 5.9205, time 125.39ms
iter 331180: loss 5.4330, time 125.33ms
iter 331190: loss 6.3027, time 125.61ms
iter 331200: loss 5.9373, time 128.14ms
iter 331210: loss 5.5584, time 128.05ms
iter 331220: loss 5.9743, time 124.70ms
iter 331230: loss 5.7926, time 123.90ms
iter 331240: loss 6.4755, time 126.76ms
step 331250: train loss 5.6385, val loss 5.6220
saving checkpoint to out-shakespeare-char
iter 331250: loss 6.7028, time 2896.88ms
iter 331260: loss 6.1127, time 124.64ms
iter 331270: loss 6.2069, time 124.24ms
iter 331280: loss 6.3110, time 124.66ms
iter 331290: loss 6.1839, time 124.30ms
iter 331300: loss 5.8648, time 124.40ms
iter 331310: loss 5.8333, time 123.13ms
iter 331320: loss 6.3409, time 124.24ms
iter 331330: loss 6.2162, time 124.46ms
iter 331340: loss 6.4710, time 124.46ms
iter 331350: loss 5.8574, time 124.56ms
iter 331360: loss 5.6243, time 124.91ms
iter 331370: loss 5.8278, time 125.21ms
iter 331380: loss 5.8641, time 127.52ms
iter 331390: loss 5.2510, time 125.04ms
iter 331400: loss 6.2649, time 124.33ms
iter 331410: loss 5.5981, time 125.00ms
iter 331420: loss 5.8024, time 125.26ms
iter 331430: loss 5.6560, time 125.46ms
iter 331440: loss 6.0178, time 125.28ms
iter 331450: loss 6.0211, time 125.20ms
iter 331460: loss 5.7159, time 124.51ms
iter 331470: loss 5.8152, time 124.83ms
iter 331480: loss 6.4775, time 127.37ms
iter 331490: loss 6.3542, time 125.30ms
step 331500: train loss 5.5774, val loss 5.6384
saving checkpoint to out-shakespeare-char
iter 331500: loss 5.7586, time 2878.90ms
iter 331510: loss 6.4220, time 126.00ms
iter 331520: loss 5.3838, time 128.08ms
iter 331530: loss 5.1358, time 125.62ms
iter 331540: loss 6.7076, time 126.69ms
iter 331550: loss 5.0401, time 125.82ms
iter 331560: loss 5.4067, time 125.65ms
iter 331570: loss 6.2687, time 125.80ms
iter 331580: loss 6.0538, time 125.58ms
iter 331590: loss 5.4847, time 125.61ms
iter 331600: loss 6.2241, time 125.62ms
iter 331610: loss 5.6764, time 126.00ms
iter 331620: loss 6.2008, time 126.42ms
iter 331630: loss 6.3400, time 128.01ms
iter 331640: loss 5.9383, time 126.30ms
iter 331650: loss 6.5409, time 125.51ms
iter 331660: loss 6.3537, time 125.63ms
iter 331670: loss 5.7506, time 125.50ms
iter 331680: loss 5.6985, time 125.54ms
iter 331690: loss 6.1181, time 125.58ms
iter 331700: loss 6.1815, time 125.65ms
iter 331710: loss 5.7487, time 125.81ms
iter 331720: loss 6.3314, time 125.38ms
iter 331730: loss 5.8245, time 125.78ms
iter 331740: loss 5.6253, time 128.02ms
step 331750: train loss 5.6128, val loss 5.6118
saving checkpoint to out-shakespeare-char
iter 331750: loss 6.0551, time 2892.53ms
iter 331760: loss 5.8114, time 122.44ms
iter 331770: loss 6.4614, time 121.06ms
iter 331780: loss 6.6501, time 121.22ms
iter 331790: loss 5.7443, time 121.84ms
iter 331800: loss 5.7174, time 121.86ms
iter 331810: loss 5.4617, time 122.24ms
iter 331820: loss 6.3148, time 121.38ms
iter 331830: loss 5.4153, time 121.68ms
iter 331840: loss 5.9134, time 121.61ms
iter 331850: loss 5.9029, time 120.73ms
iter 331860: loss 5.6552, time 121.60ms
iter 331870: loss 6.1459, time 121.80ms
iter 331880: loss 6.4886, time 121.68ms
iter 331890: loss 5.7597, time 121.94ms
iter 331900: loss 5.8086, time 121.47ms
iter 331910: loss 6.2183, time 121.76ms
iter 331920: loss 5.7253, time 121.81ms
iter 331930: loss 6.2084, time 121.04ms
iter 331940: loss 5.4495, time 121.68ms
iter 331950: loss 5.9834, time 121.64ms
iter 331960: loss 6.3663, time 121.28ms
iter 331970: loss 5.7858, time 121.67ms
iter 331980: loss 5.3622, time 121.55ms
iter 331990: loss 5.8254, time 122.01ms
step 332000: train loss 5.6445, val loss 5.6085
saving checkpoint to out-shakespeare-char
iter 332000: loss 5.7762, time 2878.50ms
iter 332010: loss 6.2144, time 123.87ms
iter 332020: loss 5.7064, time 121.46ms
iter 332030: loss 5.3791, time 123.92ms
iter 332040: loss 5.8389, time 121.70ms
iter 332050: loss 5.8148, time 123.96ms
iter 332060: loss 5.2370, time 122.02ms
iter 332070: loss 5.5635, time 124.03ms
iter 332080: loss 6.3911, time 121.78ms
iter 332090: loss 5.4738, time 123.81ms
iter 332100: loss 6.0748, time 121.31ms
iter 332110: loss 6.0976, time 123.81ms
iter 332120: loss 6.1036, time 121.00ms
iter 332130: loss 5.8232, time 123.74ms
iter 332140: loss 6.2237, time 121.63ms
iter 332150: loss 6.5796, time 123.84ms
iter 332160: loss 6.1276, time 121.81ms
iter 332170: loss 5.4661, time 123.86ms
iter 332180: loss 6.6085, time 121.64ms
iter 332190: loss 4.7232, time 123.88ms
iter 332200: loss 5.7724, time 121.36ms
iter 332210: loss 5.7189, time 123.90ms
iter 332220: loss 5.7494, time 122.43ms
iter 332230: loss 6.1343, time 124.26ms
iter 332240: loss 6.3779, time 121.57ms
step 332250: train loss 5.6347, val loss 5.6074
saving checkpoint to out-shakespeare-char
iter 332250: loss 4.8483, time 2887.84ms
iter 332260: loss 6.0112, time 124.00ms
iter 332270: loss 6.0767, time 121.84ms
iter 332280: loss 6.6610, time 123.97ms
iter 332290: loss 6.2174, time 121.76ms
iter 332300: loss 5.7393, time 123.76ms
iter 332310: loss 5.9815, time 121.54ms
iter 332320: loss 6.4454, time 123.82ms
iter 332330: loss 5.7268, time 121.57ms
iter 332340: loss 5.3371, time 123.22ms
iter 332350: loss 6.8195, time 121.56ms
iter 332360: loss 5.0807, time 123.82ms
iter 332370: loss 6.1626, time 121.60ms
iter 332380: loss 6.4480, time 123.74ms
iter 332390: loss 6.2729, time 121.30ms
iter 332400: loss 5.5986, time 123.93ms
iter 332410: loss 5.2337, time 121.88ms
iter 332420: loss 6.2300, time 123.55ms
iter 332430: loss 6.0200, time 121.78ms
iter 332440: loss 6.4093, time 123.99ms
iter 332450: loss 6.2827, time 121.60ms
iter 332460: loss 6.7311, time 123.85ms
iter 332470: loss 5.7121, time 121.54ms
iter 332480: loss 6.1617, time 123.62ms
iter 332490: loss 6.1180, time 121.64ms
step 332500: train loss 5.6129, val loss 5.6720
saving checkpoint to out-shakespeare-char
iter 332500: loss 6.1334, time 2887.85ms
iter 332510: loss 6.1262, time 121.96ms
iter 332520: loss 6.3776, time 122.17ms
iter 332530: loss 6.6614, time 121.76ms
iter 332540: loss 5.7347, time 121.85ms
iter 332550: loss 5.0372, time 121.17ms
iter 332560: loss 6.6313, time 122.55ms
iter 332570: loss 6.0749, time 121.64ms
iter 332580: loss 5.6536, time 121.91ms
iter 332590: loss 7.2344, time 121.83ms
iter 332600: loss 5.6727, time 122.00ms
iter 332610: loss 6.1831, time 121.67ms
iter 332620: loss 6.0706, time 122.78ms
iter 332630: loss 5.8345, time 121.87ms
iter 332640: loss 5.4805, time 121.90ms
iter 332650: loss 5.7831, time 121.67ms
iter 332660: loss 5.5635, time 122.70ms
iter 332670: loss 6.1978, time 121.87ms
iter 332680: loss 6.6227, time 123.98ms
iter 332690: loss 5.9661, time 121.88ms
iter 332700: loss 6.1113, time 122.19ms
iter 332710: loss 6.6780, time 121.72ms
iter 332720: loss 5.7991, time 122.08ms
iter 332730: loss 5.7133, time 121.79ms
iter 332740: loss 5.9919, time 122.16ms
step 332750: train loss 5.5973, val loss 5.6281
saving checkpoint to out-shakespeare-char
iter 332750: loss 7.0016, time 2878.00ms
iter 332760: loss 6.2821, time 121.50ms
iter 332770: loss 6.4161, time 121.70ms
iter 332780: loss 6.1699, time 121.80ms
iter 332790: loss 5.6223, time 121.53ms
iter 332800: loss 5.9320, time 121.53ms
iter 332810: loss 6.0546, time 121.62ms
iter 332820: loss 5.9264, time 122.12ms
iter 332830: loss 6.2162, time 121.55ms
iter 332840: loss 5.9372, time 121.61ms
iter 332850: loss 5.9069, time 121.57ms
iter 332860: loss 5.7270, time 121.54ms
iter 332870: loss 6.2837, time 121.58ms
iter 332880: loss 5.9164, time 121.74ms
iter 332890: loss 5.6418, time 121.61ms
iter 332900: loss 6.3412, time 121.69ms
iter 332910: loss 5.8694, time 121.47ms
iter 332920: loss 5.9075, time 121.78ms
iter 332930: loss 5.8576, time 121.44ms
iter 332940: loss 6.3976, time 121.53ms
iter 332950: loss 6.1665, time 121.35ms
iter 332960: loss 5.1973, time 121.46ms
iter 332970: loss 6.1915, time 121.48ms
iter 332980: loss 6.1755, time 121.37ms
iter 332990: loss 6.7786, time 121.43ms
step 333000: train loss 5.6353, val loss 5.6056
saving checkpoint to out-shakespeare-char
iter 333000: loss 5.6485, time 2867.77ms
iter 333010: loss 5.6452, time 124.62ms
iter 333020: loss 5.6522, time 123.91ms
iter 333030: loss 5.2238, time 124.34ms
iter 333040: loss 6.4652, time 124.36ms
iter 333050: loss 5.7978, time 124.47ms
iter 333060: loss 5.5166, time 124.62ms
iter 333070: loss 6.0886, time 124.67ms
iter 333080: loss 6.8953, time 125.18ms
iter 333090: loss 5.7361, time 127.15ms
iter 333100: loss 5.5841, time 126.63ms
iter 333110: loss 6.0076, time 125.89ms
iter 333120: loss 6.5360, time 125.27ms
iter 333130: loss 6.4580, time 124.26ms
iter 333140: loss 5.9569, time 124.41ms
iter 333150: loss 6.3130, time 124.95ms
iter 333160: loss 5.5464, time 125.63ms
iter 333170: loss 6.6466, time 125.09ms
iter 333180: loss 5.6594, time 125.10ms
iter 333190: loss 5.8596, time 124.94ms
iter 333200: loss 5.7213, time 128.36ms
iter 333210: loss 5.3304, time 125.30ms
iter 333220: loss 6.3802, time 125.10ms
iter 333230: loss 6.1186, time 124.71ms
iter 333240: loss 5.9773, time 125.24ms
step 333250: train loss 5.6663, val loss 5.6595
saving checkpoint to out-shakespeare-char
iter 333250: loss 5.4742, time 2884.35ms
iter 333260: loss 5.8118, time 125.08ms
iter 333270: loss 6.2619, time 125.07ms
iter 333280: loss 6.5501, time 126.59ms
iter 333290: loss 5.7721, time 125.32ms
iter 333300: loss 5.9315, time 124.93ms
iter 333310: loss 6.6820, time 124.12ms
iter 333320: loss 5.3865, time 123.92ms
iter 333330: loss 5.7105, time 125.23ms
iter 333340: loss 5.6864, time 125.44ms
iter 333350: loss 5.7001, time 127.77ms
iter 333360: loss 6.3644, time 124.54ms
iter 333370: loss 6.2662, time 125.18ms
iter 333380: loss 6.3239, time 125.15ms
iter 333390: loss 6.0720, time 127.79ms
iter 333400: loss 6.1225, time 124.16ms
iter 333410: loss 6.1271, time 125.43ms
iter 333420: loss 5.9427, time 124.47ms
iter 333430: loss 5.0357, time 125.51ms
iter 333440: loss 6.2120, time 125.47ms
iter 333450: loss 5.0545, time 124.81ms
iter 333460: loss 5.9938, time 125.47ms
iter 333470: loss 5.8884, time 125.13ms
iter 333480: loss 6.1523, time 124.83ms
iter 333490: loss 5.9393, time 125.02ms
step 333500: train loss 5.5869, val loss 5.6174
saving checkpoint to out-shakespeare-char
iter 333500: loss 6.0933, time 2872.53ms
iter 333510: loss 5.6206, time 125.53ms
iter 333520: loss 5.2437, time 124.67ms
iter 333530: loss 6.1853, time 125.90ms
iter 333540: loss 6.4450, time 124.61ms
iter 333550: loss 5.8676, time 125.50ms
iter 333560: loss 6.5830, time 126.62ms
iter 333570: loss 6.8237, time 124.57ms
iter 333580: loss 6.0843, time 124.41ms
iter 333590: loss 5.7892, time 125.53ms
iter 333600: loss 5.9644, time 123.68ms
iter 333610: loss 5.9652, time 124.79ms
iter 333620: loss 6.1110, time 125.15ms
iter 333630: loss 6.2301, time 125.02ms
iter 333640: loss 5.5610, time 125.34ms
iter 333650: loss 6.3111, time 125.58ms
iter 333660: loss 5.8602, time 126.70ms
iter 333670: loss 6.7074, time 125.34ms
iter 333680: loss 5.9662, time 125.49ms
iter 333690: loss 5.5122, time 125.43ms
iter 333700: loss 5.9784, time 125.12ms
iter 333710: loss 5.9559, time 125.45ms
iter 333720: loss 5.9329, time 125.14ms
iter 333730: loss 5.3698, time 125.05ms
iter 333740: loss 6.0209, time 125.60ms
step 333750: train loss 5.6271, val loss 5.6507
saving checkpoint to out-shakespeare-char
iter 333750: loss 5.9352, time 2904.21ms
iter 333760: loss 5.5458, time 124.92ms
iter 333770: loss 6.0478, time 125.34ms
iter 333780: loss 6.2402, time 125.27ms
iter 333790: loss 6.1377, time 125.72ms
iter 333800: loss 4.8055, time 123.92ms
iter 333810: loss 5.5273, time 125.64ms
iter 333820: loss 5.9624, time 126.94ms
iter 333830: loss 5.9870, time 125.36ms
iter 333840: loss 6.0510, time 124.07ms
iter 333850: loss 6.1657, time 125.56ms
iter 333860: loss 5.7603, time 124.55ms
iter 333870: loss 6.0564, time 125.61ms
iter 333880: loss 6.3684, time 124.08ms
iter 333890: loss 6.0409, time 124.71ms
iter 333900: loss 6.6594, time 124.61ms
iter 333910: loss 6.8520, time 125.19ms
iter 333920: loss 6.4751, time 125.14ms
iter 333930: loss 5.9389, time 124.49ms
iter 333940: loss 5.8072, time 124.94ms
iter 333950: loss 5.6059, time 125.50ms
iter 333960: loss 6.6231, time 125.30ms
iter 333970: loss 6.2933, time 125.48ms
iter 333980: loss 5.2024, time 124.32ms
iter 333990: loss 5.6052, time 124.77ms
step 334000: train loss 5.6283, val loss 5.5988
saving checkpoint to out-shakespeare-char
iter 334000: loss 5.8776, time 2883.78ms
iter 334010: loss 6.5223, time 124.07ms
iter 334020: loss 5.2006, time 125.30ms
iter 334030: loss 6.7754, time 125.42ms
iter 334040: loss 5.7310, time 125.93ms
iter 334050: loss 5.8278, time 125.89ms
iter 334060: loss 6.0921, time 126.20ms
iter 334070: loss 5.8961, time 125.98ms
iter 334080: loss 5.8497, time 126.81ms
iter 334090: loss 6.0174, time 126.12ms
iter 334100: loss 5.4442, time 125.82ms
iter 334110: loss 5.2936, time 128.09ms
iter 334120: loss 5.6140, time 125.83ms
iter 334130: loss 6.3143, time 126.12ms
iter 334140: loss 5.4419, time 125.75ms
iter 334150: loss 6.0276, time 125.22ms
iter 334160: loss 5.7467, time 125.50ms
iter 334170: loss 6.5487, time 125.71ms
iter 334180: loss 5.1997, time 125.56ms
iter 334190: loss 6.1678, time 125.45ms
iter 334200: loss 5.4106, time 125.74ms
iter 334210: loss 4.9224, time 126.39ms
iter 334220: loss 6.0463, time 124.79ms
iter 334230: loss 5.7214, time 126.50ms
iter 334240: loss 5.1794, time 120.54ms
step 334250: train loss 5.6455, val loss 5.6521
saving checkpoint to out-shakespeare-char
iter 334250: loss 5.8482, time 2873.19ms
iter 334260: loss 5.4430, time 121.72ms
iter 334270: loss 6.5424, time 122.73ms
iter 334280: loss 5.6816, time 121.49ms
iter 334290: loss 6.4945, time 122.68ms
iter 334300: loss 5.6210, time 121.54ms
iter 334310: loss 6.1603, time 122.60ms
iter 334320: loss 5.9894, time 120.96ms
iter 334330: loss 5.9225, time 122.91ms
iter 334340: loss 5.7290, time 120.94ms
iter 334350: loss 6.5754, time 122.68ms
iter 334360: loss 5.8814, time 121.49ms
iter 334370: loss 6.1466, time 122.91ms
iter 334380: loss 5.4436, time 121.44ms
iter 334390: loss 5.7098, time 122.59ms
iter 334400: loss 5.7687, time 121.51ms
iter 334410: loss 5.6086, time 122.98ms
iter 334420: loss 5.6719, time 120.70ms
iter 334430: loss 6.6764, time 122.48ms
iter 334440: loss 5.1293, time 121.62ms
iter 334450: loss 6.0889, time 122.76ms
iter 334460: loss 5.2774, time 121.68ms
iter 334470: loss 5.5833, time 122.90ms
iter 334480: loss 6.3893, time 122.02ms
iter 334490: loss 6.2836, time 122.80ms
step 334500: train loss 5.5970, val loss 5.6647
saving checkpoint to out-shakespeare-char
iter 334500: loss 5.7591, time 2900.65ms
iter 334510: loss 5.8503, time 125.74ms
iter 334520: loss 5.7403, time 125.90ms
iter 334530: loss 6.2169, time 126.84ms
iter 334540: loss 6.0519, time 125.48ms
iter 334550: loss 6.1096, time 123.81ms
iter 334560: loss 6.0103, time 125.47ms
iter 334570: loss 5.6500, time 125.88ms
iter 334580: loss 5.1733, time 124.63ms
iter 334590: loss 5.8541, time 125.57ms
iter 334600: loss 5.3676, time 125.40ms
iter 334610: loss 5.9362, time 127.57ms
iter 334620: loss 6.0398, time 125.57ms
iter 334630: loss 5.6425, time 125.16ms
iter 334640: loss 5.4716, time 124.26ms
iter 334650: loss 5.8341, time 125.39ms
iter 334660: loss 6.7519, time 124.24ms
iter 334670: loss 5.5405, time 125.58ms
iter 334680: loss 6.3019, time 125.98ms
iter 334690: loss 5.6649, time 125.88ms
iter 334700: loss 6.1966, time 125.74ms
iter 334710: loss 5.8158, time 125.66ms
iter 334720: loss 5.5877, time 128.48ms
iter 334730: loss 6.1338, time 125.56ms
iter 334740: loss 5.9157, time 125.97ms
step 334750: train loss 5.5970, val loss 5.6240
saving checkpoint to out-shakespeare-char
iter 334750: loss 5.3409, time 2896.16ms
iter 334760: loss 6.0325, time 125.74ms
iter 334770: loss 5.6627, time 125.64ms
iter 334780: loss 5.7428, time 125.95ms
iter 334790: loss 5.9214, time 125.69ms
iter 334800: loss 5.4359, time 126.13ms
iter 334810: loss 6.8295, time 125.82ms
iter 334820: loss 6.2087, time 125.79ms
iter 334830: loss 6.0310, time 125.46ms
iter 334840: loss 6.8287, time 126.21ms
iter 334850: loss 6.5024, time 126.07ms
iter 334860: loss 6.2383, time 121.47ms
iter 334870: loss 6.2002, time 121.07ms
iter 334880: loss 5.7283, time 121.97ms
iter 334890: loss 6.0830, time 121.94ms
iter 334900: loss 5.7422, time 121.90ms
iter 334910: loss 5.0461, time 121.25ms
iter 334920: loss 6.3275, time 121.03ms
iter 334930: loss 5.4282, time 121.53ms
iter 334940: loss 5.1678, time 121.39ms
iter 334950: loss 6.4077, time 121.77ms
iter 334960: loss 5.5761, time 121.51ms
iter 334970: loss 5.6209, time 121.50ms
iter 334980: loss 5.9503, time 121.40ms
iter 334990: loss 5.8357, time 121.39ms
step 335000: train loss 5.6297, val loss 5.6567
saving checkpoint to out-shakespeare-char
iter 335000: loss 6.4805, time 2882.64ms
iter 335010: loss 6.4828, time 121.84ms
iter 335020: loss 6.1680, time 123.68ms
iter 335030: loss 5.4235, time 121.40ms
iter 335040: loss 7.3194, time 121.71ms
iter 335050: loss 5.8228, time 121.99ms
iter 335060: loss 6.0214, time 121.55ms
iter 335070: loss 6.5047, time 121.24ms
iter 335080: loss 6.0424, time 121.03ms
iter 335090: loss 5.9741, time 121.49ms
iter 335100: loss 6.1080, time 121.86ms
iter 335110: loss 5.8861, time 121.26ms
iter 335120: loss 6.4195, time 121.48ms
iter 335130: loss 5.8255, time 121.92ms
iter 335140: loss 5.9456, time 121.53ms
iter 335150: loss 6.4188, time 121.91ms
iter 335160: loss 5.7543, time 121.26ms
iter 335170: loss 5.9981, time 121.52ms
iter 335180: loss 5.8847, time 121.50ms
iter 335190: loss 5.7694, time 121.38ms
iter 335200: loss 6.6039, time 121.66ms
iter 335210: loss 5.4610, time 121.12ms
iter 335220: loss 5.8414, time 121.19ms
iter 335230: loss 6.2304, time 121.29ms
iter 335240: loss 6.8391, time 121.55ms
step 335250: train loss 5.6323, val loss 5.6572
saving checkpoint to out-shakespeare-char
iter 335250: loss 6.6456, time 2887.36ms
iter 335260: loss 6.2711, time 122.86ms
iter 335270: loss 6.6552, time 121.46ms
iter 335280: loss 6.0072, time 122.56ms
iter 335290: loss 5.3663, time 121.00ms
iter 335300: loss 5.8372, time 123.00ms
iter 335310: loss 5.6802, time 121.86ms
iter 335320: loss 6.3903, time 122.71ms
iter 335330: loss 6.1306, time 121.52ms
iter 335340: loss 5.3996, time 122.00ms
iter 335350: loss 5.9798, time 121.43ms
iter 335360: loss 5.0660, time 122.61ms
iter 335370: loss 6.4755, time 121.50ms
iter 335380: loss 6.5615, time 122.53ms
iter 335390: loss 5.3483, time 121.45ms
iter 335400: loss 5.9728, time 122.83ms
iter 335410: loss 5.9577, time 121.63ms
iter 335420: loss 6.0550, time 122.64ms
iter 335430: loss 5.9826, time 121.37ms
iter 335440: loss 5.8381, time 122.46ms
iter 335450: loss 6.8004, time 121.95ms
iter 335460: loss 6.1280, time 122.58ms
iter 335470: loss 5.6606, time 121.62ms
iter 335480: loss 5.5426, time 122.63ms
iter 335490: loss 6.1311, time 121.49ms
step 335500: train loss 5.5612, val loss 5.6289
saving checkpoint to out-shakespeare-char
iter 335500: loss 5.7569, time 2900.54ms
iter 335510: loss 5.8420, time 127.57ms
iter 335520: loss 6.1117, time 125.43ms
iter 335530: loss 6.1401, time 124.88ms
iter 335540: loss 6.7672, time 125.08ms
iter 335550: loss 6.3160, time 129.36ms
iter 335560: loss 5.6574, time 125.51ms
iter 335570: loss 5.6422, time 125.93ms
iter 335580: loss 6.0261, time 125.14ms
iter 335590: loss 6.3411, time 126.28ms
iter 335600: loss 6.3190, time 125.88ms
iter 335610: loss 5.8952, time 125.75ms
iter 335620: loss 6.0700, time 127.68ms
iter 335630: loss 5.1191, time 125.77ms
iter 335640: loss 5.7191, time 125.49ms
iter 335650: loss 6.2806, time 126.16ms
iter 335660: loss 6.5744, time 125.63ms
iter 335670: loss 5.8639, time 125.67ms
iter 335680: loss 6.0892, time 125.78ms
iter 335690: loss 5.6580, time 124.83ms
iter 335700: loss 6.2771, time 125.64ms
iter 335710: loss 6.2584, time 125.76ms
iter 335720: loss 6.2342, time 125.99ms
iter 335730: loss 5.1657, time 128.46ms
iter 335740: loss 5.4465, time 125.58ms
step 335750: train loss 5.6138, val loss 5.6384
saving checkpoint to out-shakespeare-char
iter 335750: loss 6.1675, time 2898.34ms
iter 335760: loss 5.0412, time 125.71ms
iter 335770: loss 5.7837, time 124.29ms
iter 335780: loss 6.0622, time 125.44ms
iter 335790: loss 5.8366, time 124.94ms
iter 335800: loss 5.4899, time 125.49ms
iter 335810: loss 5.9141, time 125.00ms
iter 335820: loss 6.1499, time 125.01ms
iter 335830: loss 6.3498, time 125.26ms
iter 335840: loss 7.2093, time 127.61ms
iter 335850: loss 6.5306, time 125.08ms
iter 335860: loss 5.8638, time 125.23ms
iter 335870: loss 6.0636, time 126.21ms
iter 335880: loss 6.3208, time 127.36ms
iter 335890: loss 5.9191, time 125.02ms
iter 335900: loss 6.3484, time 124.29ms
iter 335910: loss 5.8172, time 125.18ms
iter 335920: loss 6.6401, time 124.89ms
iter 335930: loss 5.4333, time 124.80ms
iter 335940: loss 6.3676, time 125.08ms
iter 335950: loss 6.0998, time 125.48ms
iter 335960: loss 5.9492, time 124.95ms
iter 335970: loss 5.9415, time 124.89ms
iter 335980: loss 5.8581, time 124.85ms
iter 335990: loss 6.7559, time 127.79ms
step 336000: train loss 5.6709, val loss 5.6101
saving checkpoint to out-shakespeare-char
iter 336000: loss 5.8143, time 2901.77ms
iter 336010: loss 6.1199, time 124.56ms
iter 336020: loss 5.5884, time 124.34ms
iter 336030: loss 6.1875, time 124.85ms
iter 336040: loss 5.9432, time 125.18ms
iter 336050: loss 6.3797, time 125.06ms
iter 336060: loss 5.4391, time 125.57ms
iter 336070: loss 6.0003, time 125.24ms
iter 336080: loss 5.9554, time 124.87ms
iter 336090: loss 5.6495, time 125.05ms
iter 336100: loss 5.4373, time 125.39ms
iter 336110: loss 6.3209, time 125.05ms
iter 336120: loss 5.5080, time 125.14ms
iter 336130: loss 6.1341, time 124.91ms
iter 336140: loss 5.2402, time 125.23ms
iter 336150: loss 5.9915, time 125.50ms
iter 336160: loss 5.4778, time 125.02ms
iter 336170: loss 5.4022, time 124.93ms
iter 336180: loss 6.1228, time 125.02ms
iter 336190: loss 5.5865, time 125.22ms
iter 336200: loss 6.2968, time 127.65ms
iter 336210: loss 6.1073, time 124.98ms
iter 336220: loss 5.8517, time 125.74ms
iter 336230: loss 5.8933, time 123.94ms
iter 336240: loss 5.5915, time 125.13ms
step 336250: train loss 5.6011, val loss 5.6421
saving checkpoint to out-shakespeare-char
iter 336250: loss 5.8817, time 2911.40ms
iter 336260: loss 5.6730, time 125.18ms
iter 336270: loss 6.3925, time 125.22ms
iter 336280: loss 6.2399, time 124.97ms
iter 336290: loss 6.3250, time 125.47ms
iter 336300: loss 5.9772, time 125.00ms
iter 336310: loss 5.7944, time 125.24ms
iter 336320: loss 5.7058, time 125.49ms
iter 336330: loss 6.4900, time 127.34ms
iter 336340: loss 5.6075, time 125.05ms
iter 336350: loss 6.5706, time 125.42ms
iter 336360: loss 5.3317, time 125.90ms
iter 336370: loss 5.6665, time 125.22ms
iter 336380: loss 6.3392, time 124.92ms
iter 336390: loss 5.5712, time 124.84ms
iter 336400: loss 6.2176, time 125.49ms
iter 336410: loss 6.5418, time 125.03ms
iter 336420: loss 5.9454, time 125.05ms
iter 336430: loss 5.8842, time 125.31ms
iter 336440: loss 6.3241, time 127.70ms
iter 336450: loss 6.4364, time 124.87ms
iter 336460: loss 6.5364, time 124.86ms
iter 336470: loss 6.5469, time 125.10ms
iter 336480: loss 6.2788, time 125.49ms
iter 336490: loss 6.1242, time 125.36ms
step 336500: train loss 5.5867, val loss 5.6702
saving checkpoint to out-shakespeare-char
iter 336500: loss 5.9536, time 2901.74ms
iter 336510: loss 7.0283, time 125.06ms
iter 336520: loss 6.1498, time 125.15ms
iter 336530: loss 5.2754, time 125.13ms
iter 336540: loss 5.9683, time 124.86ms
iter 336550: loss 5.3637, time 124.32ms
iter 336560: loss 5.9156, time 125.16ms
iter 336570: loss 5.7481, time 125.41ms
iter 336580: loss 5.4887, time 125.18ms
iter 336590: loss 6.1643, time 125.09ms
iter 336600: loss 6.7061, time 125.43ms
iter 336610: loss 5.3930, time 124.32ms
iter 336620: loss 5.4687, time 125.32ms
iter 336630: loss 6.0252, time 125.07ms
iter 336640: loss 5.9097, time 124.88ms
iter 336650: loss 5.6581, time 124.86ms
iter 336660: loss 5.6387, time 125.00ms
iter 336670: loss 6.1448, time 124.75ms
iter 336680: loss 5.9928, time 124.86ms
iter 336690: loss 6.3491, time 125.92ms
iter 336700: loss 6.0858, time 125.14ms
iter 336710: loss 5.1151, time 124.84ms
iter 336720: loss 7.0580, time 124.88ms
iter 336730: loss 5.6684, time 124.88ms
iter 336740: loss 6.2716, time 127.64ms
step 336750: train loss 5.6087, val loss 5.6046
saving checkpoint to out-shakespeare-char
iter 336750: loss 5.4774, time 2866.51ms
iter 336760: loss 6.4217, time 125.15ms
iter 336770: loss 5.6522, time 124.77ms
iter 336780: loss 5.4908, time 127.34ms
iter 336790: loss 5.6348, time 125.23ms
iter 336800: loss 6.1815, time 124.93ms
iter 336810: loss 6.5250, time 124.74ms
iter 336820: loss 5.6412, time 126.09ms
iter 336830: loss 6.5833, time 125.79ms
iter 336840: loss 5.1542, time 126.64ms
iter 336850: loss 6.3570, time 125.06ms
iter 336860: loss 6.1662, time 125.50ms
iter 336870: loss 5.9512, time 125.49ms
iter 336880: loss 5.6036, time 125.71ms
iter 336890: loss 5.7115, time 127.74ms
iter 336900: loss 5.3931, time 125.43ms
iter 336910: loss 6.2821, time 125.27ms
iter 336920: loss 6.0680, time 125.65ms
iter 336930: loss 6.3569, time 125.35ms
iter 336940: loss 6.0555, time 125.50ms
iter 336950: loss 5.6615, time 125.08ms
iter 336960: loss 6.0968, time 124.88ms
iter 336970: loss 5.8488, time 125.37ms
iter 336980: loss 5.7921, time 125.33ms
iter 336990: loss 6.9635, time 125.10ms
step 337000: train loss 5.6343, val loss 5.5629
saving checkpoint to out-shakespeare-char
iter 337000: loss 6.0783, time 2902.78ms
iter 337010: loss 6.0871, time 126.06ms
iter 337020: loss 5.5445, time 126.00ms
iter 337030: loss 5.5497, time 126.32ms
iter 337040: loss 6.0202, time 126.30ms
iter 337050: loss 5.7008, time 126.01ms
iter 337060: loss 6.2982, time 125.66ms
iter 337070: loss 5.9703, time 125.25ms
iter 337080: loss 6.8495, time 127.86ms
iter 337090: loss 6.1382, time 125.53ms
iter 337100: loss 5.0488, time 125.09ms
iter 337110: loss 5.3710, time 126.07ms
iter 337120: loss 6.1280, time 125.25ms
iter 337130: loss 6.1037, time 125.92ms
iter 337140: loss 6.0108, time 126.02ms
iter 337150: loss 5.7390, time 128.22ms
iter 337160: loss 5.8479, time 125.67ms
iter 337170: loss 6.3579, time 125.73ms
iter 337180: loss 5.6362, time 125.77ms
iter 337190: loss 5.9447, time 125.62ms
iter 337200: loss 5.8862, time 125.66ms
iter 337210: loss 6.1400, time 126.01ms
iter 337220: loss 6.2160, time 125.93ms
iter 337230: loss 5.7190, time 126.13ms
iter 337240: loss 5.6091, time 126.16ms
step 337250: train loss 5.6170, val loss 5.6336
saving checkpoint to out-shakespeare-char
iter 337250: loss 5.6299, time 2901.35ms
iter 337260: loss 5.8624, time 127.25ms
iter 337270: loss 5.4541, time 124.46ms
iter 337280: loss 6.2714, time 124.85ms
iter 337290: loss 5.7579, time 126.10ms
iter 337300: loss 5.6152, time 125.64ms
iter 337310: loss 5.8638, time 125.70ms
iter 337320: loss 6.2394, time 126.00ms
iter 337330: loss 6.3121, time 128.01ms
iter 337340: loss 6.4270, time 125.88ms
iter 337350: loss 6.5535, time 124.99ms
iter 337360: loss 5.5296, time 125.34ms
iter 337370: loss 5.6140, time 125.01ms
iter 337380: loss 5.8761, time 124.47ms
iter 337390: loss 5.9963, time 125.08ms
iter 337400: loss 5.8179, time 125.24ms
iter 337410: loss 6.0689, time 126.25ms
iter 337420: loss 6.2985, time 125.12ms
iter 337430: loss 5.7324, time 125.15ms
iter 337440: loss 5.8578, time 127.46ms
iter 337450: loss 6.0095, time 124.94ms
iter 337460: loss 5.7280, time 125.12ms
iter 337470: loss 5.6847, time 125.21ms
iter 337480: loss 5.9654, time 127.56ms
iter 337490: loss 6.0301, time 124.86ms
step 337500: train loss 5.6324, val loss 5.6205
saving checkpoint to out-shakespeare-char
iter 337500: loss 6.2647, time 2894.19ms
iter 337510: loss 5.6552, time 125.61ms
iter 337520: loss 6.5098, time 126.63ms
iter 337530: loss 5.8258, time 124.72ms
iter 337540: loss 5.7393, time 125.26ms
iter 337550: loss 5.6736, time 126.15ms
iter 337560: loss 5.5506, time 125.41ms
iter 337570: loss 5.8628, time 125.49ms
iter 337580: loss 5.6165, time 125.54ms
iter 337590: loss 6.4159, time 126.15ms
iter 337600: loss 5.5232, time 125.41ms
iter 337610: loss 5.9480, time 126.60ms
iter 337620: loss 6.1275, time 125.22ms
iter 337630: loss 6.2849, time 125.94ms
iter 337640: loss 5.8631, time 125.71ms
iter 337650: loss 6.0898, time 124.66ms
iter 337660: loss 6.3257, time 125.69ms
iter 337670: loss 6.2697, time 128.21ms
iter 337680: loss 5.7487, time 125.30ms
iter 337690: loss 6.2276, time 124.93ms
iter 337700: loss 6.3008, time 124.92ms
iter 337710: loss 6.1175, time 125.83ms
iter 337720: loss 5.3379, time 125.57ms
iter 337730: loss 6.0969, time 125.37ms
iter 337740: loss 7.0571, time 125.60ms
step 337750: train loss 5.6165, val loss 5.5348
saving checkpoint to out-shakespeare-char
iter 337750: loss 6.4541, time 2897.36ms
iter 337760: loss 6.0937, time 125.36ms
iter 337770: loss 5.7010, time 125.38ms
iter 337780: loss 6.1131, time 125.52ms
iter 337790: loss 5.4351, time 125.04ms
iter 337800: loss 5.9829, time 125.54ms
iter 337810: loss 5.9823, time 125.59ms
iter 337820: loss 6.1326, time 125.83ms
iter 337830: loss 7.0956, time 125.67ms
iter 337840: loss 6.5092, time 125.83ms
iter 337850: loss 6.1049, time 126.25ms
iter 337860: loss 5.7985, time 128.15ms
iter 337870: loss 5.9356, time 125.42ms
iter 337880: loss 5.0336, time 125.98ms
iter 337890: loss 5.1514, time 125.83ms
iter 337900: loss 5.8812, time 125.91ms
iter 337910: loss 5.3903, time 125.50ms
iter 337920: loss 6.0300, time 124.97ms
iter 337930: loss 6.1983, time 128.23ms
iter 337940: loss 5.7494, time 125.69ms
iter 337950: loss 5.0458, time 125.55ms
iter 337960: loss 5.8239, time 125.72ms
iter 337970: loss 5.6868, time 125.58ms
iter 337980: loss 5.7600, time 125.84ms
iter 337990: loss 5.5214, time 125.66ms
step 338000: train loss 5.6008, val loss 5.5989
saving checkpoint to out-shakespeare-char
iter 338000: loss 6.2478, time 2870.92ms
iter 338010: loss 5.4129, time 125.39ms
iter 338020: loss 5.7783, time 124.60ms
iter 338030: loss 6.2521, time 125.95ms
iter 338040: loss 5.6512, time 125.20ms
iter 338050: loss 6.3272, time 127.68ms
iter 338060: loss 5.7694, time 125.25ms
iter 338070: loss 6.4625, time 125.21ms
iter 338080: loss 5.7628, time 125.91ms
iter 338090: loss 6.2496, time 127.93ms
iter 338100: loss 5.5022, time 125.17ms
iter 338110: loss 5.6648, time 124.73ms
iter 338120: loss 6.5458, time 124.77ms
iter 338130: loss 5.4472, time 124.22ms
iter 338140: loss 5.8172, time 125.28ms
iter 338150: loss 6.3936, time 125.38ms
iter 338160: loss 6.5700, time 125.52ms
iter 338170: loss 6.4109, time 125.30ms
iter 338180: loss 5.9091, time 125.88ms
iter 338190: loss 6.1080, time 125.76ms
iter 338200: loss 5.7963, time 127.62ms
iter 338210: loss 5.7418, time 125.36ms
iter 338220: loss 5.4635, time 125.32ms
iter 338230: loss 5.2134, time 125.26ms
iter 338240: loss 6.0872, time 127.91ms
step 338250: train loss 5.5605, val loss 5.6476
saving checkpoint to out-shakespeare-char
iter 338250: loss 5.2625, time 2896.49ms
iter 338260: loss 6.3781, time 125.73ms
iter 338270: loss 5.9351, time 125.34ms
iter 338280: loss 6.1121, time 126.00ms
iter 338290: loss 6.0839, time 125.27ms
iter 338300: loss 5.9444, time 124.21ms
iter 338310: loss 5.4587, time 125.45ms
iter 338320: loss 6.0355, time 125.51ms
iter 338330: loss 6.3194, time 125.19ms
iter 338340: loss 5.5943, time 125.72ms
iter 338350: loss 6.0789, time 125.37ms
iter 338360: loss 5.7890, time 125.49ms
iter 338370: loss 6.5833, time 125.51ms
iter 338380: loss 5.4767, time 124.66ms
iter 338390: loss 6.1767, time 125.85ms
iter 338400: loss 6.2561, time 125.55ms
iter 338410: loss 5.5649, time 125.75ms
iter 338420: loss 5.9786, time 125.28ms
iter 338430: loss 6.1183, time 125.45ms
iter 338440: loss 6.3537, time 126.12ms
iter 338450: loss 6.0818, time 124.13ms
iter 338460: loss 6.0128, time 125.30ms
iter 338470: loss 5.8178, time 125.76ms
iter 338480: loss 6.2565, time 125.13ms
iter 338490: loss 5.9998, time 125.38ms
step 338500: train loss 5.5867, val loss 5.6096
saving checkpoint to out-shakespeare-char
iter 338500: loss 5.4362, time 2888.71ms
iter 338510: loss 5.7493, time 125.96ms
iter 338520: loss 5.8545, time 125.76ms
iter 338530: loss 6.5645, time 125.90ms
iter 338540: loss 6.0132, time 125.44ms
iter 338550: loss 6.5243, time 125.69ms
iter 338560: loss 5.9059, time 126.19ms
iter 338570: loss 5.4223, time 125.70ms
iter 338580: loss 6.2159, time 125.87ms
iter 338590: loss 6.6647, time 128.11ms
iter 338600: loss 6.5271, time 125.55ms
iter 338610: loss 6.4513, time 124.88ms
iter 338620: loss 5.5849, time 125.76ms
iter 338630: loss 5.5608, time 124.18ms
iter 338640: loss 6.0989, time 125.71ms
iter 338650: loss 5.6480, time 126.07ms
iter 338660: loss 6.7048, time 128.12ms
iter 338670: loss 6.3643, time 123.04ms
iter 338680: loss 5.4273, time 125.73ms
iter 338690: loss 5.4917, time 125.77ms
iter 338700: loss 6.2782, time 125.38ms
iter 338710: loss 5.9859, time 125.42ms
iter 338720: loss 6.0297, time 125.95ms
iter 338730: loss 6.2407, time 125.41ms
iter 338740: loss 6.3265, time 125.58ms
step 338750: train loss 5.6206, val loss 5.6361
saving checkpoint to out-shakespeare-char
iter 338750: loss 6.2440, time 2883.40ms
iter 338760: loss 5.7935, time 125.01ms
iter 338770: loss 5.8104, time 125.14ms
iter 338780: loss 5.2822, time 124.64ms
iter 338790: loss 6.4517, time 125.06ms
iter 338800: loss 6.3008, time 125.39ms
iter 338810: loss 6.6349, time 127.65ms
iter 338820: loss 6.1140, time 124.76ms
iter 338830: loss 5.8592, time 125.11ms
iter 338840: loss 6.0056, time 126.04ms
iter 338850: loss 5.8452, time 125.37ms
iter 338860: loss 5.6054, time 125.15ms
iter 338870: loss 5.4779, time 126.43ms
iter 338880: loss 6.4750, time 125.94ms
iter 338890: loss 6.2033, time 125.24ms
iter 338900: loss 6.2158, time 125.64ms
iter 338910: loss 5.3852, time 125.56ms
iter 338920: loss 5.9244, time 128.04ms
iter 338930: loss 5.4304, time 125.27ms
iter 338940: loss 6.4054, time 125.47ms
iter 338950: loss 6.5706, time 125.50ms
iter 338960: loss 6.4416, time 125.51ms
iter 338970: loss 6.3662, time 125.50ms
iter 338980: loss 5.8426, time 125.05ms
iter 338990: loss 5.4744, time 125.89ms
step 339000: train loss 5.5937, val loss 5.6135
saving checkpoint to out-shakespeare-char
iter 339000: loss 5.9257, time 2888.06ms
iter 339010: loss 6.5652, time 125.65ms
iter 339020: loss 5.9404, time 125.33ms
iter 339030: loss 5.7309, time 125.39ms
iter 339040: loss 5.9770, time 125.86ms
iter 339050: loss 5.7012, time 125.52ms
iter 339060: loss 5.8857, time 125.62ms
iter 339070: loss 6.8193, time 126.24ms
iter 339080: loss 5.8855, time 125.61ms
iter 339090: loss 5.8680, time 125.15ms
iter 339100: loss 5.9923, time 125.37ms
iter 339110: loss 6.2061, time 127.98ms
iter 339120: loss 6.2164, time 125.36ms
iter 339130: loss 6.6186, time 125.27ms
iter 339140: loss 5.6395, time 125.28ms
iter 339150: loss 5.7498, time 127.35ms
iter 339160: loss 6.5871, time 125.44ms
iter 339170: loss 6.0749, time 125.33ms
iter 339180: loss 6.1131, time 125.53ms
iter 339190: loss 6.8273, time 125.67ms
iter 339200: loss 6.0674, time 125.46ms
iter 339210: loss 5.8367, time 125.39ms
iter 339220: loss 6.4178, time 127.71ms
iter 339230: loss 6.0448, time 125.74ms
iter 339240: loss 6.5057, time 125.43ms
step 339250: train loss 5.5647, val loss 5.6605
saving checkpoint to out-shakespeare-char
iter 339250: loss 4.9470, time 2884.33ms
iter 339260: loss 5.8618, time 125.50ms
iter 339270: loss 6.3970, time 125.13ms
iter 339280: loss 6.2017, time 125.24ms
iter 339290: loss 6.3883, time 125.19ms
iter 339300: loss 6.2282, time 127.70ms
iter 339310: loss 5.7644, time 125.07ms
iter 339320: loss 5.9454, time 124.90ms
iter 339330: loss 5.8085, time 125.31ms
iter 339340: loss 6.2226, time 125.03ms
iter 339350: loss 6.1266, time 125.22ms
iter 339360: loss 5.7320, time 124.57ms
iter 339370: loss 6.1177, time 125.38ms
iter 339380: loss 6.4960, time 125.36ms
iter 339390: loss 6.4749, time 125.65ms
iter 339400: loss 6.9360, time 125.64ms
iter 339410: loss 5.8302, time 125.97ms
iter 339420: loss 5.9244, time 125.61ms
iter 339430: loss 6.0958, time 125.81ms
iter 339440: loss 6.5333, time 125.76ms
iter 339450: loss 6.4376, time 128.15ms
iter 339460: loss 6.2961, time 125.63ms
iter 339470: loss 6.1810, time 125.53ms
iter 339480: loss 5.1899, time 125.67ms
iter 339490: loss 5.4822, time 125.57ms
step 339500: train loss 5.5970, val loss 5.6083
saving checkpoint to out-shakespeare-char
iter 339500: loss 5.4884, time 2880.96ms
iter 339510: loss 6.3295, time 121.54ms
iter 339520: loss 5.4454, time 122.02ms
iter 339530: loss 6.0881, time 123.06ms
iter 339540: loss 4.8468, time 121.95ms
iter 339550: loss 5.9186, time 121.71ms
iter 339560: loss 5.7447, time 121.90ms
iter 339570: loss 5.6724, time 121.22ms
iter 339580: loss 5.2511, time 121.94ms
iter 339590: loss 5.9501, time 121.78ms
iter 339600: loss 5.5900, time 121.94ms
iter 339610: loss 5.2365, time 121.59ms
iter 339620: loss 6.2494, time 121.99ms
iter 339630: loss 5.7050, time 122.10ms
iter 339640: loss 5.8984, time 121.73ms
iter 339650: loss 6.2088, time 121.07ms
iter 339660: loss 5.0004, time 121.96ms
iter 339670: loss 5.6062, time 121.87ms
iter 339680: loss 5.7668, time 121.82ms
iter 339690: loss 6.3963, time 122.24ms
iter 339700: loss 5.7648, time 121.91ms
iter 339710: loss 5.0020, time 121.83ms
iter 339720: loss 6.0351, time 121.95ms
iter 339730: loss 6.4091, time 121.74ms
iter 339740: loss 6.4276, time 121.83ms
step 339750: train loss 5.6190, val loss 5.6483
saving checkpoint to out-shakespeare-char
iter 339750: loss 6.0029, time 2887.45ms
iter 339760: loss 6.3030, time 126.14ms
iter 339770: loss 6.1681, time 125.53ms
iter 339780: loss 7.3952, time 125.68ms
iter 339790: loss 6.0474, time 125.32ms
iter 339800: loss 5.4362, time 128.28ms
iter 339810: loss 5.7798, time 125.42ms
iter 339820: loss 6.4757, time 125.33ms
iter 339830: loss 5.4410, time 125.70ms
iter 339840: loss 6.0353, time 125.87ms
iter 339850: loss 5.7341, time 125.09ms
iter 339860: loss 5.9497, time 125.53ms
iter 339870: loss 5.6329, time 128.25ms
iter 339880: loss 5.7895, time 125.47ms
iter 339890: loss 6.4735, time 125.66ms
iter 339900: loss 5.7021, time 125.40ms
iter 339910: loss 6.0047, time 125.62ms
iter 339920: loss 6.4714, time 124.57ms
iter 339930: loss 5.4346, time 125.56ms
iter 339940: loss 6.0404, time 127.86ms
iter 339950: loss 5.9851, time 125.55ms
iter 339960: loss 5.8956, time 125.59ms
iter 339970: loss 6.2261, time 125.78ms
iter 339980: loss 6.1213, time 125.52ms
iter 339990: loss 5.8000, time 124.91ms
step 340000: train loss 5.6383, val loss 5.5941
saving checkpoint to out-shakespeare-char
iter 340000: loss 5.8890, time 2885.51ms
iter 340010: loss 5.7255, time 124.38ms
iter 340020: loss 5.8849, time 121.50ms
iter 340030: loss 5.6294, time 123.57ms
iter 340040: loss 6.1658, time 120.72ms
iter 340050: loss 5.8042, time 123.88ms
iter 340060: loss 5.6498, time 121.94ms
iter 340070: loss 5.4919, time 123.77ms
iter 340080: loss 5.8385, time 121.57ms
iter 340090: loss 5.8000, time 123.96ms
iter 340100: loss 6.1472, time 121.69ms
iter 340110: loss 6.0685, time 123.91ms
iter 340120: loss 5.8911, time 121.63ms
iter 340130: loss 5.9833, time 123.88ms
iter 340140: loss 5.9344, time 121.33ms
iter 340150: loss 6.2294, time 124.44ms
iter 340160: loss 6.0189, time 121.06ms
iter 340170: loss 5.1562, time 123.95ms
iter 340180: loss 5.8285, time 121.47ms
iter 340190: loss 6.4467, time 123.59ms
iter 340200: loss 5.5898, time 121.61ms
iter 340210: loss 6.1457, time 123.97ms
iter 340220: loss 6.0422, time 120.67ms
iter 340230: loss 6.1583, time 123.70ms
iter 340240: loss 6.0368, time 121.47ms
step 340250: train loss 5.6937, val loss 5.6186
saving checkpoint to out-shakespeare-char
iter 340250: loss 6.5235, time 2897.95ms
iter 340260: loss 5.9775, time 121.55ms
iter 340270: loss 6.2390, time 122.69ms
iter 340280: loss 5.7340, time 121.41ms
iter 340290: loss 6.1205, time 121.79ms
iter 340300: loss 6.3076, time 121.69ms
iter 340310: loss 5.9028, time 123.23ms
iter 340320: loss 6.3294, time 121.34ms
iter 340330: loss 6.2586, time 122.66ms
iter 340340: loss 5.5747, time 121.76ms
iter 340350: loss 5.4716, time 122.94ms
iter 340360: loss 6.2180, time 122.06ms
iter 340370: loss 5.9710, time 122.44ms
iter 340380: loss 6.2700, time 119.32ms
iter 340390: loss 6.1951, time 122.88ms
iter 340400: loss 6.3260, time 121.28ms
iter 340410: loss 6.6491, time 122.33ms
iter 340420: loss 5.8068, time 122.01ms
iter 340430: loss 6.5498, time 122.48ms
iter 340440: loss 6.6652, time 121.55ms
iter 340450: loss 6.2487, time 122.92ms
iter 340460: loss 6.2290, time 122.38ms
iter 340470: loss 6.0500, time 121.37ms
iter 340480: loss 5.8524, time 120.89ms
iter 340490: loss 6.4170, time 123.18ms
step 340500: train loss 5.5818, val loss 5.5873
saving checkpoint to out-shakespeare-char
iter 340500: loss 5.5857, time 2891.62ms
iter 340510: loss 6.4809, time 121.59ms
iter 340520: loss 5.7363, time 121.58ms
iter 340530: loss 6.6154, time 121.29ms
iter 340540: loss 6.0206, time 121.91ms
iter 340550: loss 6.9781, time 121.75ms
iter 340560: loss 5.6575, time 120.91ms
iter 340570: loss 6.5264, time 122.77ms
iter 340580: loss 6.0378, time 121.79ms
iter 340590: loss 5.9184, time 121.91ms
iter 340600: loss 5.6779, time 121.48ms
iter 340610: loss 6.1487, time 121.81ms
iter 340620: loss 5.9746, time 120.35ms
iter 340630: loss 6.5087, time 122.15ms
iter 340640: loss 5.2606, time 121.47ms
iter 340650: loss 6.3725, time 122.23ms
iter 340660: loss 6.1982, time 121.59ms
iter 340670: loss 6.5155, time 121.98ms
iter 340680: loss 6.5082, time 121.65ms
iter 340690: loss 6.5202, time 121.64ms
iter 340700: loss 6.2254, time 121.70ms
iter 340710: loss 6.0282, time 121.36ms
iter 340720: loss 6.5653, time 121.79ms
iter 340730: loss 5.8798, time 121.61ms
iter 340740: loss 5.8324, time 121.76ms
step 340750: train loss 5.7027, val loss 5.6382
saving checkpoint to out-shakespeare-char
iter 340750: loss 5.8299, time 2889.12ms
iter 340760: loss 6.4930, time 125.79ms
iter 340770: loss 6.4150, time 128.25ms
iter 340780: loss 5.6635, time 125.56ms
iter 340790: loss 6.5328, time 125.57ms
iter 340800: loss 5.7147, time 125.69ms
iter 340810: loss 6.1482, time 125.69ms
iter 340820: loss 5.5282, time 125.55ms
iter 340830: loss 5.6115, time 125.79ms
iter 340840: loss 5.8233, time 125.72ms
iter 340850: loss 5.6342, time 125.63ms
iter 340860: loss 5.3692, time 126.12ms
iter 340870: loss 6.0252, time 128.20ms
iter 340880: loss 6.4909, time 125.62ms
iter 340890: loss 5.6803, time 125.66ms
iter 340900: loss 5.8569, time 125.72ms
iter 340910: loss 6.1041, time 125.49ms
iter 340920: loss 6.7071, time 125.88ms
iter 340930: loss 6.5638, time 125.64ms
iter 340940: loss 5.7845, time 125.63ms
iter 340950: loss 6.4966, time 125.37ms
iter 340960: loss 5.7882, time 125.82ms
iter 340970: loss 5.8073, time 125.65ms
iter 340980: loss 5.8115, time 128.06ms
iter 340990: loss 6.7128, time 125.58ms
step 341000: train loss 5.6155, val loss 5.5656
saving checkpoint to out-shakespeare-char
iter 341000: loss 5.1671, time 2886.56ms
iter 341010: loss 6.0048, time 125.92ms
iter 341020: loss 6.0364, time 128.24ms
iter 341030: loss 5.6927, time 125.65ms
iter 341040: loss 5.1019, time 125.76ms
iter 341050: loss 6.1756, time 125.81ms
iter 341060: loss 6.2557, time 125.36ms
iter 341070: loss 5.9755, time 125.28ms
iter 341080: loss 6.1012, time 125.10ms
iter 341090: loss 6.5434, time 125.42ms
iter 341100: loss 6.2512, time 125.61ms
iter 341110: loss 5.2762, time 125.57ms
iter 341120: loss 5.7231, time 126.00ms
iter 341130: loss 6.1594, time 128.06ms
iter 341140: loss 5.3239, time 125.20ms
iter 341150: loss 5.8672, time 125.42ms
iter 341160: loss 5.8099, time 125.63ms
iter 341170: loss 5.8300, time 125.28ms
iter 341180: loss 5.1867, time 125.37ms
iter 341190: loss 6.4797, time 125.13ms
iter 341200: loss 6.3517, time 125.65ms
iter 341210: loss 6.3423, time 125.27ms
iter 341220: loss 6.4221, time 125.42ms
iter 341230: loss 5.8317, time 125.38ms
iter 341240: loss 5.9771, time 127.66ms
step 341250: train loss 5.6130, val loss 5.5691
saving checkpoint to out-shakespeare-char
iter 341250: loss 5.9745, time 2899.57ms
iter 341260: loss 5.3972, time 125.88ms
iter 341270: loss 5.9309, time 126.07ms
iter 341280: loss 5.9633, time 128.10ms
iter 341290: loss 5.1298, time 125.65ms
iter 341300: loss 6.5123, time 125.76ms
iter 341310: loss 5.8925, time 125.65ms
iter 341320: loss 6.0995, time 128.03ms
iter 341330: loss 6.0787, time 126.12ms
iter 341340: loss 5.5542, time 126.09ms
iter 341350: loss 5.8781, time 126.28ms
iter 341360: loss 5.5655, time 126.43ms
iter 341370: loss 6.3094, time 128.23ms
iter 341380: loss 5.5196, time 125.90ms
iter 341390: loss 6.1443, time 125.98ms
iter 341400: loss 5.5881, time 126.35ms
iter 341410: loss 6.0180, time 126.20ms
iter 341420: loss 6.0980, time 126.16ms
iter 341430: loss 6.0053, time 126.14ms
iter 341440: loss 5.7214, time 126.05ms
iter 341450: loss 5.6600, time 125.73ms
iter 341460: loss 6.2463, time 124.59ms
iter 341470: loss 6.2899, time 125.77ms
iter 341480: loss 5.9571, time 128.12ms
iter 341490: loss 6.2515, time 125.52ms
step 341500: train loss 5.6686, val loss 5.6430
saving checkpoint to out-shakespeare-char
iter 341500: loss 6.3348, time 2879.28ms
iter 341510: loss 5.5455, time 125.60ms
iter 341520: loss 5.7153, time 126.04ms
iter 341530: loss 5.2944, time 126.34ms
iter 341540: loss 5.5529, time 125.81ms
iter 341550: loss 5.5610, time 126.00ms
iter 341560: loss 6.1551, time 125.98ms
iter 341570: loss 6.1933, time 125.64ms
iter 341580: loss 5.7767, time 125.98ms
iter 341590: loss 6.2739, time 126.34ms
iter 341600: loss 6.5126, time 128.18ms
iter 341610: loss 6.1340, time 125.59ms
iter 341620: loss 5.1704, time 125.97ms
iter 341630: loss 6.2435, time 125.67ms
iter 341640: loss 5.1996, time 125.95ms
iter 341650: loss 5.8431, time 125.59ms
iter 341660: loss 6.1127, time 119.54ms
iter 341670: loss 5.1378, time 119.38ms
iter 341680: loss 5.1960, time 120.30ms
iter 341690: loss 5.7389, time 119.57ms
iter 341700: loss 6.0010, time 119.50ms
iter 341710: loss 6.0855, time 119.38ms
iter 341720: loss 5.0968, time 119.60ms
iter 341730: loss 5.4931, time 120.27ms
iter 341740: loss 5.8450, time 119.88ms
step 341750: train loss 5.6258, val loss 5.5936
saving checkpoint to out-shakespeare-char
iter 341750: loss 5.4641, time 2893.60ms
iter 341760: loss 5.6638, time 125.27ms
iter 341770: loss 6.4887, time 125.58ms
iter 341780: loss 6.4917, time 125.44ms
iter 341790: loss 5.4672, time 125.99ms
iter 341800: loss 6.1828, time 125.63ms
iter 341810: loss 6.7040, time 126.25ms
iter 341820: loss 5.4449, time 125.77ms
iter 341830: loss 7.0309, time 125.45ms
iter 341840: loss 6.3307, time 125.49ms
iter 341850: loss 6.0187, time 125.90ms
iter 341860: loss 5.5649, time 128.14ms
iter 341870: loss 5.4941, time 126.00ms
iter 341880: loss 5.4687, time 125.57ms
iter 341890: loss 5.6992, time 125.94ms
iter 341900: loss 6.4171, time 124.61ms
iter 341910: loss 5.6008, time 126.08ms
iter 341920: loss 5.8221, time 125.69ms
iter 341930: loss 6.7893, time 125.46ms
iter 341940: loss 6.2040, time 125.85ms
iter 341950: loss 6.9649, time 126.23ms
iter 341960: loss 6.7554, time 126.19ms
iter 341970: loss 6.2096, time 128.33ms
iter 341980: loss 6.7417, time 125.04ms
iter 341990: loss 6.1021, time 126.36ms
step 342000: train loss 5.5910, val loss 5.6241
saving checkpoint to out-shakespeare-char
iter 342000: loss 5.5177, time 2902.23ms
iter 342010: loss 6.5202, time 125.47ms
iter 342020: loss 5.4876, time 125.63ms
iter 342030: loss 6.1129, time 127.53ms
iter 342040: loss 6.0755, time 125.45ms
iter 342050: loss 6.1802, time 125.16ms
iter 342060: loss 6.3694, time 125.73ms
iter 342070: loss 6.8947, time 124.65ms
iter 342080: loss 6.5227, time 125.28ms
iter 342090: loss 6.2391, time 125.56ms
iter 342100: loss 6.1823, time 125.40ms
iter 342110: loss 6.0584, time 125.10ms
iter 342120: loss 6.1405, time 125.10ms
iter 342130: loss 5.8541, time 125.44ms
iter 342140: loss 5.6091, time 128.12ms
iter 342150: loss 6.1736, time 124.71ms
iter 342160: loss 6.8340, time 125.19ms
iter 342170: loss 5.8599, time 125.34ms
iter 342180: loss 5.7394, time 125.38ms
iter 342190: loss 5.7608, time 125.59ms
iter 342200: loss 5.6568, time 125.47ms
iter 342210: loss 5.6356, time 125.54ms
iter 342220: loss 6.1322, time 119.64ms
iter 342230: loss 5.6961, time 120.94ms
iter 342240: loss 6.3015, time 119.79ms
step 342250: train loss 5.6021, val loss 5.6201
saving checkpoint to out-shakespeare-char
iter 342250: loss 5.9647, time 2892.09ms
iter 342260: loss 6.0326, time 124.96ms
iter 342270: loss 6.0724, time 125.49ms
iter 342280: loss 6.1522, time 125.43ms
iter 342290: loss 6.2067, time 125.28ms
iter 342300: loss 6.0432, time 126.97ms
iter 342310: loss 5.6618, time 125.65ms
iter 342320: loss 5.9708, time 125.51ms
iter 342330: loss 5.5788, time 125.22ms
iter 342340: loss 5.6762, time 124.62ms
iter 342350: loss 5.8801, time 125.19ms
iter 342360: loss 5.1442, time 125.58ms
iter 342370: loss 6.0161, time 125.80ms
iter 342380: loss 6.6520, time 125.64ms
iter 342390: loss 5.7672, time 125.53ms
iter 342400: loss 6.7202, time 125.26ms
iter 342410: loss 6.2166, time 127.71ms
iter 342420: loss 5.8634, time 124.71ms
iter 342430: loss 6.8093, time 125.18ms
iter 342440: loss 6.6685, time 125.30ms
iter 342450: loss 5.4795, time 125.46ms
iter 342460: loss 5.4886, time 125.23ms
iter 342470: loss 6.3994, time 124.61ms
iter 342480: loss 5.4425, time 124.59ms
iter 342490: loss 5.7526, time 125.24ms
step 342500: train loss 5.6139, val loss 5.5365
saving checkpoint to out-shakespeare-char
iter 342500: loss 5.4171, time 2874.57ms
iter 342510: loss 5.1496, time 125.17ms
iter 342520: loss 6.0249, time 125.32ms
iter 342530: loss 6.1449, time 127.46ms
iter 342540: loss 5.6009, time 125.42ms
iter 342550: loss 6.3844, time 125.20ms
iter 342560: loss 6.6146, time 125.64ms
iter 342570: loss 5.1692, time 124.58ms
iter 342580: loss 5.8757, time 125.18ms
iter 342590: loss 6.0246, time 125.06ms
iter 342600: loss 6.2055, time 125.69ms
iter 342610: loss 6.3532, time 124.52ms
iter 342620: loss 5.7420, time 125.37ms
iter 342630: loss 6.4845, time 125.26ms
iter 342640: loss 6.0747, time 127.54ms
iter 342650: loss 6.4523, time 124.94ms
iter 342660: loss 5.7641, time 125.29ms
iter 342670: loss 5.7375, time 125.71ms
iter 342680: loss 6.1861, time 125.43ms
iter 342690: loss 6.8022, time 125.42ms
iter 342700: loss 6.1207, time 125.29ms
iter 342710: loss 6.4129, time 125.61ms
iter 342720: loss 6.2649, time 125.25ms
iter 342730: loss 5.6992, time 125.28ms
iter 342740: loss 6.1982, time 125.65ms
step 342750: train loss 5.5845, val loss 5.5873
saving checkpoint to out-shakespeare-char
iter 342750: loss 6.0579, time 2879.15ms
iter 342760: loss 5.8860, time 125.33ms
iter 342770: loss 6.4790, time 125.57ms
iter 342780: loss 5.7100, time 125.21ms
iter 342790: loss 6.1227, time 126.74ms
iter 342800: loss 5.6144, time 125.27ms
iter 342810: loss 5.8869, time 125.51ms
iter 342820: loss 5.6069, time 125.41ms
iter 342830: loss 5.2314, time 124.22ms
iter 342840: loss 6.0734, time 125.10ms
iter 342850: loss 5.1433, time 126.24ms
iter 342860: loss 5.9842, time 125.45ms
iter 342870: loss 6.8994, time 125.78ms
iter 342880: loss 6.0375, time 125.95ms
iter 342890: loss 5.7070, time 125.48ms
iter 342900: loss 6.3886, time 127.80ms
iter 342910: loss 5.7704, time 125.25ms
iter 342920: loss 6.1530, time 125.41ms
iter 342930: loss 5.7884, time 125.40ms
iter 342940: loss 5.8771, time 125.18ms
iter 342950: loss 5.8938, time 125.40ms
iter 342960: loss 5.4586, time 125.17ms
iter 342970: loss 5.9577, time 125.28ms
iter 342980: loss 6.5440, time 125.20ms
iter 342990: loss 5.9298, time 125.35ms
step 343000: train loss 5.6128, val loss 5.5548
saving checkpoint to out-shakespeare-char
iter 343000: loss 5.9761, time 2902.61ms
iter 343010: loss 5.8781, time 125.20ms
iter 343020: loss 6.4736, time 124.59ms
iter 343030: loss 6.3475, time 125.34ms
iter 343040: loss 6.0055, time 125.38ms
iter 343050: loss 5.8777, time 125.41ms
iter 343060: loss 5.5089, time 126.65ms
iter 343070: loss 5.4763, time 125.19ms
iter 343080: loss 5.9524, time 125.32ms
iter 343090: loss 6.1908, time 125.70ms
iter 343100: loss 6.7730, time 125.92ms
iter 343110: loss 6.8287, time 124.33ms
iter 343120: loss 5.4053, time 125.15ms
iter 343130: loss 5.4850, time 125.51ms
iter 343140: loss 6.0209, time 123.97ms
iter 343150: loss 5.7612, time 124.50ms
iter 343160: loss 6.1538, time 125.37ms
iter 343170: loss 5.6712, time 128.29ms
iter 343180: loss 6.0609, time 125.32ms
iter 343190: loss 6.4951, time 125.39ms
iter 343200: loss 5.9909, time 125.57ms
iter 343210: loss 5.8823, time 125.65ms
iter 343220: loss 5.8299, time 125.06ms
iter 343230: loss 6.3870, time 125.27ms
iter 343240: loss 6.4919, time 125.56ms
step 343250: train loss 5.6047, val loss 5.6053
saving checkpoint to out-shakespeare-char
iter 343250: loss 6.6614, time 2878.63ms
iter 343260: loss 6.5011, time 125.65ms
iter 343270: loss 6.8541, time 125.28ms
iter 343280: loss 5.7088, time 125.50ms
iter 343290: loss 5.7312, time 125.58ms
iter 343300: loss 6.6951, time 124.68ms
iter 343310: loss 5.7263, time 125.26ms
iter 343320: loss 6.1951, time 124.70ms
iter 343330: loss 5.8197, time 125.59ms
iter 343340: loss 5.4055, time 127.90ms
iter 343350: loss 5.8430, time 125.75ms
iter 343360: loss 6.3793, time 125.86ms
iter 343370: loss 6.1910, time 125.22ms
iter 343380: loss 5.9453, time 124.64ms
iter 343390: loss 6.3023, time 125.25ms
iter 343400: loss 5.5724, time 124.65ms
iter 343410: loss 5.5312, time 125.99ms
iter 343420: loss 4.7931, time 125.26ms
iter 343430: loss 6.0084, time 125.69ms
iter 343440: loss 6.3721, time 125.76ms
iter 343450: loss 5.9743, time 125.56ms
iter 343460: loss 5.9028, time 124.81ms
iter 343470: loss 6.0958, time 125.39ms
iter 343480: loss 5.5649, time 126.07ms
iter 343490: loss 5.8872, time 126.77ms
step 343500: train loss 5.6557, val loss 5.5919
saving checkpoint to out-shakespeare-char
iter 343500: loss 6.3309, time 2877.00ms
iter 343510: loss 5.3023, time 125.03ms
iter 343520: loss 6.2653, time 126.50ms
iter 343530: loss 6.0775, time 125.30ms
iter 343540: loss 5.3737, time 128.21ms
iter 343550: loss 6.2098, time 125.80ms
iter 343560: loss 6.0895, time 124.68ms
iter 343570: loss 6.5311, time 123.84ms
iter 343580: loss 6.0921, time 125.35ms
iter 343590: loss 6.2014, time 125.18ms
iter 343600: loss 5.4816, time 125.10ms
iter 343610: loss 5.6020, time 124.81ms
iter 343620: loss 5.8171, time 125.48ms
iter 343630: loss 6.4606, time 125.73ms
iter 343640: loss 5.7240, time 125.26ms
iter 343650: loss 5.6988, time 128.38ms
iter 343660: loss 5.5204, time 125.69ms
iter 343670: loss 5.0527, time 125.66ms
iter 343680: loss 6.4616, time 125.44ms
iter 343690: loss 5.6081, time 125.59ms
iter 343700: loss 5.7911, time 125.77ms
iter 343710: loss 5.5730, time 125.89ms
iter 343720: loss 6.4772, time 127.06ms
iter 343730: loss 5.2297, time 125.57ms
iter 343740: loss 5.5286, time 124.72ms
step 343750: train loss 5.6412, val loss 5.5824
saving checkpoint to out-shakespeare-char
iter 343750: loss 5.7098, time 2905.52ms
iter 343760: loss 5.3308, time 125.51ms
iter 343770: loss 6.1224, time 124.51ms
iter 343780: loss 5.9859, time 125.15ms
iter 343790: loss 6.4161, time 125.32ms
iter 343800: loss 6.3480, time 127.52ms
iter 343810: loss 5.9482, time 125.16ms
iter 343820: loss 6.5242, time 125.08ms
iter 343830: loss 5.9379, time 124.52ms
iter 343840: loss 5.9198, time 125.26ms
iter 343850: loss 6.5642, time 125.24ms
iter 343860: loss 5.8384, time 125.02ms
iter 343870: loss 6.3144, time 125.28ms
iter 343880: loss 6.1177, time 125.08ms
iter 343890: loss 5.7180, time 125.39ms
iter 343900: loss 5.7039, time 125.13ms
iter 343910: loss 6.4493, time 127.88ms
iter 343920: loss 5.9886, time 125.21ms
iter 343930: loss 6.8051, time 125.62ms
iter 343940: loss 5.9224, time 125.52ms
iter 343950: loss 5.7715, time 125.25ms
iter 343960: loss 5.6223, time 125.20ms
iter 343970: loss 6.0814, time 125.69ms
iter 343980: loss 5.7139, time 125.34ms
iter 343990: loss 6.2549, time 124.42ms
step 344000: train loss 5.5872, val loss 5.5661
saving checkpoint to out-shakespeare-char
iter 344000: loss 6.3894, time 2893.59ms
iter 344010: loss 6.8359, time 125.65ms
iter 344020: loss 6.3310, time 125.34ms
iter 344030: loss 5.6162, time 125.29ms
iter 344040: loss 6.3126, time 125.63ms
iter 344050: loss 5.2551, time 125.33ms
iter 344060: loss 6.2473, time 125.73ms
iter 344070: loss 5.8132, time 125.91ms
iter 344080: loss 6.3866, time 126.38ms
iter 344090: loss 5.7705, time 126.41ms
iter 344100: loss 6.5084, time 124.96ms
iter 344110: loss 5.4234, time 125.15ms
iter 344120: loss 5.6753, time 125.37ms
iter 344130: loss 4.8095, time 125.99ms
iter 344140: loss 5.8020, time 123.61ms
iter 344150: loss 6.4700, time 125.65ms
iter 344160: loss 5.2552, time 126.05ms
iter 344170: loss 5.9540, time 125.37ms
iter 344180: loss 5.8852, time 124.80ms
iter 344190: loss 6.0507, time 125.05ms
iter 344200: loss 5.5869, time 127.83ms
iter 344210: loss 5.9692, time 125.27ms
iter 344220: loss 6.2286, time 125.41ms
iter 344230: loss 6.1896, time 125.49ms
iter 344240: loss 5.0747, time 125.25ms
step 344250: train loss 5.6282, val loss 5.5962
saving checkpoint to out-shakespeare-char
iter 344250: loss 6.1768, time 2876.95ms
iter 344260: loss 6.1034, time 125.44ms
iter 344270: loss 6.3749, time 124.83ms
iter 344280: loss 5.9975, time 123.94ms
iter 344290: loss 6.3848, time 125.42ms
iter 344300: loss 5.4711, time 125.61ms
iter 344310: loss 6.2166, time 125.26ms
iter 344320: loss 5.6429, time 125.33ms
iter 344330: loss 6.1968, time 124.76ms
iter 344340: loss 6.0936, time 125.05ms
iter 344350: loss 5.4557, time 125.39ms
iter 344360: loss 6.2618, time 125.23ms
iter 344370: loss 5.9014, time 125.29ms
iter 344380: loss 6.6462, time 125.87ms
iter 344390: loss 6.5116, time 125.14ms
iter 344400: loss 6.0977, time 125.86ms
iter 344410: loss 6.1609, time 128.00ms
iter 344420: loss 5.7112, time 121.90ms
iter 344430: loss 6.2321, time 123.22ms
iter 344440: loss 6.0903, time 122.77ms
iter 344450: loss 6.1986, time 121.55ms
iter 344460: loss 6.4203, time 124.46ms
iter 344470: loss 5.7665, time 123.92ms
iter 344480: loss 5.7254, time 122.60ms
iter 344490: loss 6.3261, time 123.00ms
step 344500: train loss 5.6037, val loss 5.6209
saving checkpoint to out-shakespeare-char
iter 344500: loss 5.6579, time 2898.26ms
iter 344510: loss 6.0404, time 121.77ms
iter 344520: loss 6.3886, time 120.81ms
iter 344530: loss 6.4314, time 121.33ms
iter 344540: loss 5.1755, time 119.50ms
iter 344550: loss 5.1792, time 121.66ms
iter 344560: loss 5.9814, time 124.14ms
iter 344570: loss 5.7866, time 121.65ms
iter 344580: loss 6.8478, time 125.16ms
iter 344590: loss 5.9255, time 121.47ms
iter 344600: loss 5.6459, time 124.48ms
iter 344610: loss 6.2552, time 122.03ms
iter 344620: loss 6.0970, time 125.18ms
iter 344630: loss 5.7711, time 125.34ms
iter 344640: loss 6.2270, time 125.88ms
iter 344650: loss 6.3046, time 124.92ms
iter 344660: loss 5.8581, time 125.60ms
iter 344670: loss 5.8340, time 124.79ms
iter 344680: loss 6.2725, time 125.54ms
iter 344690: loss 5.7306, time 127.39ms
iter 344700: loss 6.2955, time 127.54ms
iter 344710: loss 5.7663, time 125.51ms
iter 344720: loss 5.7551, time 125.73ms
iter 344730: loss 6.0911, time 125.13ms
iter 344740: loss 6.2273, time 125.31ms
step 344750: train loss 5.6112, val loss 5.5961
saving checkpoint to out-shakespeare-char
iter 344750: loss 6.9754, time 2875.58ms
iter 344760: loss 4.8747, time 125.57ms
iter 344770: loss 5.8996, time 125.32ms
iter 344780: loss 6.0780, time 124.89ms
iter 344790: loss 5.6232, time 125.35ms
iter 344800: loss 7.2547, time 125.39ms
iter 344810: loss 6.2086, time 127.88ms
iter 344820: loss 5.8300, time 125.46ms
iter 344830: loss 5.7074, time 125.12ms
iter 344840: loss 6.1965, time 125.62ms
iter 344850: loss 6.2344, time 125.35ms
iter 344860: loss 6.5129, time 124.22ms
iter 344870: loss 6.7429, time 125.56ms
iter 344880: loss 6.4688, time 125.64ms
iter 344890: loss 6.0168, time 125.37ms
iter 344900: loss 5.6472, time 125.44ms
iter 344910: loss 5.4178, time 125.82ms
iter 344920: loss 6.3827, time 127.60ms
iter 344930: loss 5.9673, time 125.35ms
iter 344940: loss 6.8370, time 125.65ms
iter 344950: loss 6.3270, time 125.65ms
iter 344960: loss 5.6977, time 125.54ms
iter 344970: loss 6.2160, time 125.03ms
iter 344980: loss 5.8365, time 125.51ms
iter 344990: loss 6.0777, time 126.30ms
step 345000: train loss 5.5416, val loss 5.6275
saving checkpoint to out-shakespeare-char
iter 345000: loss 5.9666, time 2893.78ms
iter 345010: loss 5.6541, time 125.87ms
iter 345020: loss 6.1536, time 125.83ms
iter 345030: loss 6.4729, time 125.66ms
iter 345040: loss 5.4032, time 125.87ms
iter 345050: loss 6.1992, time 124.82ms
iter 345060: loss 6.9398, time 124.64ms
iter 345070: loss 5.4195, time 126.37ms
iter 345080: loss 6.2210, time 125.45ms
iter 345090: loss 5.8138, time 128.17ms
iter 345100: loss 6.4087, time 125.67ms
iter 345110: loss 6.0998, time 126.35ms
iter 345120: loss 5.9680, time 125.98ms
iter 345130: loss 5.5482, time 125.38ms
iter 345140: loss 6.5225, time 125.60ms
iter 345150: loss 5.7097, time 125.64ms
iter 345160: loss 6.4650, time 125.89ms
iter 345170: loss 6.2180, time 125.36ms
iter 345180: loss 6.5879, time 125.88ms
iter 345190: loss 5.7498, time 125.75ms
iter 345200: loss 5.1446, time 128.35ms
iter 345210: loss 6.4810, time 125.63ms
iter 345220: loss 5.4667, time 125.97ms
iter 345230: loss 6.0661, time 126.67ms
iter 345240: loss 6.0909, time 125.78ms
step 345250: train loss 5.6302, val loss 5.5658
saving checkpoint to out-shakespeare-char
iter 345250: loss 6.4105, time 2862.01ms
iter 345260: loss 5.8715, time 125.32ms
iter 345270: loss 6.0700, time 125.16ms
iter 345280: loss 6.7337, time 125.11ms
iter 345290: loss 6.4774, time 125.80ms
iter 345300: loss 5.5538, time 125.92ms
iter 345310: loss 5.2030, time 125.81ms
iter 345320: loss 6.2765, time 126.01ms
iter 345330: loss 5.6302, time 126.67ms
iter 345340: loss 5.6596, time 126.15ms
iter 345350: loss 5.6420, time 128.46ms
iter 345360: loss 5.9035, time 126.34ms
iter 345370: loss 6.2328, time 125.82ms
iter 345380: loss 5.6353, time 126.02ms
iter 345390: loss 6.8410, time 125.77ms
iter 345400: loss 5.4937, time 126.11ms
iter 345410: loss 6.0469, time 126.82ms
iter 345420: loss 5.8971, time 125.77ms
iter 345430: loss 6.6077, time 125.71ms
iter 345440: loss 6.5332, time 125.63ms
iter 345450: loss 5.9203, time 125.64ms
iter 345460: loss 5.7924, time 126.04ms
iter 345470: loss 5.7054, time 129.84ms
iter 345480: loss 5.7777, time 125.82ms
iter 345490: loss 6.0686, time 128.04ms
step 345500: train loss 5.6112, val loss 5.6637
saving checkpoint to out-shakespeare-char
iter 345500: loss 5.5744, time 2880.88ms
iter 345510: loss 6.0355, time 121.70ms
iter 345520: loss 6.2606, time 121.67ms
iter 345530: loss 6.2782, time 121.40ms
iter 345540: loss 5.6834, time 121.12ms
iter 345550: loss 6.1309, time 122.09ms
iter 345560: loss 6.0759, time 121.48ms
iter 345570: loss 5.5397, time 121.91ms
iter 345580: loss 5.7976, time 121.48ms
iter 345590: loss 6.4785, time 121.49ms
iter 345600: loss 5.8015, time 121.51ms
iter 345610: loss 5.9821, time 121.86ms
iter 345620: loss 5.9360, time 121.23ms
iter 345630: loss 5.4199, time 121.96ms
iter 345640: loss 6.4175, time 120.24ms
iter 345650: loss 6.2708, time 121.82ms
iter 345660: loss 6.9320, time 121.98ms
iter 345670: loss 6.6358, time 121.47ms
iter 345680: loss 5.5979, time 121.68ms
iter 345690: loss 6.3190, time 121.53ms
iter 345700: loss 5.5311, time 121.52ms
iter 345710: loss 5.7554, time 121.58ms
iter 345720: loss 5.3188, time 120.54ms
iter 345730: loss 5.5822, time 122.08ms
iter 345740: loss 5.7405, time 121.67ms
step 345750: train loss 5.5987, val loss 5.5727
saving checkpoint to out-shakespeare-char
iter 345750: loss 5.7207, time 2884.67ms
iter 345760: loss 5.8618, time 121.55ms
iter 345770: loss 6.3112, time 121.85ms
iter 345780: loss 6.3176, time 121.74ms
iter 345790: loss 6.2180, time 121.58ms
iter 345800: loss 5.8234, time 120.75ms
iter 345810: loss 6.0044, time 121.47ms
iter 345820: loss 5.5427, time 120.48ms
iter 345830: loss 6.1810, time 121.58ms
iter 345840: loss 6.0636, time 121.32ms
iter 345850: loss 6.0863, time 120.64ms
iter 345860: loss 6.4013, time 120.67ms
iter 345870: loss 5.8151, time 121.47ms
iter 345880: loss 6.2421, time 121.68ms
iter 345890: loss 6.4112, time 121.99ms
iter 345900: loss 6.0472, time 121.50ms
iter 345910: loss 5.9619, time 121.36ms
iter 345920: loss 5.0868, time 121.44ms
iter 345930: loss 6.0416, time 121.42ms
iter 345940: loss 6.3184, time 121.51ms
iter 345950: loss 6.0219, time 121.33ms
iter 345960: loss 5.4461, time 121.32ms
iter 345970: loss 6.3055, time 121.51ms
iter 345980: loss 5.7266, time 121.45ms
iter 345990: loss 6.1294, time 121.45ms
step 346000: train loss 5.5949, val loss 5.6284
saving checkpoint to out-shakespeare-char
iter 346000: loss 5.9145, time 2884.41ms
iter 346010: loss 6.2678, time 121.19ms
iter 346020: loss 6.5903, time 121.73ms
iter 346030: loss 5.7058, time 120.96ms
iter 346040: loss 6.4617, time 120.10ms
iter 346050: loss 5.6123, time 121.42ms
iter 346060: loss 5.7226, time 121.39ms
iter 346070: loss 5.3730, time 121.46ms
iter 346080: loss 5.7227, time 120.95ms
iter 346090: loss 6.9146, time 120.58ms
iter 346100: loss 5.7952, time 120.42ms
iter 346110: loss 6.1836, time 121.24ms
iter 346120: loss 5.4593, time 121.50ms
iter 346130: loss 5.5100, time 120.59ms
iter 346140: loss 5.2099, time 121.32ms
iter 346150: loss 5.7767, time 121.64ms
iter 346160: loss 6.7017, time 121.55ms
iter 346170: loss 6.2348, time 121.51ms
iter 346180: loss 5.7773, time 121.42ms
iter 346190: loss 6.2974, time 121.43ms
iter 346200: loss 5.7645, time 121.47ms
iter 346210: loss 6.0234, time 121.44ms
iter 346220: loss 4.9275, time 121.41ms
iter 346230: loss 5.8988, time 121.53ms
iter 346240: loss 5.9262, time 120.86ms
step 346250: train loss 5.5709, val loss 5.5857
saving checkpoint to out-shakespeare-char
iter 346250: loss 6.0996, time 2885.14ms
iter 346260: loss 6.1284, time 121.66ms
iter 346270: loss 5.8040, time 122.99ms
iter 346280: loss 5.5704, time 121.44ms
iter 346290: loss 5.8073, time 122.81ms
iter 346300: loss 6.2213, time 121.57ms
iter 346310: loss 5.7277, time 122.73ms
iter 346320: loss 5.4690, time 121.93ms
iter 346330: loss 5.7226, time 123.58ms
iter 346340: loss 5.7926, time 121.59ms
iter 346350: loss 5.8187, time 122.82ms
iter 346360: loss 5.4337, time 121.59ms
iter 346370: loss 6.4088, time 122.25ms
iter 346380: loss 6.1160, time 121.68ms
iter 346390: loss 6.3321, time 122.71ms
iter 346400: loss 6.0663, time 122.11ms
iter 346410: loss 6.2371, time 123.05ms
iter 346420: loss 5.4663, time 121.52ms
iter 346430: loss 6.4338, time 123.01ms
iter 346440: loss 6.2721, time 121.65ms
iter 346450: loss 6.0954, time 122.68ms
iter 346460: loss 5.6905, time 121.37ms
iter 346470: loss 6.6293, time 122.69ms
iter 346480: loss 5.5239, time 121.73ms
iter 346490: loss 5.9443, time 122.66ms
step 346500: train loss 5.6674, val loss 5.5641
saving checkpoint to out-shakespeare-char
iter 346500: loss 5.7560, time 2887.12ms
iter 346510: loss 5.9580, time 122.79ms
iter 346520: loss 5.4975, time 121.35ms
iter 346530: loss 5.8769, time 121.30ms
iter 346540: loss 5.8670, time 119.73ms
iter 346550: loss 6.7340, time 120.79ms
iter 346560: loss 6.1608, time 119.71ms
iter 346570: loss 5.7187, time 120.95ms
iter 346580: loss 6.6443, time 119.98ms
iter 346590: loss 6.1799, time 121.32ms
iter 346600: loss 6.9738, time 120.03ms
iter 346610: loss 6.2079, time 121.17ms
iter 346620: loss 6.2837, time 120.53ms
iter 346630: loss 6.1202, time 122.40ms
iter 346640: loss 6.6038, time 119.80ms
iter 346650: loss 5.8574, time 122.98ms
iter 346660: loss 5.6550, time 122.00ms
iter 346670: loss 6.4961, time 123.11ms
iter 346680: loss 5.7741, time 122.33ms
iter 346690: loss 6.3824, time 126.13ms
iter 346700: loss 6.4687, time 127.81ms
iter 346710: loss 6.2991, time 125.35ms
iter 346720: loss 5.8521, time 124.97ms
iter 346730: loss 6.3827, time 125.71ms
iter 346740: loss 6.1588, time 124.54ms
step 346750: train loss 5.5695, val loss 5.6016
saving checkpoint to out-shakespeare-char
iter 346750: loss 6.2720, time 2848.81ms
iter 346760: loss 6.4733, time 125.73ms
iter 346770: loss 5.7057, time 125.59ms
iter 346780: loss 5.5678, time 125.87ms
iter 346790: loss 4.9503, time 125.86ms
iter 346800: loss 6.1258, time 125.44ms
iter 346810: loss 6.5357, time 125.35ms
iter 346820: loss 6.2637, time 127.79ms
iter 346830: loss 5.6876, time 125.26ms
iter 346840: loss 5.9194, time 125.18ms
iter 346850: loss 6.1265, time 126.25ms
iter 346860: loss 5.9234, time 125.24ms
iter 346870: loss 5.2678, time 125.33ms
iter 346880: loss 6.2066, time 125.36ms
iter 346890: loss 5.2967, time 125.63ms
iter 346900: loss 6.2195, time 125.34ms
iter 346910: loss 6.4987, time 125.35ms
iter 346920: loss 6.5061, time 125.08ms
iter 346930: loss 5.9260, time 125.33ms
iter 346940: loss 5.6861, time 125.25ms
iter 346950: loss 5.5372, time 125.22ms
iter 346960: loss 6.6073, time 125.56ms
iter 346970: loss 5.9041, time 127.86ms
iter 346980: loss 6.2610, time 125.28ms
iter 346990: loss 6.5914, time 125.26ms
step 347000: train loss 5.6440, val loss 5.5990
saving checkpoint to out-shakespeare-char
iter 347000: loss 6.1504, time 2909.67ms
iter 347010: loss 6.2122, time 121.52ms
iter 347020: loss 5.5415, time 120.50ms
iter 347030: loss 6.6495, time 120.90ms
iter 347040: loss 6.3611, time 121.66ms
iter 347050: loss 5.7493, time 121.83ms
iter 347060: loss 6.4584, time 121.57ms
iter 347070: loss 6.5059, time 121.76ms
iter 347080: loss 6.2156, time 121.66ms
iter 347090: loss 6.1690, time 121.98ms
iter 347100: loss 6.2896, time 121.67ms
iter 347110: loss 6.1969, time 122.38ms
iter 347120: loss 6.0896, time 120.75ms
iter 347130: loss 5.3616, time 120.41ms
iter 347140: loss 6.4419, time 121.68ms
iter 347150: loss 6.0778, time 121.76ms
iter 347160: loss 5.6063, time 121.45ms
iter 347170: loss 5.7522, time 122.09ms
iter 347180: loss 6.5913, time 122.82ms
iter 347190: loss 5.4784, time 122.07ms
iter 347200: loss 6.1060, time 121.59ms
iter 347210: loss 6.0038, time 121.96ms
iter 347220: loss 5.8206, time 120.53ms
iter 347230: loss 5.7779, time 121.04ms
iter 347240: loss 5.8689, time 121.75ms
step 347250: train loss 5.6071, val loss 5.5777
saving checkpoint to out-shakespeare-char
iter 347250: loss 5.9091, time 2915.65ms
iter 347260: loss 5.2030, time 125.94ms
iter 347270: loss 5.6629, time 125.66ms
iter 347280: loss 6.4694, time 125.79ms
iter 347290: loss 6.2673, time 126.04ms
iter 347300: loss 6.8219, time 125.70ms
iter 347310: loss 5.8030, time 125.83ms
iter 347320: loss 5.1726, time 126.12ms
iter 347330: loss 6.2750, time 126.13ms
iter 347340: loss 5.6162, time 126.00ms
iter 347350: loss 5.3217, time 125.09ms
iter 347360: loss 5.7776, time 125.98ms
iter 347370: loss 6.3492, time 126.29ms
iter 347380: loss 5.3961, time 125.85ms
iter 347390: loss 5.5210, time 127.16ms
iter 347400: loss 6.1593, time 126.06ms
iter 347410: loss 6.1123, time 126.32ms
iter 347420: loss 5.7116, time 125.63ms
iter 347430: loss 5.8271, time 125.61ms
iter 347440: loss 5.6999, time 125.78ms
iter 347450: loss 6.1092, time 125.08ms
iter 347460: loss 6.1969, time 125.60ms
iter 347470: loss 6.0801, time 125.25ms
iter 347480: loss 6.2550, time 125.28ms
iter 347490: loss 6.0784, time 124.76ms
step 347500: train loss 5.6133, val loss 5.5891
saving checkpoint to out-shakespeare-char
iter 347500: loss 6.6618, time 2891.10ms
iter 347510: loss 6.8520, time 126.12ms
iter 347520: loss 5.6671, time 128.18ms
iter 347530: loss 5.6945, time 125.60ms
iter 347540: loss 5.9785, time 125.53ms
iter 347550: loss 6.0050, time 125.74ms
iter 347560: loss 6.7224, time 125.96ms
iter 347570: loss 5.3253, time 125.76ms
iter 347580: loss 6.3897, time 125.56ms
iter 347590: loss 5.5389, time 125.62ms
iter 347600: loss 6.3658, time 125.64ms
iter 347610: loss 5.4791, time 125.63ms
iter 347620: loss 6.1724, time 125.66ms
iter 347630: loss 5.5973, time 125.65ms
iter 347640: loss 6.1067, time 128.11ms
iter 347650: loss 5.1771, time 125.53ms
iter 347660: loss 6.3661, time 125.60ms
iter 347670: loss 5.3236, time 125.13ms
iter 347680: loss 5.6943, time 125.70ms
iter 347690: loss 6.1710, time 125.44ms
iter 347700: loss 5.9160, time 126.21ms
iter 347710: loss 5.2186, time 125.80ms
iter 347720: loss 5.7004, time 125.72ms
iter 347730: loss 5.7711, time 128.34ms
iter 347740: loss 5.8311, time 125.70ms
step 347750: train loss 5.5600, val loss 5.6269
saving checkpoint to out-shakespeare-char
iter 347750: loss 5.5052, time 2899.96ms
iter 347760: loss 6.0505, time 125.22ms
iter 347770: loss 5.2964, time 125.05ms
iter 347780: loss 5.5910, time 124.46ms
iter 347790: loss 6.1578, time 125.31ms
iter 347800: loss 5.6572, time 125.32ms
iter 347810: loss 6.2911, time 125.22ms
iter 347820: loss 5.5108, time 124.95ms
iter 347830: loss 6.9040, time 125.05ms
iter 347840: loss 6.3671, time 125.16ms
iter 347850: loss 6.7124, time 127.92ms
iter 347860: loss 6.1141, time 125.52ms
iter 347870: loss 5.9112, time 124.50ms
iter 347880: loss 5.6290, time 125.71ms
iter 347890: loss 6.0857, time 125.46ms
iter 347900: loss 6.6474, time 125.57ms
iter 347910: loss 5.2658, time 125.55ms
iter 347920: loss 5.7593, time 125.51ms
iter 347930: loss 6.0076, time 125.66ms
iter 347940: loss 5.8657, time 124.85ms
iter 347950: loss 5.8707, time 125.65ms
iter 347960: loss 6.0571, time 128.17ms
iter 347970: loss 6.3855, time 125.56ms
iter 347980: loss 5.8864, time 125.64ms
iter 347990: loss 6.2195, time 125.82ms
step 348000: train loss 5.5724, val loss 5.6006
saving checkpoint to out-shakespeare-char
iter 348000: loss 5.7895, time 2880.78ms
iter 348010: loss 5.6068, time 124.18ms
iter 348020: loss 6.4799, time 122.09ms
iter 348030: loss 6.4582, time 123.04ms
iter 348040: loss 5.6635, time 121.65ms
iter 348050: loss 6.1939, time 124.08ms
iter 348060: loss 6.6330, time 121.74ms
iter 348070: loss 6.1403, time 124.63ms
iter 348080: loss 6.1112, time 121.58ms
iter 348090: loss 6.0094, time 124.10ms
iter 348100: loss 6.3582, time 122.02ms
iter 348110: loss 5.9477, time 124.39ms
iter 348120: loss 5.4575, time 121.94ms
iter 348130: loss 6.0760, time 123.41ms
iter 348140: loss 5.6984, time 122.11ms
iter 348150: loss 5.9294, time 124.15ms
iter 348160: loss 5.4847, time 122.03ms
iter 348170: loss 5.5100, time 124.38ms
iter 348180: loss 6.4292, time 121.96ms
iter 348190: loss 5.7321, time 124.10ms
iter 348200: loss 5.9442, time 122.19ms
iter 348210: loss 5.9656, time 124.21ms
iter 348220: loss 6.0736, time 122.04ms
iter 348230: loss 5.6642, time 123.56ms
iter 348240: loss 6.3611, time 122.06ms
step 348250: train loss 5.5877, val loss 5.6034
saving checkpoint to out-shakespeare-char
iter 348250: loss 6.6712, time 2881.43ms
iter 348260: loss 6.6768, time 121.43ms
iter 348270: loss 6.5342, time 121.76ms
iter 348280: loss 5.0771, time 121.86ms
iter 348290: loss 6.5857, time 121.56ms
iter 348300: loss 6.6564, time 121.66ms
iter 348310: loss 6.3435, time 121.51ms
iter 348320: loss 6.6785, time 121.13ms
iter 348330: loss 5.9192, time 121.60ms
iter 348340: loss 5.8848, time 121.67ms
iter 348350: loss 6.2620, time 121.21ms
iter 348360: loss 6.4236, time 121.57ms
iter 348370: loss 5.9938, time 121.70ms
iter 348380: loss 5.8777, time 121.60ms
iter 348390: loss 6.4610, time 121.69ms
iter 348400: loss 5.5957, time 120.52ms
iter 348410: loss 6.2525, time 122.05ms
iter 348420: loss 6.4862, time 121.84ms
iter 348430: loss 6.5643, time 121.68ms
iter 348440: loss 5.7598, time 121.69ms
iter 348450: loss 6.3282, time 121.48ms
iter 348460: loss 5.9442, time 121.53ms
iter 348470: loss 5.6133, time 121.34ms
iter 348480: loss 6.0555, time 121.35ms
iter 348490: loss 6.3476, time 121.87ms
step 348500: train loss 5.6020, val loss 5.5726
saving checkpoint to out-shakespeare-char
iter 348500: loss 5.8249, time 2866.40ms
iter 348510: loss 6.2414, time 121.92ms
iter 348520: loss 5.7073, time 121.91ms
iter 348530: loss 5.6626, time 120.61ms
iter 348540: loss 6.1012, time 121.87ms
iter 348550: loss 6.1901, time 121.88ms
iter 348560: loss 5.8218, time 121.56ms
iter 348570: loss 6.7652, time 121.41ms
iter 348580: loss 5.9655, time 121.29ms
iter 348590: loss 5.9924, time 122.12ms
iter 348600: loss 6.0558, time 121.77ms
iter 348610: loss 6.1514, time 121.74ms
iter 348620: loss 6.1606, time 121.29ms
iter 348630: loss 6.5588, time 121.87ms
iter 348640: loss 5.9074, time 121.97ms
iter 348650: loss 6.0088, time 122.48ms
iter 348660: loss 5.7069, time 121.76ms
iter 348670: loss 5.7589, time 121.60ms
iter 348680: loss 5.3865, time 121.68ms
iter 348690: loss 6.2064, time 121.88ms
iter 348700: loss 5.7357, time 121.72ms
iter 348710: loss 6.1445, time 121.70ms
iter 348720: loss 6.3441, time 121.46ms
iter 348730: loss 6.3730, time 121.81ms
iter 348740: loss 5.9618, time 121.86ms
step 348750: train loss 5.6705, val loss 5.6149
saving checkpoint to out-shakespeare-char
iter 348750: loss 6.2099, time 2866.53ms
iter 348760: loss 5.5942, time 121.79ms
iter 348770: loss 6.1954, time 121.82ms
iter 348780: loss 5.8851, time 121.85ms
iter 348790: loss 5.5296, time 121.57ms
iter 348800: loss 6.4826, time 121.56ms
iter 348810: loss 6.6051, time 121.75ms
iter 348820: loss 6.2760, time 121.75ms
iter 348830: loss 5.7996, time 121.55ms
iter 348840: loss 6.1612, time 121.62ms
iter 348850: loss 5.7956, time 121.69ms
iter 348860: loss 6.1851, time 121.67ms
iter 348870: loss 6.0023, time 121.67ms
iter 348880: loss 6.0366, time 121.92ms
iter 348890: loss 5.7111, time 121.57ms
iter 348900: loss 6.1978, time 121.55ms
iter 348910: loss 6.0591, time 121.13ms
iter 348920: loss 5.8287, time 121.82ms
iter 348930: loss 5.8917, time 121.14ms
iter 348940: loss 6.0081, time 121.26ms
iter 348950: loss 5.2405, time 121.26ms
iter 348960: loss 6.4171, time 121.68ms
iter 348970: loss 6.4960, time 121.43ms
iter 348980: loss 5.5508, time 121.54ms
iter 348990: loss 5.8154, time 121.34ms
step 349000: train loss 5.6183, val loss 5.6051
saving checkpoint to out-shakespeare-char
iter 349000: loss 6.5002, time 2863.42ms
iter 349010: loss 6.6129, time 121.60ms
iter 349020: loss 5.3090, time 123.98ms
iter 349030: loss 5.5733, time 121.54ms
iter 349040: loss 6.6378, time 123.34ms
iter 349050: loss 5.0976, time 121.36ms
iter 349060: loss 6.7201, time 123.47ms
iter 349070: loss 5.8204, time 121.75ms
iter 349080: loss 6.3012, time 123.58ms
iter 349090: loss 5.6532, time 121.93ms
iter 349100: loss 5.4966, time 123.50ms
iter 349110: loss 6.5895, time 121.68ms
iter 349120: loss 5.5672, time 124.21ms
iter 349130: loss 6.0561, time 121.49ms
iter 349140: loss 5.5108, time 123.92ms
iter 349150: loss 6.3003, time 121.39ms
iter 349160: loss 5.6457, time 123.83ms
iter 349170: loss 6.1408, time 121.85ms
iter 349180: loss 6.3637, time 123.80ms
iter 349190: loss 6.0405, time 121.68ms
iter 349200: loss 6.2018, time 124.22ms
iter 349210: loss 6.0893, time 121.74ms
iter 349220: loss 6.0558, time 123.87ms
iter 349230: loss 6.1571, time 121.39ms
iter 349240: loss 6.1469, time 124.20ms
step 349250: train loss 5.5971, val loss 5.5515
saving checkpoint to out-shakespeare-char
iter 349250: loss 6.4801, time 2874.59ms
iter 349260: loss 5.8244, time 124.97ms
iter 349270: loss 6.0318, time 128.28ms
iter 349280: loss 6.3496, time 126.17ms
iter 349290: loss 5.8485, time 125.73ms
iter 349300: loss 6.5775, time 125.91ms
iter 349310: loss 6.5873, time 125.49ms
iter 349320: loss 5.5908, time 125.70ms
iter 349330: loss 6.1734, time 125.70ms
iter 349340: loss 5.9779, time 125.49ms
iter 349350: loss 6.4335, time 125.43ms
iter 349360: loss 6.7995, time 125.51ms
iter 349370: loss 6.6731, time 125.66ms
iter 349380: loss 5.8033, time 125.77ms
iter 349390: loss 5.9833, time 125.44ms
iter 349400: loss 5.8884, time 125.63ms
iter 349410: loss 6.3139, time 125.83ms
iter 349420: loss 5.8331, time 128.27ms
iter 349430: loss 5.9433, time 125.64ms
iter 349440: loss 6.5201, time 125.87ms
iter 349450: loss 5.9668, time 125.88ms
iter 349460: loss 6.4159, time 125.87ms
iter 349470: loss 6.6084, time 125.62ms
iter 349480: loss 5.6084, time 125.77ms
iter 349490: loss 5.4053, time 126.12ms
step 349500: train loss 5.6623, val loss 5.6492
saving checkpoint to out-shakespeare-char
iter 349500: loss 6.5041, time 2920.63ms
iter 349510: loss 6.0559, time 126.40ms
iter 349520: loss 6.2879, time 125.70ms
iter 349530: loss 6.1806, time 127.38ms
iter 349540: loss 5.9116, time 124.45ms
iter 349550: loss 6.0095, time 125.30ms
iter 349560: loss 6.1981, time 125.21ms
iter 349570: loss 6.0320, time 124.82ms
iter 349580: loss 6.7946, time 125.95ms
iter 349590: loss 6.3613, time 125.66ms
iter 349600: loss 6.1347, time 125.58ms
iter 349610: loss 6.5892, time 125.96ms
iter 349620: loss 5.8523, time 125.95ms
iter 349630: loss 5.6007, time 125.60ms
iter 349640: loss 7.0513, time 127.98ms
iter 349650: loss 6.0342, time 125.82ms
iter 349660: loss 5.6857, time 125.42ms
iter 349670: loss 5.9298, time 125.70ms
iter 349680: loss 6.1245, time 125.25ms
iter 349690: loss 6.9502, time 125.39ms
iter 349700: loss 5.7533, time 124.79ms
iter 349710: loss 6.0733, time 125.67ms
iter 349720: loss 5.9379, time 125.53ms
iter 349730: loss 6.4449, time 125.35ms
iter 349740: loss 5.8123, time 125.86ms
step 349750: train loss 5.5769, val loss 5.6319
saving checkpoint to out-shakespeare-char
iter 349750: loss 5.6410, time 2899.62ms
iter 349760: loss 5.7003, time 126.81ms
iter 349770: loss 5.4644, time 127.81ms
iter 349780: loss 6.0780, time 125.03ms
iter 349790: loss 6.1840, time 126.23ms
iter 349800: loss 5.9544, time 126.19ms
iter 349810: loss 5.8416, time 125.36ms
iter 349820: loss 5.6313, time 124.97ms
iter 349830: loss 6.1043, time 126.32ms
iter 349840: loss 6.4197, time 125.91ms
iter 349850: loss 5.7471, time 125.84ms
iter 349860: loss 5.4260, time 125.77ms
iter 349870: loss 6.0169, time 126.22ms
iter 349880: loss 6.0761, time 127.61ms
iter 349890: loss 6.1885, time 125.76ms
iter 349900: loss 6.2328, time 126.23ms
iter 349910: loss 5.7786, time 125.71ms
iter 349920: loss 6.6129, time 125.66ms
iter 349930: loss 6.1788, time 125.49ms
iter 349940: loss 5.5780, time 125.56ms
iter 349950: loss 6.4192, time 126.15ms
iter 349960: loss 6.2008, time 125.81ms
iter 349970: loss 5.2040, time 125.77ms
iter 349980: loss 6.6491, time 125.38ms
iter 349990: loss 5.6874, time 128.13ms
step 350000: train loss 5.6522, val loss 5.5780
saving checkpoint to out-shakespeare-char
iter 350000: loss 5.7157, time 2882.66ms
iter 350010: loss 6.4524, time 125.71ms
iter 350020: loss 5.5397, time 125.77ms
iter 350030: loss 5.8205, time 125.09ms
iter 350040: loss 5.5455, time 125.33ms
iter 350050: loss 6.1265, time 125.57ms
iter 350060: loss 6.0317, time 125.61ms
iter 350070: loss 5.2202, time 125.76ms
iter 350080: loss 6.0684, time 125.53ms
iter 350090: loss 5.8719, time 125.39ms
iter 350100: loss 5.2059, time 125.69ms
iter 350110: loss 5.4879, time 127.96ms
iter 350120: loss 6.0441, time 124.54ms
iter 350130: loss 5.2169, time 125.50ms
iter 350140: loss 5.4945, time 125.56ms
iter 350150: loss 6.0097, time 125.51ms
iter 350160: loss 6.2290, time 125.58ms
iter 350170: loss 6.5283, time 125.43ms
iter 350180: loss 6.2749, time 125.63ms
iter 350190: loss 5.9307, time 125.53ms
iter 350200: loss 5.9117, time 124.36ms
iter 350210: loss 5.5267, time 125.64ms
iter 350220: loss 6.2134, time 128.42ms
iter 350230: loss 6.3942, time 125.45ms
iter 350240: loss 5.0287, time 125.62ms
step 350250: train loss 5.6043, val loss 5.6685
saving checkpoint to out-shakespeare-char
iter 350250: loss 5.8391, time 2873.14ms
iter 350260: loss 6.4227, time 125.44ms
iter 350270: loss 6.2709, time 125.46ms
iter 350280: loss 5.6723, time 125.73ms
iter 350290: loss 5.9094, time 128.08ms
iter 350300: loss 4.7295, time 125.23ms
iter 350310: loss 6.5420, time 125.54ms
iter 350320: loss 5.9843, time 124.61ms
iter 350330: loss 5.9062, time 125.85ms
iter 350340: loss 6.0775, time 125.58ms
iter 350350: loss 6.9222, time 125.38ms
iter 350360: loss 6.4809, time 125.58ms
iter 350370: loss 5.3538, time 125.77ms
iter 350380: loss 6.6109, time 125.45ms
iter 350390: loss 6.2360, time 125.74ms
iter 350400: loss 5.7966, time 127.34ms
iter 350410: loss 6.0843, time 125.34ms
iter 350420: loss 6.4209, time 124.14ms
iter 350430: loss 5.7724, time 124.51ms
iter 350440: loss 6.2774, time 124.27ms
iter 350450: loss 6.3331, time 125.15ms
iter 350460: loss 6.5473, time 124.76ms
iter 350470: loss 5.9066, time 124.44ms
iter 350480: loss 5.9590, time 124.22ms
iter 350490: loss 4.9460, time 125.19ms
step 350500: train loss 5.5857, val loss 5.6076
saving checkpoint to out-shakespeare-char
iter 350500: loss 6.2216, time 2901.66ms
iter 350510: loss 5.9647, time 127.84ms
iter 350520: loss 5.8119, time 126.12ms
iter 350530: loss 5.1439, time 126.27ms
iter 350540: loss 5.5707, time 125.92ms
iter 350550: loss 5.5797, time 126.03ms
iter 350560: loss 5.8561, time 125.98ms
iter 350570: loss 5.9844, time 124.90ms
iter 350580: loss 6.4799, time 126.28ms
iter 350590: loss 6.3687, time 126.06ms
iter 350600: loss 4.9825, time 125.88ms
iter 350610: loss 5.8686, time 126.20ms
iter 350620: loss 5.5290, time 128.50ms
iter 350630: loss 6.2349, time 125.83ms
iter 350640: loss 6.4717, time 126.31ms
iter 350650: loss 6.3184, time 126.20ms
iter 350660: loss 5.6673, time 125.90ms
iter 350670: loss 6.2581, time 125.99ms
iter 350680: loss 6.2066, time 125.84ms
iter 350690: loss 6.1149, time 126.07ms
iter 350700: loss 6.0497, time 125.81ms
iter 350710: loss 6.3142, time 126.03ms
iter 350720: loss 5.5164, time 126.04ms
iter 350730: loss 6.4000, time 128.36ms
iter 350740: loss 5.5668, time 125.93ms
step 350750: train loss 5.5632, val loss 5.6005
saving checkpoint to out-shakespeare-char
iter 350750: loss 6.3742, time 2889.75ms
iter 350760: loss 6.6752, time 124.63ms
iter 350770: loss 6.5806, time 127.59ms
iter 350780: loss 6.2449, time 125.29ms
iter 350790: loss 5.5259, time 125.20ms
iter 350800: loss 5.7759, time 125.45ms
iter 350810: loss 6.5410, time 125.20ms
iter 350820: loss 5.4438, time 125.69ms
iter 350830: loss 5.5122, time 125.80ms
iter 350840: loss 5.4905, time 125.80ms
iter 350850: loss 5.8564, time 125.73ms
iter 350860: loss 5.7596, time 125.30ms
iter 350870: loss 5.5742, time 125.40ms
iter 350880: loss 6.0981, time 127.67ms
iter 350890: loss 6.5864, time 125.36ms
iter 350900: loss 6.0881, time 125.28ms
iter 350910: loss 5.8494, time 125.55ms
iter 350920: loss 5.8322, time 125.34ms
iter 350930: loss 5.9947, time 125.62ms
iter 350940: loss 5.6415, time 125.19ms
iter 350950: loss 5.8323, time 125.34ms
iter 350960: loss 5.7081, time 125.71ms
iter 350970: loss 5.8008, time 125.35ms
iter 350980: loss 5.7996, time 125.26ms
iter 350990: loss 6.2127, time 125.45ms
step 351000: train loss 5.5552, val loss 5.5718
saving checkpoint to out-shakespeare-char
iter 351000: loss 6.2419, time 2895.01ms
iter 351010: loss 6.3484, time 125.09ms
iter 351020: loss 6.0531, time 125.32ms
iter 351030: loss 5.6293, time 125.37ms
iter 351040: loss 5.8897, time 125.09ms
iter 351050: loss 5.5643, time 125.34ms
iter 351060: loss 5.4743, time 125.26ms
iter 351070: loss 6.3017, time 127.70ms
iter 351080: loss 5.3883, time 125.11ms
iter 351090: loss 6.0194, time 125.05ms
iter 351100: loss 6.1181, time 125.97ms
iter 351110: loss 5.8609, time 125.17ms
iter 351120: loss 6.7353, time 125.09ms
iter 351130: loss 5.9395, time 125.08ms
iter 351140: loss 6.1985, time 125.30ms
iter 351150: loss 6.0168, time 125.54ms
iter 351160: loss 6.2187, time 126.04ms
iter 351170: loss 5.7117, time 125.33ms
iter 351180: loss 5.1095, time 128.58ms
iter 351190: loss 6.9508, time 125.28ms
iter 351200: loss 6.4790, time 125.14ms
iter 351210: loss 6.1045, time 126.22ms
iter 351220: loss 5.8040, time 125.58ms
iter 351230: loss 5.6596, time 125.27ms
iter 351240: loss 6.2301, time 125.87ms
step 351250: train loss 5.5882, val loss 5.6025
saving checkpoint to out-shakespeare-char
iter 351250: loss 6.1499, time 2884.08ms
iter 351260: loss 5.8414, time 128.53ms
iter 351270: loss 6.0365, time 125.56ms
iter 351280: loss 5.6692, time 124.81ms
iter 351290: loss 6.0317, time 125.78ms
iter 351300: loss 6.1875, time 128.08ms
iter 351310: loss 5.5288, time 125.60ms
iter 351320: loss 5.8288, time 125.49ms
iter 351330: loss 5.6026, time 125.93ms
iter 351340: loss 6.2245, time 125.49ms
iter 351350: loss 6.2324, time 125.63ms
iter 351360: loss 6.4009, time 125.38ms
iter 351370: loss 5.5089, time 127.89ms
iter 351380: loss 6.0197, time 125.87ms
iter 351390: loss 6.4957, time 126.01ms
iter 351400: loss 5.8777, time 125.60ms
iter 351410: loss 6.1769, time 125.65ms
iter 351420: loss 5.9468, time 125.59ms
iter 351430: loss 6.7332, time 125.66ms
iter 351440: loss 6.0656, time 125.62ms
iter 351450: loss 5.5972, time 125.48ms
iter 351460: loss 5.8140, time 125.49ms
iter 351470: loss 5.9348, time 126.00ms
iter 351480: loss 6.6403, time 127.98ms
iter 351490: loss 6.1555, time 126.79ms
step 351500: train loss 5.6030, val loss 5.6097
saving checkpoint to out-shakespeare-char
iter 351500: loss 5.2456, time 2883.31ms
iter 351510: loss 6.2882, time 125.26ms
iter 351520: loss 5.7967, time 125.78ms
iter 351530: loss 6.2579, time 125.99ms
iter 351540: loss 5.7990, time 125.67ms
iter 351550: loss 5.7576, time 125.82ms
iter 351560: loss 5.5927, time 127.12ms
iter 351570: loss 5.7951, time 125.52ms
iter 351580: loss 5.9345, time 125.63ms
iter 351590: loss 5.2995, time 125.96ms
iter 351600: loss 6.0490, time 128.02ms
iter 351610: loss 5.9892, time 124.81ms
iter 351620: loss 5.2142, time 125.63ms
iter 351630: loss 5.5189, time 125.92ms
iter 351640: loss 5.4763, time 128.63ms
iter 351650: loss 5.5519, time 125.93ms
iter 351660: loss 5.9207, time 125.68ms
iter 351670: loss 6.0609, time 125.66ms
iter 351680: loss 5.8788, time 125.70ms
iter 351690: loss 5.8888, time 124.83ms
iter 351700: loss 6.0640, time 125.61ms
iter 351710: loss 5.9409, time 126.77ms
iter 351720: loss 6.1131, time 124.55ms
iter 351730: loss 5.6482, time 125.99ms
iter 351740: loss 6.5580, time 126.24ms
step 351750: train loss 5.6267, val loss 5.6419
saving checkpoint to out-shakespeare-char
iter 351750: loss 5.8450, time 2881.54ms
iter 351760: loss 6.4736, time 125.28ms
iter 351770: loss 5.7763, time 125.21ms
iter 351780: loss 6.1856, time 124.90ms
iter 351790: loss 6.3447, time 125.08ms
iter 351800: loss 6.6121, time 128.26ms
iter 351810: loss 6.2833, time 124.81ms
iter 351820: loss 6.0652, time 125.18ms
iter 351830: loss 6.0449, time 125.47ms
iter 351840: loss 5.4807, time 127.61ms
iter 351850: loss 6.0497, time 125.58ms
iter 351860: loss 6.5450, time 126.38ms
iter 351870: loss 5.4841, time 125.60ms
iter 351880: loss 5.9573, time 125.93ms
iter 351890: loss 5.7083, time 125.46ms
iter 351900: loss 5.7956, time 125.56ms
iter 351910: loss 6.8879, time 125.63ms
iter 351920: loss 6.0992, time 125.41ms
iter 351930: loss 6.1636, time 125.54ms
iter 351940: loss 5.5065, time 125.56ms
iter 351950: loss 5.8968, time 127.98ms
iter 351960: loss 6.2547, time 125.76ms
iter 351970: loss 5.7221, time 125.67ms
iter 351980: loss 5.9038, time 124.82ms
iter 351990: loss 6.6644, time 127.96ms
step 352000: train loss 5.5989, val loss 5.6435
saving checkpoint to out-shakespeare-char
iter 352000: loss 6.0699, time 2897.38ms
iter 352010: loss 5.9676, time 124.71ms
iter 352020: loss 5.3555, time 124.93ms
iter 352030: loss 6.0995, time 127.65ms
iter 352040: loss 5.9796, time 125.11ms
iter 352050: loss 5.6285, time 125.14ms
iter 352060: loss 5.8428, time 124.83ms
iter 352070: loss 5.9092, time 125.35ms
iter 352080: loss 6.1104, time 125.22ms
iter 352090: loss 6.0044, time 125.36ms
iter 352100: loss 5.1630, time 125.30ms
iter 352110: loss 6.4211, time 125.08ms
iter 352120: loss 5.9515, time 124.95ms
iter 352130: loss 5.7972, time 124.96ms
iter 352140: loss 6.3564, time 127.78ms
iter 352150: loss 6.3841, time 124.96ms
iter 352160: loss 6.2574, time 125.08ms
iter 352170: loss 6.1662, time 125.88ms
iter 352180: loss 5.4530, time 125.04ms
iter 352190: loss 6.3250, time 125.44ms
iter 352200: loss 6.3958, time 125.06ms
iter 352210: loss 5.6900, time 125.69ms
iter 352220: loss 6.1517, time 125.10ms
iter 352230: loss 6.0582, time 125.61ms
iter 352240: loss 5.6132, time 125.34ms
step 352250: train loss 5.5849, val loss 5.5679
saving checkpoint to out-shakespeare-char
iter 352250: loss 6.2566, time 2888.21ms
iter 352260: loss 6.3256, time 125.27ms
iter 352270: loss 5.8706, time 125.29ms
iter 352280: loss 5.4704, time 125.34ms
iter 352290: loss 5.7987, time 125.56ms
iter 352300: loss 6.3532, time 125.20ms
iter 352310: loss 6.1901, time 125.31ms
iter 352320: loss 6.2258, time 125.61ms
iter 352330: loss 6.3600, time 127.67ms
iter 352340: loss 6.5453, time 125.47ms
iter 352350: loss 5.4236, time 124.93ms
iter 352360: loss 6.1413, time 125.73ms
iter 352370: loss 6.4318, time 125.28ms
iter 352380: loss 5.3769, time 125.64ms
iter 352390: loss 5.7918, time 126.08ms
iter 352400: loss 6.2716, time 125.61ms
iter 352410: loss 5.8746, time 126.16ms
iter 352420: loss 5.9742, time 125.67ms
iter 352430: loss 6.2013, time 125.34ms
iter 352440: loss 5.0076, time 128.47ms
iter 352450: loss 5.4728, time 125.87ms
iter 352460: loss 5.4751, time 125.42ms
iter 352470: loss 6.0840, time 125.72ms
iter 352480: loss 6.2660, time 125.64ms
iter 352490: loss 5.6993, time 125.39ms
step 352500: train loss 5.6217, val loss 5.5974
saving checkpoint to out-shakespeare-char
iter 352500: loss 6.0109, time 2893.35ms
iter 352510: loss 6.7414, time 125.49ms
iter 352520: loss 5.6596, time 125.33ms
iter 352530: loss 5.8069, time 125.60ms
iter 352540: loss 5.7104, time 125.64ms
iter 352550: loss 5.5154, time 123.91ms
iter 352560: loss 6.1366, time 121.36ms
iter 352570: loss 6.6274, time 122.56ms
iter 352580: loss 5.6912, time 121.77ms
iter 352590: loss 6.0951, time 122.95ms
iter 352600: loss 6.0170, time 121.03ms
iter 352610: loss 6.6732, time 122.72ms
iter 352620: loss 6.6644, time 122.44ms
iter 352630: loss 5.6746, time 121.44ms
iter 352640: loss 5.8101, time 121.18ms
iter 352650: loss 5.5315, time 122.18ms
iter 352660: loss 5.5024, time 122.23ms
iter 352670: loss 6.3242, time 123.37ms
iter 352680: loss 6.2273, time 120.51ms
iter 352690: loss 6.4492, time 121.96ms
iter 352700: loss 6.5011, time 122.00ms
iter 352710: loss 5.5914, time 123.09ms
iter 352720: loss 6.5695, time 122.55ms
iter 352730: loss 5.2770, time 123.23ms
iter 352740: loss 6.3273, time 121.69ms
step 352750: train loss 5.5569, val loss 5.5691
saving checkpoint to out-shakespeare-char
iter 352750: loss 6.0987, time 2884.78ms
iter 352760: loss 6.0736, time 121.94ms
iter 352770: loss 5.8104, time 122.27ms
iter 352780: loss 5.7230, time 121.81ms
iter 352790: loss 5.8739, time 121.56ms
iter 352800: loss 6.3220, time 121.70ms
iter 352810: loss 5.9999, time 121.79ms
iter 352820: loss 6.1597, time 121.72ms
iter 352830: loss 6.0186, time 120.45ms
iter 352840: loss 6.0428, time 121.99ms
iter 352850: loss 6.6901, time 121.79ms
iter 352860: loss 6.0835, time 120.97ms
iter 352870: loss 6.0546, time 120.96ms
iter 352880: loss 6.4874, time 121.03ms
iter 352890: loss 6.5696, time 121.86ms
iter 352900: loss 5.5527, time 121.78ms
iter 352910: loss 5.9802, time 121.73ms
iter 352920: loss 6.4695, time 120.47ms
iter 352930: loss 5.7275, time 121.97ms
iter 352940: loss 6.0090, time 121.48ms
iter 352950: loss 6.2823, time 122.20ms
iter 352960: loss 6.0785, time 122.15ms
iter 352970: loss 6.4843, time 122.14ms
iter 352980: loss 5.3106, time 122.12ms
iter 352990: loss 5.8457, time 122.09ms
step 353000: train loss 5.5610, val loss 5.5910
saving checkpoint to out-shakespeare-char
iter 353000: loss 5.8713, time 2891.00ms
iter 353010: loss 5.8000, time 122.23ms
iter 353020: loss 5.9416, time 121.98ms
iter 353030: loss 5.7114, time 122.32ms
iter 353040: loss 4.9956, time 122.41ms
iter 353050: loss 5.3438, time 121.74ms
iter 353060: loss 5.4777, time 122.15ms
iter 353070: loss 6.0208, time 121.91ms
iter 353080: loss 6.1517, time 121.76ms
iter 353090: loss 6.4008, time 122.44ms
iter 353100: loss 5.9708, time 121.86ms
iter 353110: loss 6.1054, time 121.98ms
iter 353120: loss 5.0250, time 122.05ms
iter 353130: loss 6.6916, time 122.22ms
iter 353140: loss 5.7476, time 121.12ms
iter 353150: loss 5.3308, time 121.50ms
iter 353160: loss 5.6207, time 122.25ms
iter 353170: loss 5.2544, time 121.96ms
iter 353180: loss 4.9285, time 121.94ms
iter 353190: loss 6.4542, time 122.51ms
iter 353200: loss 5.8408, time 120.77ms
iter 353210: loss 6.2021, time 122.02ms
iter 353220: loss 6.3768, time 120.41ms
iter 353230: loss 5.9357, time 121.28ms
iter 353240: loss 6.1703, time 122.75ms
step 353250: train loss 5.6128, val loss 5.6398
saving checkpoint to out-shakespeare-char
iter 353250: loss 6.3764, time 2893.62ms
iter 353260: loss 5.8902, time 124.85ms
iter 353270: loss 6.2940, time 125.08ms
iter 353280: loss 5.4089, time 125.05ms
iter 353290: loss 6.3172, time 125.35ms
iter 353300: loss 6.1524, time 125.65ms
iter 353310: loss 5.6400, time 124.95ms
iter 353320: loss 5.6272, time 125.76ms
iter 353330: loss 5.8584, time 125.97ms
iter 353340: loss 5.5059, time 127.56ms
iter 353350: loss 6.9544, time 125.50ms
iter 353360: loss 5.9499, time 124.87ms
iter 353370: loss 6.1092, time 125.91ms
iter 353380: loss 6.0780, time 127.66ms
iter 353390: loss 6.1471, time 124.59ms
iter 353400: loss 6.0217, time 125.35ms
iter 353410: loss 6.5428, time 125.70ms
iter 353420: loss 6.3172, time 125.54ms
iter 353430: loss 6.3543, time 125.79ms
iter 353440: loss 5.7694, time 125.35ms
iter 353450: loss 6.1565, time 124.83ms
iter 353460: loss 5.5378, time 126.70ms
iter 353470: loss 6.0621, time 125.47ms
iter 353480: loss 6.0913, time 125.75ms
iter 353490: loss 5.7868, time 125.64ms
step 353500: train loss 5.6521, val loss 5.6128
saving checkpoint to out-shakespeare-char
iter 353500: loss 5.9188, time 2876.13ms
iter 353510: loss 6.2742, time 126.04ms
iter 353520: loss 5.1868, time 126.38ms
iter 353530: loss 5.8340, time 127.09ms
iter 353540: loss 6.2281, time 125.96ms
iter 353550: loss 5.6589, time 126.33ms
iter 353560: loss 6.1275, time 126.16ms
iter 353570: loss 5.8539, time 125.77ms
iter 353580: loss 6.2529, time 125.94ms
iter 353590: loss 5.8428, time 125.88ms
iter 353600: loss 5.5009, time 129.23ms
iter 353610: loss 6.5332, time 125.18ms
iter 353620: loss 6.6932, time 124.96ms
iter 353630: loss 6.5153, time 125.62ms
iter 353640: loss 5.4629, time 125.11ms
iter 353650: loss 5.3321, time 125.47ms
iter 353660: loss 6.8143, time 125.12ms
iter 353670: loss 5.5999, time 125.04ms
iter 353680: loss 5.9447, time 125.02ms
iter 353690: loss 6.3011, time 125.48ms
iter 353700: loss 6.5483, time 125.27ms
iter 353710: loss 6.3532, time 127.58ms
iter 353720: loss 5.7208, time 125.11ms
iter 353730: loss 5.3620, time 125.27ms
iter 353740: loss 6.3095, time 125.65ms
step 353750: train loss 5.5977, val loss 5.5272
saving checkpoint to out-shakespeare-char
iter 353750: loss 6.1304, time 2897.37ms
iter 353760: loss 5.3831, time 121.65ms
iter 353770: loss 6.1159, time 122.75ms
iter 353780: loss 5.5304, time 121.62ms
iter 353790: loss 6.0115, time 122.55ms
iter 353800: loss 6.2474, time 121.62ms
iter 353810: loss 5.7762, time 121.66ms
iter 353820: loss 5.1124, time 121.43ms
iter 353830: loss 6.0670, time 122.45ms
iter 353840: loss 5.6663, time 121.44ms
iter 353850: loss 6.1661, time 122.65ms
iter 353860: loss 5.3045, time 121.65ms
iter 353870: loss 5.4594, time 122.29ms
iter 353880: loss 6.1551, time 121.37ms
iter 353890: loss 5.7648, time 122.44ms
iter 353900: loss 5.8287, time 121.36ms
iter 353910: loss 6.1394, time 122.58ms
iter 353920: loss 5.9529, time 121.33ms
iter 353930: loss 6.3730, time 123.32ms
iter 353940: loss 6.7053, time 121.86ms
iter 353950: loss 6.4896, time 122.84ms
iter 353960: loss 5.8678, time 122.00ms
iter 353970: loss 5.6936, time 123.65ms
iter 353980: loss 6.1314, time 126.72ms
iter 353990: loss 5.9092, time 124.50ms
step 354000: train loss 5.5403, val loss 5.6058
saving checkpoint to out-shakespeare-char
iter 354000: loss 6.0094, time 2903.07ms
iter 354010: loss 6.5503, time 125.70ms
iter 354020: loss 5.9503, time 125.14ms
iter 354030: loss 5.8854, time 125.59ms
iter 354040: loss 5.5047, time 125.28ms
iter 354050: loss 6.3958, time 125.47ms
iter 354060: loss 5.5567, time 125.65ms
iter 354070: loss 6.5081, time 125.75ms
iter 354080: loss 6.7393, time 125.37ms
iter 354090: loss 5.6304, time 124.93ms
iter 354100: loss 5.1559, time 126.24ms
iter 354110: loss 6.3889, time 126.01ms
iter 354120: loss 6.1091, time 126.19ms
iter 354130: loss 6.3919, time 128.30ms
iter 354140: loss 5.7097, time 126.14ms
iter 354150: loss 6.3954, time 123.99ms
iter 354160: loss 6.4395, time 126.02ms
iter 354170: loss 7.1310, time 125.36ms
iter 354180: loss 6.3591, time 125.88ms
iter 354190: loss 6.5212, time 125.87ms
iter 354200: loss 5.6129, time 124.05ms
iter 354210: loss 5.9338, time 125.45ms
iter 354220: loss 6.3733, time 125.43ms
iter 354230: loss 5.8980, time 125.82ms
iter 354240: loss 5.1884, time 128.20ms
step 354250: train loss 5.6087, val loss 5.5801
saving checkpoint to out-shakespeare-char
iter 354250: loss 7.0183, time 2868.60ms
iter 354260: loss 5.7926, time 125.68ms
iter 354270: loss 6.1807, time 125.92ms
iter 354280: loss 5.6164, time 123.03ms
iter 354290: loss 6.2254, time 122.15ms
iter 354300: loss 5.5006, time 122.96ms
iter 354310: loss 5.6846, time 122.09ms
iter 354320: loss 6.2075, time 121.44ms
iter 354330: loss 5.4950, time 122.62ms
iter 354340: loss 6.6177, time 122.06ms
iter 354350: loss 6.3875, time 122.13ms
iter 354360: loss 6.6457, time 121.18ms
iter 354370: loss 6.3596, time 122.51ms
iter 354380: loss 5.8330, time 121.50ms
iter 354390: loss 5.3943, time 122.53ms
iter 354400: loss 5.8074, time 122.45ms
iter 354410: loss 6.1563, time 121.36ms
iter 354420: loss 6.2212, time 121.26ms
iter 354430: loss 5.8765, time 121.43ms
iter 354440: loss 6.0512, time 121.46ms
iter 354450: loss 5.9773, time 121.66ms
iter 354460: loss 6.3142, time 121.49ms
iter 354470: loss 6.0851, time 121.48ms
iter 354480: loss 5.7336, time 121.45ms
iter 354490: loss 5.8313, time 121.52ms
step 354500: train loss 5.5405, val loss 5.5572
saving checkpoint to out-shakespeare-char
iter 354500: loss 6.3606, time 2883.41ms
iter 354510: loss 5.7859, time 121.80ms
iter 354520: loss 6.2610, time 128.78ms
iter 354530: loss 6.5762, time 125.74ms
iter 354540: loss 6.1831, time 126.05ms
iter 354550: loss 5.8503, time 126.38ms
iter 354560: loss 6.1141, time 128.31ms
iter 354570: loss 6.2023, time 126.26ms
iter 354580: loss 5.0592, time 126.48ms
iter 354590: loss 6.7908, time 126.15ms
iter 354600: loss 6.1572, time 125.78ms
iter 354610: loss 6.4219, time 125.97ms
iter 354620: loss 5.9048, time 126.31ms
iter 354630: loss 6.7260, time 126.18ms
iter 354640: loss 6.2130, time 125.86ms
iter 354650: loss 6.1533, time 125.86ms
iter 354660: loss 5.5645, time 126.11ms
iter 354670: loss 6.0963, time 128.29ms
iter 354680: loss 5.8908, time 125.76ms
iter 354690: loss 6.0063, time 126.03ms
iter 354700: loss 5.8663, time 125.28ms
iter 354710: loss 6.0976, time 125.66ms
iter 354720: loss 6.0286, time 126.04ms
iter 354730: loss 5.8932, time 126.04ms
iter 354740: loss 6.0585, time 126.69ms
step 354750: train loss 5.5402, val loss 5.6576
saving checkpoint to out-shakespeare-char
iter 354750: loss 6.1178, time 2874.56ms
iter 354760: loss 5.5073, time 125.14ms
iter 354770: loss 5.8555, time 125.02ms
iter 354780: loss 6.0469, time 125.06ms
iter 354790: loss 5.1133, time 127.63ms
iter 354800: loss 5.6925, time 124.91ms
iter 354810: loss 6.7900, time 125.01ms
iter 354820: loss 7.2397, time 125.28ms
iter 354830: loss 6.3282, time 125.07ms
iter 354840: loss 6.1905, time 126.08ms
iter 354850: loss 6.1271, time 124.91ms
iter 354860: loss 5.6842, time 125.10ms
iter 354870: loss 5.6167, time 124.91ms
iter 354880: loss 5.5067, time 125.34ms
iter 354890: loss 5.9599, time 124.76ms
iter 354900: loss 6.1793, time 127.14ms
iter 354910: loss 5.3754, time 124.88ms
iter 354920: loss 6.0051, time 125.50ms
iter 354930: loss 6.0317, time 125.31ms
iter 354940: loss 5.6395, time 124.98ms
iter 354950: loss 6.3050, time 125.07ms
iter 354960: loss 6.7135, time 124.83ms
iter 354970: loss 5.6702, time 125.12ms
iter 354980: loss 5.9555, time 124.82ms
iter 354990: loss 6.2539, time 125.26ms
step 355000: train loss 5.6429, val loss 5.6204
saving checkpoint to out-shakespeare-char
iter 355000: loss 6.9046, time 2884.43ms
iter 355010: loss 6.5388, time 125.35ms
iter 355020: loss 6.1798, time 125.27ms
iter 355030: loss 6.3829, time 127.50ms
iter 355040: loss 6.7178, time 124.96ms
iter 355050: loss 6.4662, time 124.32ms
iter 355060: loss 6.1060, time 125.10ms
iter 355070: loss 5.8271, time 125.08ms
iter 355080: loss 6.3380, time 125.52ms
iter 355090: loss 6.1194, time 125.12ms
iter 355100: loss 5.5105, time 125.57ms
iter 355110: loss 6.7450, time 125.59ms
iter 355120: loss 6.3659, time 125.61ms
iter 355130: loss 5.8011, time 126.02ms
iter 355140: loss 6.1387, time 127.66ms
iter 355150: loss 6.2483, time 125.77ms
iter 355160: loss 6.1107, time 126.54ms
iter 355170: loss 5.1604, time 126.03ms
iter 355180: loss 5.6722, time 125.53ms
iter 355190: loss 6.0483, time 126.25ms
iter 355200: loss 5.9162, time 125.97ms
iter 355210: loss 5.8953, time 126.33ms
iter 355220: loss 5.7651, time 126.05ms
iter 355230: loss 5.0128, time 125.91ms
iter 355240: loss 5.7925, time 126.10ms
step 355250: train loss 5.5717, val loss 5.5912
saving checkpoint to out-shakespeare-char
iter 355250: loss 5.8270, time 2904.45ms
iter 355260: loss 6.6886, time 125.37ms
iter 355270: loss 5.6411, time 125.23ms
iter 355280: loss 5.3503, time 125.73ms
iter 355290: loss 6.9130, time 127.98ms
iter 355300: loss 5.9241, time 125.29ms
iter 355310: loss 6.5201, time 125.19ms
iter 355320: loss 5.9512, time 125.69ms
iter 355330: loss 5.8386, time 131.17ms
iter 355340: loss 6.0446, time 128.16ms
iter 355350: loss 6.2163, time 126.65ms
iter 355360: loss 6.1719, time 125.92ms
iter 355370: loss 6.8463, time 124.86ms
iter 355380: loss 5.8015, time 125.23ms
iter 355390: loss 5.7859, time 125.10ms
iter 355400: loss 5.5201, time 125.64ms
iter 355410: loss 6.7729, time 125.54ms
iter 355420: loss 6.3745, time 125.06ms
iter 355430: loss 6.2063, time 126.11ms
iter 355440: loss 5.4171, time 126.18ms
iter 355450: loss 5.9893, time 126.20ms
iter 355460: loss 5.9446, time 125.89ms
iter 355470: loss 6.6049, time 125.94ms
iter 355480: loss 6.2175, time 128.29ms
iter 355490: loss 5.9880, time 126.06ms
step 355500: train loss 5.6402, val loss 5.6070
saving checkpoint to out-shakespeare-char
iter 355500: loss 5.9890, time 2885.26ms
iter 355510: loss 6.0195, time 125.72ms
iter 355520: loss 5.9148, time 125.82ms
iter 355530: loss 6.5505, time 125.76ms
iter 355540: loss 5.5974, time 126.16ms
iter 355550: loss 5.5265, time 125.84ms
iter 355560: loss 5.7977, time 125.25ms
iter 355570: loss 6.9807, time 126.03ms
iter 355580: loss 6.3489, time 125.75ms
iter 355590: loss 6.5737, time 126.32ms
iter 355600: loss 5.2763, time 125.79ms
iter 355610: loss 6.4496, time 127.26ms
iter 355620: loss 6.1632, time 125.97ms
iter 355630: loss 5.6865, time 127.98ms
iter 355640: loss 5.9878, time 126.39ms
iter 355650: loss 6.4983, time 127.17ms
iter 355660: loss 6.7783, time 125.81ms
iter 355670: loss 5.5880, time 125.84ms
iter 355680: loss 4.8490, time 125.58ms
iter 355690: loss 6.3175, time 126.80ms
iter 355700: loss 5.6090, time 125.96ms
iter 355710: loss 6.5527, time 125.95ms
iter 355720: loss 5.7911, time 125.01ms
iter 355730: loss 5.4080, time 126.37ms
iter 355740: loss 6.0154, time 127.83ms
step 355750: train loss 5.5934, val loss 5.5990
saving checkpoint to out-shakespeare-char
iter 355750: loss 5.6879, time 2886.74ms
iter 355760: loss 5.3354, time 125.41ms
iter 355770: loss 6.1500, time 125.13ms
iter 355780: loss 5.4216, time 123.92ms
iter 355790: loss 6.0229, time 125.33ms
iter 355800: loss 6.8665, time 125.25ms
iter 355810: loss 5.7823, time 125.22ms
iter 355820: loss 5.4047, time 124.28ms
iter 355830: loss 6.1087, time 125.06ms
iter 355840: loss 5.8621, time 124.81ms
iter 355850: loss 5.8455, time 125.10ms
iter 355860: loss 5.3156, time 127.29ms
iter 355870: loss 6.4063, time 125.04ms
iter 355880: loss 6.4895, time 125.01ms
iter 355890: loss 5.8380, time 126.17ms
iter 355900: loss 5.7841, time 125.30ms
iter 355910: loss 5.4936, time 125.15ms
iter 355920: loss 5.7726, time 125.13ms
iter 355930: loss 5.9917, time 125.33ms
iter 355940: loss 6.0311, time 124.42ms
iter 355950: loss 5.9341, time 125.68ms
iter 355960: loss 6.1275, time 125.33ms
iter 355970: loss 6.3976, time 127.53ms
iter 355980: loss 6.2719, time 125.00ms
iter 355990: loss 6.0152, time 124.98ms
step 356000: train loss 5.5430, val loss 5.5947
saving checkpoint to out-shakespeare-char
iter 356000: loss 5.7078, time 2874.10ms
iter 356010: loss 6.4280, time 127.39ms
iter 356020: loss 6.4433, time 123.73ms
iter 356030: loss 6.5439, time 124.17ms
iter 356040: loss 5.5981, time 123.35ms
iter 356050: loss 6.1285, time 123.76ms
iter 356060: loss 6.5928, time 125.11ms
iter 356070: loss 5.5181, time 125.10ms
iter 356080: loss 6.2922, time 127.30ms
iter 356090: loss 6.0037, time 124.97ms
iter 356100: loss 5.8949, time 124.64ms
iter 356110: loss 5.0923, time 125.42ms
iter 356120: loss 6.4667, time 125.17ms
iter 356130: loss 5.7677, time 125.11ms
iter 356140: loss 6.6477, time 125.72ms
iter 356150: loss 5.9954, time 125.57ms
iter 356160: loss 6.3407, time 125.12ms
iter 356170: loss 5.7237, time 125.15ms
iter 356180: loss 5.7902, time 125.60ms
iter 356190: loss 5.0836, time 127.28ms
iter 356200: loss 4.9349, time 125.17ms
iter 356210: loss 5.8291, time 124.87ms
iter 356220: loss 6.0433, time 125.57ms
iter 356230: loss 5.7626, time 126.48ms
iter 356240: loss 5.5713, time 124.77ms
step 356250: train loss 5.5757, val loss 5.5929
saving checkpoint to out-shakespeare-char
iter 356250: loss 5.6020, time 2886.01ms
iter 356260: loss 5.9504, time 122.08ms
iter 356270: loss 6.3220, time 123.27ms
iter 356280: loss 5.3373, time 122.10ms
iter 356290: loss 5.4317, time 123.15ms
iter 356300: loss 6.4610, time 121.76ms
iter 356310: loss 5.2761, time 123.07ms
iter 356320: loss 5.9694, time 122.00ms
iter 356330: loss 6.0903, time 123.11ms
iter 356340: loss 6.0559, time 121.99ms
iter 356350: loss 6.1717, time 122.83ms
iter 356360: loss 6.5675, time 122.10ms
iter 356370: loss 5.3709, time 123.50ms
iter 356380: loss 5.4390, time 122.10ms
iter 356390: loss 6.4820, time 123.26ms
iter 356400: loss 6.3795, time 121.93ms
iter 356410: loss 5.8739, time 123.41ms
iter 356420: loss 6.2758, time 121.97ms
iter 356430: loss 6.1032, time 123.45ms
iter 356440: loss 6.6142, time 121.95ms
iter 356450: loss 6.7297, time 123.49ms
iter 356460: loss 6.2615, time 121.86ms
iter 356470: loss 5.9359, time 123.26ms
iter 356480: loss 5.3217, time 121.99ms
iter 356490: loss 5.3646, time 123.14ms
step 356500: train loss 5.6272, val loss 5.6390
saving checkpoint to out-shakespeare-char
iter 356500: loss 5.8882, time 2901.52ms
iter 356510: loss 5.8708, time 125.47ms
iter 356520: loss 5.3466, time 125.92ms
iter 356530: loss 5.6189, time 125.78ms
iter 356540: loss 5.5703, time 125.43ms
iter 356550: loss 5.5019, time 125.29ms
iter 356560: loss 5.2330, time 125.90ms
iter 356570: loss 6.1198, time 125.37ms
iter 356580: loss 6.2571, time 125.24ms
iter 356590: loss 5.7193, time 125.73ms
iter 356600: loss 5.9073, time 125.37ms
iter 356610: loss 5.7035, time 127.71ms
iter 356620: loss 5.8452, time 125.05ms
iter 356630: loss 5.9474, time 125.49ms
iter 356640: loss 5.7552, time 125.38ms
iter 356650: loss 6.2543, time 125.96ms
iter 356660: loss 5.4061, time 125.39ms
iter 356670: loss 6.3551, time 125.49ms
iter 356680: loss 6.6495, time 125.83ms
iter 356690: loss 6.0154, time 125.17ms
iter 356700: loss 6.0439, time 125.88ms
iter 356710: loss 6.1075, time 125.36ms
iter 356720: loss 6.5988, time 128.45ms
iter 356730: loss 5.9490, time 125.64ms
iter 356740: loss 6.4979, time 126.02ms
step 356750: train loss 5.6009, val loss 5.6067
saving checkpoint to out-shakespeare-char
iter 356750: loss 5.8644, time 2884.77ms
iter 356760: loss 6.5019, time 127.25ms
iter 356770: loss 5.9446, time 126.12ms
iter 356780: loss 5.2568, time 124.79ms
iter 356790: loss 5.8147, time 125.88ms
iter 356800: loss 6.1658, time 125.02ms
iter 356810: loss 6.1614, time 124.95ms
iter 356820: loss 6.3740, time 125.13ms
iter 356830: loss 6.5040, time 125.76ms
iter 356840: loss 5.9033, time 124.75ms
iter 356850: loss 6.2845, time 124.81ms
iter 356860: loss 6.8093, time 125.37ms
iter 356870: loss 5.3175, time 127.36ms
iter 356880: loss 5.3428, time 125.33ms
iter 356890: loss 5.2518, time 125.01ms
iter 356900: loss 6.3691, time 124.93ms
iter 356910: loss 5.7796, time 124.86ms
iter 356920: loss 6.1040, time 125.19ms
iter 356930: loss 5.4739, time 125.44ms
iter 356940: loss 6.2067, time 125.66ms
iter 356950: loss 6.1986, time 124.72ms
iter 356960: loss 5.8565, time 124.72ms
iter 356970: loss 5.4365, time 124.80ms
iter 356980: loss 5.4666, time 127.62ms
iter 356990: loss 5.7411, time 124.95ms
step 357000: train loss 5.5697, val loss 5.5914
saving checkpoint to out-shakespeare-char
iter 357000: loss 6.1504, time 2872.30ms
iter 357010: loss 5.9168, time 125.93ms
iter 357020: loss 6.1754, time 125.78ms
iter 357030: loss 6.3038, time 125.48ms
iter 357040: loss 5.8326, time 125.59ms
iter 357050: loss 5.9305, time 125.68ms
iter 357060: loss 6.3610, time 125.39ms
iter 357070: loss 6.2325, time 125.35ms
iter 357080: loss 5.4290, time 125.35ms
iter 357090: loss 5.8610, time 128.33ms
iter 357100: loss 5.8448, time 125.43ms
iter 357110: loss 5.9079, time 125.26ms
iter 357120: loss 5.8840, time 125.28ms
iter 357130: loss 5.5633, time 125.47ms
iter 357140: loss 5.9938, time 125.12ms
iter 357150: loss 5.9792, time 125.86ms
iter 357160: loss 6.3655, time 125.80ms
iter 357170: loss 5.8734, time 125.37ms
iter 357180: loss 6.2111, time 124.81ms
iter 357190: loss 6.4238, time 125.49ms
iter 357200: loss 5.6424, time 125.43ms
iter 357210: loss 5.8114, time 125.94ms
iter 357220: loss 5.7481, time 125.18ms
iter 357230: loss 5.5296, time 124.98ms
iter 357240: loss 5.6124, time 125.42ms
step 357250: train loss 5.5897, val loss 5.5863
saving checkpoint to out-shakespeare-char
iter 357250: loss 6.5636, time 2875.64ms
iter 357260: loss 5.8842, time 125.91ms
iter 357270: loss 6.3948, time 125.88ms
iter 357280: loss 5.4274, time 123.99ms
iter 357290: loss 6.5214, time 127.63ms
iter 357300: loss 7.0735, time 124.95ms
iter 357310: loss 5.8750, time 125.99ms
iter 357320: loss 5.1896, time 123.76ms
iter 357330: loss 5.4259, time 125.77ms
iter 357340: loss 6.3004, time 125.15ms
iter 357350: loss 6.0369, time 125.75ms
iter 357360: loss 5.5647, time 125.39ms
iter 357370: loss 6.2042, time 125.36ms
iter 357380: loss 5.9197, time 125.80ms
iter 357390: loss 5.5347, time 125.95ms
iter 357400: loss 5.8897, time 128.22ms
iter 357410: loss 6.2378, time 125.69ms
iter 357420: loss 6.7432, time 126.77ms
iter 357430: loss 5.5051, time 126.00ms
iter 357440: loss 5.7090, time 125.95ms
iter 357450: loss 5.4527, time 126.18ms
iter 357460: loss 6.3447, time 125.83ms
iter 357470: loss 6.2254, time 125.97ms
iter 357480: loss 6.0428, time 125.93ms
iter 357490: loss 5.4726, time 126.85ms
step 357500: train loss 5.5744, val loss 5.5332
saving checkpoint to out-shakespeare-char
iter 357500: loss 5.6528, time 2889.54ms
iter 357510: loss 5.6126, time 125.27ms
iter 357520: loss 6.4549, time 125.71ms
iter 357530: loss 6.9094, time 125.30ms
iter 357540: loss 6.5111, time 126.01ms
iter 357550: loss 6.2070, time 127.24ms
iter 357560: loss 6.2095, time 125.00ms
iter 357570: loss 6.3228, time 124.65ms
iter 357580: loss 5.6638, time 124.45ms
iter 357590: loss 5.6904, time 125.12ms
iter 357600: loss 6.0369, time 125.47ms
iter 357610: loss 5.9869, time 125.07ms
iter 357620: loss 5.9900, time 125.21ms
iter 357630: loss 5.0814, time 126.23ms
iter 357640: loss 5.6906, time 124.95ms
iter 357650: loss 6.0557, time 124.53ms
iter 357660: loss 5.6681, time 124.82ms
iter 357670: loss 6.8878, time 125.53ms
iter 357680: loss 5.5021, time 124.06ms
iter 357690: loss 6.5446, time 124.66ms
iter 357700: loss 6.0753, time 124.87ms
iter 357710: loss 5.6999, time 124.18ms
iter 357720: loss 5.9985, time 124.76ms
iter 357730: loss 5.7205, time 125.10ms
iter 357740: loss 5.4672, time 124.28ms
step 357750: train loss 5.5864, val loss 5.6446
saving checkpoint to out-shakespeare-char
iter 357750: loss 5.7312, time 2884.70ms
iter 357760: loss 5.6316, time 123.02ms
iter 357770: loss 6.4976, time 125.16ms
iter 357780: loss 6.3421, time 124.90ms
iter 357790: loss 6.0455, time 124.96ms
iter 357800: loss 5.9762, time 125.07ms
iter 357810: loss 6.0616, time 125.05ms
iter 357820: loss 6.4577, time 127.59ms
iter 357830: loss 6.1287, time 125.20ms
iter 357840: loss 6.3193, time 125.04ms
iter 357850: loss 5.5909, time 126.29ms
iter 357860: loss 5.2668, time 125.52ms
iter 357870: loss 5.6727, time 125.57ms
iter 357880: loss 6.4620, time 125.54ms
iter 357890: loss 6.0936, time 124.37ms
iter 357900: loss 6.0710, time 125.39ms
iter 357910: loss 5.8488, time 125.38ms
iter 357920: loss 6.1103, time 125.43ms
iter 357930: loss 5.7827, time 127.72ms
iter 357940: loss 6.3222, time 124.48ms
iter 357950: loss 6.0909, time 125.54ms
iter 357960: loss 5.6099, time 124.65ms
iter 357970: loss 6.4241, time 125.45ms
iter 357980: loss 6.4476, time 125.38ms
iter 357990: loss 5.7122, time 125.87ms
step 358000: train loss 5.5947, val loss 5.6242
saving checkpoint to out-shakespeare-char
iter 358000: loss 5.6491, time 2868.20ms
iter 358010: loss 6.6813, time 124.62ms
iter 358020: loss 5.6947, time 125.75ms
iter 358030: loss 5.7326, time 125.44ms
iter 358040: loss 5.9504, time 125.63ms
iter 358050: loss 6.2635, time 125.17ms
iter 358060: loss 6.1684, time 124.85ms
iter 358070: loss 6.1910, time 125.18ms
iter 358080: loss 6.0736, time 125.07ms
iter 358090: loss 6.2246, time 125.20ms
iter 358100: loss 6.9004, time 127.70ms
iter 358110: loss 6.3891, time 124.39ms
iter 358120: loss 6.4723, time 125.12ms
iter 358130: loss 6.0906, time 125.00ms
iter 358140: loss 5.9627, time 125.18ms
iter 358150: loss 6.2018, time 126.70ms
iter 358160: loss 5.9739, time 124.98ms
iter 358170: loss 5.8787, time 125.51ms
iter 358180: loss 5.8200, time 125.21ms
iter 358190: loss 5.6125, time 125.19ms
iter 358200: loss 6.2763, time 125.39ms
iter 358210: loss 6.7211, time 127.70ms
iter 358220: loss 6.4667, time 125.03ms
iter 358230: loss 5.4953, time 125.58ms
iter 358240: loss 6.0773, time 125.35ms
step 358250: train loss 5.6487, val loss 5.6338
saving checkpoint to out-shakespeare-char
iter 358250: loss 5.9726, time 2864.58ms
iter 358260: loss 5.5371, time 125.91ms
iter 358270: loss 6.4891, time 125.86ms
iter 358280: loss 6.7045, time 125.68ms
iter 358290: loss 6.5029, time 125.77ms
iter 358300: loss 5.8162, time 125.46ms
iter 358310: loss 5.7791, time 125.65ms
iter 358320: loss 5.8133, time 125.53ms
iter 358330: loss 5.4851, time 122.50ms
iter 358340: loss 6.1900, time 123.50ms
iter 358350: loss 5.9779, time 122.01ms
iter 358360: loss 6.0722, time 123.24ms
iter 358370: loss 6.3395, time 121.34ms
iter 358380: loss 5.6063, time 122.90ms
iter 358390: loss 5.5247, time 122.10ms
iter 358400: loss 6.0076, time 122.93ms
iter 358410: loss 5.9372, time 121.90ms
iter 358420: loss 5.9272, time 123.36ms
iter 358430: loss 6.2846, time 121.17ms
iter 358440: loss 5.2961, time 123.06ms
iter 358450: loss 5.5245, time 121.56ms
iter 358460: loss 6.5452, time 122.86ms
iter 358470: loss 6.8867, time 122.02ms
iter 358480: loss 6.1329, time 123.15ms
iter 358490: loss 5.9334, time 121.86ms
step 358500: train loss 5.6287, val loss 5.6471
saving checkpoint to out-shakespeare-char
iter 358500: loss 5.5616, time 2886.17ms
iter 358510: loss 6.3056, time 125.90ms
iter 358520: loss 5.8683, time 125.79ms
iter 358530: loss 5.8880, time 126.07ms
iter 358540: loss 6.6191, time 125.77ms
iter 358550: loss 6.1330, time 126.05ms
iter 358560: loss 5.7259, time 125.68ms
iter 358570: loss 5.9259, time 126.37ms
iter 358580: loss 6.0528, time 125.91ms
iter 358590: loss 6.3301, time 125.58ms
iter 358600: loss 5.6821, time 126.51ms
iter 358610: loss 6.0175, time 128.10ms
iter 358620: loss 5.8197, time 125.89ms
iter 358630: loss 5.6292, time 125.86ms
iter 358640: loss 5.7037, time 125.77ms
iter 358650: loss 5.6492, time 125.90ms
iter 358660: loss 5.9118, time 125.62ms
iter 358670: loss 5.9754, time 125.89ms
iter 358680: loss 5.5006, time 126.01ms
iter 358690: loss 6.4561, time 125.95ms
iter 358700: loss 6.0227, time 127.33ms
iter 358710: loss 5.4162, time 127.49ms
iter 358720: loss 6.4739, time 128.08ms
iter 358730: loss 5.7454, time 125.65ms
iter 358740: loss 5.7988, time 125.72ms
step 358750: train loss 5.5594, val loss 5.6088
saving checkpoint to out-shakespeare-char
iter 358750: loss 5.2713, time 2884.07ms
iter 358760: loss 6.6550, time 126.14ms
iter 358770: loss 5.2911, time 126.15ms
iter 358780: loss 6.1883, time 125.72ms
iter 358790: loss 5.1429, time 125.83ms
iter 358800: loss 6.4445, time 125.82ms
iter 358810: loss 4.8975, time 125.86ms
iter 358820: loss 6.0360, time 125.83ms
iter 358830: loss 6.2375, time 124.86ms
iter 358840: loss 5.4453, time 125.84ms
iter 358850: loss 5.8546, time 125.86ms
iter 358860: loss 6.2361, time 126.26ms
iter 358870: loss 6.8157, time 128.10ms
iter 358880: loss 6.7167, time 125.71ms
iter 358890: loss 6.2757, time 125.66ms
iter 358900: loss 6.1448, time 125.93ms
iter 358910: loss 6.1915, time 125.97ms
iter 358920: loss 5.8992, time 125.88ms
iter 358930: loss 5.8279, time 125.90ms
iter 358940: loss 5.8968, time 125.05ms
iter 358950: loss 5.7532, time 125.63ms
iter 358960: loss 6.0379, time 125.86ms
iter 358970: loss 5.9459, time 125.83ms
iter 358980: loss 6.3281, time 128.24ms
iter 358990: loss 5.8837, time 125.89ms
step 359000: train loss 5.5857, val loss 5.6189
saving checkpoint to out-shakespeare-char
iter 359000: loss 5.9445, time 2891.01ms
iter 359010: loss 5.9788, time 125.46ms
iter 359020: loss 4.4949, time 125.04ms
iter 359030: loss 6.2207, time 124.82ms
iter 359040: loss 5.7787, time 125.06ms
iter 359050: loss 6.2790, time 124.62ms
iter 359060: loss 5.5593, time 124.95ms
iter 359070: loss 5.9798, time 127.58ms
iter 359080: loss 6.1656, time 125.33ms
iter 359090: loss 5.7883, time 125.43ms
iter 359100: loss 6.0601, time 125.12ms
iter 359110: loss 6.1459, time 124.74ms
iter 359120: loss 5.6841, time 125.19ms
iter 359130: loss 5.7050, time 125.03ms
iter 359140: loss 5.8857, time 125.32ms
iter 359150: loss 5.9295, time 125.37ms
iter 359160: loss 5.7789, time 125.39ms
iter 359170: loss 6.2812, time 125.46ms
iter 359180: loss 6.4749, time 127.48ms
iter 359190: loss 6.5120, time 124.96ms
iter 359200: loss 6.1394, time 124.88ms
iter 359210: loss 6.3975, time 125.07ms
iter 359220: loss 6.3543, time 125.53ms
iter 359230: loss 5.4853, time 124.91ms
iter 359240: loss 6.1438, time 123.19ms
step 359250: train loss 5.5893, val loss 5.5864
saving checkpoint to out-shakespeare-char
iter 359250: loss 5.5634, time 2872.57ms
iter 359260: loss 5.7630, time 121.72ms
iter 359270: loss 6.5500, time 121.99ms
iter 359280: loss 5.2583, time 121.51ms
iter 359290: loss 6.2956, time 123.31ms
iter 359300: loss 6.6510, time 121.67ms
iter 359310: loss 6.2662, time 123.15ms
iter 359320: loss 6.3015, time 121.53ms
iter 359330: loss 5.4189, time 122.84ms
iter 359340: loss 5.7070, time 121.46ms
iter 359350: loss 6.1625, time 122.94ms
iter 359360: loss 5.9179, time 121.53ms
iter 359370: loss 6.3803, time 122.76ms
iter 359380: loss 5.2921, time 121.39ms
iter 359390: loss 6.0881, time 122.30ms
iter 359400: loss 6.2277, time 121.69ms
iter 359410: loss 5.7777, time 122.83ms
iter 359420: loss 6.0914, time 121.41ms
iter 359430: loss 5.5989, time 123.13ms
iter 359440: loss 5.8882, time 121.66ms
iter 359450: loss 5.8109, time 122.98ms
iter 359460: loss 6.3978, time 121.29ms
iter 359470: loss 6.4577, time 122.81ms
iter 359480: loss 6.2992, time 121.28ms
iter 359490: loss 5.8025, time 122.70ms
step 359500: train loss 5.5365, val loss 5.5652
saving checkpoint to out-shakespeare-char
iter 359500: loss 5.7921, time 2899.13ms
iter 359510: loss 5.0631, time 121.96ms
iter 359520: loss 5.8894, time 121.65ms
iter 359530: loss 6.5549, time 121.62ms
iter 359540: loss 5.5770, time 121.93ms
iter 359550: loss 5.6902, time 121.88ms
iter 359560: loss 5.6747, time 121.63ms
iter 359570: loss 5.3560, time 121.48ms
iter 359580: loss 5.8595, time 121.59ms
iter 359590: loss 6.2854, time 122.15ms
iter 359600: loss 5.8595, time 122.28ms
iter 359610: loss 6.0470, time 122.60ms
iter 359620: loss 6.0488, time 121.66ms
iter 359630: loss 6.2059, time 121.52ms
iter 359640: loss 6.2810, time 121.94ms
iter 359650: loss 4.8194, time 121.58ms
iter 359660: loss 5.8983, time 121.46ms
iter 359670: loss 6.2538, time 121.79ms
iter 359680: loss 5.9902, time 121.69ms
iter 359690: loss 6.2371, time 121.97ms
iter 359700: loss 6.2248, time 121.22ms
iter 359710: loss 6.8098, time 121.39ms
iter 359720: loss 6.8616, time 121.59ms
iter 359730: loss 6.1786, time 121.75ms
iter 359740: loss 5.9924, time 121.50ms
step 359750: train loss 5.6046, val loss 5.6075
saving checkpoint to out-shakespeare-char
iter 359750: loss 6.2631, time 2904.99ms
iter 359760: loss 6.4433, time 126.43ms
iter 359770: loss 5.5628, time 128.06ms
iter 359780: loss 6.5008, time 125.97ms
iter 359790: loss 5.8675, time 125.67ms
iter 359800: loss 6.1856, time 126.66ms
iter 359810: loss 6.3332, time 128.34ms
iter 359820: loss 6.3659, time 125.67ms
iter 359830: loss 6.1969, time 125.88ms
iter 359840: loss 6.1293, time 125.44ms
iter 359850: loss 5.4254, time 128.20ms
iter 359860: loss 5.8197, time 125.79ms
iter 359870: loss 6.7193, time 125.95ms
iter 359880: loss 6.1623, time 126.87ms
iter 359890: loss 6.2851, time 128.40ms
iter 359900: loss 5.9700, time 125.87ms
iter 359910: loss 5.7240, time 125.72ms
iter 359920: loss 5.8831, time 125.00ms
iter 359930: loss 6.2185, time 125.83ms
iter 359940: loss 6.5894, time 126.39ms
iter 359950: loss 5.8055, time 125.62ms
iter 359960: loss 5.3831, time 126.33ms
iter 359970: loss 5.8272, time 125.82ms
iter 359980: loss 6.6890, time 125.80ms
iter 359990: loss 5.8031, time 125.71ms
step 360000: train loss 5.5949, val loss 5.6221
saving checkpoint to out-shakespeare-char
iter 360000: loss 5.6465, time 2890.26ms
iter 360010: loss 5.9534, time 126.10ms
iter 360020: loss 5.1477, time 125.87ms
iter 360030: loss 6.3663, time 126.16ms
iter 360040: loss 5.7350, time 128.26ms
iter 360050: loss 6.1641, time 125.65ms
iter 360060: loss 5.7870, time 125.51ms
iter 360070: loss 6.2453, time 125.75ms
iter 360080: loss 6.0612, time 125.93ms
iter 360090: loss 6.0189, time 126.15ms
iter 360100: loss 5.9686, time 126.33ms
iter 360110: loss 6.2155, time 125.87ms
iter 360120: loss 5.5501, time 125.66ms
iter 360130: loss 5.7425, time 125.03ms
iter 360140: loss 5.6162, time 126.04ms
iter 360150: loss 6.1603, time 125.60ms
iter 360160: loss 6.3654, time 127.79ms
iter 360170: loss 6.6480, time 125.24ms
iter 360180: loss 6.5885, time 125.01ms
iter 360190: loss 6.3227, time 126.05ms
iter 360200: loss 6.0227, time 125.21ms
iter 360210: loss 5.2291, time 125.71ms
iter 360220: loss 5.5615, time 125.59ms
iter 360230: loss 5.2300, time 125.33ms
iter 360240: loss 5.7416, time 125.91ms
step 360250: train loss 5.5848, val loss 5.6605
saving checkpoint to out-shakespeare-char
iter 360250: loss 5.7623, time 2915.07ms
iter 360260: loss 5.8253, time 125.91ms
iter 360270: loss 6.4096, time 126.07ms
iter 360280: loss 5.7083, time 125.85ms
iter 360290: loss 6.6160, time 125.88ms
iter 360300: loss 5.8871, time 126.11ms
iter 360310: loss 6.2703, time 128.55ms
iter 360320: loss 6.0616, time 125.82ms
iter 360330: loss 5.4323, time 125.54ms
iter 360340: loss 5.9749, time 125.98ms
iter 360350: loss 5.8337, time 126.02ms
iter 360360: loss 5.5682, time 125.71ms
iter 360370: loss 5.8422, time 126.04ms
iter 360380: loss 6.3537, time 125.79ms
iter 360390: loss 5.3759, time 126.04ms
iter 360400: loss 5.9193, time 126.07ms
iter 360410: loss 6.4167, time 125.96ms
iter 360420: loss 5.8568, time 128.07ms
iter 360430: loss 5.8170, time 125.72ms
iter 360440: loss 6.1590, time 125.87ms
iter 360450: loss 6.0765, time 125.91ms
iter 360460: loss 6.3063, time 125.62ms
iter 360470: loss 6.2751, time 120.17ms
iter 360480: loss 6.9967, time 121.66ms
iter 360490: loss 6.3239, time 119.49ms
step 360500: train loss 5.5948, val loss 5.6269
saving checkpoint to out-shakespeare-char
iter 360500: loss 5.9104, time 2871.10ms
iter 360510: loss 5.4122, time 121.11ms
iter 360520: loss 6.0109, time 120.93ms
iter 360530: loss 5.8319, time 120.99ms
iter 360540: loss 6.2860, time 125.33ms
iter 360550: loss 5.7805, time 125.83ms
iter 360560: loss 6.4112, time 125.03ms
iter 360570: loss 5.9013, time 125.38ms
iter 360580: loss 6.2304, time 125.65ms
iter 360590: loss 6.2926, time 125.79ms
iter 360600: loss 6.2835, time 125.58ms
iter 360610: loss 5.6187, time 125.21ms
iter 360620: loss 6.2226, time 124.70ms
iter 360630: loss 6.0237, time 124.39ms
iter 360640: loss 5.8620, time 125.31ms
iter 360650: loss 5.5459, time 126.13ms
iter 360660: loss 6.2771, time 125.32ms
iter 360670: loss 5.7944, time 125.58ms
iter 360680: loss 6.4471, time 124.42ms
iter 360690: loss 6.5221, time 125.33ms
iter 360700: loss 6.6608, time 125.36ms
iter 360710: loss 5.6594, time 127.58ms
iter 360720: loss 5.5036, time 125.25ms
iter 360730: loss 6.0984, time 125.30ms
iter 360740: loss 6.0541, time 125.52ms
step 360750: train loss 5.5707, val loss 5.5761
saving checkpoint to out-shakespeare-char
iter 360750: loss 5.0786, time 2900.56ms
iter 360760: loss 5.8257, time 125.03ms
iter 360770: loss 5.8630, time 124.70ms
iter 360780: loss 5.8100, time 125.21ms
iter 360790: loss 5.8530, time 124.98ms
iter 360800: loss 5.3990, time 125.17ms
iter 360810: loss 5.7111, time 124.70ms
iter 360820: loss 5.9212, time 125.08ms
iter 360830: loss 6.3555, time 124.94ms
iter 360840: loss 5.9486, time 125.25ms
iter 360850: loss 6.2589, time 124.97ms
iter 360860: loss 6.2679, time 127.60ms
iter 360870: loss 6.4965, time 125.07ms
iter 360880: loss 5.4816, time 124.27ms
iter 360890: loss 6.0131, time 125.24ms
iter 360900: loss 6.2775, time 124.06ms
iter 360910: loss 6.0108, time 125.08ms
iter 360920: loss 5.7609, time 125.03ms
iter 360930: loss 6.2865, time 127.32ms
iter 360940: loss 5.0251, time 125.11ms
iter 360950: loss 6.0012, time 124.87ms
iter 360960: loss 6.2526, time 124.94ms
iter 360970: loss 5.9340, time 125.01ms
iter 360980: loss 5.6389, time 124.59ms
iter 360990: loss 6.4905, time 125.21ms
step 361000: train loss 5.6157, val loss 5.6235
saving checkpoint to out-shakespeare-char
iter 361000: loss 5.6405, time 2884.41ms
iter 361010: loss 6.2368, time 121.55ms
iter 361020: loss 5.5501, time 122.50ms
iter 361030: loss 5.3383, time 121.44ms
iter 361040: loss 5.2489, time 122.73ms
iter 361050: loss 5.5509, time 121.59ms
iter 361060: loss 6.0276, time 122.57ms
iter 361070: loss 6.6043, time 122.39ms
iter 361080: loss 5.6290, time 122.70ms
iter 361090: loss 5.6646, time 121.33ms
iter 361100: loss 6.6278, time 122.53ms
iter 361110: loss 5.8098, time 121.50ms
iter 361120: loss 5.9557, time 122.45ms
iter 361130: loss 5.7452, time 121.51ms
iter 361140: loss 5.9098, time 122.51ms
iter 361150: loss 5.6727, time 121.41ms
iter 361160: loss 6.0197, time 121.74ms
iter 361170: loss 5.6595, time 121.43ms
iter 361180: loss 5.6747, time 122.86ms
iter 361190: loss 6.2504, time 121.43ms
iter 361200: loss 6.1177, time 122.57ms
iter 361210: loss 6.0570, time 121.42ms
iter 361220: loss 5.8792, time 122.81ms
iter 361230: loss 5.7596, time 121.09ms
iter 361240: loss 6.3215, time 122.87ms
step 361250: train loss 5.6365, val loss 5.6314
saving checkpoint to out-shakespeare-char
iter 361250: loss 6.1904, time 2873.54ms
iter 361260: loss 5.0311, time 121.60ms
iter 361270: loss 6.4016, time 121.55ms
iter 361280: loss 5.8813, time 121.32ms
iter 361290: loss 6.6140, time 121.40ms
iter 361300: loss 5.2974, time 121.38ms
iter 361310: loss 6.4129, time 121.99ms
iter 361320: loss 5.9925, time 121.38ms
iter 361330: loss 5.8712, time 121.59ms
iter 361340: loss 6.5161, time 121.36ms
iter 361350: loss 6.2760, time 121.49ms
iter 361360: loss 6.0980, time 121.37ms
iter 361370: loss 5.8371, time 121.73ms
iter 361380: loss 5.1255, time 121.83ms
iter 361390: loss 6.2179, time 121.05ms
iter 361400: loss 6.0917, time 121.66ms
iter 361410: loss 5.7481, time 121.53ms
iter 361420: loss 6.1206, time 121.40ms
iter 361430: loss 6.2711, time 121.22ms
iter 361440: loss 5.8249, time 122.90ms
iter 361450: loss 6.3082, time 121.54ms
iter 361460: loss 5.9922, time 121.46ms
iter 361470: loss 6.1141, time 121.47ms
iter 361480: loss 5.7586, time 120.94ms
iter 361490: loss 6.0992, time 121.77ms
step 361500: train loss 5.5762, val loss 5.5712
saving checkpoint to out-shakespeare-char
iter 361500: loss 5.6233, time 2888.50ms
iter 361510: loss 5.5137, time 124.36ms
iter 361520: loss 6.2009, time 121.55ms
iter 361530: loss 5.7729, time 123.81ms
iter 361540: loss 5.8333, time 121.62ms
iter 361550: loss 5.4396, time 123.79ms
iter 361560: loss 6.1993, time 121.60ms
iter 361570: loss 5.9549, time 123.70ms
iter 361580: loss 5.9110, time 121.54ms
iter 361590: loss 5.8555, time 123.65ms
iter 361600: loss 6.3032, time 121.52ms
iter 361610: loss 6.2740, time 124.06ms
iter 361620: loss 5.7842, time 121.07ms
iter 361630: loss 5.9626, time 123.95ms
iter 361640: loss 6.2708, time 121.51ms
iter 361650: loss 6.5704, time 122.85ms
iter 361660: loss 6.7643, time 121.72ms
iter 361670: loss 5.9668, time 123.99ms
iter 361680: loss 6.0918, time 120.68ms
iter 361690: loss 6.5083, time 124.00ms
iter 361700: loss 6.1343, time 121.49ms
iter 361710: loss 5.9014, time 122.40ms
iter 361720: loss 5.9572, time 121.96ms
iter 361730: loss 5.6410, time 123.64ms
iter 361740: loss 5.7611, time 122.05ms
step 361750: train loss 5.6367, val loss 5.5515
saving checkpoint to out-shakespeare-char
iter 361750: loss 5.5868, time 2901.36ms
iter 361760: loss 6.2443, time 125.54ms
iter 361770: loss 6.3639, time 126.03ms
iter 361780: loss 6.5962, time 125.28ms
iter 361790: loss 6.6771, time 127.54ms
iter 361800: loss 6.0124, time 125.57ms
iter 361810: loss 5.2793, time 125.88ms
iter 361820: loss 6.4641, time 125.72ms
iter 361830: loss 6.5331, time 127.94ms
iter 361840: loss 6.2812, time 125.01ms
iter 361850: loss 6.2819, time 125.91ms
iter 361860: loss 6.1247, time 125.90ms
iter 361870: loss 6.2382, time 126.19ms
iter 361880: loss 6.2741, time 124.36ms
iter 361890: loss 5.5649, time 126.07ms
iter 361900: loss 5.9297, time 126.28ms
iter 361910: loss 5.9119, time 125.80ms
iter 361920: loss 6.0682, time 126.10ms
iter 361930: loss 6.0509, time 126.45ms
iter 361940: loss 5.6536, time 128.63ms
iter 361950: loss 5.9971, time 126.22ms
iter 361960: loss 6.2957, time 125.75ms
iter 361970: loss 6.2793, time 125.83ms
iter 361980: loss 5.5410, time 128.55ms
iter 361990: loss 6.2231, time 125.25ms
step 362000: train loss 5.6362, val loss 5.5849
saving checkpoint to out-shakespeare-char
iter 362000: loss 6.1228, time 2893.69ms
iter 362010: loss 6.2059, time 122.61ms
iter 362020: loss 5.5273, time 121.46ms
iter 362030: loss 6.3365, time 122.35ms
iter 362040: loss 6.1121, time 121.02ms
iter 362050: loss 5.3978, time 120.82ms
iter 362060: loss 5.5731, time 120.69ms
iter 362070: loss 5.6647, time 121.72ms
iter 362080: loss 6.0396, time 121.49ms
iter 362090: loss 6.2045, time 122.76ms
iter 362100: loss 6.3638, time 121.81ms
iter 362110: loss 5.2960, time 123.25ms
iter 362120: loss 6.7321, time 121.14ms
iter 362130: loss 5.8047, time 121.83ms
iter 362140: loss 6.1022, time 121.52ms
iter 362150: loss 6.0104, time 122.68ms
iter 362160: loss 6.3749, time 121.44ms
iter 362170: loss 5.5756, time 122.84ms
iter 362180: loss 6.1772, time 121.71ms
iter 362190: loss 5.1021, time 123.30ms
iter 362200: loss 6.2168, time 121.73ms
iter 362210: loss 5.7018, time 122.48ms
iter 362220: loss 6.2511, time 121.53ms
iter 362230: loss 5.9548, time 122.93ms
iter 362240: loss 6.2128, time 121.41ms
step 362250: train loss 5.5722, val loss 5.6320
saving checkpoint to out-shakespeare-char
iter 362250: loss 5.5906, time 2897.00ms
iter 362260: loss 6.0784, time 128.31ms
iter 362270: loss 5.7182, time 125.25ms
iter 362280: loss 6.4720, time 125.46ms
iter 362290: loss 6.0877, time 125.85ms
iter 362300: loss 6.0349, time 124.26ms
iter 362310: loss 6.1290, time 121.95ms
iter 362320: loss 5.7473, time 121.46ms
iter 362330: loss 5.6227, time 121.73ms
iter 362340: loss 6.4396, time 122.02ms
iter 362350: loss 5.8365, time 121.76ms
iter 362360: loss 6.2749, time 121.15ms
iter 362370: loss 5.7245, time 121.72ms
iter 362380: loss 6.4544, time 122.48ms
iter 362390: loss 6.1464, time 121.50ms
iter 362400: loss 5.6297, time 122.15ms
iter 362410: loss 5.3558, time 121.70ms
iter 362420: loss 6.0585, time 121.67ms
iter 362430: loss 6.3523, time 121.75ms
iter 362440: loss 5.3723, time 121.75ms
iter 362450: loss 5.8328, time 121.11ms
iter 362460: loss 6.3056, time 121.78ms
iter 362470: loss 6.3037, time 121.98ms
iter 362480: loss 5.8094, time 121.60ms
iter 362490: loss 6.1530, time 121.72ms
step 362500: train loss 5.5786, val loss 5.5950
saving checkpoint to out-shakespeare-char
iter 362500: loss 5.5137, time 2897.86ms
iter 362510: loss 5.7305, time 121.82ms
iter 362520: loss 6.1918, time 121.65ms
iter 362530: loss 6.9927, time 121.65ms
iter 362540: loss 6.4401, time 121.39ms
iter 362550: loss 6.0307, time 120.89ms
iter 362560: loss 6.1601, time 121.49ms
iter 362570: loss 6.0264, time 121.64ms
iter 362580: loss 6.2505, time 119.83ms
iter 362590: loss 6.2263, time 121.12ms
iter 362600: loss 6.3172, time 122.64ms
iter 362610: loss 5.6760, time 120.86ms
iter 362620: loss 6.4903, time 122.95ms
iter 362630: loss 6.3736, time 122.24ms
iter 362640: loss 6.1274, time 122.90ms
iter 362650: loss 6.4051, time 121.60ms
iter 362660: loss 5.2318, time 122.93ms
iter 362670: loss 5.6529, time 121.75ms
iter 362680: loss 5.6301, time 123.43ms
iter 362690: loss 6.2923, time 121.74ms
iter 362700: loss 6.0084, time 122.77ms
iter 362710: loss 5.9983, time 121.67ms
iter 362720: loss 5.6940, time 123.24ms
iter 362730: loss 6.2476, time 121.55ms
iter 362740: loss 7.3295, time 122.92ms
step 362750: train loss 5.6027, val loss 5.5642
saving checkpoint to out-shakespeare-char
iter 362750: loss 6.0053, time 2898.10ms
iter 362760: loss 4.8072, time 121.92ms
iter 362770: loss 5.1382, time 122.14ms
iter 362780: loss 5.1541, time 121.68ms
iter 362790: loss 5.2124, time 122.07ms
iter 362800: loss 5.8336, time 121.77ms
iter 362810: loss 5.9048, time 121.58ms
iter 362820: loss 5.5193, time 122.11ms
iter 362830: loss 6.0446, time 121.90ms
iter 362840: loss 6.9459, time 121.87ms
iter 362850: loss 5.5901, time 121.86ms
iter 362860: loss 5.1995, time 122.04ms
iter 362870: loss 6.2568, time 121.59ms
iter 362880: loss 6.2496, time 122.50ms
iter 362890: loss 6.1942, time 121.65ms
iter 362900: loss 5.7978, time 121.79ms
iter 362910: loss 6.5953, time 121.77ms
iter 362920: loss 6.0235, time 121.49ms
iter 362930: loss 6.2062, time 121.90ms
iter 362940: loss 6.4651, time 121.65ms
iter 362950: loss 5.3771, time 121.76ms
iter 362960: loss 5.7215, time 121.62ms
iter 362970: loss 6.2597, time 121.54ms
iter 362980: loss 5.4237, time 122.71ms
iter 362990: loss 6.3763, time 121.61ms
step 363000: train loss 5.5655, val loss 5.6190
saving checkpoint to out-shakespeare-char
iter 363000: loss 5.0140, time 2897.59ms
iter 363010: loss 6.0890, time 121.73ms
iter 363020: loss 6.3969, time 121.46ms
iter 363030: loss 5.8730, time 121.63ms
iter 363040: loss 6.3967, time 121.79ms
iter 363050: loss 5.5786, time 121.73ms
iter 363060: loss 5.9662, time 120.51ms
iter 363070: loss 6.0894, time 122.02ms
iter 363080: loss 5.5688, time 121.40ms
iter 363090: loss 6.0597, time 122.02ms
iter 363100: loss 6.2405, time 121.67ms
iter 363110: loss 5.8604, time 122.15ms
iter 363120: loss 6.8967, time 121.70ms
iter 363130: loss 5.9661, time 121.72ms
iter 363140: loss 6.1605, time 121.59ms
iter 363150: loss 6.3134, time 121.76ms
iter 363160: loss 5.8740, time 121.54ms
iter 363170: loss 6.1214, time 121.84ms
iter 363180: loss 6.7115, time 121.50ms
iter 363190: loss 5.5973, time 121.73ms
iter 363200: loss 6.5410, time 121.61ms
iter 363210: loss 6.0779, time 121.99ms
iter 363220: loss 6.2591, time 121.90ms
iter 363230: loss 5.9400, time 121.89ms
iter 363240: loss 6.6368, time 120.88ms
step 363250: train loss 5.5695, val loss 5.5514
saving checkpoint to out-shakespeare-char
iter 363250: loss 6.0500, time 2899.13ms
iter 363260: loss 5.0558, time 125.27ms
iter 363270: loss 5.7807, time 125.35ms
iter 363280: loss 5.4398, time 125.43ms
iter 363290: loss 6.4953, time 125.24ms
iter 363300: loss 5.9118, time 125.43ms
iter 363310: loss 6.3983, time 125.47ms
iter 363320: loss 5.8431, time 125.02ms
iter 363330: loss 6.2468, time 125.41ms
iter 363340: loss 5.3639, time 127.65ms
iter 363350: loss 5.5467, time 125.26ms
iter 363360: loss 6.3321, time 125.37ms
iter 363370: loss 6.1333, time 125.71ms
iter 363380: loss 6.7542, time 125.40ms
iter 363390: loss 5.7817, time 125.21ms
iter 363400: loss 6.4248, time 125.33ms
iter 363410: loss 5.9583, time 125.43ms
iter 363420: loss 5.8486, time 125.28ms
iter 363430: loss 5.2471, time 125.49ms
iter 363440: loss 6.0744, time 125.54ms
iter 363450: loss 6.2644, time 126.53ms
iter 363460: loss 5.4577, time 125.70ms
iter 363470: loss 5.7309, time 125.36ms
iter 363480: loss 6.5255, time 125.40ms
iter 363490: loss 5.6834, time 121.61ms
step 363500: train loss 5.6064, val loss 5.5910
saving checkpoint to out-shakespeare-char
iter 363500: loss 6.2619, time 2875.75ms
iter 363510: loss 5.4552, time 125.66ms
iter 363520: loss 6.6771, time 125.69ms
iter 363530: loss 5.8665, time 125.29ms
iter 363540: loss 6.0390, time 125.42ms
iter 363550: loss 6.0560, time 125.56ms
iter 363560: loss 5.8773, time 125.75ms
iter 363570: loss 5.9550, time 125.44ms
iter 363580: loss 6.5370, time 125.38ms
iter 363590: loss 5.7723, time 125.24ms
iter 363600: loss 6.2837, time 125.75ms
iter 363610: loss 6.2239, time 125.40ms
iter 363620: loss 6.8050, time 125.40ms
iter 363630: loss 6.5324, time 125.37ms
iter 363640: loss 6.1518, time 125.21ms
iter 363650: loss 5.6049, time 125.54ms
iter 363660: loss 6.0479, time 127.78ms
iter 363670: loss 5.7295, time 125.18ms
iter 363680: loss 6.3776, time 125.87ms
iter 363690: loss 6.2898, time 126.03ms
iter 363700: loss 6.0176, time 124.98ms
iter 363710: loss 5.9506, time 125.42ms
iter 363720: loss 6.4285, time 126.24ms
iter 363730: loss 6.2999, time 125.33ms
iter 363740: loss 6.0827, time 125.68ms
step 363750: train loss 5.5717, val loss 5.6118
saving checkpoint to out-shakespeare-char
iter 363750: loss 6.8861, time 2871.96ms
iter 363760: loss 5.5133, time 127.98ms
iter 363770: loss 5.5225, time 125.26ms
iter 363780: loss 6.2867, time 124.47ms
iter 363790: loss 5.7929, time 125.49ms
iter 363800: loss 6.2391, time 125.40ms
iter 363810: loss 5.9639, time 125.49ms
iter 363820: loss 5.2796, time 125.20ms
iter 363830: loss 6.3557, time 125.35ms
iter 363840: loss 5.9920, time 125.27ms
iter 363850: loss 6.7035, time 125.19ms
iter 363860: loss 5.8240, time 125.55ms
iter 363870: loss 6.5444, time 127.51ms
iter 363880: loss 6.2121, time 125.35ms
iter 363890: loss 5.5796, time 125.30ms
iter 363900: loss 6.4610, time 125.44ms
iter 363910: loss 6.4270, time 125.65ms
iter 363920: loss 5.4627, time 125.00ms
iter 363930: loss 6.3814, time 125.41ms
iter 363940: loss 6.3353, time 125.43ms
iter 363950: loss 5.0663, time 125.12ms
iter 363960: loss 5.7942, time 125.35ms
iter 363970: loss 6.1393, time 125.21ms
iter 363980: loss 6.6279, time 127.99ms
iter 363990: loss 6.5564, time 125.28ms
step 364000: train loss 5.6154, val loss 5.5885
saving checkpoint to out-shakespeare-char
iter 364000: loss 5.7996, time 2882.04ms
iter 364010: loss 5.8608, time 125.25ms
iter 364020: loss 6.0676, time 125.26ms
iter 364030: loss 6.3943, time 127.81ms
iter 364040: loss 5.6781, time 125.26ms
iter 364050: loss 6.2115, time 125.43ms
iter 364060: loss 5.6263, time 124.95ms
iter 364070: loss 5.6762, time 124.49ms
iter 364080: loss 6.9485, time 125.59ms
iter 364090: loss 6.1616, time 125.23ms
iter 364100: loss 6.1260, time 125.34ms
iter 364110: loss 5.6410, time 126.36ms
iter 364120: loss 6.3343, time 125.21ms
iter 364130: loss 6.0962, time 125.94ms
iter 364140: loss 6.1544, time 127.70ms
iter 364150: loss 6.7296, time 125.23ms
iter 364160: loss 5.7221, time 125.81ms
iter 364170: loss 5.3312, time 125.44ms
iter 364180: loss 6.4525, time 125.08ms
iter 364190: loss 5.3281, time 124.55ms
iter 364200: loss 5.7102, time 125.57ms
iter 364210: loss 6.0760, time 128.03ms
iter 364220: loss 6.3288, time 125.16ms
iter 364230: loss 6.0705, time 125.10ms
iter 364240: loss 5.9328, time 125.09ms
step 364250: train loss 5.6224, val loss 5.5844
saving checkpoint to out-shakespeare-char
iter 364250: loss 5.9658, time 2861.24ms
iter 364260: loss 7.0528, time 125.67ms
iter 364270: loss 5.4840, time 125.73ms
iter 364280: loss 6.0959, time 125.38ms
iter 364290: loss 5.7163, time 128.17ms
iter 364300: loss 6.2295, time 125.32ms
iter 364310: loss 5.6780, time 126.26ms
iter 364320: loss 5.8575, time 125.37ms
iter 364330: loss 5.6155, time 125.11ms
iter 364340: loss 6.9840, time 125.02ms
iter 364350: loss 5.4001, time 125.44ms
iter 364360: loss 5.6404, time 127.34ms
iter 364370: loss 6.1958, time 125.27ms
iter 364380: loss 6.3049, time 125.23ms
iter 364390: loss 5.9631, time 125.55ms
iter 364400: loss 5.6965, time 125.21ms
iter 364410: loss 5.7870, time 125.33ms
iter 364420: loss 6.0089, time 125.17ms
iter 364430: loss 6.0141, time 125.35ms
iter 364440: loss 5.9403, time 124.88ms
iter 364450: loss 6.0690, time 125.20ms
iter 364460: loss 5.8105, time 125.32ms
iter 364470: loss 5.9507, time 127.83ms
iter 364480: loss 6.5564, time 125.31ms
iter 364490: loss 6.2438, time 125.20ms
step 364500: train loss 5.5563, val loss 5.6022
saving checkpoint to out-shakespeare-char
iter 364500: loss 6.0634, time 2868.24ms
iter 364510: loss 5.9823, time 125.19ms
iter 364520: loss 5.9335, time 125.45ms
iter 364530: loss 5.3708, time 126.22ms
iter 364540: loss 5.2490, time 126.37ms
iter 364550: loss 6.5753, time 125.89ms
iter 364560: loss 6.6666, time 124.83ms
iter 364570: loss 5.5645, time 125.87ms
iter 364580: loss 5.3928, time 125.14ms
iter 364590: loss 5.9254, time 126.26ms
iter 364600: loss 6.7083, time 125.16ms
iter 364610: loss 6.0451, time 125.46ms
iter 364620: loss 5.3706, time 125.67ms
iter 364630: loss 6.4154, time 125.56ms
iter 364640: loss 6.5826, time 124.67ms
iter 364650: loss 5.6914, time 127.90ms
iter 364660: loss 6.3571, time 126.08ms
iter 364670: loss 5.5866, time 125.13ms
iter 364680: loss 5.4845, time 124.86ms
iter 364690: loss 5.2247, time 125.46ms
iter 364700: loss 5.8553, time 125.93ms
iter 364710: loss 6.5328, time 125.71ms
iter 364720: loss 6.2102, time 126.21ms
iter 364730: loss 6.3423, time 126.02ms
iter 364740: loss 5.3631, time 125.88ms
step 364750: train loss 5.5902, val loss 5.6057
saving checkpoint to out-shakespeare-char
iter 364750: loss 6.2730, time 2893.06ms
iter 364760: loss 5.7917, time 126.21ms
iter 364770: loss 6.3839, time 125.62ms
iter 364780: loss 6.0664, time 125.59ms
iter 364790: loss 5.7498, time 125.94ms
iter 364800: loss 5.7872, time 127.56ms
iter 364810: loss 5.8721, time 126.07ms
iter 364820: loss 5.9297, time 126.35ms
iter 364830: loss 7.0773, time 126.25ms
iter 364840: loss 5.7870, time 128.33ms
iter 364850: loss 5.7303, time 125.23ms
iter 364860: loss 5.8890, time 125.39ms
iter 364870: loss 6.4522, time 125.71ms
iter 364880: loss 5.8845, time 124.99ms
iter 364890: loss 5.8496, time 125.56ms
iter 364900: loss 6.4900, time 125.46ms
iter 364910: loss 6.3880, time 126.02ms
iter 364920: loss 5.4132, time 124.96ms
iter 364930: loss 6.4755, time 125.93ms
iter 364940: loss 6.2155, time 125.22ms
iter 364950: loss 6.1619, time 127.38ms
iter 364960: loss 5.8616, time 125.74ms
iter 364970: loss 6.3079, time 125.18ms
iter 364980: loss 5.4868, time 125.04ms
iter 364990: loss 5.8767, time 125.50ms
step 365000: train loss 5.6099, val loss 5.6002
saving checkpoint to out-shakespeare-char
iter 365000: loss 5.8288, time 2884.64ms
iter 365010: loss 5.3175, time 125.16ms
iter 365020: loss 6.3655, time 125.29ms
iter 365030: loss 6.5411, time 125.06ms
iter 365040: loss 6.4544, time 125.32ms
iter 365050: loss 6.2803, time 124.90ms
iter 365060: loss 5.9397, time 125.29ms
iter 365070: loss 5.7983, time 127.40ms
iter 365080: loss 5.9415, time 125.54ms
iter 365090: loss 5.4198, time 125.20ms
iter 365100: loss 6.1389, time 125.56ms
iter 365110: loss 5.6537, time 125.03ms
iter 365120: loss 5.3552, time 125.15ms
iter 365130: loss 5.6119, time 125.09ms
iter 365140: loss 6.2579, time 124.94ms
iter 365150: loss 5.5377, time 124.71ms
iter 365160: loss 6.1683, time 125.07ms
iter 365170: loss 6.0681, time 124.78ms
iter 365180: loss 5.8113, time 127.62ms
iter 365190: loss 6.0302, time 124.80ms
iter 365200: loss 5.6108, time 124.73ms
iter 365210: loss 5.6173, time 125.49ms
iter 365220: loss 6.0244, time 125.08ms
iter 365230: loss 6.7491, time 124.99ms
iter 365240: loss 5.8482, time 124.39ms
step 365250: train loss 5.5690, val loss 5.5811
saving checkpoint to out-shakespeare-char
iter 365250: loss 6.4670, time 2867.40ms
iter 365260: loss 5.6504, time 127.98ms
iter 365270: loss 6.2267, time 124.94ms
iter 365280: loss 6.0150, time 124.28ms
iter 365290: loss 5.3341, time 125.30ms
iter 365300: loss 6.3118, time 125.16ms
iter 365310: loss 5.9774, time 125.39ms
iter 365320: loss 5.4337, time 125.05ms
iter 365330: loss 6.3356, time 124.98ms
iter 365340: loss 5.0335, time 127.37ms
iter 365350: loss 6.2132, time 125.22ms
iter 365360: loss 5.9588, time 124.52ms
iter 365370: loss 5.9592, time 124.99ms
iter 365380: loss 6.3043, time 124.75ms
iter 365390: loss 5.6821, time 125.16ms
iter 365400: loss 5.8380, time 124.72ms
iter 365410: loss 6.1282, time 125.37ms
iter 365420: loss 5.8468, time 124.14ms
iter 365430: loss 6.2119, time 125.91ms
iter 365440: loss 6.3826, time 124.42ms
iter 365450: loss 6.2872, time 127.25ms
iter 365460: loss 6.0852, time 124.84ms
iter 365470: loss 5.7929, time 125.50ms
iter 365480: loss 6.4167, time 124.59ms
iter 365490: loss 5.7964, time 124.97ms
step 365500: train loss 5.5817, val loss 5.5745
saving checkpoint to out-shakespeare-char
iter 365500: loss 5.9232, time 2861.96ms
iter 365510: loss 5.8452, time 125.85ms
iter 365520: loss 5.0637, time 125.92ms
iter 365530: loss 5.7072, time 125.75ms
iter 365540: loss 6.1558, time 126.20ms
iter 365550: loss 5.6982, time 126.27ms
iter 365560: loss 5.2127, time 128.32ms
iter 365570: loss 6.1882, time 125.94ms
iter 365580: loss 5.8831, time 126.00ms
iter 365590: loss 6.1582, time 125.81ms
iter 365600: loss 5.7294, time 128.29ms
iter 365610: loss 6.3049, time 125.76ms
iter 365620: loss 6.0213, time 125.86ms
iter 365630: loss 6.3611, time 125.79ms
iter 365640: loss 6.0938, time 125.81ms
iter 365650: loss 5.8415, time 125.72ms
iter 365660: loss 5.6382, time 125.90ms
iter 365670: loss 6.5550, time 125.93ms
iter 365680: loss 6.5445, time 125.79ms
iter 365690: loss 6.7564, time 125.78ms
iter 365700: loss 6.3847, time 126.13ms
iter 365710: loss 5.5229, time 128.28ms
iter 365720: loss 5.3543, time 125.76ms
iter 365730: loss 6.6688, time 125.86ms
iter 365740: loss 6.3113, time 125.81ms
step 365750: train loss 5.6257, val loss 5.5614
saving checkpoint to out-shakespeare-char
iter 365750: loss 4.9607, time 2895.31ms
iter 365760: loss 6.3793, time 126.29ms
iter 365770: loss 5.6252, time 125.48ms
iter 365780: loss 5.8166, time 125.74ms
iter 365790: loss 5.9036, time 125.59ms
iter 365800: loss 5.8601, time 125.69ms
iter 365810: loss 6.7570, time 125.51ms
iter 365820: loss 5.9710, time 126.01ms
iter 365830: loss 6.2149, time 125.94ms
iter 365840: loss 5.4505, time 126.31ms
iter 365850: loss 6.0935, time 125.83ms
iter 365860: loss 4.9562, time 128.26ms
iter 365870: loss 5.3642, time 125.62ms
iter 365880: loss 5.3570, time 125.81ms
iter 365890: loss 6.0867, time 125.63ms
iter 365900: loss 5.7062, time 125.76ms
iter 365910: loss 5.7838, time 127.56ms
iter 365920: loss 5.5358, time 125.59ms
iter 365930: loss 5.9649, time 126.00ms
iter 365940: loss 5.1155, time 125.85ms
iter 365950: loss 6.4232, time 125.78ms
iter 365960: loss 5.6500, time 125.87ms
iter 365970: loss 6.6351, time 128.18ms
iter 365980: loss 6.0677, time 125.91ms
iter 365990: loss 5.3491, time 125.76ms
step 366000: train loss 5.5815, val loss 5.6354
saving checkpoint to out-shakespeare-char
iter 366000: loss 5.8419, time 2882.27ms
iter 366010: loss 5.5627, time 128.38ms
iter 366020: loss 5.8540, time 119.77ms
iter 366030: loss 5.9443, time 120.81ms
iter 366040: loss 6.0618, time 119.70ms
iter 366050: loss 6.1651, time 121.14ms
iter 366060: loss 5.9957, time 120.21ms
iter 366070: loss 6.3981, time 121.30ms
iter 366080: loss 6.1641, time 119.65ms
iter 366090: loss 6.6033, time 120.85ms
iter 366100: loss 6.2341, time 119.77ms
iter 366110: loss 5.5274, time 120.67ms
iter 366120: loss 5.8756, time 121.04ms
iter 366130: loss 6.2253, time 120.58ms
iter 366140: loss 6.1097, time 120.80ms
iter 366150: loss 5.8686, time 120.79ms
iter 366160: loss 6.4500, time 119.66ms
iter 366170: loss 6.2231, time 122.01ms
iter 366180: loss 5.7093, time 119.59ms
iter 366190: loss 5.2567, time 120.67ms
iter 366200: loss 5.5982, time 119.64ms
iter 366210: loss 5.6392, time 120.66ms
iter 366220: loss 5.1258, time 119.75ms
iter 366230: loss 6.4377, time 120.76ms
iter 366240: loss 5.9337, time 119.67ms
step 366250: train loss 5.6142, val loss 5.5885
saving checkpoint to out-shakespeare-char
iter 366250: loss 6.5447, time 2883.43ms
iter 366260: loss 5.7352, time 119.64ms
iter 366270: loss 6.3297, time 119.58ms
iter 366280: loss 5.7364, time 119.89ms
iter 366290: loss 5.8358, time 119.74ms
iter 366300: loss 6.2997, time 120.95ms
iter 366310: loss 6.7706, time 125.85ms
iter 366320: loss 6.1730, time 124.86ms
iter 366330: loss 6.0413, time 125.18ms
iter 366340: loss 5.8644, time 121.94ms
iter 366350: loss 5.2764, time 122.30ms
iter 366360: loss 6.1484, time 121.56ms
iter 366370: loss 6.4978, time 122.76ms
iter 366380: loss 6.0130, time 121.74ms
iter 366390: loss 5.6874, time 122.71ms
iter 366400: loss 5.9629, time 121.61ms
iter 366410: loss 6.5226, time 122.74ms
iter 366420: loss 5.9368, time 120.68ms
iter 366430: loss 5.5282, time 122.62ms
iter 366440: loss 5.4479, time 121.55ms
iter 366450: loss 6.2693, time 122.57ms
iter 366460: loss 6.1344, time 121.53ms
iter 366470: loss 7.1180, time 122.68ms
iter 366480: loss 5.7033, time 121.52ms
iter 366490: loss 6.5264, time 122.70ms
step 366500: train loss 5.5605, val loss 5.6112
saving checkpoint to out-shakespeare-char
iter 366500: loss 5.6658, time 2881.51ms
iter 366510: loss 5.4775, time 121.73ms
iter 366520: loss 6.0827, time 122.13ms
iter 366530: loss 5.2805, time 121.50ms
iter 366540: loss 6.1958, time 121.60ms
iter 366550: loss 6.4706, time 121.59ms
iter 366560: loss 5.3083, time 121.58ms
iter 366570: loss 6.2874, time 121.49ms
iter 366580: loss 5.9855, time 121.65ms
iter 366590: loss 5.6799, time 122.63ms
iter 366600: loss 5.8491, time 121.56ms
iter 366610: loss 6.0600, time 121.56ms
iter 366620: loss 6.3280, time 121.16ms
iter 366630: loss 6.3979, time 121.70ms
iter 366640: loss 6.1030, time 121.67ms
iter 366650: loss 6.7012, time 121.44ms
iter 366660: loss 5.5805, time 121.49ms
iter 366670: loss 5.5975, time 121.17ms
iter 366680: loss 6.3076, time 121.52ms
iter 366690: loss 6.1480, time 121.55ms
iter 366700: loss 6.2666, time 121.47ms
iter 366710: loss 5.2582, time 121.99ms
iter 366720: loss 6.1563, time 121.59ms
iter 366730: loss 5.9614, time 121.70ms
iter 366740: loss 5.7541, time 121.56ms
step 366750: train loss 5.6156, val loss 5.6078
saving checkpoint to out-shakespeare-char
iter 366750: loss 6.0976, time 2895.59ms
iter 366760: loss 6.3781, time 122.95ms
iter 366770: loss 6.4512, time 121.39ms
iter 366780: loss 6.3227, time 123.29ms
iter 366790: loss 5.2843, time 121.57ms
iter 366800: loss 5.8750, time 122.66ms
iter 366810: loss 5.8281, time 121.55ms
iter 366820: loss 6.2432, time 123.01ms
iter 366830: loss 6.1881, time 121.97ms
iter 366840: loss 5.9549, time 122.98ms
iter 366850: loss 6.2227, time 121.47ms
iter 366860: loss 5.9382, time 122.80ms
iter 366870: loss 5.9833, time 121.57ms
iter 366880: loss 5.6279, time 123.18ms
iter 366890: loss 6.4192, time 121.29ms
iter 366900: loss 6.1074, time 123.44ms
iter 366910: loss 5.0403, time 121.40ms
iter 366920: loss 5.4477, time 122.82ms
iter 366930: loss 6.1268, time 122.03ms
iter 366940: loss 5.9133, time 122.75ms
iter 366950: loss 6.1319, time 121.97ms
iter 366960: loss 5.5553, time 123.02ms
iter 366970: loss 5.8659, time 121.55ms
iter 366980: loss 5.8519, time 122.87ms
iter 366990: loss 6.0342, time 121.63ms
step 367000: train loss 5.5565, val loss 5.5933
saving checkpoint to out-shakespeare-char
iter 367000: loss 5.6261, time 2894.09ms
iter 367010: loss 6.0484, time 125.74ms
iter 367020: loss 5.3579, time 128.00ms
iter 367030: loss 6.5873, time 125.60ms
iter 367040: loss 5.7395, time 125.40ms
iter 367050: loss 5.6803, time 124.79ms
iter 367060: loss 5.7050, time 125.82ms
iter 367070: loss 6.2970, time 125.76ms
iter 367080: loss 5.9094, time 125.59ms
iter 367090: loss 6.2099, time 125.60ms
iter 367100: loss 6.1044, time 125.57ms
iter 367110: loss 6.3834, time 125.99ms
iter 367120: loss 6.3359, time 125.36ms
iter 367130: loss 5.6102, time 128.09ms
iter 367140: loss 5.2990, time 125.38ms
iter 367150: loss 6.0451, time 126.16ms
iter 367160: loss 6.0445, time 124.67ms
iter 367170: loss 6.6375, time 125.42ms
iter 367180: loss 6.4720, time 125.06ms
iter 367190: loss 5.9321, time 125.38ms
iter 367200: loss 7.4200, time 127.43ms
iter 367210: loss 5.6074, time 126.26ms
iter 367220: loss 5.5011, time 125.32ms
iter 367230: loss 5.7210, time 126.09ms
iter 367240: loss 6.3552, time 125.61ms
step 367250: train loss 5.5561, val loss 5.5410
saving checkpoint to out-shakespeare-char
iter 367250: loss 5.9592, time 2896.07ms
iter 367260: loss 6.1356, time 121.78ms
iter 367270: loss 5.4006, time 121.55ms
iter 367280: loss 6.3669, time 122.03ms
iter 367290: loss 5.9307, time 121.44ms
iter 367300: loss 4.9656, time 121.72ms
iter 367310: loss 5.9134, time 121.48ms
iter 367320: loss 5.5403, time 122.02ms
iter 367330: loss 5.6744, time 121.46ms
iter 367340: loss 5.0993, time 121.42ms
iter 367350: loss 5.5399, time 121.59ms
iter 367360: loss 6.1328, time 121.52ms
iter 367370: loss 5.6899, time 121.39ms
iter 367380: loss 5.4860, time 121.29ms
iter 367390: loss 5.9401, time 121.54ms
iter 367400: loss 6.3497, time 121.28ms
iter 367410: loss 6.0349, time 121.46ms
iter 367420: loss 6.5157, time 121.58ms
iter 367430: loss 6.6094, time 121.57ms
iter 367440: loss 5.6905, time 121.47ms
iter 367450: loss 6.6176, time 121.52ms
iter 367460: loss 6.2201, time 121.56ms
iter 367470: loss 6.0433, time 121.38ms
iter 367480: loss 5.9668, time 121.36ms
iter 367490: loss 5.7260, time 121.56ms
step 367500: train loss 5.6249, val loss 5.5719
saving checkpoint to out-shakespeare-char
iter 367500: loss 5.8043, time 2886.06ms
iter 367510: loss 5.6000, time 122.73ms
iter 367520: loss 6.5381, time 121.94ms
iter 367530: loss 6.6714, time 122.67ms
iter 367540: loss 5.8708, time 121.66ms
iter 367550: loss 6.4371, time 122.65ms
iter 367560: loss 5.0815, time 121.74ms
iter 367570: loss 6.8618, time 123.11ms
iter 367580: loss 5.9950, time 121.59ms
iter 367590: loss 5.4616, time 122.54ms
iter 367600: loss 6.4459, time 121.53ms
iter 367610: loss 5.8517, time 122.61ms
iter 367620: loss 5.8822, time 121.12ms
iter 367630: loss 5.6451, time 123.27ms
iter 367640: loss 6.7075, time 121.92ms
iter 367650: loss 6.1512, time 122.60ms
iter 367660: loss 6.0439, time 120.89ms
iter 367670: loss 6.2176, time 122.60ms
iter 367680: loss 5.6292, time 121.48ms
iter 367690: loss 6.2694, time 122.09ms
iter 367700: loss 6.2668, time 121.42ms
iter 367710: loss 5.8226, time 122.65ms
iter 367720: loss 6.0187, time 121.38ms
iter 367730: loss 6.0232, time 122.56ms
iter 367740: loss 6.5915, time 121.58ms
step 367750: train loss 5.5982, val loss 5.5546
saving checkpoint to out-shakespeare-char
iter 367750: loss 6.7390, time 2890.48ms
iter 367760: loss 5.7043, time 121.91ms
iter 367770: loss 5.8341, time 122.30ms
iter 367780: loss 5.6448, time 121.83ms
iter 367790: loss 5.5636, time 121.66ms
iter 367800: loss 5.5289, time 121.68ms
iter 367810: loss 5.8234, time 122.88ms
iter 367820: loss 5.4353, time 121.62ms
iter 367830: loss 5.9670, time 121.04ms
iter 367840: loss 5.9842, time 121.71ms
iter 367850: loss 5.1418, time 121.61ms
iter 367860: loss 5.7080, time 121.68ms
iter 367870: loss 5.8341, time 121.79ms
iter 367880: loss 6.1129, time 121.66ms
iter 367890: loss 6.0430, time 122.58ms
iter 367900: loss 6.2050, time 121.65ms
iter 367910: loss 5.7107, time 121.61ms
iter 367920: loss 5.4436, time 121.57ms
iter 367930: loss 5.4982, time 121.41ms
iter 367940: loss 5.6224, time 121.67ms
iter 367950: loss 6.0651, time 121.54ms
iter 367960: loss 5.5723, time 121.62ms
iter 367970: loss 6.4560, time 121.63ms
iter 367980: loss 6.1690, time 121.56ms
iter 367990: loss 5.8915, time 121.44ms
step 368000: train loss 5.6380, val loss 5.5852
saving checkpoint to out-shakespeare-char
iter 368000: loss 5.9576, time 2877.28ms
iter 368010: loss 6.1301, time 121.60ms
iter 368020: loss 6.5980, time 121.88ms
iter 368030: loss 6.5604, time 121.13ms
iter 368040: loss 5.8443, time 121.56ms
iter 368050: loss 6.3004, time 121.67ms
iter 368060: loss 5.7704, time 121.61ms
iter 368070: loss 6.7574, time 121.55ms
iter 368080: loss 6.4953, time 121.99ms
iter 368090: loss 5.9015, time 121.58ms
iter 368100: loss 5.9267, time 121.61ms
iter 368110: loss 5.9507, time 121.54ms
iter 368120: loss 6.0926, time 121.62ms
iter 368130: loss 5.8921, time 121.43ms
iter 368140: loss 5.7716, time 121.50ms
iter 368150: loss 5.5298, time 121.53ms
iter 368160: loss 5.9728, time 121.56ms
iter 368170: loss 5.9509, time 120.89ms
iter 368180: loss 5.1751, time 120.40ms
iter 368190: loss 5.9481, time 121.47ms
iter 368200: loss 6.5113, time 121.15ms
iter 368210: loss 5.9036, time 121.62ms
iter 368220: loss 6.0106, time 121.92ms
iter 368230: loss 6.2164, time 122.48ms
iter 368240: loss 5.0009, time 121.58ms
step 368250: train loss 5.5719, val loss 5.6153
saving checkpoint to out-shakespeare-char
iter 368250: loss 6.3822, time 2885.30ms
iter 368260: loss 5.4108, time 122.23ms
iter 368270: loss 5.4009, time 121.51ms
iter 368280: loss 5.0989, time 121.68ms
iter 368290: loss 6.4555, time 121.28ms
iter 368300: loss 5.1960, time 120.77ms
iter 368310: loss 5.9934, time 122.51ms
iter 368320: loss 5.8811, time 121.38ms
iter 368330: loss 5.3968, time 121.48ms
iter 368340: loss 5.9584, time 121.51ms
iter 368350: loss 4.8267, time 122.76ms
iter 368360: loss 5.3310, time 122.02ms
iter 368370: loss 5.7901, time 121.57ms
iter 368380: loss 6.4239, time 121.64ms
iter 368390: loss 5.9087, time 120.52ms
iter 368400: loss 6.1921, time 121.17ms
iter 368410: loss 5.6856, time 122.03ms
iter 368420: loss 6.0596, time 121.25ms
iter 368430: loss 5.8526, time 121.52ms
iter 368440: loss 5.3785, time 121.52ms
iter 368450: loss 5.9724, time 121.44ms
iter 368460: loss 6.1237, time 122.01ms
iter 368470: loss 5.9373, time 121.55ms
iter 368480: loss 5.5952, time 121.39ms
iter 368490: loss 5.7764, time 120.75ms
step 368500: train loss 5.6202, val loss 5.6099
saving checkpoint to out-shakespeare-char
iter 368500: loss 5.4236, time 2896.36ms
iter 368510: loss 6.2113, time 125.74ms
iter 368520: loss 5.5319, time 126.01ms
iter 368530: loss 5.7452, time 125.68ms
iter 368540: loss 5.6959, time 127.95ms
iter 368550: loss 5.4132, time 125.70ms
iter 368560: loss 5.3731, time 125.40ms
iter 368570: loss 5.7495, time 125.52ms
iter 368580: loss 6.3289, time 125.51ms
iter 368590: loss 5.7233, time 124.98ms
iter 368600: loss 5.3441, time 125.16ms
iter 368610: loss 5.7422, time 125.02ms
iter 368620: loss 6.0005, time 125.65ms
iter 368630: loss 5.8091, time 125.72ms
iter 368640: loss 6.1873, time 125.66ms
iter 368650: loss 6.5069, time 128.08ms
iter 368660: loss 5.5997, time 125.82ms
iter 368670: loss 6.0455, time 125.50ms
iter 368680: loss 6.1387, time 125.50ms
iter 368690: loss 6.1322, time 126.13ms
iter 368700: loss 5.9921, time 125.43ms
iter 368710: loss 5.5758, time 125.60ms
iter 368720: loss 6.4965, time 125.63ms
iter 368730: loss 6.2264, time 125.51ms
iter 368740: loss 6.0543, time 125.52ms
step 368750: train loss 5.6089, val loss 5.6151
saving checkpoint to out-shakespeare-char
iter 368750: loss 6.2478, time 2897.55ms
iter 368760: loss 6.1541, time 127.44ms
iter 368770: loss 6.5245, time 125.51ms
iter 368780: loss 5.7044, time 125.96ms
iter 368790: loss 5.3859, time 125.78ms
iter 368800: loss 5.8471, time 125.88ms
iter 368810: loss 5.8927, time 125.15ms
iter 368820: loss 6.5611, time 125.19ms
iter 368830: loss 6.6222, time 125.79ms
iter 368840: loss 5.8421, time 125.21ms
iter 368850: loss 6.2887, time 125.28ms
iter 368860: loss 5.4763, time 125.53ms
iter 368870: loss 6.1641, time 127.61ms
iter 368880: loss 6.4575, time 125.31ms
iter 368890: loss 6.1141, time 125.27ms
iter 368900: loss 5.7079, time 125.29ms
iter 368910: loss 6.1961, time 125.36ms
iter 368920: loss 6.0269, time 125.37ms
iter 368930: loss 6.3788, time 125.38ms
iter 368940: loss 5.3349, time 125.78ms
iter 368950: loss 5.8419, time 125.13ms
iter 368960: loss 5.9028, time 125.35ms
iter 368970: loss 5.9130, time 125.35ms
iter 368980: loss 6.3969, time 127.69ms
iter 368990: loss 5.6888, time 121.60ms
step 369000: train loss 5.5778, val loss 5.5923
saving checkpoint to out-shakespeare-char
iter 369000: loss 5.9745, time 2911.96ms
iter 369010: loss 5.7355, time 121.81ms
iter 369020: loss 5.7581, time 123.12ms
iter 369030: loss 5.4924, time 121.63ms
iter 369040: loss 5.6062, time 122.37ms
iter 369050: loss 6.1439, time 121.81ms
iter 369060: loss 5.7900, time 123.21ms
iter 369070: loss 6.2326, time 122.11ms
iter 369080: loss 5.6762, time 122.85ms
iter 369090: loss 5.4513, time 121.80ms
iter 369100: loss 6.3581, time 122.38ms
iter 369110: loss 6.3662, time 121.74ms
iter 369120: loss 6.2603, time 123.34ms
iter 369130: loss 5.7711, time 121.62ms
iter 369140: loss 5.5560, time 123.00ms
iter 369150: loss 6.2436, time 121.93ms
iter 369160: loss 6.7433, time 123.01ms
iter 369170: loss 6.0428, time 121.83ms
iter 369180: loss 6.2760, time 123.72ms
iter 369190: loss 5.8185, time 122.06ms
iter 369200: loss 6.3438, time 123.21ms
iter 369210: loss 6.4547, time 121.85ms
iter 369220: loss 5.9306, time 122.60ms
iter 369230: loss 5.9386, time 122.08ms
iter 369240: loss 6.1848, time 122.96ms
step 369250: train loss 5.6070, val loss 5.5896
saving checkpoint to out-shakespeare-char
iter 369250: loss 5.7173, time 2902.66ms
iter 369260: loss 6.5957, time 121.57ms
iter 369270: loss 5.9927, time 121.67ms
iter 369280: loss 6.4667, time 121.55ms
iter 369290: loss 5.7375, time 121.50ms
iter 369300: loss 6.0728, time 121.64ms
iter 369310: loss 6.2815, time 121.58ms
iter 369320: loss 5.8119, time 121.56ms
iter 369330: loss 6.9998, time 121.44ms
iter 369340: loss 5.7618, time 121.54ms
iter 369350: loss 6.4373, time 121.39ms
iter 369360: loss 5.5888, time 121.49ms
iter 369370: loss 6.1223, time 121.54ms
iter 369380: loss 5.7982, time 121.53ms
iter 369390: loss 5.6861, time 121.48ms
iter 369400: loss 6.0679, time 121.54ms
iter 369410: loss 6.1732, time 121.40ms
iter 369420: loss 6.0599, time 121.44ms
iter 369430: loss 5.7049, time 121.54ms
iter 369440: loss 6.0827, time 121.72ms
iter 369450: loss 6.2651, time 122.01ms
iter 369460: loss 5.7346, time 121.73ms
iter 369470: loss 5.8007, time 121.81ms
iter 369480: loss 5.9553, time 121.60ms
iter 369490: loss 5.6090, time 121.43ms
step 369500: train loss 5.5760, val loss 5.5932
saving checkpoint to out-shakespeare-char
iter 369500: loss 5.7464, time 2894.83ms
iter 369510: loss 6.5354, time 125.36ms
iter 369520: loss 6.1202, time 125.70ms
iter 369530: loss 5.1513, time 125.42ms
iter 369540: loss 5.9650, time 125.56ms
iter 369550: loss 5.9086, time 125.75ms
iter 369560: loss 5.7216, time 125.77ms
iter 369570: loss 7.1101, time 124.44ms
iter 369580: loss 6.6272, time 124.66ms
iter 369590: loss 5.5846, time 124.77ms
iter 369600: loss 5.6916, time 126.24ms
iter 369610: loss 5.9649, time 124.40ms
iter 369620: loss 6.4379, time 124.70ms
iter 369630: loss 6.3416, time 124.92ms
iter 369640: loss 6.3680, time 125.04ms
iter 369650: loss 6.5359, time 124.98ms
iter 369660: loss 6.1095, time 124.41ms
iter 369670: loss 5.9907, time 127.80ms
iter 369680: loss 5.9430, time 125.32ms
iter 369690: loss 5.8874, time 125.09ms
iter 369700: loss 6.2149, time 125.33ms
iter 369710: loss 5.6968, time 125.36ms
iter 369720: loss 6.0258, time 125.03ms
iter 369730: loss 6.5166, time 125.14ms
iter 369740: loss 5.6161, time 125.20ms
step 369750: train loss 5.5982, val loss 5.5948
saving checkpoint to out-shakespeare-char
iter 369750: loss 6.1261, time 2890.03ms
iter 369760: loss 6.2588, time 125.55ms
iter 369770: loss 5.4906, time 125.31ms
iter 369780: loss 5.4294, time 125.65ms
iter 369790: loss 5.8320, time 125.71ms
iter 369800: loss 5.9654, time 125.92ms
iter 369810: loss 6.0580, time 125.69ms
iter 369820: loss 6.0048, time 125.66ms
iter 369830: loss 6.0000, time 125.49ms
iter 369840: loss 5.4443, time 125.46ms
iter 369850: loss 6.0229, time 124.91ms
iter 369860: loss 6.5316, time 128.28ms
iter 369870: loss 5.4072, time 125.43ms
iter 369880: loss 5.9601, time 125.61ms
iter 369890: loss 5.4647, time 125.71ms
iter 369900: loss 6.0948, time 125.53ms
iter 369910: loss 5.8908, time 125.17ms
iter 369920: loss 6.2053, time 125.69ms
iter 369930: loss 6.4570, time 125.73ms
iter 369940: loss 6.5878, time 125.60ms
iter 369950: loss 6.0029, time 125.45ms
iter 369960: loss 6.8070, time 125.21ms
iter 369970: loss 6.1040, time 125.73ms
iter 369980: loss 6.0511, time 125.29ms
iter 369990: loss 5.6109, time 125.60ms
step 370000: train loss 5.6375, val loss 5.5984
saving checkpoint to out-shakespeare-char
iter 370000: loss 5.7125, time 2887.40ms
iter 370010: loss 6.0821, time 125.24ms
iter 370020: loss 5.5315, time 125.80ms
iter 370030: loss 6.3112, time 125.60ms
iter 370040: loss 5.7040, time 125.98ms
iter 370050: loss 6.9697, time 127.99ms
iter 370060: loss 6.2753, time 125.79ms
iter 370070: loss 6.3838, time 125.65ms
iter 370080: loss 6.7328, time 125.82ms
iter 370090: loss 5.7549, time 125.20ms
iter 370100: loss 6.0648, time 126.08ms
iter 370110: loss 5.8692, time 125.58ms
iter 370120: loss 5.4175, time 128.33ms
iter 370130: loss 5.7516, time 125.82ms
iter 370140: loss 5.7908, time 125.34ms
iter 370150: loss 6.2547, time 126.04ms
iter 370160: loss 5.7108, time 125.50ms
iter 370170: loss 5.9532, time 125.69ms
iter 370180: loss 6.6689, time 125.70ms
iter 370190: loss 6.0765, time 128.35ms
iter 370200: loss 5.3931, time 125.57ms
iter 370210: loss 6.0266, time 125.63ms
iter 370220: loss 5.9554, time 125.37ms
iter 370230: loss 6.5207, time 125.57ms
iter 370240: loss 6.4115, time 124.77ms
step 370250: train loss 5.5643, val loss 5.6314
saving checkpoint to out-shakespeare-char
iter 370250: loss 5.5678, time 2859.69ms
iter 370260: loss 5.5211, time 125.46ms
iter 370270: loss 5.6331, time 126.10ms
iter 370280: loss 5.9660, time 125.15ms
iter 370290: loss 6.0162, time 125.42ms
iter 370300: loss 6.5898, time 125.55ms
iter 370310: loss 6.5093, time 127.78ms
iter 370320: loss 6.0198, time 125.28ms
iter 370330: loss 5.5233, time 125.22ms
iter 370340: loss 5.7363, time 126.29ms
iter 370350: loss 5.8098, time 125.62ms
iter 370360: loss 5.7356, time 125.95ms
iter 370370: loss 5.8956, time 125.98ms
iter 370380: loss 5.7224, time 125.93ms
iter 370390: loss 6.3154, time 125.86ms
iter 370400: loss 6.5099, time 126.25ms
iter 370410: loss 5.2597, time 126.26ms
iter 370420: loss 6.6600, time 127.73ms
iter 370430: loss 6.0914, time 125.95ms
iter 370440: loss 6.1143, time 126.09ms
iter 370450: loss 6.2735, time 125.86ms
iter 370460: loss 6.3990, time 126.02ms
iter 370470: loss 6.1955, time 125.86ms
iter 370480: loss 6.6018, time 125.79ms
iter 370490: loss 6.7201, time 126.12ms
step 370500: train loss 5.5505, val loss 5.5887
saving checkpoint to out-shakespeare-char
iter 370500: loss 6.2482, time 2864.81ms
iter 370510: loss 5.7391, time 121.88ms
iter 370520: loss 5.8815, time 120.92ms
iter 370530: loss 6.2441, time 121.75ms
iter 370540: loss 5.8873, time 121.68ms
iter 370550: loss 5.7090, time 121.92ms
iter 370560: loss 6.2894, time 122.07ms
iter 370570: loss 6.0746, time 121.18ms
iter 370580: loss 5.8824, time 121.56ms
iter 370590: loss 5.7711, time 121.51ms
iter 370600: loss 6.1231, time 122.33ms
iter 370610: loss 6.1781, time 121.72ms
iter 370620: loss 6.1297, time 121.29ms
iter 370630: loss 5.9779, time 121.67ms
iter 370640: loss 6.2493, time 121.88ms
iter 370650: loss 5.8600, time 122.10ms
iter 370660: loss 6.3386, time 123.87ms
iter 370670: loss 6.0470, time 121.64ms
iter 370680: loss 5.8487, time 124.07ms
iter 370690: loss 5.5970, time 122.01ms
iter 370700: loss 5.7111, time 124.39ms
iter 370710: loss 5.3973, time 121.62ms
iter 370720: loss 5.5417, time 123.99ms
iter 370730: loss 5.9603, time 121.62ms
iter 370740: loss 5.3209, time 124.26ms
step 370750: train loss 5.6112, val loss 5.5663
saving checkpoint to out-shakespeare-char
iter 370750: loss 5.1836, time 2892.32ms
iter 370760: loss 5.7690, time 121.47ms
iter 370770: loss 5.6413, time 121.62ms
iter 370780: loss 5.7135, time 121.77ms
iter 370790: loss 5.5343, time 121.62ms
iter 370800: loss 6.0006, time 121.51ms
iter 370810: loss 6.0362, time 121.72ms
iter 370820: loss 6.5573, time 121.59ms
iter 370830: loss 6.1409, time 121.52ms
iter 370840: loss 6.1832, time 121.50ms
iter 370850: loss 6.3393, time 121.64ms
iter 370860: loss 7.1485, time 121.57ms
iter 370870: loss 6.3782, time 121.89ms
iter 370880: loss 6.1871, time 121.81ms
iter 370890: loss 5.2737, time 121.48ms
iter 370900: loss 6.4399, time 121.61ms
iter 370910: loss 5.9234, time 121.46ms
iter 370920: loss 6.2035, time 121.59ms
iter 370930: loss 6.3102, time 121.56ms
iter 370940: loss 5.2022, time 121.41ms
iter 370950: loss 6.0327, time 121.61ms
iter 370960: loss 5.6446, time 121.68ms
iter 370970: loss 6.0983, time 121.61ms
iter 370980: loss 5.6091, time 121.36ms
iter 370990: loss 5.5358, time 121.70ms
step 371000: train loss 5.5968, val loss 5.5852
saving checkpoint to out-shakespeare-char
iter 371000: loss 5.7665, time 2890.07ms
iter 371010: loss 6.2927, time 121.65ms
iter 371020: loss 5.9751, time 121.65ms
iter 371030: loss 6.3242, time 122.92ms
iter 371040: loss 6.1104, time 121.53ms
iter 371050: loss 5.8415, time 122.74ms
iter 371060: loss 5.9546, time 122.07ms
iter 371070: loss 5.6505, time 122.76ms
iter 371080: loss 5.6143, time 121.53ms
iter 371090: loss 6.3003, time 122.63ms
iter 371100: loss 6.1111, time 121.96ms
iter 371110: loss 6.1735, time 123.13ms
iter 371120: loss 5.3857, time 121.36ms
iter 371130: loss 6.0935, time 122.68ms
iter 371140: loss 5.9266, time 121.56ms
iter 371150: loss 5.9928, time 122.90ms
iter 371160: loss 6.3400, time 121.22ms
iter 371170: loss 6.2836, time 122.85ms
iter 371180: loss 6.4638, time 121.49ms
iter 371190: loss 5.8124, time 122.61ms
iter 371200: loss 6.3705, time 121.66ms
iter 371210: loss 5.9108, time 122.73ms
iter 371220: loss 5.8052, time 121.56ms
iter 371230: loss 6.0854, time 122.24ms
iter 371240: loss 6.0843, time 121.47ms
step 371250: train loss 5.6197, val loss 5.5920
saving checkpoint to out-shakespeare-char
iter 371250: loss 6.6601, time 2891.66ms
iter 371260: loss 6.7243, time 124.11ms
iter 371270: loss 6.0374, time 121.49ms
iter 371280: loss 5.6203, time 123.74ms
iter 371290: loss 5.5681, time 121.56ms
iter 371300: loss 5.7134, time 123.88ms
iter 371310: loss 5.6606, time 121.75ms
iter 371320: loss 6.2407, time 124.10ms
iter 371330: loss 5.5440, time 121.52ms
iter 371340: loss 5.5647, time 124.02ms
iter 371350: loss 6.2324, time 121.50ms
iter 371360: loss 6.5612, time 123.93ms
iter 371370: loss 5.4139, time 121.53ms
iter 371380: loss 6.0747, time 124.01ms
iter 371390: loss 6.3447, time 120.50ms
iter 371400: loss 4.8490, time 123.80ms
iter 371410: loss 6.1762, time 121.43ms
iter 371420: loss 5.9358, time 124.05ms
iter 371430: loss 6.5980, time 121.71ms
iter 371440: loss 6.2435, time 123.94ms
iter 371450: loss 7.0218, time 121.43ms
iter 371460: loss 6.3995, time 124.02ms
iter 371470: loss 6.0641, time 121.53ms
iter 371480: loss 6.2228, time 123.80ms
iter 371490: loss 5.8170, time 121.62ms
step 371500: train loss 5.5454, val loss 5.5791
saving checkpoint to out-shakespeare-char
iter 371500: loss 5.8264, time 2895.16ms
iter 371510: loss 6.2263, time 122.15ms
iter 371520: loss 6.1507, time 121.66ms
iter 371530: loss 5.3433, time 121.63ms
iter 371540: loss 5.9069, time 121.17ms
iter 371550: loss 5.6728, time 121.56ms
iter 371560: loss 5.7919, time 121.53ms
iter 371570: loss 6.4585, time 121.60ms
iter 371580: loss 6.6042, time 121.80ms
iter 371590: loss 5.8719, time 121.69ms
iter 371600: loss 6.3235, time 121.55ms
iter 371610: loss 6.1295, time 121.57ms
iter 371620: loss 5.1560, time 121.22ms
iter 371630: loss 5.9486, time 121.89ms
iter 371640: loss 6.2672, time 121.58ms
iter 371650: loss 5.7287, time 121.51ms
iter 371660: loss 6.4485, time 122.12ms
iter 371670: loss 5.6234, time 121.73ms
iter 371680: loss 5.6978, time 123.28ms
iter 371690: loss 5.8081, time 122.03ms
iter 371700: loss 5.8726, time 121.18ms
iter 371710: loss 5.8598, time 121.56ms
iter 371720: loss 5.5106, time 121.76ms
iter 371730: loss 6.4055, time 121.64ms
iter 371740: loss 6.0821, time 121.56ms
step 371750: train loss 5.5796, val loss 5.5590
saving checkpoint to out-shakespeare-char
iter 371750: loss 6.0748, time 2877.43ms
iter 371760: loss 5.7157, time 121.61ms
iter 371770: loss 5.6799, time 123.83ms
iter 371780: loss 6.8768, time 121.34ms
iter 371790: loss 5.7617, time 123.75ms
iter 371800: loss 5.6610, time 121.11ms
iter 371810: loss 6.5338, time 123.68ms
iter 371820: loss 6.4705, time 121.62ms
iter 371830: loss 5.8264, time 123.75ms
iter 371840: loss 6.7135, time 121.78ms
iter 371850: loss 5.7673, time 123.80ms
iter 371860: loss 6.1703, time 121.08ms
iter 371870: loss 5.5323, time 123.87ms
iter 371880: loss 5.1405, time 121.62ms
iter 371890: loss 4.9169, time 123.91ms
iter 371900: loss 5.8108, time 120.87ms
iter 371910: loss 5.7087, time 123.56ms
iter 371920: loss 5.8167, time 121.77ms
iter 371930: loss 6.1182, time 124.12ms
iter 371940: loss 5.3804, time 121.68ms
iter 371950: loss 6.2193, time 123.88ms
iter 371960: loss 5.1635, time 121.00ms
iter 371970: loss 6.1844, time 122.91ms
iter 371980: loss 5.6657, time 121.70ms
iter 371990: loss 5.4425, time 123.50ms
step 372000: train loss 5.5774, val loss 5.5950
saving checkpoint to out-shakespeare-char
iter 372000: loss 6.0712, time 2876.87ms
iter 372010: loss 5.7379, time 124.09ms
iter 372020: loss 6.0931, time 121.97ms
iter 372030: loss 6.2688, time 124.12ms
iter 372040: loss 5.4270, time 121.34ms
iter 372050: loss 6.4464, time 124.11ms
iter 372060: loss 6.0889, time 121.82ms
iter 372070: loss 5.9881, time 124.02ms
iter 372080: loss 6.8974, time 121.72ms
iter 372090: loss 6.1579, time 123.96ms
iter 372100: loss 5.9830, time 122.29ms
iter 372110: loss 5.8726, time 124.15ms
iter 372120: loss 6.2134, time 121.77ms
iter 372130: loss 6.2333, time 123.26ms
iter 372140: loss 6.3400, time 121.11ms
iter 372150: loss 6.4734, time 124.01ms
iter 372160: loss 6.2709, time 122.13ms
iter 372170: loss 5.5719, time 124.01ms
iter 372180: loss 5.3910, time 121.82ms
iter 372190: loss 5.7810, time 124.10ms
iter 372200: loss 5.8822, time 121.98ms
iter 372210: loss 5.9493, time 124.10ms
iter 372220: loss 5.8823, time 121.68ms
iter 372230: loss 5.9826, time 124.58ms
iter 372240: loss 6.7518, time 122.03ms
step 372250: train loss 5.5747, val loss 5.6002
saving checkpoint to out-shakespeare-char
iter 372250: loss 5.9077, time 2906.43ms
iter 372260: loss 5.9340, time 121.68ms
iter 372270: loss 5.8430, time 122.78ms
iter 372280: loss 5.6557, time 121.60ms
iter 372290: loss 6.0565, time 122.67ms
iter 372300: loss 6.6803, time 121.63ms
iter 372310: loss 6.4526, time 122.79ms
iter 372320: loss 5.9768, time 120.79ms
iter 372330: loss 6.0664, time 123.31ms
iter 372340: loss 6.3468, time 121.59ms
iter 372350: loss 6.0654, time 123.04ms
iter 372360: loss 6.0338, time 121.36ms
iter 372370: loss 6.9036, time 122.52ms
iter 372380: loss 6.3501, time 121.55ms
iter 372390: loss 5.9649, time 123.12ms
iter 372400: loss 5.7788, time 121.85ms
iter 372410: loss 5.3621, time 122.68ms
iter 372420: loss 5.8293, time 121.56ms
iter 372430: loss 6.4052, time 122.74ms
iter 372440: loss 6.1121, time 121.55ms
iter 372450: loss 5.4316, time 122.88ms
iter 372460: loss 5.8453, time 121.77ms
iter 372470: loss 5.9313, time 122.76ms
iter 372480: loss 5.6951, time 121.52ms
iter 372490: loss 6.0304, time 122.66ms
step 372500: train loss 5.5824, val loss 5.6210
saving checkpoint to out-shakespeare-char
iter 372500: loss 5.9511, time 2890.81ms
iter 372510: loss 5.8240, time 125.29ms
iter 372520: loss 6.0486, time 125.22ms
iter 372530: loss 5.9100, time 125.58ms
iter 372540: loss 6.0749, time 125.90ms
iter 372550: loss 6.3595, time 125.06ms
iter 372560: loss 6.2237, time 125.58ms
iter 372570: loss 6.5523, time 126.74ms
iter 372580: loss 5.6666, time 125.87ms
iter 372590: loss 6.1660, time 126.41ms
iter 372600: loss 6.1912, time 124.74ms
iter 372610: loss 6.1986, time 125.09ms
iter 372620: loss 5.4929, time 125.07ms
iter 372630: loss 5.8259, time 125.40ms
iter 372640: loss 6.1733, time 125.88ms
iter 372650: loss 5.8020, time 125.39ms
iter 372660: loss 5.9333, time 125.46ms
iter 372670: loss 5.7972, time 124.86ms
iter 372680: loss 5.3428, time 125.66ms
iter 372690: loss 6.2558, time 125.64ms
iter 372700: loss 5.9575, time 126.00ms
iter 372710: loss 5.8630, time 127.06ms
iter 372720: loss 5.7504, time 125.65ms
iter 372730: loss 6.0676, time 125.72ms
iter 372740: loss 5.8519, time 125.78ms
step 372750: train loss 5.6282, val loss 5.5713
saving checkpoint to out-shakespeare-char
iter 372750: loss 5.5244, time 2909.98ms
iter 372760: loss 5.9685, time 125.25ms
iter 372770: loss 6.2331, time 125.10ms
iter 372780: loss 6.3257, time 125.17ms
iter 372790: loss 6.0666, time 125.13ms
iter 372800: loss 5.8900, time 125.94ms
iter 372810: loss 5.6144, time 125.69ms
iter 372820: loss 5.8032, time 125.34ms
iter 372830: loss 5.9235, time 124.62ms
iter 372840: loss 5.3778, time 125.58ms
iter 372850: loss 5.6930, time 124.87ms
iter 372860: loss 5.8760, time 127.99ms
iter 372870: loss 6.0995, time 125.59ms
iter 372880: loss 6.9264, time 126.18ms
iter 372890: loss 5.4923, time 125.95ms
iter 372900: loss 5.7223, time 128.15ms
iter 372910: loss 6.2769, time 125.28ms
iter 372920: loss 5.7303, time 125.62ms
iter 372930: loss 5.7119, time 125.84ms
iter 372940: loss 6.2101, time 126.59ms
iter 372950: loss 5.9653, time 125.66ms
iter 372960: loss 6.9210, time 125.54ms
iter 372970: loss 6.0275, time 125.84ms
iter 372980: loss 6.4186, time 125.57ms
iter 372990: loss 6.5907, time 125.50ms
step 373000: train loss 5.5940, val loss 5.6491
saving checkpoint to out-shakespeare-char
iter 373000: loss 6.6612, time 2886.04ms
iter 373010: loss 5.6042, time 125.56ms
iter 373020: loss 6.4358, time 125.39ms
iter 373030: loss 5.5971, time 125.36ms
iter 373040: loss 6.2902, time 124.24ms
iter 373050: loss 5.1863, time 127.48ms
iter 373060: loss 6.1338, time 125.06ms
iter 373070: loss 5.9201, time 125.12ms
iter 373080: loss 6.0417, time 125.13ms
iter 373090: loss 5.9480, time 125.34ms
iter 373100: loss 6.3859, time 125.14ms
iter 373110: loss 5.8919, time 125.28ms
iter 373120: loss 5.2671, time 125.35ms
iter 373130: loss 6.2005, time 126.07ms
iter 373140: loss 6.1569, time 125.18ms
iter 373150: loss 6.2624, time 125.27ms
iter 373160: loss 5.7792, time 127.80ms
iter 373170: loss 5.6011, time 125.20ms
iter 373180: loss 6.2197, time 125.19ms
iter 373190: loss 5.5710, time 125.37ms
iter 373200: loss 5.7263, time 125.58ms
iter 373210: loss 6.1053, time 124.01ms
iter 373220: loss 5.8697, time 125.10ms
iter 373230: loss 5.7530, time 125.58ms
iter 373240: loss 6.4280, time 125.33ms
step 373250: train loss 5.6080, val loss 5.5828
saving checkpoint to out-shakespeare-char
iter 373250: loss 5.5363, time 2880.24ms
iter 373260: loss 6.1695, time 120.56ms
iter 373270: loss 6.9783, time 121.57ms
iter 373280: loss 5.5990, time 121.71ms
iter 373290: loss 7.0065, time 121.61ms
iter 373300: loss 6.3789, time 120.95ms
iter 373310: loss 5.8862, time 121.56ms
iter 373320: loss 6.4459, time 121.03ms
iter 373330: loss 6.3084, time 121.55ms
iter 373340: loss 5.5983, time 121.66ms
iter 373350: loss 5.5879, time 121.44ms
iter 373360: loss 6.3867, time 121.54ms
iter 373370: loss 5.5358, time 121.56ms
iter 373380: loss 5.8355, time 121.35ms
iter 373390: loss 6.1788, time 121.74ms
iter 373400: loss 6.1774, time 122.47ms
iter 373410: loss 5.9907, time 121.46ms
iter 373420: loss 6.0054, time 121.62ms
iter 373430: loss 5.9109, time 121.51ms
iter 373440: loss 6.0191, time 121.24ms
iter 373450: loss 6.1805, time 121.32ms
iter 373460: loss 6.0569, time 121.62ms
iter 373470: loss 5.6884, time 121.61ms
iter 373480: loss 5.8140, time 121.62ms
iter 373490: loss 6.2365, time 120.19ms
step 373500: train loss 5.6018, val loss 5.5596
saving checkpoint to out-shakespeare-char
iter 373500: loss 5.8397, time 2916.15ms
iter 373510: loss 6.4499, time 122.91ms
iter 373520: loss 5.8712, time 121.37ms
iter 373530: loss 5.7191, time 122.26ms
iter 373540: loss 5.6388, time 121.21ms
iter 373550: loss 5.4979, time 122.48ms
iter 373560: loss 6.5679, time 121.33ms
iter 373570: loss 5.3687, time 122.51ms
iter 373580: loss 5.8375, time 121.29ms
iter 373590: loss 5.8801, time 122.50ms
iter 373600: loss 6.4709, time 121.58ms
iter 373610: loss 6.1815, time 122.28ms
iter 373620: loss 6.3881, time 121.27ms
iter 373630: loss 6.2788, time 122.51ms
iter 373640: loss 5.9448, time 121.29ms
iter 373650: loss 6.5905, time 122.33ms
iter 373660: loss 6.3834, time 121.27ms
iter 373670: loss 6.2667, time 123.74ms
iter 373680: loss 6.0610, time 121.46ms
iter 373690: loss 4.8067, time 123.05ms
iter 373700: loss 5.2610, time 121.22ms
iter 373710: loss 6.2647, time 121.19ms
iter 373720: loss 6.0219, time 121.65ms
iter 373730: loss 5.5278, time 119.01ms
iter 373740: loss 6.6448, time 121.49ms
step 373750: train loss 5.6185, val loss 5.6037
saving checkpoint to out-shakespeare-char
iter 373750: loss 6.1623, time 2895.04ms
iter 373760: loss 6.1242, time 121.76ms
iter 373770: loss 5.7159, time 122.89ms
iter 373780: loss 5.8432, time 122.95ms
iter 373790: loss 6.0138, time 122.95ms
iter 373800: loss 6.3327, time 122.20ms
iter 373810: loss 6.5695, time 123.03ms
iter 373820: loss 5.2724, time 121.82ms
iter 373830: loss 6.3751, time 120.97ms
iter 373840: loss 5.4472, time 121.92ms
iter 373850: loss 5.5592, time 122.03ms
iter 373860: loss 6.4353, time 121.72ms
iter 373870: loss 6.0490, time 123.30ms
iter 373880: loss 6.5500, time 122.19ms
iter 373890: loss 5.8795, time 122.92ms
iter 373900: loss 6.0947, time 121.88ms
iter 373910: loss 5.4056, time 123.56ms
iter 373920: loss 5.5748, time 120.53ms
iter 373930: loss 6.1576, time 123.65ms
iter 373940: loss 6.4107, time 121.86ms
iter 373950: loss 5.7966, time 122.74ms
iter 373960: loss 5.3312, time 122.22ms
iter 373970: loss 5.5658, time 122.97ms
iter 373980: loss 6.3776, time 125.62ms
iter 373990: loss 6.1282, time 125.56ms
step 374000: train loss 5.6399, val loss 5.5940
saving checkpoint to out-shakespeare-char
iter 374000: loss 6.1948, time 2912.11ms
iter 374010: loss 5.2590, time 125.24ms
iter 374020: loss 5.1664, time 127.57ms
iter 374030: loss 5.9137, time 125.27ms
iter 374040: loss 6.2299, time 125.29ms
iter 374050: loss 5.5508, time 125.46ms
iter 374060: loss 5.4918, time 125.34ms
iter 374070: loss 6.2121, time 125.05ms
iter 374080: loss 5.5163, time 125.29ms
iter 374090: loss 6.4752, time 125.14ms
iter 374100: loss 5.6866, time 125.23ms
iter 374110: loss 5.3623, time 125.31ms
iter 374120: loss 6.2403, time 124.31ms
iter 374130: loss 6.3697, time 124.07ms
iter 374140: loss 6.4885, time 124.71ms
iter 374150: loss 6.5360, time 125.35ms
iter 374160: loss 5.5679, time 125.83ms
iter 374170: loss 6.3674, time 124.33ms
iter 374180: loss 6.1783, time 125.36ms
iter 374190: loss 5.4660, time 125.35ms
iter 374200: loss 5.5804, time 127.67ms
iter 374210: loss 6.6444, time 123.14ms
iter 374220: loss 5.9184, time 125.15ms
iter 374230: loss 6.1672, time 125.50ms
iter 374240: loss 5.5240, time 125.78ms
step 374250: train loss 5.6123, val loss 5.5995
saving checkpoint to out-shakespeare-char
iter 374250: loss 5.4592, time 2901.51ms
iter 374260: loss 5.7971, time 120.71ms
iter 374270: loss 5.2560, time 121.69ms
iter 374280: loss 5.8333, time 121.32ms
iter 374290: loss 5.5288, time 121.49ms
iter 374300: loss 6.2991, time 121.90ms
iter 374310: loss 5.6914, time 121.46ms
iter 374320: loss 5.9635, time 121.36ms
iter 374330: loss 6.1984, time 121.32ms
iter 374340: loss 6.1009, time 121.50ms
iter 374350: loss 6.1359, time 122.22ms
iter 374360: loss 4.9879, time 121.48ms
iter 374370: loss 6.1350, time 121.39ms
iter 374380: loss 5.9005, time 121.52ms
iter 374390: loss 5.9086, time 121.61ms
iter 374400: loss 5.6408, time 121.71ms
iter 374410: loss 5.8982, time 121.42ms
iter 374420: loss 5.6129, time 121.40ms
iter 374430: loss 5.2288, time 120.68ms
iter 374440: loss 6.3087, time 121.65ms
iter 374450: loss 5.8699, time 121.54ms
iter 374460: loss 5.7097, time 121.58ms
iter 374470: loss 5.4671, time 121.57ms
iter 374480: loss 5.6632, time 121.48ms
iter 374490: loss 5.8185, time 121.73ms
step 374500: train loss 5.6180, val loss 5.5974
saving checkpoint to out-shakespeare-char
iter 374500: loss 6.0473, time 2896.18ms
iter 374510: loss 6.1812, time 122.70ms
iter 374520: loss 6.1480, time 121.63ms
iter 374530: loss 5.9352, time 121.76ms
iter 374540: loss 6.2509, time 121.58ms
iter 374550: loss 5.7662, time 122.59ms
iter 374560: loss 5.5882, time 121.50ms
iter 374570: loss 6.5629, time 122.56ms
iter 374580: loss 6.3478, time 121.46ms
iter 374590: loss 6.3822, time 122.68ms
iter 374600: loss 6.2864, time 121.69ms
iter 374610: loss 6.3687, time 122.69ms
iter 374620: loss 5.8556, time 121.69ms
iter 374630: loss 6.1733, time 122.70ms
iter 374640: loss 5.3807, time 121.57ms
iter 374650: loss 5.8250, time 123.04ms
iter 374660: loss 5.5470, time 121.50ms
iter 374670: loss 5.8704, time 122.79ms
iter 374680: loss 5.7515, time 121.57ms
iter 374690: loss 5.7083, time 122.90ms
iter 374700: loss 6.2423, time 121.60ms
iter 374710: loss 6.4875, time 123.15ms
iter 374720: loss 6.2980, time 121.56ms
iter 374730: loss 6.1111, time 122.85ms
iter 374740: loss 6.1273, time 121.66ms
step 374750: train loss 5.5486, val loss 5.5342
saving checkpoint to out-shakespeare-char
iter 374750: loss 5.6438, time 2885.34ms
iter 374760: loss 6.5900, time 121.55ms
iter 374770: loss 5.3093, time 120.57ms
iter 374780: loss 6.0410, time 121.17ms
iter 374790: loss 6.8373, time 121.53ms
iter 374800: loss 5.7803, time 121.51ms
iter 374810: loss 6.2757, time 121.61ms
iter 374820: loss 6.3673, time 121.37ms
iter 374830: loss 5.8164, time 121.50ms
iter 374840: loss 5.8545, time 121.63ms
iter 374850: loss 5.1414, time 121.53ms
iter 374860: loss 5.4280, time 121.53ms
iter 374870: loss 5.6187, time 121.52ms
iter 374880: loss 6.3017, time 120.66ms
iter 374890: loss 6.0294, time 121.62ms
iter 374900: loss 5.4811, time 121.49ms
iter 374910: loss 5.8757, time 121.20ms
iter 374920: loss 6.1115, time 121.41ms
iter 374930: loss 5.8763, time 121.29ms
iter 374940: loss 6.1936, time 121.44ms
iter 374950: loss 6.0363, time 121.26ms
iter 374960: loss 6.0760, time 121.49ms
iter 374970: loss 5.9766, time 121.29ms
iter 374980: loss 5.8346, time 120.90ms
iter 374990: loss 5.7173, time 121.41ms
step 375000: train loss 5.6237, val loss 5.5337
saving checkpoint to out-shakespeare-char
iter 375000: loss 5.8287, time 2875.11ms
iter 375010: loss 5.7072, time 121.42ms
iter 375020: loss 5.9191, time 121.74ms
iter 375030: loss 6.8812, time 121.35ms
iter 375040: loss 6.2165, time 122.25ms
iter 375050: loss 5.8365, time 121.49ms
iter 375060: loss 5.8575, time 122.49ms
iter 375070: loss 5.2637, time 121.50ms
iter 375080: loss 6.6271, time 122.62ms
iter 375090: loss 5.6815, time 121.41ms
iter 375100: loss 5.7480, time 122.60ms
iter 375110: loss 6.1526, time 121.50ms
iter 375120: loss 6.1004, time 122.58ms
iter 375130: loss 5.3804, time 121.53ms
iter 375140: loss 6.0381, time 122.61ms
iter 375150: loss 6.0506, time 121.39ms
iter 375160: loss 6.3142, time 122.64ms
iter 375170: loss 6.3664, time 121.56ms
iter 375180: loss 5.3649, time 122.55ms
iter 375190: loss 6.2163, time 121.55ms
iter 375200: loss 5.6965, time 122.26ms
iter 375210: loss 6.2646, time 121.42ms
iter 375220: loss 5.6005, time 122.65ms
iter 375230: loss 6.3596, time 122.01ms
iter 375240: loss 5.7750, time 123.05ms
step 375250: train loss 5.6054, val loss 5.6131
saving checkpoint to out-shakespeare-char
iter 375250: loss 5.6057, time 2877.35ms
iter 375260: loss 5.8086, time 125.89ms
iter 375270: loss 5.5954, time 126.05ms
iter 375280: loss 5.7499, time 127.97ms
iter 375290: loss 6.6262, time 125.96ms
iter 375300: loss 6.1073, time 126.06ms
iter 375310: loss 5.7678, time 125.92ms
iter 375320: loss 6.3016, time 125.82ms
iter 375330: loss 5.6131, time 125.89ms
iter 375340: loss 5.1551, time 125.43ms
iter 375350: loss 6.2244, time 125.64ms
iter 375360: loss 5.3746, time 125.81ms
iter 375370: loss 5.9487, time 125.67ms
iter 375380: loss 6.4059, time 125.86ms
iter 375390: loss 6.5078, time 128.13ms
iter 375400: loss 6.8020, time 126.23ms
iter 375410: loss 6.6727, time 124.78ms
iter 375420: loss 6.5891, time 125.61ms
iter 375430: loss 5.5101, time 125.74ms
iter 375440: loss 5.9123, time 125.63ms
iter 375450: loss 6.4129, time 125.71ms
iter 375460: loss 5.1230, time 125.56ms
iter 375470: loss 5.7965, time 125.91ms
iter 375480: loss 5.5479, time 125.74ms
iter 375490: loss 6.6555, time 125.26ms
step 375500: train loss 5.5744, val loss 5.5626
saving checkpoint to out-shakespeare-char
iter 375500: loss 5.9540, time 2880.93ms
iter 375510: loss 5.3205, time 125.97ms
iter 375520: loss 5.7065, time 124.84ms
iter 375530: loss 6.0659, time 125.74ms
iter 375540: loss 5.4755, time 124.68ms
iter 375550: loss 6.2811, time 128.00ms
iter 375560: loss 5.2527, time 125.38ms
iter 375570: loss 6.5528, time 125.24ms
iter 375580: loss 5.1260, time 125.72ms
iter 375590: loss 5.7582, time 125.54ms
iter 375600: loss 6.0214, time 125.78ms
iter 375610: loss 6.0678, time 125.16ms
iter 375620: loss 5.9790, time 125.33ms
iter 375630: loss 6.4928, time 124.97ms
iter 375640: loss 5.6166, time 124.89ms
iter 375650: loss 5.9086, time 125.21ms
iter 375660: loss 6.2203, time 127.18ms
iter 375670: loss 5.8218, time 125.09ms
iter 375680: loss 6.2716, time 125.11ms
iter 375690: loss 5.2717, time 125.69ms
iter 375700: loss 5.9413, time 125.57ms
iter 375710: loss 5.9386, time 124.75ms
iter 375720: loss 5.5289, time 124.94ms
iter 375730: loss 5.8796, time 125.20ms
iter 375740: loss 5.8622, time 124.15ms
step 375750: train loss 5.5861, val loss 5.5596
saving checkpoint to out-shakespeare-char
iter 375750: loss 6.0250, time 2880.51ms
iter 375760: loss 6.0253, time 125.35ms
iter 375770: loss 6.4967, time 125.26ms
iter 375780: loss 6.1014, time 124.17ms
iter 375790: loss 6.3275, time 125.00ms
iter 375800: loss 5.6803, time 124.93ms
iter 375810: loss 6.1016, time 127.55ms
iter 375820: loss 6.9144, time 124.98ms
iter 375830: loss 5.8247, time 124.98ms
iter 375840: loss 6.0883, time 125.44ms
iter 375850: loss 6.4080, time 127.18ms
iter 375860: loss 6.0175, time 125.63ms
iter 375870: loss 6.2772, time 125.04ms
iter 375880: loss 5.8804, time 123.74ms
iter 375890: loss 5.8199, time 125.02ms
iter 375900: loss 5.4405, time 125.57ms
iter 375910: loss 6.5156, time 125.31ms
iter 375920: loss 6.0356, time 125.44ms
iter 375930: loss 6.4844, time 125.70ms
iter 375940: loss 6.5108, time 125.89ms
iter 375950: loss 6.4850, time 125.40ms
iter 375960: loss 5.3918, time 125.51ms
iter 375970: loss 5.9346, time 125.79ms
iter 375980: loss 6.4265, time 127.98ms
iter 375990: loss 6.3287, time 124.86ms
step 376000: train loss 5.5961, val loss 5.5377
saving checkpoint to out-shakespeare-char
iter 376000: loss 6.3911, time 2900.00ms
iter 376010: loss 5.4188, time 125.37ms
iter 376020: loss 6.4395, time 125.34ms
iter 376030: loss 5.9970, time 124.62ms
iter 376040: loss 6.8716, time 124.80ms
iter 376050: loss 6.1611, time 125.68ms
iter 376060: loss 6.1335, time 125.55ms
iter 376070: loss 6.2026, time 125.45ms
iter 376080: loss 5.4842, time 125.52ms
iter 376090: loss 5.9110, time 125.88ms
iter 376100: loss 5.4785, time 128.01ms
iter 376110: loss 6.3534, time 124.91ms
iter 376120: loss 5.5676, time 125.95ms
iter 376130: loss 5.7604, time 126.50ms
iter 376140: loss 5.8968, time 125.89ms
iter 376150: loss 5.7897, time 124.33ms
iter 376160: loss 5.9514, time 126.04ms
iter 376170: loss 6.2857, time 126.02ms
iter 376180: loss 5.6892, time 125.61ms
iter 376190: loss 6.6774, time 125.74ms
iter 376200: loss 6.1311, time 126.13ms
iter 376210: loss 6.0196, time 126.03ms
iter 376220: loss 5.7077, time 126.15ms
iter 376230: loss 5.8723, time 125.87ms
iter 376240: loss 5.7356, time 125.68ms
step 376250: train loss 5.5905, val loss 5.6023
saving checkpoint to out-shakespeare-char
iter 376250: loss 5.6026, time 2903.60ms
iter 376260: loss 6.0748, time 125.44ms
iter 376270: loss 5.5919, time 125.20ms
iter 376280: loss 6.1072, time 124.26ms
iter 376290: loss 6.1912, time 125.32ms
iter 376300: loss 6.3743, time 124.49ms
iter 376310: loss 6.0595, time 125.55ms
iter 376320: loss 5.3784, time 124.45ms
iter 376330: loss 6.3830, time 125.81ms
iter 376340: loss 5.9311, time 124.58ms
iter 376350: loss 6.2306, time 125.05ms
iter 376360: loss 6.0043, time 127.46ms
iter 376370: loss 5.9500, time 124.68ms
iter 376380: loss 5.9882, time 125.90ms
iter 376390: loss 6.5109, time 126.16ms
iter 376400: loss 6.3938, time 125.36ms
iter 376410: loss 6.3641, time 125.91ms
iter 376420: loss 5.3476, time 126.16ms
iter 376430: loss 5.5297, time 126.13ms
iter 376440: loss 5.6441, time 124.94ms
iter 376450: loss 6.0263, time 126.24ms
iter 376460: loss 6.1362, time 125.73ms
iter 376470: loss 6.0046, time 124.65ms
iter 376480: loss 5.6195, time 126.45ms
iter 376490: loss 5.7325, time 125.92ms
step 376500: train loss 5.5811, val loss 5.5539
saving checkpoint to out-shakespeare-char
iter 376500: loss 5.2606, time 2881.40ms
iter 376510: loss 6.2122, time 124.44ms
iter 376520: loss 6.1939, time 127.71ms
iter 376530: loss 6.5196, time 125.71ms
iter 376540: loss 6.2784, time 125.37ms
iter 376550: loss 6.5709, time 124.45ms
iter 376560: loss 5.6724, time 125.48ms
iter 376570: loss 5.4915, time 125.30ms
iter 376580: loss 5.2907, time 125.62ms
iter 376590: loss 6.0782, time 124.81ms
iter 376600: loss 5.7688, time 125.47ms
iter 376610: loss 5.4047, time 125.62ms
iter 376620: loss 6.1028, time 125.35ms
iter 376630: loss 5.7876, time 126.89ms
iter 376640: loss 5.9181, time 125.71ms
iter 376650: loss 6.1914, time 125.51ms
iter 376660: loss 6.1337, time 125.29ms
iter 376670: loss 5.6469, time 125.84ms
iter 376680: loss 5.5496, time 125.27ms
iter 376690: loss 5.9001, time 125.90ms
iter 376700: loss 6.2694, time 125.59ms
iter 376710: loss 6.1084, time 125.62ms
iter 376720: loss 5.9896, time 124.94ms
iter 376730: loss 6.2987, time 125.34ms
iter 376740: loss 6.2456, time 125.93ms
step 376750: train loss 5.5833, val loss 5.5768
saving checkpoint to out-shakespeare-char
iter 376750: loss 5.7535, time 2896.94ms
iter 376760: loss 6.3872, time 125.59ms
iter 376770: loss 6.2043, time 124.63ms
iter 376780: loss 5.9264, time 125.97ms
iter 376790: loss 5.8626, time 125.52ms
iter 376800: loss 6.2142, time 125.52ms
iter 376810: loss 5.3595, time 124.84ms
iter 376820: loss 5.5916, time 125.58ms
iter 376830: loss 6.1894, time 124.99ms
iter 376840: loss 5.6616, time 125.27ms
iter 376850: loss 6.3022, time 125.65ms
iter 376860: loss 6.3828, time 127.49ms
iter 376870: loss 5.7052, time 125.87ms
iter 376880: loss 5.8576, time 125.61ms
iter 376890: loss 6.4850, time 125.25ms
iter 376900: loss 6.5315, time 125.20ms
iter 376910: loss 6.5084, time 125.47ms
iter 376920: loss 6.1978, time 125.88ms
iter 376930: loss 6.1197, time 125.65ms
iter 376940: loss 6.4433, time 126.09ms
iter 376950: loss 5.8482, time 125.63ms
iter 376960: loss 6.1274, time 125.42ms
iter 376970: loss 6.1586, time 128.13ms
iter 376980: loss 5.7876, time 125.47ms
iter 376990: loss 6.0431, time 125.07ms
step 377000: train loss 5.5738, val loss 5.6262
saving checkpoint to out-shakespeare-char
iter 377000: loss 5.6871, time 2861.03ms
iter 377010: loss 6.5597, time 124.34ms
iter 377020: loss 6.3286, time 125.30ms
iter 377030: loss 5.6044, time 125.41ms
iter 377040: loss 7.1710, time 125.33ms
iter 377050: loss 6.0499, time 124.40ms
iter 377060: loss 5.7503, time 125.48ms
iter 377070: loss 6.1435, time 125.64ms
iter 377080: loss 5.7117, time 123.92ms
iter 377090: loss 5.9755, time 125.82ms
iter 377100: loss 5.8541, time 125.42ms
iter 377110: loss 7.1047, time 125.71ms
iter 377120: loss 5.7688, time 124.91ms
iter 377130: loss 5.7184, time 125.47ms
iter 377140: loss 5.8765, time 125.52ms
iter 377150: loss 6.6032, time 124.67ms
iter 377160: loss 4.9782, time 126.39ms
iter 377170: loss 6.1945, time 125.61ms
iter 377180: loss 5.6411, time 125.53ms
iter 377190: loss 6.1181, time 126.02ms
iter 377200: loss 5.2595, time 125.70ms
iter 377210: loss 5.5275, time 125.67ms
iter 377220: loss 5.2335, time 125.85ms
iter 377230: loss 5.8565, time 127.93ms
iter 377240: loss 6.1239, time 125.47ms
step 377250: train loss 5.5502, val loss 5.5952
saving checkpoint to out-shakespeare-char
iter 377250: loss 5.5313, time 2888.72ms
iter 377260: loss 5.9423, time 126.20ms
iter 377270: loss 5.4480, time 126.17ms
iter 377280: loss 5.7945, time 126.09ms
iter 377290: loss 5.6089, time 125.97ms
iter 377300: loss 5.3380, time 125.94ms
iter 377310: loss 5.6282, time 125.70ms
iter 377320: loss 6.2624, time 125.77ms
iter 377330: loss 5.7999, time 125.48ms
iter 377340: loss 6.1369, time 125.94ms
iter 377350: loss 6.1793, time 127.19ms
iter 377360: loss 5.9181, time 126.05ms
iter 377370: loss 5.0758, time 125.75ms
iter 377380: loss 5.3130, time 125.25ms
iter 377390: loss 5.8729, time 125.98ms
iter 377400: loss 6.4637, time 125.90ms
iter 377410: loss 6.0766, time 125.78ms
iter 377420: loss 6.6716, time 125.23ms
iter 377430: loss 5.6032, time 126.28ms
iter 377440: loss 6.4267, time 125.83ms
iter 377450: loss 6.3753, time 124.98ms
iter 377460: loss 5.8052, time 125.07ms
iter 377470: loss 6.1277, time 124.91ms
iter 377480: loss 6.0002, time 125.17ms
iter 377490: loss 6.2774, time 125.28ms
step 377500: train loss 5.6172, val loss 5.5839
saving checkpoint to out-shakespeare-char
iter 377500: loss 7.2060, time 2886.59ms
iter 377510: loss 5.7288, time 125.57ms
iter 377520: loss 5.8635, time 128.55ms
iter 377530: loss 6.2219, time 127.47ms
iter 377540: loss 5.6825, time 125.99ms
iter 377550: loss 6.7023, time 126.40ms
iter 377560: loss 6.3393, time 128.45ms
iter 377570: loss 6.1466, time 125.64ms
iter 377580: loss 6.2018, time 125.66ms
iter 377590: loss 5.7818, time 125.71ms
iter 377600: loss 6.1268, time 127.89ms
iter 377610: loss 5.6423, time 125.49ms
iter 377620: loss 5.6916, time 124.68ms
iter 377630: loss 5.7714, time 125.79ms
iter 377640: loss 6.1531, time 125.84ms
iter 377650: loss 6.1156, time 125.00ms
iter 377660: loss 5.5079, time 124.98ms
iter 377670: loss 6.9325, time 127.60ms
iter 377680: loss 5.6615, time 124.82ms
iter 377690: loss 6.1088, time 125.27ms
iter 377700: loss 5.9415, time 125.50ms
iter 377710: loss 5.8704, time 125.16ms
iter 377720: loss 6.3484, time 125.19ms
iter 377730: loss 5.7137, time 124.98ms
iter 377740: loss 5.5259, time 125.32ms
step 377750: train loss 5.5986, val loss 5.5579
saving checkpoint to out-shakespeare-char
iter 377750: loss 6.1934, time 2886.37ms
iter 377760: loss 5.7338, time 125.10ms
iter 377770: loss 5.7994, time 125.01ms
iter 377780: loss 5.3333, time 125.02ms
iter 377790: loss 6.9828, time 125.21ms
iter 377800: loss 5.7378, time 123.84ms
iter 377810: loss 6.0116, time 125.05ms
iter 377820: loss 5.9331, time 125.01ms
iter 377830: loss 6.0660, time 125.16ms
iter 377840: loss 6.0012, time 124.92ms
iter 377850: loss 6.2427, time 125.18ms
iter 377860: loss 6.3877, time 124.56ms
iter 377870: loss 6.1861, time 125.62ms
iter 377880: loss 5.7162, time 124.31ms
iter 377890: loss 5.5840, time 124.96ms
iter 377900: loss 5.5855, time 127.49ms
iter 377910: loss 6.2242, time 124.81ms
iter 377920: loss 5.4571, time 124.12ms
iter 377930: loss 5.4164, time 125.18ms
iter 377940: loss 5.7081, time 124.38ms
iter 377950: loss 5.6086, time 125.18ms
iter 377960: loss 5.8549, time 125.48ms
iter 377970: loss 6.5135, time 127.73ms
iter 377980: loss 5.7755, time 125.12ms
iter 377990: loss 6.2247, time 125.05ms
step 378000: train loss 5.5825, val loss 5.6300
saving checkpoint to out-shakespeare-char
iter 378000: loss 5.9028, time 2912.02ms
iter 378010: loss 5.8717, time 125.38ms
iter 378020: loss 6.0739, time 127.65ms
iter 378030: loss 5.8176, time 125.02ms
iter 378040: loss 6.2380, time 125.82ms
iter 378050: loss 5.2495, time 125.28ms
iter 378060: loss 6.3642, time 125.34ms
iter 378070: loss 6.4143, time 125.35ms
iter 378080: loss 5.4096, time 125.00ms
iter 378090: loss 5.7549, time 124.43ms
iter 378100: loss 6.3526, time 125.57ms
iter 378110: loss 6.3001, time 125.31ms
iter 378120: loss 4.9901, time 125.13ms
iter 378130: loss 6.2452, time 127.41ms
iter 378140: loss 6.1122, time 124.83ms
iter 378150: loss 6.6935, time 125.55ms
iter 378160: loss 5.7267, time 125.52ms
iter 378170: loss 6.0223, time 125.08ms
iter 378180: loss 5.9702, time 125.56ms
iter 378190: loss 5.9618, time 125.37ms
iter 378200: loss 6.0094, time 125.38ms
iter 378210: loss 6.1453, time 125.15ms
iter 378220: loss 5.9683, time 124.17ms
iter 378230: loss 6.2279, time 125.31ms
iter 378240: loss 5.9043, time 126.33ms
step 378250: train loss 5.5683, val loss 5.5948
saving checkpoint to out-shakespeare-char
iter 378250: loss 5.8847, time 2889.93ms
iter 378260: loss 6.3189, time 124.90ms
iter 378270: loss 5.9657, time 125.11ms
iter 378280: loss 5.3172, time 125.69ms
iter 378290: loss 5.5282, time 126.40ms
iter 378300: loss 6.4048, time 126.02ms
iter 378310: loss 5.7151, time 125.88ms
iter 378320: loss 5.8507, time 124.01ms
iter 378330: loss 5.9598, time 124.99ms
iter 378340: loss 5.5806, time 126.26ms
iter 378350: loss 6.4403, time 125.33ms
iter 378360: loss 5.9530, time 126.77ms
iter 378370: loss 5.4276, time 124.68ms
iter 378380: loss 6.2806, time 124.94ms
iter 378390: loss 6.3002, time 125.14ms
iter 378400: loss 5.3880, time 125.17ms
iter 378410: loss 6.3388, time 125.37ms
iter 378420: loss 5.5075, time 124.95ms
iter 378430: loss 5.8966, time 125.17ms
iter 378440: loss 6.4392, time 125.20ms
iter 378450: loss 6.1814, time 125.54ms
iter 378460: loss 6.4729, time 125.08ms
iter 378470: loss 5.5549, time 127.95ms
iter 378480: loss 6.5902, time 126.30ms
iter 378490: loss 5.7058, time 126.27ms
step 378500: train loss 5.6120, val loss 5.5572
saving checkpoint to out-shakespeare-char
iter 378500: loss 6.0843, time 2890.24ms
iter 378510: loss 6.3862, time 122.11ms
iter 378520: loss 6.0811, time 121.32ms
iter 378530: loss 6.3838, time 123.18ms
iter 378540: loss 6.6563, time 119.99ms
iter 378550: loss 6.1963, time 120.18ms
iter 378560: loss 6.0068, time 125.71ms
iter 378570: loss 6.1607, time 126.59ms
iter 378580: loss 5.5076, time 126.16ms
iter 378590: loss 6.1727, time 125.76ms
iter 378600: loss 5.3432, time 125.88ms
iter 378610: loss 5.8800, time 125.63ms
iter 378620: loss 6.3185, time 128.66ms
iter 378630: loss 6.0200, time 125.28ms
iter 378640: loss 5.5645, time 125.59ms
iter 378650: loss 6.5871, time 126.03ms
iter 378660: loss 6.0432, time 128.28ms
iter 378670: loss 6.6083, time 125.82ms
iter 378680: loss 6.1142, time 125.85ms
iter 378690: loss 5.5930, time 125.63ms
iter 378700: loss 6.5892, time 125.79ms
iter 378710: loss 6.1222, time 126.17ms
iter 378720: loss 5.8899, time 125.89ms
iter 378730: loss 5.6964, time 126.34ms
iter 378740: loss 6.6673, time 125.62ms
step 378750: train loss 5.6076, val loss 5.6286
saving checkpoint to out-shakespeare-char
iter 378750: loss 5.7186, time 2878.93ms
iter 378760: loss 6.3824, time 125.96ms
iter 378770: loss 6.4957, time 125.86ms
iter 378780: loss 5.8224, time 124.57ms
iter 378790: loss 5.8987, time 125.81ms
iter 378800: loss 6.8537, time 125.64ms
iter 378810: loss 6.3530, time 127.76ms
iter 378820: loss 6.3179, time 125.59ms
iter 378830: loss 5.5235, time 125.27ms
iter 378840: loss 5.0507, time 125.66ms
iter 378850: loss 5.9796, time 128.19ms
iter 378860: loss 6.0786, time 124.94ms
iter 378870: loss 6.3567, time 125.47ms
iter 378880: loss 5.6592, time 125.44ms
iter 378890: loss 5.5266, time 125.82ms
iter 378900: loss 6.1676, time 125.08ms
iter 378910: loss 5.5962, time 125.43ms
iter 378920: loss 5.4074, time 125.43ms
iter 378930: loss 5.5221, time 127.84ms
iter 378940: loss 5.2441, time 125.67ms
iter 378950: loss 5.7989, time 125.28ms
iter 378960: loss 6.5403, time 125.74ms
iter 378970: loss 6.0346, time 125.90ms
iter 378980: loss 5.6344, time 125.66ms
iter 378990: loss 5.3041, time 125.56ms
step 379000: train loss 5.6160, val loss 5.5514
saving checkpoint to out-shakespeare-char
iter 379000: loss 6.2480, time 2867.97ms
iter 379010: loss 5.8157, time 121.51ms
iter 379020: loss 6.3122, time 122.04ms
iter 379030: loss 6.1336, time 121.27ms
iter 379040: loss 6.5412, time 121.42ms
iter 379050: loss 5.2478, time 120.69ms
iter 379060: loss 6.0798, time 121.55ms
iter 379070: loss 6.4767, time 121.57ms
iter 379080: loss 6.0238, time 121.58ms
iter 379090: loss 6.2180, time 122.16ms
iter 379100: loss 5.3483, time 121.50ms
iter 379110: loss 6.2458, time 119.22ms
iter 379120: loss 6.7004, time 121.45ms
iter 379130: loss 6.0315, time 121.32ms
iter 379140: loss 5.5748, time 121.49ms
iter 379150: loss 6.0384, time 121.67ms
iter 379160: loss 5.1537, time 121.64ms
iter 379170: loss 5.5909, time 121.47ms
iter 379180: loss 5.3283, time 121.57ms
iter 379190: loss 5.8051, time 121.61ms
iter 379200: loss 5.8296, time 121.59ms
iter 379210: loss 6.1936, time 121.59ms
iter 379220: loss 6.3356, time 121.30ms
iter 379230: loss 5.3495, time 121.54ms
iter 379240: loss 6.5086, time 120.89ms
step 379250: train loss 5.6075, val loss 5.5709
saving checkpoint to out-shakespeare-char
iter 379250: loss 7.0998, time 2892.67ms
iter 379260: loss 6.2435, time 122.21ms
iter 379270: loss 6.1848, time 121.62ms
iter 379280: loss 6.0704, time 121.83ms
iter 379290: loss 5.4239, time 121.63ms
iter 379300: loss 5.6373, time 120.84ms
iter 379310: loss 5.6242, time 121.85ms
iter 379320: loss 5.5342, time 121.58ms
iter 379330: loss 5.9329, time 121.58ms
iter 379340: loss 6.1122, time 121.53ms
iter 379350: loss 5.5698, time 121.26ms
iter 379360: loss 5.5912, time 121.59ms
iter 379370: loss 6.3186, time 121.60ms
iter 379380: loss 5.3388, time 121.37ms
iter 379390: loss 6.3176, time 121.67ms
iter 379400: loss 6.3786, time 121.58ms
iter 379410: loss 6.3404, time 122.09ms
iter 379420: loss 6.6012, time 121.64ms
iter 379430: loss 6.6696, time 121.66ms
iter 379440: loss 6.5198, time 121.79ms
iter 379450: loss 5.8468, time 121.70ms
iter 379460: loss 6.0852, time 121.65ms
iter 379470: loss 5.3185, time 121.59ms
iter 379480: loss 6.6718, time 120.76ms
iter 379490: loss 5.6909, time 121.50ms
step 379500: train loss 5.5794, val loss 5.5842
saving checkpoint to out-shakespeare-char
iter 379500: loss 6.3504, time 2887.82ms
iter 379510: loss 5.7588, time 120.93ms
iter 379520: loss 5.9855, time 121.89ms
iter 379530: loss 5.2819, time 121.68ms
iter 379540: loss 6.3542, time 121.34ms
iter 379550: loss 5.4345, time 121.65ms
iter 379560: loss 6.1373, time 121.56ms
iter 379570: loss 5.4074, time 121.39ms
iter 379580: loss 5.8142, time 121.42ms
iter 379590: loss 6.9607, time 120.22ms
iter 379600: loss 5.7936, time 121.36ms
iter 379610: loss 6.6088, time 121.44ms
iter 379620: loss 5.1641, time 121.44ms
iter 379630: loss 6.2670, time 121.54ms
iter 379640: loss 6.3620, time 121.51ms
iter 379650: loss 6.0169, time 121.51ms
iter 379660: loss 5.6158, time 121.64ms
iter 379670: loss 6.1351, time 121.53ms
iter 379680: loss 5.8120, time 121.02ms
iter 379690: loss 5.7456, time 121.26ms
iter 379700: loss 6.1379, time 122.07ms
iter 379710: loss 5.5914, time 121.84ms
iter 379720: loss 6.3702, time 121.80ms
iter 379730: loss 5.9709, time 121.60ms
iter 379740: loss 5.9515, time 122.20ms
step 379750: train loss 5.5997, val loss 5.5835
saving checkpoint to out-shakespeare-char
iter 379750: loss 5.6865, time 2898.68ms
iter 379760: loss 5.9607, time 121.29ms
iter 379770: loss 5.7412, time 121.60ms
iter 379780: loss 6.1195, time 121.09ms
iter 379790: loss 5.4535, time 122.30ms
iter 379800: loss 6.3639, time 121.92ms
iter 379810: loss 6.3632, time 121.66ms
iter 379820: loss 6.2253, time 121.26ms
iter 379830: loss 6.2586, time 121.09ms
iter 379840: loss 6.6321, time 121.78ms
iter 379850: loss 6.1950, time 121.62ms
iter 379860: loss 5.3915, time 121.84ms
iter 379870: loss 5.8077, time 121.64ms
iter 379880: loss 6.5250, time 121.52ms
iter 379890: loss 5.1279, time 121.01ms
iter 379900: loss 6.2532, time 121.90ms
iter 379910: loss 6.4759, time 122.22ms
iter 379920: loss 6.2718, time 121.56ms
iter 379930: loss 6.3239, time 122.33ms
iter 379940: loss 5.5801, time 122.03ms
iter 379950: loss 5.6850, time 121.38ms
iter 379960: loss 6.2097, time 121.91ms
iter 379970: loss 6.0417, time 121.47ms
iter 379980: loss 6.2222, time 122.17ms
iter 379990: loss 5.8725, time 121.69ms
step 380000: train loss 5.5815, val loss 5.5483
saving checkpoint to out-shakespeare-char
iter 380000: loss 6.1604, time 2885.58ms
iter 380010: loss 6.2151, time 125.60ms
iter 380020: loss 6.2463, time 125.10ms
iter 380030: loss 6.1678, time 126.14ms
iter 380040: loss 6.4540, time 125.71ms
iter 380050: loss 6.4901, time 125.62ms
iter 380060: loss 5.3680, time 125.26ms
iter 380070: loss 6.3351, time 125.53ms
iter 380080: loss 5.3680, time 125.69ms
iter 380090: loss 5.6386, time 125.85ms
iter 380100: loss 6.8226, time 128.04ms
iter 380110: loss 6.8829, time 125.53ms
iter 380120: loss 5.6146, time 125.49ms
iter 380130: loss 5.5449, time 125.81ms
iter 380140: loss 5.6590, time 127.75ms
iter 380150: loss 5.7288, time 124.53ms
iter 380160: loss 6.2373, time 124.15ms
iter 380170: loss 5.1740, time 125.19ms
iter 380180: loss 5.6738, time 124.54ms
iter 380190: loss 6.1395, time 125.75ms
iter 380200: loss 5.5436, time 125.64ms
iter 380210: loss 5.9322, time 124.82ms
iter 380220: loss 6.1782, time 124.70ms
iter 380230: loss 5.7717, time 125.51ms
iter 380240: loss 5.6718, time 126.77ms
step 380250: train loss 5.6105, val loss 5.5919
saving checkpoint to out-shakespeare-char
iter 380250: loss 6.2386, time 2896.19ms
iter 380260: loss 5.8444, time 121.55ms
iter 380270: loss 6.2763, time 121.34ms
iter 380280: loss 5.2135, time 121.56ms
iter 380290: loss 6.1681, time 122.20ms
iter 380300: loss 6.0708, time 122.09ms
iter 380310: loss 6.6412, time 122.02ms
iter 380320: loss 5.5449, time 121.58ms
iter 380330: loss 5.3221, time 123.14ms
iter 380340: loss 6.7348, time 121.54ms
iter 380350: loss 6.0373, time 122.62ms
iter 380360: loss 5.6320, time 121.47ms
iter 380370: loss 6.5206, time 122.66ms
iter 380380: loss 6.1305, time 121.66ms
iter 380390: loss 5.7948, time 122.54ms
iter 380400: loss 5.8792, time 121.26ms
iter 380410: loss 6.0254, time 122.48ms
iter 380420: loss 5.6017, time 121.46ms
iter 380430: loss 5.1805, time 122.85ms
iter 380440: loss 6.1569, time 121.52ms
iter 380450: loss 5.7910, time 122.43ms
iter 380460: loss 6.3132, time 121.75ms
iter 380470: loss 6.4302, time 122.69ms
iter 380480: loss 6.9839, time 121.50ms
iter 380490: loss 6.2970, time 122.62ms
step 380500: train loss 5.6033, val loss 5.5493
saving checkpoint to out-shakespeare-char
iter 380500: loss 5.8537, time 2879.42ms
iter 380510: loss 6.0186, time 121.56ms
iter 380520: loss 6.3131, time 121.49ms
iter 380530: loss 5.9057, time 119.27ms
iter 380540: loss 6.6585, time 120.01ms
iter 380550: loss 5.1569, time 122.31ms
iter 380560: loss 6.2396, time 120.45ms
iter 380570: loss 6.2878, time 121.65ms
iter 380580: loss 6.2869, time 121.64ms
iter 380590: loss 6.1480, time 121.47ms
iter 380600: loss 5.9100, time 121.68ms
iter 380610: loss 5.5390, time 121.35ms
iter 380620: loss 6.9442, time 121.45ms
iter 380630: loss 6.8986, time 121.29ms
iter 380640: loss 6.6653, time 121.67ms
iter 380650: loss 6.2868, time 121.84ms
iter 380660: loss 5.6318, time 121.44ms
iter 380670: loss 6.4348, time 122.69ms
iter 380680: loss 5.4845, time 122.70ms
iter 380690: loss 6.1655, time 124.98ms
iter 380700: loss 6.4473, time 125.09ms
iter 380710: loss 5.4025, time 125.33ms
iter 380720: loss 5.8130, time 125.01ms
iter 380730: loss 5.9285, time 124.92ms
iter 380740: loss 5.6993, time 124.95ms
step 380750: train loss 5.5728, val loss 5.5985
saving checkpoint to out-shakespeare-char
iter 380750: loss 5.5729, time 2883.72ms
iter 380760: loss 5.5352, time 126.44ms
iter 380770: loss 5.7305, time 126.92ms
iter 380780: loss 5.7963, time 125.43ms
iter 380790: loss 5.9997, time 125.20ms
iter 380800: loss 5.9896, time 125.22ms
iter 380810: loss 6.2789, time 125.27ms
iter 380820: loss 6.1114, time 128.18ms
iter 380830: loss 5.9086, time 125.05ms
iter 380840: loss 5.4443, time 125.63ms
iter 380850: loss 6.4627, time 125.33ms
iter 380860: loss 5.3459, time 126.79ms
iter 380870: loss 6.4415, time 125.79ms
iter 380880: loss 5.9076, time 125.29ms
iter 380890: loss 6.5265, time 125.30ms
iter 380900: loss 5.7804, time 125.30ms
iter 380910: loss 6.0491, time 127.32ms
iter 380920: loss 5.0126, time 125.43ms
iter 380930: loss 6.5464, time 126.22ms
iter 380940: loss 6.1480, time 128.19ms
iter 380950: loss 5.8864, time 125.70ms
iter 380960: loss 5.7774, time 125.25ms
iter 380970: loss 6.2205, time 126.82ms
iter 380980: loss 5.8018, time 125.94ms
iter 380990: loss 6.2478, time 125.81ms
step 381000: train loss 5.5725, val loss 5.5248
saving checkpoint to out-shakespeare-char
iter 381000: loss 6.0295, time 2886.73ms
iter 381010: loss 5.9249, time 125.68ms
iter 381020: loss 5.7684, time 125.63ms
iter 381030: loss 6.1424, time 126.26ms
iter 381040: loss 5.8418, time 125.83ms
iter 381050: loss 5.9534, time 125.98ms
iter 381060: loss 6.4066, time 126.52ms
iter 381070: loss 6.0241, time 125.35ms
iter 381080: loss 6.1720, time 125.86ms
iter 381090: loss 6.2517, time 128.50ms
iter 381100: loss 6.0533, time 125.77ms
iter 381110: loss 5.8594, time 125.60ms
iter 381120: loss 6.0834, time 125.46ms
iter 381130: loss 5.3763, time 124.65ms
iter 381140: loss 5.7583, time 125.86ms
iter 381150: loss 5.8030, time 125.66ms
iter 381160: loss 6.0353, time 125.63ms
iter 381170: loss 6.0347, time 128.10ms
iter 381180: loss 6.0426, time 125.00ms
iter 381190: loss 5.8878, time 125.89ms
iter 381200: loss 6.0628, time 125.70ms
iter 381210: loss 6.1101, time 125.72ms
iter 381220: loss 6.0962, time 125.99ms
iter 381230: loss 6.2902, time 125.77ms
iter 381240: loss 5.6378, time 125.79ms
step 381250: train loss 5.6105, val loss 5.6123
saving checkpoint to out-shakespeare-char
iter 381250: loss 6.0479, time 2880.21ms
iter 381260: loss 5.6802, time 126.10ms
iter 381270: loss 5.7223, time 126.32ms
iter 381280: loss 6.3134, time 126.32ms
iter 381290: loss 6.2577, time 124.60ms
iter 381300: loss 6.9251, time 126.02ms
iter 381310: loss 5.9990, time 126.25ms
iter 381320: loss 5.9532, time 126.06ms
iter 381330: loss 5.6507, time 124.60ms
iter 381340: loss 5.7937, time 125.89ms
iter 381350: loss 5.7663, time 128.24ms
iter 381360: loss 5.4716, time 125.63ms
iter 381370: loss 6.2612, time 125.32ms
iter 381380: loss 5.7006, time 125.50ms
iter 381390: loss 5.4168, time 125.33ms
iter 381400: loss 6.3030, time 125.67ms
iter 381410: loss 5.9791, time 125.30ms
iter 381420: loss 5.6736, time 125.70ms
iter 381430: loss 6.1788, time 125.25ms
iter 381440: loss 5.9239, time 125.17ms
iter 381450: loss 6.3885, time 125.42ms
iter 381460: loss 5.9770, time 125.59ms
iter 381470: loss 6.1195, time 125.37ms
iter 381480: loss 6.2073, time 125.05ms
iter 381490: loss 5.6923, time 124.43ms
step 381500: train loss 5.5012, val loss 5.5852
saving checkpoint to out-shakespeare-char
iter 381500: loss 6.2182, time 2904.44ms
iter 381510: loss 6.2479, time 127.60ms
iter 381520: loss 6.0823, time 125.34ms
iter 381530: loss 5.5636, time 125.11ms
iter 381540: loss 5.5650, time 125.35ms
iter 381550: loss 5.6535, time 128.13ms
iter 381560: loss 6.6785, time 124.45ms
iter 381570: loss 5.6385, time 125.54ms
iter 381580: loss 6.1928, time 125.83ms
iter 381590: loss 6.0825, time 125.09ms
iter 381600: loss 6.3111, time 125.67ms
iter 381610: loss 5.6177, time 125.61ms
iter 381620: loss 5.6789, time 125.50ms
iter 381630: loss 5.8917, time 125.30ms
iter 381640: loss 5.5786, time 125.66ms
iter 381650: loss 6.2101, time 127.11ms
iter 381660: loss 6.1619, time 126.15ms
iter 381670: loss 5.6766, time 127.03ms
iter 381680: loss 5.5660, time 125.70ms
iter 381690: loss 4.9546, time 125.77ms
iter 381700: loss 5.5943, time 125.58ms
iter 381710: loss 5.9202, time 125.56ms
iter 381720: loss 6.1212, time 125.64ms
iter 381730: loss 4.7757, time 128.12ms
iter 381740: loss 5.7982, time 125.06ms
step 381750: train loss 5.5245, val loss 5.5714
saving checkpoint to out-shakespeare-char
iter 381750: loss 5.9147, time 2902.28ms
iter 381760: loss 6.4354, time 125.37ms
iter 381770: loss 5.7603, time 125.44ms
iter 381780: loss 6.1479, time 127.75ms
iter 381790: loss 6.1734, time 126.24ms
iter 381800: loss 6.9427, time 125.28ms
iter 381810: loss 5.5327, time 124.42ms
iter 381820: loss 5.3249, time 125.11ms
iter 381830: loss 6.6451, time 125.31ms
iter 381840: loss 6.3162, time 125.26ms
iter 381850: loss 6.1003, time 124.48ms
iter 381860: loss 5.7743, time 125.38ms
iter 381870: loss 5.7509, time 125.53ms
iter 381880: loss 5.6904, time 125.69ms
iter 381890: loss 6.1896, time 126.20ms
iter 381900: loss 6.7171, time 125.35ms
iter 381910: loss 6.0091, time 124.95ms
iter 381920: loss 6.2802, time 125.34ms
iter 381930: loss 5.8302, time 125.06ms
iter 381940: loss 6.1645, time 125.13ms
iter 381950: loss 5.9463, time 124.22ms
iter 381960: loss 6.1239, time 125.48ms
iter 381970: loss 5.4478, time 127.47ms
iter 381980: loss 6.2636, time 125.36ms
iter 381990: loss 6.0252, time 124.05ms
step 382000: train loss 5.6559, val loss 5.5648
saving checkpoint to out-shakespeare-char
iter 382000: loss 5.9918, time 2888.05ms
iter 382010: loss 6.4499, time 125.77ms
iter 382020: loss 6.1227, time 125.60ms
iter 382030: loss 5.9194, time 125.59ms
iter 382040: loss 6.1055, time 125.61ms
iter 382050: loss 6.4915, time 125.84ms
iter 382060: loss 5.9755, time 125.66ms
iter 382070: loss 6.0988, time 125.46ms
iter 382080: loss 5.4732, time 125.87ms
iter 382090: loss 5.5188, time 128.17ms
iter 382100: loss 5.6548, time 125.65ms
iter 382110: loss 5.5626, time 125.84ms
iter 382120: loss 6.1186, time 127.45ms
iter 382130: loss 6.8717, time 125.11ms
iter 382140: loss 6.1828, time 125.12ms
iter 382150: loss 5.5617, time 125.82ms
iter 382160: loss 5.9556, time 125.11ms
iter 382170: loss 5.5571, time 125.30ms
iter 382180: loss 5.6974, time 125.56ms
iter 382190: loss 6.1477, time 127.63ms
iter 382200: loss 6.3196, time 125.12ms
iter 382210: loss 5.2285, time 124.89ms
iter 382220: loss 5.6644, time 125.46ms
iter 382230: loss 5.6284, time 124.82ms
iter 382240: loss 6.2504, time 125.09ms
step 382250: train loss 5.6115, val loss 5.5807
saving checkpoint to out-shakespeare-char
iter 382250: loss 5.8559, time 2902.23ms
iter 382260: loss 5.8983, time 125.43ms
iter 382270: loss 5.3531, time 125.07ms
iter 382280: loss 5.8930, time 124.70ms
iter 382290: loss 6.0600, time 125.27ms
iter 382300: loss 5.9080, time 125.33ms
iter 382310: loss 5.7645, time 125.25ms
iter 382320: loss 5.5671, time 125.71ms
iter 382330: loss 5.8638, time 124.50ms
iter 382340: loss 6.3348, time 125.24ms
iter 382350: loss 5.9728, time 125.51ms
iter 382360: loss 6.1476, time 125.66ms
iter 382370: loss 6.0172, time 125.80ms
iter 382380: loss 6.7160, time 125.37ms
iter 382390: loss 4.7188, time 125.61ms
iter 382400: loss 6.6124, time 125.73ms
iter 382410: loss 5.6793, time 125.97ms
iter 382420: loss 7.0228, time 124.87ms
iter 382430: loss 6.4651, time 125.60ms
iter 382440: loss 6.6569, time 124.78ms
iter 382450: loss 7.2878, time 127.93ms
iter 382460: loss 5.6983, time 125.93ms
iter 382470: loss 6.4877, time 125.35ms
iter 382480: loss 5.6511, time 125.57ms
iter 382490: loss 5.4171, time 128.15ms
step 382500: train loss 5.5800, val loss 5.6009
saving checkpoint to out-shakespeare-char
iter 382500: loss 5.4092, time 2893.79ms
iter 382510: loss 6.2533, time 125.56ms
iter 382520: loss 5.9150, time 124.80ms
iter 382530: loss 5.8039, time 125.60ms
iter 382540: loss 5.4397, time 128.24ms
iter 382550: loss 5.8901, time 125.52ms
iter 382560: loss 5.7369, time 125.95ms
iter 382570: loss 5.7995, time 125.93ms
iter 382580: loss 6.0330, time 125.91ms
iter 382590: loss 5.7611, time 125.72ms
iter 382600: loss 5.3121, time 125.51ms
iter 382610: loss 5.3907, time 125.76ms
iter 382620: loss 5.9897, time 125.65ms
iter 382630: loss 5.7075, time 125.71ms
iter 382640: loss 6.4463, time 125.67ms
iter 382650: loss 6.0599, time 125.73ms
iter 382660: loss 5.8135, time 125.80ms
iter 382670: loss 6.6203, time 125.51ms
iter 382680: loss 5.7082, time 125.66ms
iter 382690: loss 5.7082, time 128.20ms
iter 382700: loss 6.0207, time 125.48ms
iter 382710: loss 5.2997, time 125.61ms
iter 382720: loss 6.5283, time 125.68ms
iter 382730: loss 5.8645, time 125.98ms
iter 382740: loss 6.1345, time 124.84ms
step 382750: train loss 5.6321, val loss 5.6123
saving checkpoint to out-shakespeare-char
iter 382750: loss 5.3496, time 2914.96ms
iter 382760: loss 6.0444, time 126.16ms
iter 382770: loss 5.7146, time 125.71ms
iter 382780: loss 6.3173, time 126.08ms
iter 382790: loss 5.8535, time 124.90ms
iter 382800: loss 6.1907, time 125.27ms
iter 382810: loss 5.3417, time 125.46ms
iter 382820: loss 5.6397, time 125.24ms
iter 382830: loss 6.5072, time 125.02ms
iter 382840: loss 5.6827, time 125.20ms
iter 382850: loss 5.7917, time 124.98ms
iter 382860: loss 6.3642, time 125.33ms
iter 382870: loss 5.3792, time 125.29ms
iter 382880: loss 5.7452, time 126.63ms
iter 382890: loss 5.6144, time 126.30ms
iter 382900: loss 5.6598, time 124.53ms
iter 382910: loss 5.7836, time 125.17ms
iter 382920: loss 6.1438, time 125.44ms
iter 382930: loss 6.1188, time 125.04ms
iter 382940: loss 5.9301, time 125.52ms
iter 382950: loss 6.1218, time 125.57ms
iter 382960: loss 6.1813, time 126.64ms
iter 382970: loss 5.3863, time 125.75ms
iter 382980: loss 6.9041, time 125.48ms
iter 382990: loss 6.4509, time 125.78ms
step 383000: train loss 5.6031, val loss 5.5868
saving checkpoint to out-shakespeare-char
iter 383000: loss 5.0866, time 2884.51ms
iter 383010: loss 5.7046, time 125.53ms
iter 383020: loss 6.0946, time 125.15ms
iter 383030: loss 6.1024, time 125.15ms
iter 383040: loss 6.1651, time 125.35ms
iter 383050: loss 6.2411, time 127.88ms
iter 383060: loss 6.1002, time 124.54ms
iter 383070: loss 5.7043, time 125.72ms
iter 383080: loss 6.1466, time 125.55ms
iter 383090: loss 5.9968, time 125.28ms
iter 383100: loss 6.7765, time 125.15ms
iter 383110: loss 5.7191, time 125.19ms
iter 383120: loss 5.3190, time 125.31ms
iter 383130: loss 5.6800, time 125.13ms
iter 383140: loss 6.1982, time 125.03ms
iter 383150: loss 5.8668, time 125.67ms
iter 383160: loss 5.7525, time 125.31ms
iter 383170: loss 6.3989, time 125.01ms
iter 383180: loss 6.4284, time 124.83ms
iter 383190: loss 5.7201, time 125.17ms
iter 383200: loss 6.6271, time 124.64ms
iter 383210: loss 6.2047, time 125.08ms
iter 383220: loss 5.5367, time 124.87ms
iter 383230: loss 5.7869, time 125.28ms
iter 383240: loss 5.8142, time 127.57ms
step 383250: train loss 5.5563, val loss 5.5913
saving checkpoint to out-shakespeare-char
iter 383250: loss 6.6587, time 2891.86ms
iter 383260: loss 5.8578, time 121.39ms
iter 383270: loss 5.9746, time 123.70ms
iter 383280: loss 6.2761, time 121.45ms
iter 383290: loss 5.8998, time 124.00ms
iter 383300: loss 6.4123, time 121.52ms
iter 383310: loss 6.2700, time 123.23ms
iter 383320: loss 6.3054, time 121.46ms
iter 383330: loss 5.5556, time 123.58ms
iter 383340: loss 5.7929, time 121.32ms
iter 383350: loss 5.8979, time 124.08ms
iter 383360: loss 6.4784, time 120.75ms
iter 383370: loss 6.0335, time 123.60ms
iter 383380: loss 5.7813, time 121.38ms
iter 383390: loss 5.5404, time 123.69ms
iter 383400: loss 5.8537, time 121.45ms
iter 383410: loss 6.0000, time 123.62ms
iter 383420: loss 5.5092, time 121.51ms
iter 383430: loss 5.6789, time 123.56ms
iter 383440: loss 6.2069, time 121.46ms
iter 383450: loss 6.2022, time 123.61ms
iter 383460: loss 5.6876, time 121.34ms
iter 383470: loss 5.9592, time 123.79ms
iter 383480: loss 5.8900, time 121.49ms
iter 383490: loss 5.7558, time 123.48ms
step 383500: train loss 5.6327, val loss 5.5232
saving checkpoint to out-shakespeare-char
iter 383500: loss 6.7125, time 2892.41ms
iter 383510: loss 5.6886, time 121.76ms
iter 383520: loss 5.9177, time 121.62ms
iter 383530: loss 5.9281, time 121.43ms
iter 383540: loss 6.1770, time 121.65ms
iter 383550: loss 6.6242, time 121.64ms
iter 383560: loss 6.2420, time 121.56ms
iter 383570: loss 5.9724, time 121.40ms
iter 383580: loss 6.2512, time 121.53ms
iter 383590: loss 5.5113, time 121.56ms
iter 383600: loss 6.7980, time 121.59ms
iter 383610: loss 5.6842, time 121.65ms
iter 383620: loss 6.5115, time 121.44ms
iter 383630: loss 6.5789, time 121.50ms
iter 383640: loss 5.3873, time 121.56ms
iter 383650: loss 6.2152, time 121.48ms
iter 383660: loss 6.2556, time 121.49ms
iter 383670: loss 6.3203, time 121.45ms
iter 383680: loss 5.2345, time 121.53ms
iter 383690: loss 6.2225, time 121.74ms
iter 383700: loss 6.0727, time 121.58ms
iter 383710: loss 5.7467, time 122.00ms
iter 383720: loss 5.6102, time 121.56ms
iter 383730: loss 6.1997, time 122.97ms
iter 383740: loss 6.3010, time 121.47ms
step 383750: train loss 5.5928, val loss 5.5982
saving checkpoint to out-shakespeare-char
iter 383750: loss 5.8819, time 2878.51ms
iter 383760: loss 5.8557, time 121.76ms
iter 383770: loss 5.7772, time 121.37ms
iter 383780: loss 5.8527, time 121.47ms
iter 383790: loss 5.6942, time 121.49ms
iter 383800: loss 6.5528, time 121.54ms
iter 383810: loss 6.0083, time 121.54ms
iter 383820: loss 6.1243, time 121.67ms
iter 383830: loss 6.1999, time 121.59ms
iter 383840: loss 5.9836, time 121.46ms
iter 383850: loss 6.2168, time 121.48ms
iter 383860: loss 5.7702, time 121.55ms
iter 383870: loss 5.9118, time 121.34ms
iter 383880: loss 6.0161, time 120.51ms
iter 383890: loss 6.1682, time 121.49ms
iter 383900: loss 6.3701, time 121.68ms
iter 383910: loss 5.6289, time 121.49ms
iter 383920: loss 5.8243, time 121.00ms
iter 383930: loss 5.7627, time 121.49ms
iter 383940: loss 6.2607, time 121.53ms
iter 383950: loss 6.1001, time 121.41ms
iter 383960: loss 5.3338, time 122.37ms
iter 383970: loss 5.6015, time 121.61ms
iter 383980: loss 6.1367, time 121.31ms
iter 383990: loss 5.8940, time 121.37ms
step 384000: train loss 5.5440, val loss 5.6031
saving checkpoint to out-shakespeare-char
iter 384000: loss 5.9173, time 2863.03ms
iter 384010: loss 6.1362, time 125.77ms
iter 384020: loss 6.3119, time 125.29ms
iter 384030: loss 5.1651, time 125.33ms
iter 384040: loss 5.7392, time 127.97ms
iter 384050: loss 6.0132, time 125.36ms
iter 384060: loss 5.8576, time 126.10ms
iter 384070: loss 5.6453, time 125.68ms
iter 384080: loss 6.4434, time 125.64ms
iter 384090: loss 6.2812, time 125.45ms
iter 384100: loss 6.1559, time 125.50ms
iter 384110: loss 6.2203, time 125.49ms
iter 384120: loss 6.0328, time 125.18ms
iter 384130: loss 5.7954, time 125.19ms
iter 384140: loss 6.2917, time 125.32ms
iter 384150: loss 4.8566, time 127.90ms
iter 384160: loss 5.5739, time 125.25ms
iter 384170: loss 6.2105, time 125.44ms
iter 384180: loss 5.9095, time 125.63ms
iter 384190: loss 5.8476, time 125.35ms
iter 384200: loss 5.8498, time 125.21ms
iter 384210: loss 5.8065, time 125.51ms
iter 384220: loss 6.3759, time 125.80ms
iter 384230: loss 6.5447, time 125.55ms
iter 384240: loss 6.0094, time 125.37ms
step 384250: train loss 5.5742, val loss 5.6415
saving checkpoint to out-shakespeare-char
iter 384250: loss 5.8168, time 2875.51ms
iter 384260: loss 5.6952, time 128.11ms
iter 384270: loss 6.4637, time 124.64ms
iter 384280: loss 6.2454, time 125.69ms
iter 384290: loss 7.1010, time 125.83ms
iter 384300: loss 5.3954, time 125.52ms
iter 384310: loss 5.9865, time 125.49ms
iter 384320: loss 6.3160, time 125.51ms
iter 384330: loss 5.4366, time 125.92ms
iter 384340: loss 6.6693, time 125.41ms
iter 384350: loss 6.4493, time 124.95ms
iter 384360: loss 5.4160, time 125.34ms
iter 384370: loss 5.6632, time 127.78ms
iter 384380: loss 5.4415, time 125.54ms
iter 384390: loss 6.1697, time 125.40ms
iter 384400: loss 4.9488, time 125.46ms
iter 384410: loss 6.0609, time 127.83ms
iter 384420: loss 6.0493, time 125.43ms
iter 384430: loss 5.8607, time 125.41ms
iter 384440: loss 5.7267, time 125.73ms
iter 384450: loss 6.4326, time 125.40ms
iter 384460: loss 6.4542, time 125.44ms
iter 384470: loss 5.6932, time 125.71ms
iter 384480: loss 5.8017, time 125.50ms
iter 384490: loss 5.2167, time 125.23ms
step 384500: train loss 5.5411, val loss 5.5957
saving checkpoint to out-shakespeare-char
iter 384500: loss 5.5791, time 2896.37ms
iter 384510: loss 5.6816, time 125.89ms
iter 384520: loss 6.3971, time 125.82ms
iter 384530: loss 6.3500, time 125.62ms
iter 384540: loss 6.2288, time 124.67ms
iter 384550: loss 5.7416, time 125.42ms
iter 384560: loss 5.8397, time 127.66ms
iter 384570: loss 5.0754, time 125.95ms
iter 384580: loss 6.1740, time 125.69ms
iter 384590: loss 6.0647, time 125.85ms
iter 384600: loss 6.1784, time 125.27ms
iter 384610: loss 5.8916, time 125.58ms
iter 384620: loss 5.8494, time 125.73ms
iter 384630: loss 5.6860, time 125.69ms
iter 384640: loss 5.6379, time 125.47ms
iter 384650: loss 5.6437, time 125.30ms
iter 384660: loss 5.4534, time 125.70ms
iter 384670: loss 5.6448, time 127.84ms
iter 384680: loss 6.2346, time 125.57ms
iter 384690: loss 5.6442, time 124.79ms
iter 384700: loss 6.9841, time 126.05ms
iter 384710: loss 6.5135, time 128.34ms
iter 384720: loss 5.6541, time 125.82ms
iter 384730: loss 6.6504, time 125.84ms
iter 384740: loss 6.1887, time 126.15ms
step 384750: train loss 5.6004, val loss 5.6058
saving checkpoint to out-shakespeare-char
iter 384750: loss 6.3952, time 2898.74ms
iter 384760: loss 5.7877, time 125.78ms
iter 384770: loss 5.5623, time 125.00ms
iter 384780: loss 6.2785, time 124.38ms
iter 384790: loss 5.9705, time 124.35ms
iter 384800: loss 5.1331, time 124.94ms
iter 384810: loss 6.1654, time 125.28ms
iter 384820: loss 5.7114, time 124.21ms
iter 384830: loss 6.2360, time 124.07ms
iter 384840: loss 7.1293, time 124.49ms
iter 384850: loss 6.2913, time 125.60ms
iter 384860: loss 7.0540, time 127.83ms
iter 384870: loss 6.5272, time 125.82ms
iter 384880: loss 6.1242, time 125.25ms
iter 384890: loss 5.2980, time 124.72ms
iter 384900: loss 6.0425, time 125.63ms
iter 384910: loss 6.6315, time 125.45ms
iter 384920: loss 6.0282, time 125.21ms
iter 384930: loss 6.2925, time 125.59ms
iter 384940: loss 6.6333, time 125.05ms
iter 384950: loss 6.1911, time 125.73ms
iter 384960: loss 5.7850, time 121.93ms
iter 384970: loss 5.4341, time 121.39ms
iter 384980: loss 5.8500, time 121.88ms
iter 384990: loss 5.9623, time 121.62ms
step 385000: train loss 5.5806, val loss 5.5791
saving checkpoint to out-shakespeare-char
iter 385000: loss 5.4983, time 2896.16ms
iter 385010: loss 6.2046, time 126.19ms
iter 385020: loss 6.7119, time 125.39ms
iter 385030: loss 6.0338, time 125.88ms
iter 385040: loss 6.0762, time 126.23ms
iter 385050: loss 5.7550, time 127.79ms
iter 385060: loss 5.4263, time 125.25ms
iter 385070: loss 5.6722, time 125.64ms
iter 385080: loss 5.3888, time 126.08ms
iter 385090: loss 6.0182, time 125.06ms
iter 385100: loss 5.7539, time 124.89ms
iter 385110: loss 6.1359, time 124.79ms
iter 385120: loss 5.7687, time 125.03ms
iter 385130: loss 5.5994, time 124.54ms
iter 385140: loss 5.8121, time 124.14ms
iter 385150: loss 5.8363, time 125.00ms
iter 385160: loss 5.8519, time 127.77ms
iter 385170: loss 5.5803, time 125.12ms
iter 385180: loss 5.4365, time 124.17ms
iter 385190: loss 5.1649, time 124.74ms
iter 385200: loss 5.7205, time 125.42ms
iter 385210: loss 5.0777, time 124.85ms
iter 385220: loss 5.7664, time 125.10ms
iter 385230: loss 6.4227, time 124.06ms
iter 385240: loss 6.4009, time 125.60ms
step 385250: train loss 5.6004, val loss 5.5816
saving checkpoint to out-shakespeare-char
iter 385250: loss 6.3459, time 2893.61ms
iter 385260: loss 6.6164, time 126.01ms
iter 385270: loss 6.4320, time 127.88ms
iter 385280: loss 6.6638, time 125.05ms
iter 385290: loss 6.2721, time 125.08ms
iter 385300: loss 5.7421, time 123.39ms
iter 385310: loss 6.4224, time 125.25ms
iter 385320: loss 6.4321, time 124.20ms
iter 385330: loss 6.3343, time 125.11ms
iter 385340: loss 5.1154, time 124.30ms
iter 385350: loss 6.3847, time 125.65ms
iter 385360: loss 6.0068, time 125.42ms
iter 385370: loss 6.7021, time 123.56ms
iter 385380: loss 6.2647, time 125.38ms
iter 385390: loss 6.0215, time 125.49ms
iter 385400: loss 5.7467, time 125.25ms
iter 385410: loss 5.6766, time 123.92ms
iter 385420: loss 6.1291, time 125.11ms
iter 385430: loss 6.1268, time 120.43ms
iter 385440: loss 5.8917, time 122.13ms
iter 385450: loss 5.9642, time 122.67ms
iter 385460: loss 6.4709, time 123.38ms
iter 385470: loss 5.9222, time 121.59ms
iter 385480: loss 6.1987, time 122.14ms
iter 385490: loss 5.2165, time 122.06ms
step 385500: train loss 5.5941, val loss 5.6181
saving checkpoint to out-shakespeare-char
iter 385500: loss 5.7132, time 2898.91ms
iter 385510: loss 5.8987, time 122.10ms
iter 385520: loss 5.9146, time 121.31ms
iter 385530: loss 5.3554, time 126.59ms
iter 385540: loss 5.7687, time 126.81ms
iter 385550: loss 6.1537, time 125.84ms
iter 385560: loss 5.5876, time 127.94ms
iter 385570: loss 5.3036, time 125.60ms
iter 385580: loss 6.1740, time 125.23ms
iter 385590: loss 5.4527, time 128.26ms
iter 385600: loss 5.9321, time 124.91ms
iter 385610: loss 5.7760, time 125.48ms
iter 385620: loss 6.2517, time 123.71ms
iter 385630: loss 6.2904, time 125.85ms
iter 385640: loss 6.2200, time 127.74ms
iter 385650: loss 6.3665, time 121.79ms
iter 385660: loss 6.1234, time 122.64ms
iter 385670: loss 6.1078, time 121.47ms
iter 385680: loss 6.4389, time 122.21ms
iter 385690: loss 5.5251, time 121.49ms
iter 385700: loss 6.1857, time 122.64ms
iter 385710: loss 6.5023, time 121.53ms
iter 385720: loss 5.9078, time 121.47ms
iter 385730: loss 6.2124, time 121.25ms
iter 385740: loss 5.9954, time 121.61ms
step 385750: train loss 5.5243, val loss 5.5465
saving checkpoint to out-shakespeare-char
iter 385750: loss 5.6197, time 2883.61ms
iter 385760: loss 6.2321, time 122.93ms
iter 385770: loss 5.8525, time 121.43ms
iter 385780: loss 6.1557, time 122.76ms
iter 385790: loss 5.9815, time 125.23ms
iter 385800: loss 5.6109, time 125.11ms
iter 385810: loss 5.8554, time 127.76ms
iter 385820: loss 5.8964, time 124.94ms
iter 385830: loss 5.4178, time 124.43ms
iter 385840: loss 6.6241, time 125.06ms
iter 385850: loss 5.9061, time 124.95ms
iter 385860: loss 6.7916, time 124.85ms
iter 385870: loss 6.2644, time 125.09ms
iter 385880: loss 6.0072, time 125.09ms
iter 385890: loss 6.3470, time 127.12ms
iter 385900: loss 5.5495, time 124.98ms
iter 385910: loss 6.3116, time 124.93ms
iter 385920: loss 5.3980, time 124.89ms
iter 385930: loss 5.9136, time 124.97ms
iter 385940: loss 5.8028, time 124.80ms
iter 385950: loss 5.9349, time 127.48ms
iter 385960: loss 5.4410, time 125.26ms
iter 385970: loss 5.7682, time 126.89ms
iter 385980: loss 5.3864, time 125.59ms
iter 385990: loss 5.0932, time 126.10ms
step 386000: train loss 5.6037, val loss 5.6014
saving checkpoint to out-shakespeare-char
iter 386000: loss 5.3591, time 2867.29ms
iter 386010: loss 6.1779, time 125.16ms
iter 386020: loss 6.5812, time 124.79ms
iter 386030: loss 5.7035, time 124.73ms
iter 386040: loss 6.6150, time 125.33ms
iter 386050: loss 6.3561, time 127.35ms
iter 386060: loss 6.1412, time 124.79ms
iter 386070: loss 5.7065, time 124.77ms
iter 386080: loss 5.9214, time 125.05ms
iter 386090: loss 5.4267, time 124.59ms
iter 386100: loss 6.1143, time 126.16ms
iter 386110: loss 6.3790, time 125.51ms
iter 386120: loss 5.5688, time 125.23ms
iter 386130: loss 5.0113, time 125.08ms
iter 386140: loss 5.3950, time 125.33ms
iter 386150: loss 6.5358, time 124.43ms
iter 386160: loss 4.9445, time 126.58ms
iter 386170: loss 6.2835, time 127.55ms
iter 386180: loss 6.2690, time 125.30ms
iter 386190: loss 5.8563, time 124.47ms
iter 386200: loss 5.9048, time 125.38ms
iter 386210: loss 5.3796, time 125.39ms
iter 386220: loss 6.4932, time 127.46ms
iter 386230: loss 6.1196, time 125.95ms
iter 386240: loss 6.5832, time 124.66ms
step 386250: train loss 5.5961, val loss 5.6084
saving checkpoint to out-shakespeare-char
iter 386250: loss 6.0416, time 2913.84ms
iter 386260: loss 6.9886, time 126.22ms
iter 386270: loss 5.5503, time 126.91ms
iter 386280: loss 6.6078, time 126.15ms
iter 386290: loss 6.1301, time 125.71ms
iter 386300: loss 5.3800, time 125.72ms
iter 386310: loss 6.0976, time 125.87ms
iter 386320: loss 6.7053, time 125.65ms
iter 386330: loss 6.0190, time 126.15ms
iter 386340: loss 6.1153, time 126.01ms
iter 386350: loss 5.6462, time 126.40ms
iter 386360: loss 4.8397, time 125.83ms
iter 386370: loss 6.8466, time 125.14ms
iter 386380: loss 6.1603, time 125.53ms
iter 386390: loss 6.3248, time 127.76ms
iter 386400: loss 6.3251, time 126.26ms
iter 386410: loss 6.5763, time 127.87ms
iter 386420: loss 6.4519, time 126.33ms
iter 386430: loss 5.8269, time 124.20ms
iter 386440: loss 5.5576, time 129.84ms
iter 386450: loss 6.2607, time 125.69ms
iter 386460: loss 5.9287, time 126.19ms
iter 386470: loss 6.1815, time 127.91ms
iter 386480: loss 6.5063, time 125.16ms
iter 386490: loss 5.9018, time 128.95ms
step 386500: train loss 5.5544, val loss 5.5260
saving checkpoint to out-shakespeare-char
iter 386500: loss 5.8355, time 2924.58ms
iter 386510: loss 6.4283, time 125.34ms
iter 386520: loss 5.4684, time 125.13ms
iter 386530: loss 5.6628, time 125.23ms
iter 386540: loss 5.5675, time 126.01ms
iter 386550: loss 5.9778, time 125.29ms
iter 386560: loss 6.7523, time 127.84ms
iter 386570: loss 5.5720, time 127.54ms
iter 386580: loss 6.0564, time 126.13ms
iter 386590: loss 6.2937, time 125.50ms
iter 386600: loss 6.2286, time 125.85ms
iter 386610: loss 6.6091, time 126.18ms
iter 386620: loss 5.3306, time 126.02ms
iter 386630: loss 5.6474, time 126.04ms
iter 386640: loss 6.3041, time 128.44ms
iter 386650: loss 5.5298, time 126.43ms
iter 386660: loss 6.0129, time 125.99ms
iter 386670: loss 5.9438, time 128.82ms
iter 386680: loss 6.1348, time 124.54ms
iter 386690: loss 6.0235, time 122.33ms
iter 386700: loss 6.3157, time 124.29ms
iter 386710: loss 5.2111, time 122.59ms
iter 386720: loss 5.7602, time 125.67ms
iter 386730: loss 6.2904, time 125.95ms
iter 386740: loss 5.8730, time 127.20ms
step 386750: train loss 5.6210, val loss 5.5918
saving checkpoint to out-shakespeare-char
iter 386750: loss 5.8711, time 2905.68ms
iter 386760: loss 5.6603, time 126.03ms
iter 386770: loss 6.2220, time 125.88ms
iter 386780: loss 6.3716, time 125.94ms
iter 386790: loss 6.3602, time 125.83ms
iter 386800: loss 5.8729, time 126.51ms
iter 386810: loss 6.2372, time 126.53ms
iter 386820: loss 6.3229, time 127.99ms
iter 386830: loss 5.6469, time 125.62ms
iter 386840: loss 5.9815, time 126.00ms
iter 386850: loss 5.4993, time 125.74ms
iter 386860: loss 5.7951, time 125.62ms
iter 386870: loss 6.0317, time 126.95ms
iter 386880: loss 5.9388, time 125.92ms
iter 386890: loss 6.4022, time 124.79ms
iter 386900: loss 5.6117, time 125.95ms
iter 386910: loss 6.2764, time 127.36ms
iter 386920: loss 6.4007, time 124.96ms
iter 386930: loss 5.8901, time 125.48ms
iter 386940: loss 6.1487, time 124.79ms
iter 386950: loss 5.6553, time 125.26ms
iter 386960: loss 5.6709, time 124.28ms
iter 386970: loss 6.3352, time 125.33ms
iter 386980: loss 5.8776, time 126.40ms
iter 386990: loss 6.2552, time 124.96ms
step 387000: train loss 5.6242, val loss 5.5914
saving checkpoint to out-shakespeare-char
iter 387000: loss 5.3384, time 2898.37ms
iter 387010: loss 5.8394, time 121.68ms
iter 387020: loss 5.9365, time 121.70ms
iter 387030: loss 5.5782, time 122.12ms
iter 387040: loss 5.9319, time 121.70ms
iter 387050: loss 6.3459, time 122.88ms
iter 387060: loss 5.7494, time 121.65ms
iter 387070: loss 6.7970, time 121.74ms
iter 387080: loss 5.7388, time 121.65ms
iter 387090: loss 5.8950, time 121.44ms
iter 387100: loss 5.8803, time 121.57ms
iter 387110: loss 5.4096, time 121.60ms
iter 387120: loss 6.0239, time 121.42ms
iter 387130: loss 6.1493, time 121.01ms
iter 387140: loss 5.6714, time 121.86ms
iter 387150: loss 6.6756, time 121.66ms
iter 387160: loss 5.8045, time 122.45ms
iter 387170: loss 6.1929, time 121.74ms
iter 387180: loss 5.8676, time 121.65ms
iter 387190: loss 5.1695, time 122.84ms
iter 387200: loss 5.7619, time 121.15ms
iter 387210: loss 5.9352, time 121.47ms
iter 387220: loss 5.9818, time 120.96ms
iter 387230: loss 6.1872, time 121.53ms
iter 387240: loss 6.0302, time 121.73ms
step 387250: train loss 5.5811, val loss 5.5872
saving checkpoint to out-shakespeare-char
iter 387250: loss 5.9988, time 2890.31ms
iter 387260: loss 6.0887, time 126.05ms
iter 387270: loss 5.9519, time 125.70ms
iter 387280: loss 5.5548, time 125.71ms
iter 387290: loss 5.6742, time 125.65ms
iter 387300: loss 5.8885, time 125.83ms
iter 387310: loss 6.1170, time 127.06ms
iter 387320: loss 5.2412, time 125.74ms
iter 387330: loss 6.5714, time 125.64ms
iter 387340: loss 5.6446, time 125.92ms
iter 387350: loss 5.4594, time 125.73ms
iter 387360: loss 5.7108, time 125.62ms
iter 387370: loss 5.8687, time 125.71ms
iter 387380: loss 5.9584, time 125.53ms
iter 387390: loss 5.9014, time 125.76ms
iter 387400: loss 5.7036, time 128.74ms
iter 387410: loss 6.2229, time 127.09ms
iter 387420: loss 6.0075, time 126.13ms
iter 387430: loss 6.1201, time 125.87ms
iter 387440: loss 6.8686, time 126.05ms
iter 387450: loss 5.1321, time 125.80ms
iter 387460: loss 6.2282, time 125.48ms
iter 387470: loss 6.2393, time 125.81ms
iter 387480: loss 5.4678, time 127.69ms
iter 387490: loss 6.0913, time 125.92ms
step 387500: train loss 5.5350, val loss 5.6058
saving checkpoint to out-shakespeare-char
iter 387500: loss 5.6236, time 2908.59ms
iter 387510: loss 6.4477, time 125.52ms
iter 387520: loss 5.8642, time 126.85ms
iter 387530: loss 5.5950, time 125.65ms
iter 387540: loss 5.2558, time 126.21ms
iter 387550: loss 6.3464, time 125.80ms
iter 387560: loss 5.9165, time 127.09ms
iter 387570: loss 5.6089, time 124.45ms
iter 387580: loss 6.1429, time 126.86ms
iter 387590: loss 5.6064, time 125.69ms
iter 387600: loss 5.3242, time 125.09ms
iter 387610: loss 5.2641, time 127.27ms
iter 387620: loss 6.6192, time 125.86ms
iter 387630: loss 5.9022, time 124.81ms
iter 387640: loss 5.8539, time 125.55ms
iter 387650: loss 6.2288, time 125.85ms
iter 387660: loss 6.2518, time 125.59ms
iter 387670: loss 5.9262, time 125.30ms
iter 387680: loss 5.7624, time 125.44ms
iter 387690: loss 5.9048, time 125.60ms
iter 387700: loss 6.1468, time 127.66ms
iter 387710: loss 6.1276, time 125.80ms
iter 387720: loss 6.2778, time 123.92ms
iter 387730: loss 5.9399, time 125.64ms
iter 387740: loss 6.1141, time 125.91ms
step 387750: train loss 5.5444, val loss 5.6407
saving checkpoint to out-shakespeare-char
iter 387750: loss 6.6570, time 2882.38ms
iter 387760: loss 5.7414, time 124.74ms
iter 387770: loss 5.9955, time 127.68ms
iter 387780: loss 6.2089, time 126.12ms
iter 387790: loss 5.9880, time 128.11ms
iter 387800: loss 6.3257, time 125.98ms
iter 387810: loss 5.1960, time 125.91ms
iter 387820: loss 5.6824, time 124.99ms
iter 387830: loss 5.5695, time 125.94ms
iter 387840: loss 5.6757, time 125.87ms
iter 387850: loss 6.0344, time 125.93ms
iter 387860: loss 5.8348, time 125.53ms
iter 387870: loss 6.6526, time 128.39ms
iter 387880: loss 5.4826, time 125.55ms
iter 387890: loss 6.2495, time 125.41ms
iter 387900: loss 5.8782, time 125.97ms
iter 387910: loss 6.2591, time 127.75ms
iter 387920: loss 6.0362, time 127.73ms
iter 387930: loss 5.5990, time 125.25ms
iter 387940: loss 5.8497, time 125.51ms
iter 387950: loss 5.2451, time 124.74ms
iter 387960: loss 5.7089, time 125.43ms
iter 387970: loss 6.1745, time 125.80ms
iter 387980: loss 5.6605, time 125.61ms
iter 387990: loss 5.9935, time 125.50ms
step 388000: train loss 5.5787, val loss 5.6076
saving checkpoint to out-shakespeare-char
iter 388000: loss 6.0085, time 2893.76ms
iter 388010: loss 6.8219, time 121.44ms
iter 388020: loss 5.6823, time 123.99ms
iter 388030: loss 6.4244, time 121.98ms
iter 388040: loss 5.9494, time 123.87ms
iter 388050: loss 5.3417, time 121.68ms
iter 388060: loss 5.8228, time 123.88ms
iter 388070: loss 6.4924, time 121.71ms
iter 388080: loss 5.8259, time 121.65ms
iter 388090: loss 5.7319, time 122.56ms
iter 388100: loss 6.4493, time 121.69ms
iter 388110: loss 6.1794, time 123.86ms
iter 388120: loss 6.1870, time 121.70ms
iter 388130: loss 5.7557, time 122.77ms
iter 388140: loss 6.5068, time 121.87ms
iter 388150: loss 5.7458, time 123.11ms
iter 388160: loss 6.5412, time 121.95ms
iter 388170: loss 6.5910, time 123.35ms
iter 388180: loss 5.6510, time 122.86ms
iter 388190: loss 5.6189, time 122.07ms
iter 388200: loss 5.7816, time 122.74ms
iter 388210: loss 6.4384, time 121.66ms
iter 388220: loss 6.6238, time 122.82ms
iter 388230: loss 5.8413, time 121.67ms
iter 388240: loss 6.0408, time 122.77ms
step 388250: train loss 5.5638, val loss 5.5885
saving checkpoint to out-shakespeare-char
iter 388250: loss 6.3060, time 2894.92ms
iter 388260: loss 6.3668, time 120.92ms
iter 388270: loss 6.6225, time 124.20ms
iter 388280: loss 6.1288, time 121.69ms
iter 388290: loss 6.0604, time 123.99ms
iter 388300: loss 6.2580, time 122.04ms
iter 388310: loss 6.1076, time 124.00ms
iter 388320: loss 5.7655, time 121.65ms
iter 388330: loss 6.0706, time 123.62ms
iter 388340: loss 5.8270, time 121.78ms
iter 388350: loss 6.3132, time 121.40ms
iter 388360: loss 5.9604, time 121.92ms
iter 388370: loss 6.2317, time 121.80ms
iter 388380: loss 6.0958, time 122.09ms
iter 388390: loss 5.0015, time 121.96ms
iter 388400: loss 6.1909, time 122.87ms
iter 388410: loss 6.3170, time 121.24ms
iter 388420: loss 5.3367, time 121.65ms
iter 388430: loss 6.5472, time 122.42ms
iter 388440: loss 6.1054, time 122.16ms
iter 388450: loss 5.7978, time 121.91ms
iter 388460: loss 6.8147, time 121.40ms
iter 388470: loss 6.1706, time 121.73ms
iter 388480: loss 6.4327, time 122.28ms
iter 388490: loss 5.6798, time 121.72ms
step 388500: train loss 5.5660, val loss 5.6050
saving checkpoint to out-shakespeare-char
iter 388500: loss 5.1710, time 2898.27ms
iter 388510: loss 6.4652, time 121.91ms
iter 388520: loss 6.5669, time 123.66ms
iter 388530: loss 5.6225, time 121.77ms
iter 388540: loss 5.7401, time 123.53ms
iter 388550: loss 6.2258, time 122.15ms
iter 388560: loss 5.8291, time 123.12ms
iter 388570: loss 5.4601, time 122.23ms
iter 388580: loss 6.1535, time 122.81ms
iter 388590: loss 5.9785, time 121.66ms
iter 388600: loss 6.1306, time 123.20ms
iter 388610: loss 5.9040, time 121.71ms
iter 388620: loss 6.2794, time 121.74ms
iter 388630: loss 6.2483, time 120.91ms
iter 388640: loss 6.1983, time 121.78ms
iter 388650: loss 6.3043, time 122.08ms
iter 388660: loss 5.7819, time 121.40ms
iter 388670: loss 6.5463, time 121.78ms
iter 388680: loss 6.8594, time 121.66ms
iter 388690: loss 5.9096, time 121.65ms
iter 388700: loss 6.2537, time 121.77ms
iter 388710: loss 5.9395, time 121.61ms
iter 388720: loss 6.1609, time 121.52ms
iter 388730: loss 5.9546, time 121.60ms
iter 388740: loss 5.8923, time 121.59ms
step 388750: train loss 5.6053, val loss 5.5805
saving checkpoint to out-shakespeare-char
iter 388750: loss 5.1802, time 2889.74ms
iter 388760: loss 6.0969, time 121.68ms
iter 388770: loss 6.2939, time 123.68ms
iter 388780: loss 5.9124, time 121.63ms
iter 388790: loss 5.8239, time 123.67ms
iter 388800: loss 5.9748, time 121.50ms
iter 388810: loss 6.2230, time 123.70ms
iter 388820: loss 5.8324, time 121.47ms
iter 388830: loss 6.0177, time 123.90ms
iter 388840: loss 5.9394, time 120.49ms
iter 388850: loss 6.9429, time 123.71ms
iter 388860: loss 6.3111, time 121.50ms
iter 388870: loss 5.7156, time 123.60ms
iter 388880: loss 6.2786, time 121.58ms
iter 388890: loss 5.9704, time 123.28ms
iter 388900: loss 5.3103, time 121.40ms
iter 388910: loss 6.0727, time 123.69ms
iter 388920: loss 6.4154, time 121.53ms
iter 388930: loss 5.7317, time 123.99ms
iter 388940: loss 6.2798, time 121.84ms
iter 388950: loss 6.3623, time 123.69ms
iter 388960: loss 5.8249, time 121.38ms
iter 388970: loss 5.6592, time 123.55ms
iter 388980: loss 6.2991, time 121.45ms
iter 388990: loss 5.7257, time 123.71ms
step 389000: train loss 5.5406, val loss 5.5963
saving checkpoint to out-shakespeare-char
iter 389000: loss 5.6950, time 2890.22ms
iter 389010: loss 5.6325, time 123.98ms
iter 389020: loss 5.9485, time 121.67ms
iter 389030: loss 4.8244, time 125.98ms
iter 389040: loss 6.5671, time 125.47ms
iter 389050: loss 6.5185, time 125.80ms
iter 389060: loss 5.7285, time 125.82ms
iter 389070: loss 6.0978, time 128.36ms
iter 389080: loss 5.8376, time 125.06ms
iter 389090: loss 5.4893, time 125.79ms
iter 389100: loss 6.5232, time 126.95ms
iter 389110: loss 5.6588, time 128.27ms
iter 389120: loss 5.9926, time 125.54ms
iter 389130: loss 6.3795, time 125.86ms
iter 389140: loss 5.6478, time 125.76ms
iter 389150: loss 5.3173, time 125.37ms
iter 389160: loss 5.8019, time 125.60ms
iter 389170: loss 5.4590, time 125.42ms
iter 389180: loss 6.5938, time 125.68ms
iter 389190: loss 6.3507, time 125.46ms
iter 389200: loss 6.5621, time 125.53ms
iter 389210: loss 5.7768, time 125.97ms
iter 389220: loss 5.5911, time 128.21ms
iter 389230: loss 5.7728, time 125.70ms
iter 389240: loss 5.9745, time 125.78ms
step 389250: train loss 5.5996, val loss 5.6207
saving checkpoint to out-shakespeare-char
iter 389250: loss 6.0568, time 2895.17ms
iter 389260: loss 6.4294, time 121.94ms
iter 389270: loss 6.0946, time 121.58ms
iter 389280: loss 5.8553, time 122.27ms
iter 389290: loss 6.4604, time 121.62ms
iter 389300: loss 5.9710, time 121.66ms
iter 389310: loss 5.5593, time 121.91ms
iter 389320: loss 5.6569, time 121.57ms
iter 389330: loss 5.7773, time 121.64ms
iter 389340: loss 6.5574, time 122.17ms
iter 389350: loss 6.1833, time 121.64ms
iter 389360: loss 5.1599, time 121.77ms
iter 389370: loss 6.0664, time 121.63ms
iter 389380: loss 6.0936, time 120.92ms
iter 389390: loss 6.4250, time 121.54ms
iter 389400: loss 5.8898, time 121.66ms
iter 389410: loss 6.1626, time 121.82ms
iter 389420: loss 5.7361, time 121.61ms
iter 389430: loss 5.7363, time 121.84ms
iter 389440: loss 6.5445, time 120.59ms
iter 389450: loss 5.7864, time 121.54ms
iter 389460: loss 6.0125, time 121.55ms
iter 389470: loss 6.7837, time 121.63ms
iter 389480: loss 6.6950, time 121.78ms
iter 389490: loss 5.3872, time 122.18ms
step 389500: train loss 5.6046, val loss 5.6311
saving checkpoint to out-shakespeare-char
iter 389500: loss 6.5416, time 2893.95ms
iter 389510: loss 4.6836, time 121.64ms
iter 389520: loss 6.2403, time 121.69ms
iter 389530: loss 6.2696, time 121.64ms
iter 389540: loss 5.8997, time 121.77ms
iter 389550: loss 5.6063, time 121.73ms
iter 389560: loss 5.7274, time 122.24ms
iter 389570: loss 6.2543, time 121.81ms
iter 389580: loss 5.9391, time 121.65ms
iter 389590: loss 6.2433, time 121.71ms
iter 389600: loss 6.5916, time 121.73ms
iter 389610: loss 6.1280, time 121.46ms
iter 389620: loss 5.4866, time 121.67ms
iter 389630: loss 6.4599, time 121.63ms
iter 389640: loss 5.6865, time 121.79ms
iter 389650: loss 5.5110, time 120.83ms
iter 389660: loss 6.7722, time 121.61ms
iter 389670: loss 6.5773, time 120.94ms
iter 389680: loss 6.3907, time 121.75ms
iter 389690: loss 5.6319, time 121.54ms
iter 389700: loss 6.1814, time 121.87ms
iter 389710: loss 6.4652, time 121.70ms
iter 389720: loss 6.4307, time 121.68ms
iter 389730: loss 6.1124, time 121.64ms
iter 389740: loss 6.1968, time 121.70ms
step 389750: train loss 5.5754, val loss 5.6094
saving checkpoint to out-shakespeare-char
iter 389750: loss 7.0755, time 2900.70ms
iter 389760: loss 5.5650, time 123.77ms
iter 389770: loss 6.2981, time 121.78ms
iter 389780: loss 5.5672, time 123.75ms
iter 389790: loss 6.1496, time 121.59ms
iter 389800: loss 5.9611, time 123.76ms
iter 389810: loss 6.2197, time 121.08ms
iter 389820: loss 6.2236, time 123.70ms
iter 389830: loss 5.7741, time 121.55ms
iter 389840: loss 6.0975, time 124.05ms
iter 389850: loss 6.1801, time 121.52ms
iter 389860: loss 5.9683, time 123.53ms
iter 389870: loss 6.5571, time 120.60ms
iter 389880: loss 6.3057, time 123.87ms
iter 389890: loss 5.9282, time 121.96ms
iter 389900: loss 5.7155, time 123.76ms
iter 389910: loss 6.0913, time 121.55ms
iter 389920: loss 6.0297, time 123.72ms
iter 389930: loss 6.5669, time 121.47ms
iter 389940: loss 5.2122, time 123.60ms
iter 389950: loss 6.8424, time 121.90ms
iter 389960: loss 6.2296, time 123.77ms
iter 389970: loss 6.2835, time 121.34ms
iter 389980: loss 5.2863, time 123.43ms
iter 389990: loss 5.6955, time 121.38ms
step 390000: train loss 5.5995, val loss 5.5549
saving checkpoint to out-shakespeare-char
iter 390000: loss 6.3894, time 2895.84ms
iter 390010: loss 6.6461, time 121.72ms
iter 390020: loss 6.1842, time 123.14ms
iter 390030: loss 6.1791, time 121.65ms
iter 390040: loss 6.0078, time 122.64ms
iter 390050: loss 5.6274, time 121.73ms
iter 390060: loss 5.7955, time 122.90ms
iter 390070: loss 5.8382, time 121.66ms
iter 390080: loss 6.3511, time 122.68ms
iter 390090: loss 6.3482, time 121.79ms
iter 390100: loss 5.7953, time 122.80ms
iter 390110: loss 5.9851, time 121.56ms
iter 390120: loss 6.2512, time 122.75ms
iter 390130: loss 6.0237, time 121.74ms
iter 390140: loss 6.1099, time 122.81ms
iter 390150: loss 5.7909, time 121.62ms
iter 390160: loss 5.7072, time 122.86ms
iter 390170: loss 5.8934, time 121.69ms
iter 390180: loss 6.7986, time 122.81ms
iter 390190: loss 5.3062, time 121.65ms
iter 390200: loss 5.3602, time 122.97ms
iter 390210: loss 6.1463, time 121.76ms
iter 390220: loss 6.5236, time 123.14ms
iter 390230: loss 5.4027, time 120.65ms
iter 390240: loss 5.8201, time 122.86ms
step 390250: train loss 5.5936, val loss 5.5872
saving checkpoint to out-shakespeare-char
iter 390250: loss 6.0518, time 2888.35ms
iter 390260: loss 5.5408, time 121.96ms
iter 390270: loss 6.0310, time 120.76ms
iter 390280: loss 6.3206, time 121.65ms
iter 390290: loss 6.2696, time 121.78ms
iter 390300: loss 6.5419, time 121.71ms
iter 390310: loss 5.9443, time 121.63ms
iter 390320: loss 5.8971, time 121.97ms
iter 390330: loss 6.0862, time 121.67ms
iter 390340: loss 6.2437, time 121.63ms
iter 390350: loss 5.8021, time 121.76ms
iter 390360: loss 5.9687, time 121.52ms
iter 390370: loss 5.7193, time 121.59ms
iter 390380: loss 5.9195, time 122.82ms
iter 390390: loss 6.3122, time 121.72ms
iter 390400: loss 6.3225, time 121.75ms
iter 390410: loss 6.0290, time 121.66ms
iter 390420: loss 6.0691, time 121.60ms
iter 390430: loss 5.1713, time 121.61ms
iter 390440: loss 5.7590, time 121.66ms
iter 390450: loss 6.1624, time 121.76ms
iter 390460: loss 5.6847, time 121.56ms
iter 390470: loss 6.1283, time 122.04ms
iter 390480: loss 6.0095, time 120.76ms
iter 390490: loss 6.2364, time 121.96ms
step 390500: train loss 5.5669, val loss 5.5907
saving checkpoint to out-shakespeare-char
iter 390500: loss 6.2153, time 2912.60ms
iter 390510: loss 5.8919, time 125.84ms
iter 390520: loss 5.4169, time 124.62ms
iter 390530: loss 5.3538, time 125.35ms
iter 390540: loss 5.3508, time 125.15ms
iter 390550: loss 5.8962, time 125.06ms
iter 390560: loss 5.7902, time 121.77ms
iter 390570: loss 5.8323, time 121.32ms
iter 390580: loss 5.7292, time 121.57ms
iter 390590: loss 6.5735, time 121.54ms
iter 390600: loss 5.9299, time 121.60ms
iter 390610: loss 5.3754, time 121.63ms
iter 390620: loss 5.3960, time 120.93ms
iter 390630: loss 6.3021, time 121.53ms
iter 390640: loss 5.8148, time 121.31ms
iter 390650: loss 6.1770, time 121.58ms
iter 390660: loss 5.5634, time 121.69ms
iter 390670: loss 5.9020, time 121.60ms
iter 390680: loss 6.4413, time 121.70ms
iter 390690: loss 6.5176, time 121.21ms
iter 390700: loss 5.3109, time 121.66ms
iter 390710: loss 6.0871, time 121.52ms
iter 390720: loss 5.9847, time 121.52ms
iter 390730: loss 5.0380, time 121.60ms
iter 390740: loss 6.0461, time 122.08ms
step 390750: train loss 5.5419, val loss 5.6297
saving checkpoint to out-shakespeare-char
iter 390750: loss 6.3553, time 2883.57ms
iter 390760: loss 6.2566, time 123.88ms
iter 390770: loss 5.8666, time 121.76ms
iter 390780: loss 5.9069, time 123.88ms
iter 390790: loss 6.4127, time 121.73ms
iter 390800: loss 5.5539, time 123.94ms
iter 390810: loss 5.7703, time 121.58ms
iter 390820: loss 6.3810, time 123.90ms
iter 390830: loss 6.8462, time 121.58ms
iter 390840: loss 6.2690, time 123.85ms
iter 390850: loss 6.2407, time 121.51ms
iter 390860: loss 6.1302, time 123.37ms
iter 390870: loss 4.7604, time 121.60ms
iter 390880: loss 6.1108, time 124.40ms
iter 390890: loss 5.7968, time 121.33ms
iter 390900: loss 5.9159, time 123.31ms
iter 390910: loss 5.9196, time 121.34ms
iter 390920: loss 6.4973, time 124.37ms
iter 390930: loss 5.3683, time 121.62ms
iter 390940: loss 5.8651, time 123.85ms
iter 390950: loss 6.1514, time 121.83ms
iter 390960: loss 6.0500, time 123.89ms
iter 390970: loss 6.7053, time 121.22ms
iter 390980: loss 5.8521, time 122.34ms
iter 390990: loss 5.3302, time 121.16ms
step 391000: train loss 5.6057, val loss 5.5393
saving checkpoint to out-shakespeare-char
iter 391000: loss 5.6995, time 2890.04ms
iter 391010: loss 5.5426, time 121.87ms
iter 391020: loss 6.0439, time 123.60ms
iter 391030: loss 6.0869, time 121.75ms
iter 391040: loss 6.2843, time 123.05ms
iter 391050: loss 5.6413, time 120.80ms
iter 391060: loss 5.9444, time 123.31ms
iter 391070: loss 5.8960, time 121.78ms
iter 391080: loss 6.3787, time 122.86ms
iter 391090: loss 6.0902, time 121.77ms
iter 391100: loss 6.0136, time 123.03ms
iter 391110: loss 6.1772, time 121.94ms
iter 391120: loss 5.2517, time 122.81ms
iter 391130: loss 5.7424, time 121.95ms
iter 391140: loss 6.4310, time 122.87ms
iter 391150: loss 6.0111, time 121.61ms
iter 391160: loss 6.2649, time 123.17ms
iter 391170: loss 6.2231, time 121.89ms
iter 391180: loss 5.6457, time 122.78ms
iter 391190: loss 6.1580, time 121.41ms
iter 391200: loss 6.2707, time 122.82ms
iter 391210: loss 6.6102, time 121.78ms
iter 391220: loss 6.1860, time 122.75ms
iter 391230: loss 5.8536, time 121.73ms
iter 391240: loss 6.2023, time 122.70ms
step 391250: train loss 5.5813, val loss 5.6068
saving checkpoint to out-shakespeare-char
iter 391250: loss 4.7121, time 2901.68ms
iter 391260: loss 5.4147, time 125.47ms
iter 391270: loss 6.3065, time 125.19ms
iter 391280: loss 5.9771, time 127.60ms
iter 391290: loss 5.9881, time 125.19ms
iter 391300: loss 5.9693, time 125.28ms
iter 391310: loss 5.4427, time 124.75ms
iter 391320: loss 6.1535, time 124.42ms
iter 391330: loss 5.9172, time 125.20ms
iter 391340: loss 6.6504, time 124.83ms
iter 391350: loss 5.6982, time 124.20ms
iter 391360: loss 5.8517, time 124.79ms
iter 391370: loss 7.0401, time 124.57ms
iter 391380: loss 5.8534, time 125.35ms
iter 391390: loss 5.4449, time 126.77ms
iter 391400: loss 5.9975, time 124.39ms
iter 391410: loss 5.3538, time 124.71ms
iter 391420: loss 5.5868, time 125.38ms
iter 391430: loss 5.7747, time 125.52ms
iter 391440: loss 6.4428, time 125.06ms
iter 391450: loss 5.7743, time 126.18ms
iter 391460: loss 6.0586, time 128.02ms
iter 391470: loss 5.8121, time 125.16ms
iter 391480: loss 5.8647, time 124.41ms
iter 391490: loss 6.0269, time 125.47ms
step 391500: train loss 5.6086, val loss 5.6006
saving checkpoint to out-shakespeare-char
iter 391500: loss 6.0496, time 2886.78ms
iter 391510: loss 6.2027, time 121.78ms
iter 391520: loss 5.9978, time 124.06ms
iter 391530: loss 5.8694, time 121.68ms
iter 391540: loss 6.0166, time 124.41ms
iter 391550: loss 5.8586, time 121.30ms
iter 391560: loss 6.4728, time 123.60ms
iter 391570: loss 5.8662, time 121.63ms
iter 391580: loss 5.8114, time 123.88ms
iter 391590: loss 6.0302, time 121.69ms
iter 391600: loss 6.4361, time 124.05ms
iter 391610: loss 5.5938, time 121.42ms
iter 391620: loss 6.0896, time 123.73ms
iter 391630: loss 6.2639, time 121.93ms
iter 391640: loss 6.3072, time 123.12ms
iter 391650: loss 6.2029, time 121.79ms
iter 391660: loss 5.8325, time 123.35ms
iter 391670: loss 5.9845, time 121.81ms
iter 391680: loss 6.1636, time 122.04ms
iter 391690: loss 5.6625, time 121.80ms
iter 391700: loss 5.8157, time 123.01ms
iter 391710: loss 6.0516, time 120.94ms
iter 391720: loss 6.0827, time 122.82ms
iter 391730: loss 6.1427, time 121.79ms
iter 391740: loss 5.8292, time 123.22ms
step 391750: train loss 5.5413, val loss 5.6072
saving checkpoint to out-shakespeare-char
iter 391750: loss 5.6882, time 2898.28ms
iter 391760: loss 6.3590, time 125.21ms
iter 391770: loss 5.4319, time 125.33ms
iter 391780: loss 6.2958, time 127.59ms
iter 391790: loss 5.5072, time 124.41ms
iter 391800: loss 6.1254, time 125.19ms
iter 391810: loss 5.0241, time 125.22ms
iter 391820: loss 5.9148, time 125.31ms
iter 391830: loss 5.5159, time 125.77ms
iter 391840: loss 6.1465, time 125.96ms
iter 391850: loss 6.0821, time 125.06ms
iter 391860: loss 5.4653, time 123.30ms
iter 391870: loss 5.8077, time 124.78ms
iter 391880: loss 5.9237, time 127.52ms
iter 391890: loss 6.2359, time 125.95ms
iter 391900: loss 5.6607, time 125.30ms
iter 391910: loss 5.5310, time 125.64ms
iter 391920: loss 5.3806, time 127.61ms
iter 391930: loss 6.1629, time 125.72ms
iter 391940: loss 6.0189, time 125.60ms
iter 391950: loss 6.4865, time 125.59ms
iter 391960: loss 5.7691, time 127.78ms
iter 391970: loss 6.3437, time 125.37ms
iter 391980: loss 6.0510, time 125.58ms
iter 391990: loss 6.3896, time 125.52ms
step 392000: train loss 5.6006, val loss 5.5735
saving checkpoint to out-shakespeare-char
iter 392000: loss 5.9600, time 2899.16ms
iter 392010: loss 5.8815, time 125.76ms
iter 392020: loss 6.2420, time 125.57ms
iter 392030: loss 5.7392, time 126.28ms
iter 392040: loss 7.2953, time 128.01ms
iter 392050: loss 6.2953, time 125.58ms
iter 392060: loss 6.4728, time 125.39ms
iter 392070: loss 6.2395, time 125.59ms
iter 392080: loss 6.0028, time 125.40ms
iter 392090: loss 6.4280, time 125.84ms
iter 392100: loss 5.7284, time 125.81ms
iter 392110: loss 5.4660, time 125.59ms
iter 392120: loss 5.9946, time 125.66ms
iter 392130: loss 5.6158, time 125.49ms
iter 392140: loss 5.2009, time 125.65ms
iter 392150: loss 5.8669, time 125.68ms
iter 392160: loss 5.0308, time 125.71ms
iter 392170: loss 5.9779, time 124.74ms
iter 392180: loss 6.1627, time 125.67ms
iter 392190: loss 5.8115, time 126.82ms
iter 392200: loss 5.7838, time 125.53ms
iter 392210: loss 6.2248, time 126.79ms
iter 392220: loss 5.5835, time 125.39ms
iter 392230: loss 6.0714, time 125.65ms
iter 392240: loss 5.7384, time 125.49ms
step 392250: train loss 5.5674, val loss 5.5711
saving checkpoint to out-shakespeare-char
iter 392250: loss 6.5583, time 2889.76ms
iter 392260: loss 5.1709, time 126.29ms
iter 392270: loss 5.7547, time 125.45ms
iter 392280: loss 6.7140, time 125.43ms
iter 392290: loss 5.0618, time 125.42ms
iter 392300: loss 6.6318, time 125.59ms
iter 392310: loss 6.1632, time 125.90ms
iter 392320: loss 6.4058, time 125.61ms
iter 392330: loss 5.3719, time 125.61ms
iter 392340: loss 5.6280, time 121.92ms
iter 392350: loss 6.3480, time 121.64ms
iter 392360: loss 5.9900, time 121.83ms
iter 392370: loss 5.4940, time 121.61ms
iter 392380: loss 5.5691, time 121.84ms
iter 392390: loss 5.8841, time 120.96ms
iter 392400: loss 5.6422, time 121.81ms
iter 392410: loss 6.1304, time 122.86ms
iter 392420: loss 5.1159, time 121.78ms
iter 392430: loss 5.4520, time 121.64ms
iter 392440: loss 6.3518, time 125.92ms
iter 392450: loss 5.1947, time 125.54ms
iter 392460: loss 5.2353, time 125.75ms
iter 392470: loss 6.2289, time 125.48ms
iter 392480: loss 6.2164, time 125.47ms
iter 392490: loss 5.2624, time 125.47ms
step 392500: train loss 5.5955, val loss 5.6027
saving checkpoint to out-shakespeare-char
iter 392500: loss 6.1073, time 2887.81ms
iter 392510: loss 5.2170, time 125.39ms
iter 392520: loss 5.3660, time 125.82ms
iter 392530: loss 6.1625, time 125.14ms
iter 392540: loss 6.1077, time 125.69ms
iter 392550: loss 6.5158, time 125.89ms
iter 392560: loss 6.5229, time 127.42ms
iter 392570: loss 6.2683, time 125.45ms
iter 392580: loss 6.1381, time 124.64ms
iter 392590: loss 6.2094, time 126.42ms
iter 392600: loss 5.6709, time 125.60ms
iter 392610: loss 5.5493, time 125.64ms
iter 392620: loss 5.9832, time 125.12ms
iter 392630: loss 6.0915, time 124.42ms
iter 392640: loss 6.1805, time 125.55ms
iter 392650: loss 6.4627, time 125.49ms
iter 392660: loss 5.5078, time 125.50ms
iter 392670: loss 6.4041, time 125.59ms
iter 392680: loss 5.8828, time 125.74ms
iter 392690: loss 5.2953, time 125.70ms
iter 392700: loss 5.9233, time 125.27ms
iter 392710: loss 6.1858, time 128.36ms
iter 392720: loss 6.0805, time 126.02ms
iter 392730: loss 6.0180, time 125.60ms
iter 392740: loss 6.0481, time 125.66ms
step 392750: train loss 5.5377, val loss 5.6573
saving checkpoint to out-shakespeare-char
iter 392750: loss 7.0071, time 2894.53ms
iter 392760: loss 5.7204, time 121.91ms
iter 392770: loss 5.4365, time 122.96ms
iter 392780: loss 5.9381, time 121.98ms
iter 392790: loss 6.5098, time 123.64ms
iter 392800: loss 5.9857, time 121.88ms
iter 392810: loss 5.9343, time 123.01ms
iter 392820: loss 5.9345, time 121.89ms
iter 392830: loss 5.8994, time 123.16ms
iter 392840: loss 5.5223, time 120.45ms
iter 392850: loss 6.1059, time 122.94ms
iter 392860: loss 6.1508, time 122.21ms
iter 392870: loss 5.5417, time 122.95ms
iter 392880: loss 5.7795, time 125.52ms
iter 392890: loss 5.4406, time 125.83ms
iter 392900: loss 5.6686, time 125.69ms
iter 392910: loss 5.5370, time 128.30ms
iter 392920: loss 5.6748, time 125.82ms
iter 392930: loss 6.0485, time 125.83ms
iter 392940: loss 6.1001, time 125.73ms
iter 392950: loss 5.9503, time 126.08ms
iter 392960: loss 6.0851, time 125.66ms
iter 392970: loss 6.4784, time 125.59ms
iter 392980: loss 5.2254, time 124.71ms
iter 392990: loss 5.6986, time 124.72ms
step 393000: train loss 5.6106, val loss 5.6154
saving checkpoint to out-shakespeare-char
iter 393000: loss 5.9565, time 2861.86ms
iter 393010: loss 5.3331, time 121.79ms
iter 393020: loss 5.9024, time 121.07ms
iter 393030: loss 6.4316, time 121.47ms
iter 393040: loss 6.1265, time 121.52ms
iter 393050: loss 5.8598, time 121.49ms
iter 393060: loss 5.9924, time 121.56ms
iter 393070: loss 6.5627, time 121.57ms
iter 393080: loss 6.4675, time 121.65ms
iter 393090: loss 7.0810, time 120.61ms
iter 393100: loss 5.4319, time 121.55ms
iter 393110: loss 5.9252, time 121.53ms
iter 393120: loss 6.0877, time 121.64ms
iter 393130: loss 6.0934, time 121.31ms
iter 393140: loss 6.1275, time 121.54ms
iter 393150: loss 5.9515, time 121.53ms
iter 393160: loss 6.1051, time 122.20ms
iter 393170: loss 6.1686, time 121.57ms
iter 393180: loss 6.0955, time 121.56ms
iter 393190: loss 5.5903, time 122.02ms
iter 393200: loss 5.7352, time 121.55ms
iter 393210: loss 6.1284, time 121.45ms
iter 393220: loss 5.3509, time 121.66ms
iter 393230: loss 6.2229, time 121.13ms
iter 393240: loss 5.4733, time 121.34ms
step 393250: train loss 5.6131, val loss 5.6184
saving checkpoint to out-shakespeare-char
iter 393250: loss 6.4323, time 2890.86ms
iter 393260: loss 6.6219, time 124.25ms
iter 393270: loss 6.2201, time 121.65ms
iter 393280: loss 5.6717, time 123.74ms
iter 393290: loss 6.0771, time 121.28ms
iter 393300: loss 6.5685, time 128.53ms
iter 393310: loss 6.2310, time 125.36ms
iter 393320: loss 6.6233, time 125.79ms
iter 393330: loss 6.3932, time 125.73ms
iter 393340: loss 6.1901, time 126.78ms
iter 393350: loss 6.6381, time 125.52ms
iter 393360: loss 5.8898, time 125.59ms
iter 393370: loss 6.3623, time 125.88ms
iter 393380: loss 5.9650, time 125.12ms
iter 393390: loss 6.2057, time 125.43ms
iter 393400: loss 5.9682, time 125.47ms
iter 393410: loss 5.6528, time 128.12ms
iter 393420: loss 5.8337, time 127.11ms
iter 393430: loss 5.9344, time 125.28ms
iter 393440: loss 5.9445, time 125.39ms
iter 393450: loss 5.0581, time 125.53ms
iter 393460: loss 5.3810, time 125.09ms
iter 393470: loss 6.2106, time 125.29ms
iter 393480: loss 5.8819, time 125.29ms
iter 393490: loss 5.8157, time 127.48ms
step 393500: train loss 5.5808, val loss 5.5871
saving checkpoint to out-shakespeare-char
iter 393500: loss 5.7630, time 2903.66ms
iter 393510: loss 5.8106, time 125.29ms
iter 393520: loss 5.9967, time 125.74ms
iter 393530: loss 5.9699, time 125.53ms
iter 393540: loss 6.3499, time 125.47ms
iter 393550: loss 5.8660, time 125.17ms
iter 393560: loss 6.2198, time 125.55ms
iter 393570: loss 6.2074, time 125.29ms
iter 393580: loss 6.1984, time 127.15ms
iter 393590: loss 5.9540, time 125.61ms
iter 393600: loss 5.5378, time 125.81ms
iter 393610: loss 5.5959, time 125.61ms
iter 393620: loss 5.3538, time 124.93ms
iter 393630: loss 5.6829, time 125.80ms
iter 393640: loss 5.5576, time 126.20ms
iter 393650: loss 5.4197, time 124.25ms
iter 393660: loss 5.5783, time 125.85ms
iter 393670: loss 6.1187, time 125.21ms
iter 393680: loss 6.5725, time 125.44ms
iter 393690: loss 6.4964, time 125.70ms
iter 393700: loss 6.1206, time 127.71ms
iter 393710: loss 5.4675, time 125.34ms
iter 393720: loss 5.7655, time 125.61ms
iter 393730: loss 5.6023, time 125.59ms
iter 393740: loss 6.0773, time 125.84ms
step 393750: train loss 5.5639, val loss 5.5518
saving checkpoint to out-shakespeare-char
iter 393750: loss 5.9535, time 2876.74ms
iter 393760: loss 5.2682, time 121.73ms
iter 393770: loss 5.4846, time 121.33ms
iter 393780: loss 6.2737, time 121.69ms
iter 393790: loss 5.5875, time 121.54ms
iter 393800: loss 5.8202, time 122.94ms
iter 393810: loss 5.7382, time 121.66ms
iter 393820: loss 6.6844, time 121.64ms
iter 393830: loss 5.9894, time 121.59ms
iter 393840: loss 6.2384, time 121.53ms
iter 393850: loss 5.6899, time 121.62ms
iter 393860: loss 5.5961, time 121.64ms
iter 393870: loss 5.3567, time 122.36ms
iter 393880: loss 5.9346, time 121.51ms
iter 393890: loss 6.7314, time 122.08ms
iter 393900: loss 6.3799, time 121.51ms
iter 393910: loss 6.1289, time 119.62ms
iter 393920: loss 5.5985, time 121.68ms
iter 393930: loss 5.1310, time 121.48ms
iter 393940: loss 6.6322, time 121.53ms
iter 393950: loss 6.4120, time 121.41ms
iter 393960: loss 6.0073, time 121.45ms
iter 393970: loss 5.4881, time 121.38ms
iter 393980: loss 6.0874, time 121.50ms
iter 393990: loss 5.5828, time 121.46ms
step 394000: train loss 5.5758, val loss 5.6013
saving checkpoint to out-shakespeare-char
iter 394000: loss 6.2512, time 2886.75ms
iter 394010: loss 5.9837, time 121.46ms
iter 394020: loss 6.2210, time 125.91ms
iter 394030: loss 5.8228, time 125.95ms
iter 394040: loss 6.4715, time 125.69ms
iter 394050: loss 6.8103, time 125.67ms
iter 394060: loss 6.4222, time 124.08ms
iter 394070: loss 6.1536, time 125.81ms
iter 394080: loss 5.4799, time 125.72ms
iter 394090: loss 5.1982, time 125.69ms
iter 394100: loss 6.4427, time 125.84ms
iter 394110: loss 6.6605, time 121.45ms
iter 394120: loss 5.8464, time 121.37ms
iter 394130: loss 7.0337, time 121.54ms
iter 394140: loss 5.9147, time 121.39ms
iter 394150: loss 6.6917, time 121.47ms
iter 394160: loss 6.4499, time 121.44ms
iter 394170: loss 5.6753, time 121.20ms
iter 394180: loss 5.9588, time 122.92ms
iter 394190: loss 5.6179, time 122.63ms
iter 394200: loss 6.2725, time 121.54ms
iter 394210: loss 5.6468, time 122.50ms
iter 394220: loss 5.5643, time 120.19ms
iter 394230: loss 6.1039, time 122.77ms
iter 394240: loss 6.0214, time 121.69ms
step 394250: train loss 5.5894, val loss 5.5959
saving checkpoint to out-shakespeare-char
iter 394250: loss 5.8937, time 2867.43ms
iter 394260: loss 6.4747, time 121.69ms
iter 394270: loss 6.2260, time 121.75ms
iter 394280: loss 6.4593, time 121.85ms
iter 394290: loss 5.7872, time 121.98ms
iter 394300: loss 5.8713, time 122.50ms
iter 394310: loss 6.4193, time 121.83ms
iter 394320: loss 5.3911, time 122.10ms
iter 394330: loss 5.9330, time 121.31ms
iter 394340: loss 6.0505, time 123.21ms
iter 394350: loss 6.4328, time 122.14ms
iter 394360: loss 5.3423, time 121.51ms
iter 394370: loss 5.6620, time 121.91ms
iter 394380: loss 5.7010, time 121.80ms
iter 394390: loss 5.8336, time 125.12ms
iter 394400: loss 6.4023, time 121.99ms
iter 394410: loss 6.2729, time 121.86ms
iter 394420: loss 5.8101, time 121.75ms
iter 394430: loss 5.6863, time 121.87ms
iter 394440: loss 6.2404, time 123.22ms
iter 394450: loss 6.4014, time 121.89ms
iter 394460: loss 5.2813, time 121.81ms
iter 394470: loss 5.8902, time 122.73ms
iter 394480: loss 6.0258, time 121.93ms
iter 394490: loss 5.5505, time 122.28ms
step 394500: train loss 5.5917, val loss 5.5650
saving checkpoint to out-shakespeare-char
iter 394500: loss 5.6787, time 2903.91ms
iter 394510: loss 6.3427, time 121.67ms
iter 394520: loss 5.7425, time 121.28ms
iter 394530: loss 5.6193, time 121.28ms
iter 394540: loss 5.8959, time 121.28ms
iter 394550: loss 6.0790, time 120.53ms
iter 394560: loss 6.5061, time 121.43ms
iter 394570: loss 6.0787, time 121.29ms
iter 394580: loss 6.2598, time 121.08ms
iter 394590: loss 6.2278, time 121.54ms
iter 394600: loss 6.6866, time 122.42ms
iter 394610: loss 6.2859, time 121.06ms
iter 394620: loss 5.9473, time 121.25ms
iter 394630: loss 6.4838, time 121.44ms
iter 394640: loss 5.9855, time 121.78ms
iter 394650: loss 5.8562, time 122.10ms
iter 394660: loss 6.0582, time 121.50ms
iter 394670: loss 6.5151, time 121.32ms
iter 394680: loss 5.5761, time 122.15ms
iter 394690: loss 6.1893, time 121.60ms
iter 394700: loss 6.0572, time 121.52ms
iter 394710: loss 5.4518, time 121.32ms
iter 394720: loss 6.1344, time 122.61ms
iter 394730: loss 5.8757, time 121.48ms
iter 394740: loss 6.0567, time 121.36ms
step 394750: train loss 5.5825, val loss 5.5899
saving checkpoint to out-shakespeare-char
iter 394750: loss 5.6592, time 2886.62ms
iter 394760: loss 6.1082, time 121.27ms
iter 394770: loss 6.4596, time 121.08ms
iter 394780: loss 6.5511, time 122.74ms
iter 394790: loss 5.6961, time 121.31ms
iter 394800: loss 5.4737, time 122.08ms
iter 394810: loss 5.7009, time 121.43ms
iter 394820: loss 6.8261, time 122.16ms
iter 394830: loss 6.0585, time 121.76ms
iter 394840: loss 6.0945, time 121.97ms
iter 394850: loss 5.6504, time 121.83ms
iter 394860: loss 5.3485, time 121.94ms
iter 394870: loss 6.4615, time 121.55ms
iter 394880: loss 6.4291, time 122.68ms
iter 394890: loss 6.2807, time 122.57ms
iter 394900: loss 6.0226, time 122.01ms
iter 394910: loss 6.6208, time 121.82ms
iter 394920: loss 6.0197, time 122.29ms
iter 394930: loss 5.7363, time 123.11ms
iter 394940: loss 5.7170, time 121.83ms
iter 394950: loss 5.9760, time 122.99ms
iter 394960: loss 5.6106, time 121.69ms
iter 394970: loss 6.6005, time 121.92ms
iter 394980: loss 6.5661, time 121.83ms
iter 394990: loss 6.0530, time 123.05ms
step 395000: train loss 5.5980, val loss 5.6098
saving checkpoint to out-shakespeare-char
iter 395000: loss 6.2631, time 2872.17ms
iter 395010: loss 6.5461, time 121.86ms
iter 395020: loss 6.0748, time 122.04ms
iter 395030: loss 6.3092, time 120.63ms
iter 395040: loss 6.2386, time 123.14ms
iter 395050: loss 6.4987, time 121.89ms
iter 395060: loss 6.5973, time 123.06ms
iter 395070: loss 5.5569, time 121.78ms
iter 395080: loss 6.4248, time 123.69ms
iter 395090: loss 6.0395, time 122.23ms
iter 395100: loss 6.2389, time 122.63ms
iter 395110: loss 5.5721, time 122.02ms
iter 395120: loss 4.9121, time 123.03ms
iter 395130: loss 6.1514, time 121.81ms
iter 395140: loss 6.5754, time 122.97ms
iter 395150: loss 6.1378, time 121.91ms
iter 395160: loss 6.7038, time 122.56ms
iter 395170: loss 5.3259, time 121.88ms
iter 395180: loss 6.4891, time 122.48ms
iter 395190: loss 5.9925, time 122.94ms
iter 395200: loss 6.0847, time 121.98ms
iter 395210: loss 6.6171, time 122.73ms
iter 395220: loss 5.9979, time 121.76ms
iter 395230: loss 5.9172, time 121.94ms
iter 395240: loss 6.3075, time 121.73ms
step 395250: train loss 5.5635, val loss 5.5397
saving checkpoint to out-shakespeare-char
iter 395250: loss 6.5952, time 2907.26ms
iter 395260: loss 5.7534, time 123.06ms
iter 395270: loss 6.0074, time 121.74ms
iter 395280: loss 5.9964, time 123.12ms
iter 395290: loss 5.3701, time 121.89ms
iter 395300: loss 6.1084, time 123.28ms
iter 395310: loss 5.9473, time 121.86ms
iter 395320: loss 6.4319, time 122.93ms
iter 395330: loss 5.9121, time 121.82ms
iter 395340: loss 5.8381, time 122.97ms
iter 395350: loss 5.2567, time 123.05ms
iter 395360: loss 6.0684, time 122.01ms
iter 395370: loss 6.2553, time 123.09ms
iter 395380: loss 6.1167, time 122.40ms
iter 395390: loss 5.9239, time 123.04ms
iter 395400: loss 5.7786, time 121.90ms
iter 395410: loss 6.4923, time 122.88ms
iter 395420: loss 6.2795, time 122.09ms
iter 395430: loss 5.7692, time 123.28ms
iter 395440: loss 5.5154, time 121.77ms
iter 395450: loss 6.3055, time 123.31ms
iter 395460: loss 5.2018, time 121.88ms
iter 395470: loss 6.3460, time 121.78ms
iter 395480: loss 6.4993, time 121.79ms
iter 395490: loss 6.2600, time 121.54ms
step 395500: train loss 5.5807, val loss 5.6130
saving checkpoint to out-shakespeare-char
iter 395500: loss 6.6338, time 2893.65ms
iter 395510: loss 5.8570, time 121.54ms
iter 395520: loss 5.5139, time 121.91ms
iter 395530: loss 5.3944, time 121.90ms
iter 395540: loss 5.5456, time 122.84ms
iter 395550: loss 6.0860, time 121.79ms
iter 395560: loss 5.9636, time 122.08ms
iter 395570: loss 5.7276, time 121.83ms
iter 395580: loss 6.3218, time 121.98ms
iter 395590: loss 5.5273, time 121.55ms
iter 395600: loss 6.4979, time 121.68ms
iter 395610: loss 6.1811, time 119.37ms
iter 395620: loss 5.9550, time 122.85ms
iter 395630: loss 6.2058, time 122.19ms
iter 395640: loss 6.3761, time 123.14ms
iter 395650: loss 6.0036, time 121.86ms
iter 395660: loss 5.9499, time 120.99ms
iter 395670: loss 6.6717, time 119.75ms
iter 395680: loss 5.8520, time 120.69ms
iter 395690: loss 5.5466, time 119.84ms
iter 395700: loss 6.8660, time 120.59ms
iter 395710: loss 5.6865, time 121.98ms
iter 395720: loss 5.9931, time 122.66ms
iter 395730: loss 5.6149, time 122.41ms
iter 395740: loss 6.5644, time 120.98ms
step 395750: train loss 5.5795, val loss 5.6262
saving checkpoint to out-shakespeare-char
iter 395750: loss 6.6032, time 2873.38ms
iter 395760: loss 6.0241, time 120.91ms
iter 395770: loss 5.6325, time 122.23ms
iter 395780: loss 5.4752, time 123.22ms
iter 395790: loss 6.7546, time 122.89ms
iter 395800: loss 6.1888, time 123.01ms
iter 395810: loss 5.8893, time 121.85ms
iter 395820: loss 6.5531, time 123.03ms
iter 395830: loss 6.1230, time 121.91ms
iter 395840: loss 5.8642, time 122.93ms
iter 395850: loss 5.8618, time 120.84ms
iter 395860: loss 6.7331, time 122.22ms
iter 395870: loss 5.5557, time 121.87ms
iter 395880: loss 6.1805, time 123.09ms
iter 395890: loss 6.0781, time 123.14ms
iter 395900: loss 6.2379, time 121.35ms
iter 395910: loss 6.0566, time 122.31ms
iter 395920: loss 5.5871, time 121.75ms
iter 395930: loss 5.1552, time 121.88ms
iter 395940: loss 6.2313, time 121.40ms
iter 395950: loss 6.4332, time 121.39ms
iter 395960: loss 5.9653, time 122.90ms
iter 395970: loss 6.8010, time 121.96ms
iter 395980: loss 5.9092, time 121.83ms
iter 395990: loss 6.0291, time 121.97ms
step 396000: train loss 5.5517, val loss 5.6209
saving checkpoint to out-shakespeare-char
iter 396000: loss 6.5151, time 2895.08ms
iter 396010: loss 5.6307, time 122.11ms
iter 396020: loss 5.2780, time 123.08ms
iter 396030: loss 6.1456, time 121.97ms
iter 396040: loss 6.6326, time 122.63ms
iter 396050: loss 6.7737, time 123.09ms
iter 396060: loss 5.7677, time 121.93ms
iter 396070: loss 6.6584, time 121.87ms
iter 396080: loss 6.2839, time 122.22ms
iter 396090: loss 6.0083, time 121.77ms
iter 396100: loss 5.9232, time 121.95ms
iter 396110: loss 5.7688, time 122.16ms
iter 396120: loss 5.4867, time 122.11ms
iter 396130: loss 5.8337, time 121.99ms
iter 396140: loss 6.4828, time 121.85ms
iter 396150: loss 5.7824, time 121.84ms
iter 396160: loss 6.0984, time 123.31ms
iter 396170: loss 5.7921, time 124.09ms
iter 396180: loss 5.8516, time 121.91ms
iter 396190: loss 5.5649, time 124.05ms
iter 396200: loss 6.1035, time 121.92ms
iter 396210: loss 6.3516, time 124.13ms
iter 396220: loss 6.2585, time 121.96ms
iter 396230: loss 5.6012, time 123.97ms
iter 396240: loss 5.4565, time 122.16ms
step 396250: train loss 5.5800, val loss 5.5597
saving checkpoint to out-shakespeare-char
iter 396250: loss 5.4774, time 2895.53ms
iter 396260: loss 6.9268, time 122.00ms
iter 396270: loss 5.8118, time 121.33ms
iter 396280: loss 4.8355, time 121.04ms
iter 396290: loss 5.9793, time 120.81ms
iter 396300: loss 5.7734, time 121.15ms
iter 396310: loss 6.1476, time 121.07ms
iter 396320: loss 6.0909, time 122.16ms
iter 396330: loss 5.7331, time 121.37ms
iter 396340: loss 5.8771, time 121.08ms
iter 396350: loss 6.1999, time 120.86ms
iter 396360: loss 5.7168, time 120.96ms
iter 396370: loss 5.3493, time 121.25ms
iter 396380: loss 5.4595, time 121.07ms
iter 396390: loss 5.7404, time 121.15ms
iter 396400: loss 5.8937, time 121.29ms
iter 396410: loss 5.7508, time 121.15ms
iter 396420: loss 5.2551, time 120.64ms
iter 396430: loss 5.5636, time 119.69ms
iter 396440: loss 6.4602, time 120.84ms
iter 396450: loss 5.9412, time 122.05ms
iter 396460: loss 6.4290, time 121.90ms
iter 396470: loss 5.6757, time 121.45ms
iter 396480: loss 6.1147, time 121.89ms
iter 396490: loss 7.0733, time 122.15ms
step 396500: train loss 5.5844, val loss 5.5831
saving checkpoint to out-shakespeare-char
iter 396500: loss 6.1278, time 2871.90ms
iter 396510: loss 6.1206, time 121.21ms
iter 396520: loss 6.1976, time 130.19ms
iter 396530: loss 6.1771, time 120.83ms
iter 396540: loss 6.6087, time 121.85ms
iter 396550: loss 6.0331, time 123.24ms
iter 396560: loss 6.1954, time 121.68ms
iter 396570: loss 5.8618, time 123.40ms
iter 396580: loss 6.0901, time 120.83ms
iter 396590: loss 6.0255, time 122.88ms
iter 396600: loss 5.2359, time 121.27ms
iter 396610: loss 5.9869, time 121.49ms
iter 396620: loss 6.4130, time 121.41ms
iter 396630: loss 6.0621, time 121.56ms
iter 396640: loss 5.9048, time 121.55ms
iter 396650: loss 6.2795, time 121.54ms
iter 396660: loss 5.6150, time 121.25ms
iter 396670: loss 6.1404, time 121.93ms
iter 396680: loss 5.9257, time 122.06ms
iter 396690: loss 6.1684, time 122.32ms
iter 396700: loss 5.4177, time 122.26ms
iter 396710: loss 6.1161, time 122.61ms
iter 396720: loss 5.9632, time 121.84ms
iter 396730: loss 6.7158, time 122.03ms
iter 396740: loss 5.4812, time 122.65ms
step 396750: train loss 5.5537, val loss 5.6041
saving checkpoint to out-shakespeare-char
iter 396750: loss 5.5628, time 2895.26ms
iter 396760: loss 5.9442, time 125.63ms
iter 396770: loss 5.5684, time 125.91ms
iter 396780: loss 5.7177, time 126.07ms
iter 396790: loss 6.4753, time 125.80ms
iter 396800: loss 5.4756, time 125.87ms
iter 396810: loss 5.6334, time 125.58ms
iter 396820: loss 6.2189, time 125.73ms
iter 396830: loss 6.5971, time 125.30ms
iter 396840: loss 6.7163, time 125.59ms
iter 396850: loss 5.7020, time 125.66ms
iter 396860: loss 6.2719, time 125.91ms
iter 396870: loss 6.3612, time 126.05ms
iter 396880: loss 6.0775, time 125.75ms
iter 396890: loss 5.5072, time 125.79ms
iter 396900: loss 5.9273, time 125.33ms
iter 396910: loss 6.9577, time 125.96ms
iter 396920: loss 6.6077, time 125.25ms
iter 396930: loss 5.9051, time 125.78ms
iter 396940: loss 5.4449, time 125.38ms
iter 396950: loss 5.9168, time 128.30ms
iter 396960: loss 6.1955, time 124.72ms
iter 396970: loss 5.6042, time 125.63ms
iter 396980: loss 6.0140, time 125.30ms
iter 396990: loss 6.0379, time 126.42ms
step 397000: train loss 5.5636, val loss 5.5970
saving checkpoint to out-shakespeare-char
iter 397000: loss 5.4375, time 2882.68ms
iter 397010: loss 6.2616, time 122.04ms
iter 397020: loss 5.4239, time 121.73ms
iter 397030: loss 5.5378, time 121.82ms
iter 397040: loss 5.9279, time 121.62ms
iter 397050: loss 6.3216, time 122.55ms
iter 397060: loss 6.1148, time 121.58ms
iter 397070: loss 5.4230, time 122.62ms
iter 397080: loss 5.9697, time 121.43ms
iter 397090: loss 5.6202, time 123.64ms
iter 397100: loss 5.5516, time 121.52ms
iter 397110: loss 5.7958, time 122.79ms
iter 397120: loss 6.1564, time 121.20ms
iter 397130: loss 5.6325, time 122.20ms
iter 397140: loss 5.7801, time 121.48ms
iter 397150: loss 6.2412, time 122.56ms
iter 397160: loss 5.2861, time 121.49ms
iter 397170: loss 5.1888, time 122.56ms
iter 397180: loss 6.2846, time 121.25ms
iter 397190: loss 5.9243, time 122.89ms
iter 397200: loss 6.8625, time 121.00ms
iter 397210: loss 5.0857, time 122.57ms
iter 397220: loss 5.5418, time 120.69ms
iter 397230: loss 6.1755, time 123.80ms
iter 397240: loss 5.9386, time 121.42ms
step 397250: train loss 5.5987, val loss 5.5433
saving checkpoint to out-shakespeare-char
iter 397250: loss 6.4890, time 2884.11ms
iter 397260: loss 6.5665, time 121.56ms
iter 397270: loss 6.9618, time 122.57ms
iter 397280: loss 5.4751, time 121.42ms
iter 397290: loss 5.9373, time 121.57ms
iter 397300: loss 6.1185, time 121.62ms
iter 397310: loss 5.8637, time 122.48ms
iter 397320: loss 6.1640, time 121.38ms
iter 397330: loss 5.9311, time 122.65ms
iter 397340: loss 5.6377, time 121.35ms
iter 397350: loss 5.4915, time 122.47ms
iter 397360: loss 5.5911, time 121.62ms
iter 397370: loss 5.2514, time 122.55ms
iter 397380: loss 6.1505, time 121.60ms
iter 397390: loss 5.5895, time 122.61ms
iter 397400: loss 6.3118, time 121.46ms
iter 397410: loss 6.3593, time 121.59ms
iter 397420: loss 6.1193, time 120.69ms
iter 397430: loss 5.3444, time 122.96ms
iter 397440: loss 5.6445, time 121.29ms
iter 397450: loss 5.7596, time 122.40ms
iter 397460: loss 5.4719, time 121.38ms
iter 397470: loss 6.0130, time 123.36ms
iter 397480: loss 5.7434, time 121.42ms
iter 397490: loss 5.6106, time 122.66ms
step 397500: train loss 5.5502, val loss 5.5037
saving checkpoint to out-shakespeare-char
iter 397500: loss 7.0595, time 2884.90ms
iter 397510: loss 6.1742, time 120.71ms
iter 397520: loss 5.7288, time 121.54ms
iter 397530: loss 5.7106, time 121.37ms
iter 397540: loss 6.3765, time 121.48ms
iter 397550: loss 5.9637, time 121.40ms
iter 397560: loss 5.9847, time 121.57ms
iter 397570: loss 6.5133, time 121.50ms
iter 397580: loss 6.1925, time 122.74ms
iter 397590: loss 5.8950, time 121.84ms
iter 397600: loss 5.7358, time 121.40ms
iter 397610: loss 6.4891, time 121.46ms
iter 397620: loss 5.8103, time 121.65ms
iter 397630: loss 6.1533, time 121.59ms
iter 397640: loss 4.9307, time 121.62ms
iter 397650: loss 6.1467, time 121.48ms
iter 397660: loss 6.5663, time 121.20ms
iter 397670: loss 6.1154, time 120.60ms
iter 397680: loss 5.9540, time 124.24ms
iter 397690: loss 6.2573, time 121.57ms
iter 397700: loss 6.5535, time 123.72ms
iter 397710: loss 5.3387, time 123.03ms
iter 397720: loss 6.3647, time 122.13ms
iter 397730: loss 6.2132, time 121.73ms
iter 397740: loss 5.6670, time 122.16ms
step 397750: train loss 5.5986, val loss 5.5514
saving checkpoint to out-shakespeare-char
iter 397750: loss 5.7222, time 2887.22ms
iter 397760: loss 5.4964, time 121.99ms
iter 397770: loss 6.5297, time 121.81ms
iter 397780: loss 6.2235, time 121.46ms
iter 397790: loss 6.2946, time 122.80ms
iter 397800: loss 5.8189, time 122.00ms
iter 397810: loss 6.3074, time 121.85ms
iter 397820: loss 5.3105, time 121.87ms
iter 397830: loss 6.3452, time 121.91ms
iter 397840: loss 5.0990, time 122.93ms
iter 397850: loss 6.7343, time 122.56ms
iter 397860: loss 6.3676, time 123.20ms
iter 397870: loss 5.7761, time 121.95ms
iter 397880: loss 6.5805, time 122.92ms
iter 397890: loss 6.1075, time 121.82ms
iter 397900: loss 5.8301, time 123.44ms
iter 397910: loss 6.3948, time 122.07ms
iter 397920: loss 6.0861, time 123.23ms
iter 397930: loss 5.8046, time 121.75ms
iter 397940: loss 5.3510, time 121.68ms
iter 397950: loss 5.8822, time 122.25ms
iter 397960: loss 6.1903, time 121.73ms
iter 397970: loss 5.8601, time 121.91ms
iter 397980: loss 5.8265, time 122.04ms
iter 397990: loss 5.9218, time 121.79ms
step 398000: train loss 5.5884, val loss 5.6037
saving checkpoint to out-shakespeare-char
iter 398000: loss 6.0018, time 2896.54ms
iter 398010: loss 5.5831, time 121.54ms
iter 398020: loss 6.2943, time 121.47ms
iter 398030: loss 6.0675, time 121.61ms
iter 398040: loss 6.1289, time 121.35ms
iter 398050: loss 5.8763, time 121.26ms
iter 398060: loss 5.8634, time 121.69ms
iter 398070: loss 6.0139, time 121.46ms
iter 398080: loss 6.5465, time 121.30ms
iter 398090: loss 6.0531, time 120.29ms
iter 398100: loss 5.5900, time 121.26ms
iter 398110: loss 5.9437, time 121.51ms
iter 398120: loss 6.9459, time 121.38ms
iter 398130: loss 6.1346, time 121.73ms
iter 398140: loss 6.7186, time 121.73ms
iter 398150: loss 6.1213, time 122.10ms
iter 398160: loss 6.2940, time 121.54ms
iter 398170: loss 6.3002, time 121.66ms
iter 398180: loss 6.4393, time 121.77ms
iter 398190: loss 6.2814, time 122.28ms
iter 398200: loss 6.0085, time 121.56ms
iter 398210: loss 6.0335, time 121.85ms
iter 398220: loss 6.2275, time 121.78ms
iter 398230: loss 6.1971, time 121.52ms
iter 398240: loss 6.0058, time 121.65ms
step 398250: train loss 5.5484, val loss 5.5893
saving checkpoint to out-shakespeare-char
iter 398250: loss 6.1654, time 2887.26ms
iter 398260: loss 5.9323, time 121.73ms
iter 398270: loss 5.8501, time 120.92ms
iter 398280: loss 6.0308, time 121.68ms
iter 398290: loss 6.5785, time 121.60ms
iter 398300: loss 6.6194, time 121.67ms
iter 398310: loss 6.4283, time 121.61ms
iter 398320: loss 5.7444, time 122.29ms
iter 398330: loss 6.1903, time 121.31ms
iter 398340: loss 5.8997, time 121.68ms
iter 398350: loss 6.5769, time 120.21ms
iter 398360: loss 5.8170, time 121.55ms
iter 398370: loss 6.0488, time 121.38ms
iter 398380: loss 6.2874, time 121.61ms
iter 398390: loss 5.3220, time 121.49ms
iter 398400: loss 6.1966, time 121.60ms
iter 398410: loss 6.2220, time 121.29ms
iter 398420: loss 6.2173, time 121.47ms
iter 398430: loss 6.0937, time 121.35ms
iter 398440: loss 6.1095, time 121.49ms
iter 398450: loss 5.5987, time 121.50ms
iter 398460: loss 5.5667, time 121.85ms
iter 398470: loss 6.4140, time 121.51ms
iter 398480: loss 5.2011, time 121.57ms
iter 398490: loss 5.8771, time 121.37ms
step 398500: train loss 5.5674, val loss 5.6210
saving checkpoint to out-shakespeare-char
iter 398500: loss 5.9337, time 2891.63ms
iter 398510: loss 6.1276, time 121.44ms
iter 398520: loss 6.1465, time 122.98ms
iter 398530: loss 5.3236, time 121.49ms
iter 398540: loss 6.5109, time 123.75ms
iter 398550: loss 5.9656, time 121.42ms
iter 398560: loss 5.7411, time 123.97ms
iter 398570: loss 6.2485, time 121.40ms
iter 398580: loss 6.4519, time 124.15ms
iter 398590: loss 5.8818, time 121.27ms
iter 398600: loss 5.1490, time 123.60ms
iter 398610: loss 6.6681, time 121.66ms
iter 398620: loss 5.9846, time 123.61ms
iter 398630: loss 5.6172, time 121.47ms
iter 398640: loss 5.7575, time 123.70ms
iter 398650: loss 6.2514, time 121.48ms
iter 398660: loss 6.3595, time 123.70ms
iter 398670: loss 6.2276, time 121.59ms
iter 398680: loss 5.6784, time 123.64ms
iter 398690: loss 5.3869, time 121.57ms
iter 398700: loss 6.2323, time 123.72ms
iter 398710: loss 5.9549, time 121.64ms
iter 398720: loss 6.5235, time 123.79ms
iter 398730: loss 5.9056, time 121.49ms
iter 398740: loss 6.2482, time 123.70ms
step 398750: train loss 5.5579, val loss 5.5903
saving checkpoint to out-shakespeare-char
iter 398750: loss 6.0212, time 2897.93ms
iter 398760: loss 5.6028, time 125.86ms
iter 398770: loss 5.2666, time 127.44ms
iter 398780: loss 5.9706, time 125.04ms
iter 398790: loss 6.2807, time 125.03ms
iter 398800: loss 5.7666, time 125.14ms
iter 398810: loss 5.8887, time 125.26ms
iter 398820: loss 6.0280, time 125.37ms
iter 398830: loss 6.3291, time 125.31ms
iter 398840: loss 5.7706, time 128.21ms
iter 398850: loss 6.6619, time 125.63ms
iter 398860: loss 6.0844, time 125.77ms
iter 398870: loss 5.8054, time 125.60ms
iter 398880: loss 6.5095, time 128.67ms
iter 398890: loss 5.2426, time 125.36ms
iter 398900: loss 5.9216, time 125.66ms
iter 398910: loss 6.3993, time 125.71ms
iter 398920: loss 6.1997, time 125.53ms
iter 398930: loss 5.9726, time 125.53ms
iter 398940: loss 6.3458, time 126.07ms
iter 398950: loss 6.0959, time 125.65ms
iter 398960: loss 6.0658, time 125.39ms
iter 398970: loss 6.1192, time 125.63ms
iter 398980: loss 5.7410, time 125.90ms
iter 398990: loss 6.0944, time 128.58ms
step 399000: train loss 5.5780, val loss 5.6063
saving checkpoint to out-shakespeare-char
iter 399000: loss 6.2851, time 2894.16ms
iter 399010: loss 6.1516, time 121.50ms
iter 399020: loss 6.3995, time 122.75ms
iter 399030: loss 6.5340, time 121.71ms
iter 399040: loss 6.1848, time 121.66ms
iter 399050: loss 5.8618, time 122.09ms
iter 399060: loss 5.3535, time 122.76ms
iter 399070: loss 6.1857, time 121.56ms
iter 399080: loss 6.0109, time 123.00ms
iter 399090: loss 6.0096, time 121.76ms
iter 399100: loss 5.6138, time 122.88ms
iter 399110: loss 6.1749, time 122.79ms
iter 399120: loss 5.6314, time 122.75ms
iter 399130: loss 6.0292, time 121.71ms
iter 399140: loss 5.5534, time 122.80ms
iter 399150: loss 5.4725, time 121.53ms
iter 399160: loss 5.9291, time 122.79ms
iter 399170: loss 5.9170, time 121.59ms
iter 399180: loss 5.9698, time 122.79ms
iter 399190: loss 6.1257, time 121.56ms
iter 399200: loss 7.0772, time 122.49ms
iter 399210: loss 5.9769, time 121.64ms
iter 399220: loss 6.0354, time 123.02ms
iter 399230: loss 6.3960, time 121.34ms
iter 399240: loss 6.1471, time 122.81ms
step 399250: train loss 5.6002, val loss 5.5933
saving checkpoint to out-shakespeare-char
iter 399250: loss 6.3768, time 2914.31ms
iter 399260: loss 5.5370, time 125.60ms
iter 399270: loss 6.5897, time 125.81ms
iter 399280: loss 5.6074, time 125.31ms
iter 399290: loss 5.8844, time 125.23ms
iter 399300: loss 5.7327, time 125.27ms
iter 399310: loss 5.9743, time 127.60ms
iter 399320: loss 5.7946, time 125.12ms
iter 399330: loss 5.5362, time 125.27ms
iter 399340: loss 6.1805, time 125.40ms
iter 399350: loss 5.6722, time 125.39ms
iter 399360: loss 6.1045, time 125.21ms
iter 399370: loss 6.1901, time 125.22ms
iter 399380: loss 5.4694, time 125.19ms
iter 399390: loss 6.2315, time 125.13ms
iter 399400: loss 6.1085, time 126.41ms
iter 399410: loss 6.0792, time 125.38ms
iter 399420: loss 5.8805, time 127.81ms
iter 399430: loss 5.9013, time 124.66ms
iter 399440: loss 6.5769, time 125.02ms
iter 399450: loss 5.8503, time 125.87ms
iter 399460: loss 5.9539, time 125.52ms
iter 399470: loss 5.9632, time 125.05ms
iter 399480: loss 5.7752, time 127.76ms
iter 399490: loss 6.1544, time 125.11ms
step 399500: train loss 5.6025, val loss 5.5971
saving checkpoint to out-shakespeare-char
iter 399500: loss 5.5455, time 2868.04ms
iter 399510: loss 5.6248, time 125.97ms
iter 399520: loss 6.5426, time 127.86ms
iter 399530: loss 6.2781, time 125.28ms
iter 399540: loss 5.7285, time 125.27ms
iter 399550: loss 6.0805, time 125.31ms
iter 399560: loss 6.0849, time 125.38ms
iter 399570: loss 6.1994, time 124.43ms
iter 399580: loss 5.9022, time 124.45ms
iter 399590: loss 5.2238, time 124.19ms
iter 399600: loss 5.8009, time 124.56ms
iter 399610: loss 6.2155, time 125.15ms
iter 399620: loss 6.4898, time 125.29ms
iter 399630: loss 5.9893, time 127.46ms
iter 399640: loss 6.3835, time 125.90ms
iter 399650: loss 5.9778, time 125.08ms
iter 399660: loss 6.3601, time 125.54ms
iter 399670: loss 5.4683, time 125.29ms
iter 399680: loss 5.6820, time 125.35ms
iter 399690: loss 6.0870, time 124.62ms
iter 399700: loss 6.2107, time 122.12ms
iter 399710: loss 6.4935, time 121.35ms
iter 399720: loss 5.8949, time 122.79ms
iter 399730: loss 6.3129, time 121.78ms
iter 399740: loss 5.5770, time 123.02ms
step 399750: train loss 5.5981, val loss 5.5863
saving checkpoint to out-shakespeare-char
iter 399750: loss 6.2283, time 2894.70ms
iter 399760: loss 5.6563, time 126.07ms
iter 399770: loss 5.6119, time 126.20ms
iter 399780: loss 6.1030, time 128.18ms
iter 399790: loss 6.2690, time 125.51ms
iter 399800: loss 5.5935, time 125.46ms
iter 399810: loss 6.2065, time 125.47ms
iter 399820: loss 6.0145, time 125.64ms
iter 399830: loss 6.2162, time 125.85ms
iter 399840: loss 5.5924, time 125.79ms
iter 399850: loss 5.8638, time 125.75ms
iter 399860: loss 5.8132, time 125.60ms
iter 399870: loss 5.7930, time 125.56ms
iter 399880: loss 5.1937, time 125.91ms
iter 399890: loss 6.3364, time 128.18ms
iter 399900: loss 5.9118, time 126.17ms
iter 399910: loss 6.1415, time 125.54ms
iter 399920: loss 5.3124, time 126.08ms
iter 399930: loss 6.1262, time 125.09ms
iter 399940: loss 5.8410, time 125.49ms
iter 399950: loss 5.4736, time 125.50ms
iter 399960: loss 6.0073, time 128.62ms
iter 399970: loss 6.5217, time 124.71ms
iter 399980: loss 5.7691, time 125.24ms
iter 399990: loss 5.5681, time 125.28ms
step 400000: train loss 5.5699, val loss 5.6285
saving checkpoint to out-shakespeare-char
iter 400000: loss 6.5451, time 2901.59ms
iter 400010: loss 6.4806, time 125.99ms
iter 400020: loss 5.2834, time 124.38ms
iter 400030: loss 5.9463, time 126.12ms
iter 400040: loss 6.3839, time 126.70ms
iter 400050: loss 6.9231, time 125.23ms
iter 400060: loss 6.0445, time 125.22ms
iter 400070: loss 6.3081, time 125.21ms
iter 400080: loss 5.7698, time 125.18ms
iter 400090: loss 6.2219, time 125.14ms
iter 400100: loss 6.0181, time 125.29ms
iter 400110: loss 6.5917, time 125.29ms
iter 400120: loss 6.1545, time 125.05ms
iter 400130: loss 5.7018, time 125.19ms
iter 400140: loss 5.5700, time 125.45ms
iter 400150: loss 5.5556, time 127.53ms
iter 400160: loss 6.3441, time 125.09ms
iter 400170: loss 5.9776, time 124.60ms
iter 400180: loss 6.3002, time 125.68ms
iter 400190: loss 5.5182, time 125.10ms
iter 400200: loss 5.4416, time 125.11ms
iter 400210: loss 6.6548, time 125.27ms
iter 400220: loss 5.9551, time 125.42ms
iter 400230: loss 6.0999, time 125.40ms
iter 400240: loss 5.7744, time 124.84ms
step 400250: train loss 5.5895, val loss 5.5780
saving checkpoint to out-shakespeare-char
iter 400250: loss 5.7673, time 2888.94ms
iter 400260: loss 6.2490, time 123.55ms
iter 400270: loss 6.4631, time 121.09ms
iter 400280: loss 6.2120, time 123.70ms
iter 400290: loss 6.0930, time 120.67ms
iter 400300: loss 5.6638, time 123.60ms
iter 400310: loss 6.0287, time 121.32ms
iter 400320: loss 5.6290, time 124.03ms
iter 400330: loss 6.0209, time 121.58ms
iter 400340: loss 5.7919, time 123.54ms
iter 400350: loss 6.2438, time 121.40ms
iter 400360: loss 5.9547, time 123.58ms
iter 400370: loss 5.8719, time 121.31ms
iter 400380: loss 5.4633, time 123.51ms
iter 400390: loss 5.8115, time 121.57ms
iter 400400: loss 6.3062, time 123.67ms
iter 400410: loss 6.2453, time 121.14ms
iter 400420: loss 5.9388, time 123.61ms
iter 400430: loss 5.5764, time 121.44ms
iter 400440: loss 5.9949, time 123.24ms
iter 400450: loss 5.8599, time 121.46ms
iter 400460: loss 6.3409, time 124.19ms
iter 400470: loss 6.4248, time 121.55ms
iter 400480: loss 5.6831, time 123.68ms
iter 400490: loss 5.8280, time 121.67ms
step 400500: train loss 5.5678, val loss 5.5730
saving checkpoint to out-shakespeare-char
iter 400500: loss 5.8072, time 2912.15ms
iter 400510: loss 5.8465, time 121.09ms
iter 400520: loss 5.8224, time 121.16ms
iter 400530: loss 6.4767, time 121.41ms
iter 400540: loss 6.1583, time 121.45ms
iter 400550: loss 5.7427, time 121.58ms
iter 400560: loss 6.2292, time 121.51ms
iter 400570: loss 5.9674, time 121.82ms
iter 400580: loss 5.5360, time 121.50ms
iter 400590: loss 6.1974, time 122.22ms
iter 400600: loss 5.9266, time 121.84ms
iter 400610: loss 5.9437, time 121.06ms
iter 400620: loss 5.8226, time 120.03ms
iter 400630: loss 6.1937, time 120.98ms
iter 400640: loss 5.6255, time 121.50ms
iter 400650: loss 6.1703, time 122.13ms
iter 400660: loss 5.7861, time 121.63ms
iter 400670: loss 6.8245, time 121.43ms
iter 400680: loss 6.1299, time 122.60ms
iter 400690: loss 5.5545, time 121.43ms
iter 400700: loss 6.0791, time 121.35ms
iter 400710: loss 5.7021, time 121.55ms
iter 400720: loss 5.8300, time 121.57ms
iter 400730: loss 6.1306, time 121.54ms
iter 400740: loss 6.0846, time 121.33ms
step 400750: train loss 5.5744, val loss 5.6172
saving checkpoint to out-shakespeare-char
iter 400750: loss 6.1458, time 2896.27ms
iter 400760: loss 6.2695, time 124.37ms
iter 400770: loss 5.7955, time 124.59ms
iter 400780: loss 6.3163, time 124.84ms
iter 400790: loss 6.4424, time 124.19ms
iter 400800: loss 5.7295, time 124.80ms
iter 400810: loss 6.8270, time 124.69ms
iter 400820: loss 5.7052, time 127.41ms
iter 400830: loss 5.7626, time 124.45ms
iter 400840: loss 6.3941, time 125.96ms
iter 400850: loss 6.8814, time 124.36ms
iter 400860: loss 6.8136, time 125.78ms
iter 400870: loss 6.3532, time 125.59ms
iter 400880: loss 5.8334, time 124.20ms
iter 400890: loss 5.8508, time 125.27ms
iter 400900: loss 5.7251, time 124.12ms
iter 400910: loss 5.8100, time 126.24ms
iter 400920: loss 5.7533, time 125.55ms
iter 400930: loss 6.2111, time 125.45ms
iter 400940: loss 5.1942, time 125.48ms
iter 400950: loss 5.7524, time 125.68ms
iter 400960: loss 6.0292, time 127.41ms
iter 400970: loss 5.8064, time 125.31ms
iter 400980: loss 6.4441, time 125.92ms
iter 400990: loss 6.0135, time 125.31ms
step 401000: train loss 5.6142, val loss 5.6456
saving checkpoint to out-shakespeare-char
iter 401000: loss 6.3163, time 2879.55ms
iter 401010: loss 5.9171, time 125.99ms
iter 401020: loss 5.4450, time 125.86ms
iter 401030: loss 6.1487, time 125.92ms
iter 401040: loss 5.6890, time 128.53ms
iter 401050: loss 6.2622, time 125.54ms
iter 401060: loss 6.8004, time 125.75ms
iter 401070: loss 6.0656, time 126.21ms
iter 401080: loss 6.5934, time 125.17ms
iter 401090: loss 5.7768, time 125.44ms
iter 401100: loss 6.1445, time 126.21ms
iter 401110: loss 5.8305, time 126.23ms
iter 401120: loss 6.2309, time 125.62ms
iter 401130: loss 5.9013, time 123.64ms
iter 401140: loss 5.4554, time 124.97ms
iter 401150: loss 5.7339, time 124.92ms
iter 401160: loss 5.7112, time 124.84ms
iter 401170: loss 5.9736, time 124.31ms
iter 401180: loss 5.8794, time 126.64ms
iter 401190: loss 5.2426, time 124.76ms
iter 401200: loss 5.7817, time 124.67ms
iter 401210: loss 6.2175, time 124.42ms
iter 401220: loss 5.9854, time 124.10ms
iter 401230: loss 6.0152, time 125.06ms
iter 401240: loss 5.7462, time 124.75ms
step 401250: train loss 5.6249, val loss 5.5655
saving checkpoint to out-shakespeare-char
iter 401250: loss 5.9985, time 2881.03ms
iter 401260: loss 5.6165, time 124.04ms
iter 401270: loss 6.4037, time 124.80ms
iter 401280: loss 5.7861, time 124.79ms
iter 401290: loss 5.0819, time 124.80ms
iter 401300: loss 5.7877, time 124.35ms
iter 401310: loss 5.7847, time 124.35ms
iter 401320: loss 6.2139, time 127.56ms
iter 401330: loss 5.8991, time 124.72ms
iter 401340: loss 5.8310, time 124.42ms
iter 401350: loss 6.0306, time 124.90ms
iter 401360: loss 5.2538, time 124.85ms
iter 401370: loss 5.8439, time 125.45ms
iter 401380: loss 5.3945, time 124.82ms
iter 401390: loss 5.6088, time 126.87ms
iter 401400: loss 6.2505, time 125.26ms
iter 401410: loss 5.7740, time 125.07ms
iter 401420: loss 6.2510, time 125.27ms
iter 401430: loss 5.7030, time 124.89ms
iter 401440: loss 5.9148, time 124.80ms
iter 401450: loss 6.3109, time 124.81ms
iter 401460: loss 5.5728, time 124.96ms
iter 401470: loss 6.4492, time 124.75ms
iter 401480: loss 5.2903, time 124.85ms
iter 401490: loss 5.5896, time 124.86ms
step 401500: train loss 5.5483, val loss 5.5308
saving checkpoint to out-shakespeare-char
iter 401500: loss 6.4911, time 2878.89ms
iter 401510: loss 5.4131, time 125.14ms
iter 401520: loss 5.8007, time 124.86ms
iter 401530: loss 6.4850, time 124.85ms
iter 401540: loss 5.9980, time 124.90ms
iter 401550: loss 5.7833, time 124.74ms
iter 401560: loss 5.8465, time 124.17ms
iter 401570: loss 6.4917, time 124.70ms
iter 401580: loss 5.9574, time 127.37ms
iter 401590: loss 6.2626, time 124.19ms
iter 401600: loss 6.2612, time 124.70ms
iter 401610: loss 5.9292, time 124.98ms
iter 401620: loss 6.2917, time 124.94ms
iter 401630: loss 5.7057, time 123.86ms
iter 401640: loss 6.3304, time 124.46ms
iter 401650: loss 5.5885, time 124.99ms
iter 401660: loss 5.7554, time 124.76ms
iter 401670: loss 6.1889, time 123.78ms
iter 401680: loss 6.1247, time 124.72ms
iter 401690: loss 6.0209, time 127.13ms
iter 401700: loss 5.6049, time 125.06ms
iter 401710: loss 5.6975, time 124.41ms
iter 401720: loss 5.7763, time 123.51ms
iter 401730: loss 6.4977, time 124.83ms
iter 401740: loss 5.7672, time 125.50ms
step 401750: train loss 5.6237, val loss 5.6020
saving checkpoint to out-shakespeare-char
iter 401750: loss 5.7327, time 2870.33ms
iter 401760: loss 5.4968, time 121.49ms
iter 401770: loss 5.7541, time 119.79ms
iter 401780: loss 5.0962, time 120.75ms
iter 401790: loss 5.8235, time 120.85ms
iter 401800: loss 5.4681, time 120.66ms
iter 401810: loss 6.4766, time 119.65ms
iter 401820: loss 6.0261, time 121.56ms
iter 401830: loss 6.6388, time 121.37ms
iter 401840: loss 5.7203, time 122.61ms
iter 401850: loss 6.0406, time 121.39ms
iter 401860: loss 5.6985, time 123.90ms
iter 401870: loss 6.1456, time 121.25ms
iter 401880: loss 6.5286, time 123.13ms
iter 401890: loss 6.7192, time 120.73ms
iter 401900: loss 6.6063, time 122.63ms
iter 401910: loss 5.7111, time 121.47ms
iter 401920: loss 6.2680, time 122.78ms
iter 401930: loss 6.1148, time 121.65ms
iter 401940: loss 6.5668, time 122.74ms
iter 401950: loss 5.8533, time 120.98ms
iter 401960: loss 6.5664, time 123.07ms
iter 401970: loss 5.9918, time 121.33ms
iter 401980: loss 5.6235, time 122.75ms
iter 401990: loss 6.5512, time 121.53ms
step 402000: train loss 5.5816, val loss 5.5741
saving checkpoint to out-shakespeare-char
iter 402000: loss 5.7949, time 2891.96ms
iter 402010: loss 6.1648, time 127.66ms
iter 402020: loss 6.5522, time 124.91ms
iter 402030: loss 5.9040, time 125.13ms
iter 402040: loss 5.8981, time 125.07ms
iter 402050: loss 6.2067, time 124.39ms
iter 402060: loss 5.5684, time 124.94ms
iter 402070: loss 5.2248, time 125.27ms
iter 402080: loss 5.6443, time 125.72ms
iter 402090: loss 5.8777, time 124.86ms
iter 402100: loss 6.0362, time 124.89ms
iter 402110: loss 5.9979, time 125.70ms
iter 402120: loss 6.0074, time 124.62ms
iter 402130: loss 6.3574, time 124.07ms
iter 402140: loss 5.6777, time 124.94ms
iter 402150: loss 6.0493, time 125.22ms
iter 402160: loss 5.7138, time 124.97ms
iter 402170: loss 7.0996, time 124.98ms
iter 402180: loss 6.0557, time 124.72ms
iter 402190: loss 5.4594, time 127.24ms
iter 402200: loss 5.6647, time 124.69ms
iter 402210: loss 6.2845, time 125.34ms
iter 402220: loss 6.0221, time 125.20ms
iter 402230: loss 5.4833, time 125.12ms
iter 402240: loss 5.7843, time 125.16ms
step 402250: train loss 5.5706, val loss 5.6517
saving checkpoint to out-shakespeare-char
iter 402250: loss 6.2820, time 2891.61ms
iter 402260: loss 6.0499, time 124.19ms
iter 402270: loss 5.3570, time 124.77ms
iter 402280: loss 6.9765, time 125.46ms
iter 402290: loss 5.9139, time 125.02ms
iter 402300: loss 5.9159, time 125.21ms
iter 402310: loss 5.5044, time 125.69ms
iter 402320: loss 5.5917, time 124.95ms
iter 402330: loss 6.6916, time 125.84ms
iter 402340: loss 6.2286, time 132.02ms
iter 402350: loss 7.0385, time 125.79ms
iter 402360: loss 4.9739, time 125.26ms
iter 402370: loss 5.6986, time 125.18ms
iter 402380: loss 5.9380, time 125.05ms
iter 402390: loss 5.1359, time 125.10ms
iter 402400: loss 5.7132, time 125.41ms
iter 402410: loss 5.8081, time 124.55ms
iter 402420: loss 5.4725, time 125.06ms
iter 402430: loss 6.5235, time 124.99ms
iter 402440: loss 6.6111, time 125.39ms
iter 402450: loss 6.9672, time 126.73ms
iter 402460: loss 5.8954, time 125.15ms
iter 402470: loss 6.2534, time 123.75ms
iter 402480: loss 6.0714, time 125.23ms
iter 402490: loss 6.1401, time 123.90ms
step 402500: train loss 5.5633, val loss 5.5805
saving checkpoint to out-shakespeare-char
iter 402500: loss 5.8282, time 2876.11ms
iter 402510: loss 5.9145, time 125.42ms
iter 402520: loss 6.2336, time 126.09ms
iter 402530: loss 5.8744, time 125.02ms
iter 402540: loss 5.6865, time 125.12ms
iter 402550: loss 6.1042, time 125.25ms
iter 402560: loss 5.3944, time 125.18ms
iter 402570: loss 5.9489, time 125.05ms
iter 402580: loss 5.9136, time 124.01ms
iter 402590: loss 6.8555, time 127.12ms
iter 402600: loss 5.2609, time 125.69ms
iter 402610: loss 6.5755, time 127.36ms
iter 402620: loss 5.4495, time 125.77ms
iter 402630: loss 5.9504, time 125.02ms
iter 402640: loss 6.2180, time 125.13ms
iter 402650: loss 6.7762, time 125.35ms
iter 402660: loss 5.5617, time 124.99ms
iter 402670: loss 6.2860, time 123.91ms
iter 402680: loss 6.3517, time 126.63ms
iter 402690: loss 5.6123, time 125.07ms
iter 402700: loss 5.8767, time 123.93ms
iter 402710: loss 6.4460, time 125.10ms
iter 402720: loss 5.7500, time 124.60ms
iter 402730: loss 5.7069, time 125.02ms
iter 402740: loss 6.0309, time 127.44ms
step 402750: train loss 5.5566, val loss 5.5928
saving checkpoint to out-shakespeare-char
iter 402750: loss 6.2976, time 2883.71ms
iter 402760: loss 5.7505, time 125.32ms
iter 402770: loss 6.1437, time 124.64ms
iter 402780: loss 6.9419, time 126.97ms
iter 402790: loss 5.7133, time 125.36ms
iter 402800: loss 6.8026, time 125.33ms
iter 402810: loss 6.2743, time 125.54ms
iter 402820: loss 6.4221, time 127.46ms
iter 402830: loss 5.9321, time 127.98ms
iter 402840: loss 6.0825, time 125.05ms
iter 402850: loss 5.8160, time 125.17ms
iter 402860: loss 6.7982, time 126.58ms
iter 402870: loss 6.5153, time 124.69ms
iter 402880: loss 5.6214, time 125.24ms
iter 402890: loss 6.7592, time 125.10ms
iter 402900: loss 5.5764, time 124.55ms
iter 402910: loss 6.1262, time 124.89ms
iter 402920: loss 5.4102, time 125.31ms
iter 402930: loss 6.4162, time 125.22ms
iter 402940: loss 5.7127, time 124.61ms
iter 402950: loss 5.5804, time 125.24ms
iter 402960: loss 6.1139, time 125.51ms
iter 402970: loss 6.0549, time 125.68ms
iter 402980: loss 6.1757, time 126.89ms
iter 402990: loss 5.2331, time 125.03ms
step 403000: train loss 5.5597, val loss 5.5944
saving checkpoint to out-shakespeare-char
iter 403000: loss 5.6720, time 2907.96ms
iter 403010: loss 5.9169, time 125.81ms
iter 403020: loss 5.5775, time 125.39ms
iter 403030: loss 5.9785, time 125.31ms
iter 403040: loss 5.7308, time 125.60ms
iter 403050: loss 6.4312, time 125.52ms
iter 403060: loss 6.5802, time 127.42ms
iter 403070: loss 6.0728, time 126.16ms
iter 403080: loss 6.2454, time 125.47ms
iter 403090: loss 5.9495, time 125.47ms
iter 403100: loss 5.9985, time 125.38ms
iter 403110: loss 6.4174, time 123.88ms
iter 403120: loss 5.9046, time 125.53ms
iter 403130: loss 6.8196, time 125.53ms
iter 403140: loss 5.5552, time 125.66ms
iter 403150: loss 5.8728, time 125.60ms
iter 403160: loss 6.0345, time 125.53ms
iter 403170: loss 5.3025, time 125.53ms
iter 403180: loss 5.6932, time 124.27ms
iter 403190: loss 5.8446, time 125.21ms
iter 403200: loss 5.6288, time 124.89ms
iter 403210: loss 6.5832, time 125.00ms
iter 403220: loss 6.3393, time 124.97ms
iter 403230: loss 5.8300, time 127.51ms
iter 403240: loss 6.0405, time 125.02ms
step 403250: train loss 5.5761, val loss 5.5425
saving checkpoint to out-shakespeare-char
iter 403250: loss 6.9548, time 2887.62ms
iter 403260: loss 5.5065, time 125.36ms
iter 403270: loss 5.8225, time 125.04ms
iter 403280: loss 6.4191, time 125.04ms
iter 403290: loss 5.7910, time 124.91ms
iter 403300: loss 5.3912, time 124.24ms
iter 403310: loss 5.5558, time 126.36ms
iter 403320: loss 5.8675, time 124.94ms
iter 403330: loss 4.8545, time 125.23ms
iter 403340: loss 6.2696, time 125.07ms
iter 403350: loss 5.7413, time 125.22ms
iter 403360: loss 6.3420, time 125.06ms
iter 403370: loss 5.7092, time 125.56ms
iter 403380: loss 5.7554, time 125.24ms
iter 403390: loss 5.1805, time 127.52ms
iter 403400: loss 5.8067, time 124.77ms
iter 403410: loss 6.2073, time 126.65ms
iter 403420: loss 5.8081, time 124.94ms
iter 403430: loss 6.3973, time 125.17ms
iter 403440: loss 5.7737, time 125.11ms
iter 403450: loss 5.0147, time 124.79ms
iter 403460: loss 5.6694, time 125.32ms
iter 403470: loss 5.8615, time 125.09ms
iter 403480: loss 5.3511, time 125.52ms
iter 403490: loss 5.8046, time 124.75ms
step 403500: train loss 5.5753, val loss 5.5901
saving checkpoint to out-shakespeare-char
iter 403500: loss 6.3919, time 2900.17ms
iter 403510: loss 5.9206, time 125.45ms
iter 403520: loss 5.7992, time 125.55ms
iter 403530: loss 6.0654, time 124.72ms
iter 403540: loss 6.3299, time 125.64ms
iter 403550: loss 5.2200, time 125.33ms
iter 403560: loss 5.8998, time 125.41ms
iter 403570: loss 5.7387, time 125.92ms
iter 403580: loss 5.5120, time 125.54ms
iter 403590: loss 5.9889, time 127.94ms
iter 403600: loss 6.3109, time 125.63ms
iter 403610: loss 5.3716, time 125.24ms
iter 403620: loss 5.4453, time 125.50ms
iter 403630: loss 5.8265, time 125.42ms
iter 403640: loss 5.5195, time 125.73ms
iter 403650: loss 6.2456, time 125.68ms
iter 403660: loss 6.0873, time 126.34ms
iter 403670: loss 5.7779, time 126.87ms
iter 403680: loss 6.0456, time 124.86ms
iter 403690: loss 6.0150, time 125.42ms
iter 403700: loss 5.8010, time 126.01ms
iter 403710: loss 5.8585, time 127.86ms
iter 403720: loss 5.9565, time 125.53ms
iter 403730: loss 6.1290, time 124.69ms
iter 403740: loss 6.5695, time 125.54ms
step 403750: train loss 5.5710, val loss 5.5504
saving checkpoint to out-shakespeare-char
iter 403750: loss 5.7573, time 2876.58ms
iter 403760: loss 5.5397, time 121.32ms
iter 403770: loss 6.2412, time 122.48ms
iter 403780: loss 6.0587, time 121.29ms
iter 403790: loss 5.9640, time 122.66ms
iter 403800: loss 5.7724, time 121.73ms
iter 403810: loss 5.7501, time 122.31ms
iter 403820: loss 5.3683, time 122.62ms
iter 403830: loss 5.6293, time 121.38ms
iter 403840: loss 6.8181, time 121.28ms
iter 403850: loss 5.8774, time 121.74ms
iter 403860: loss 5.9670, time 121.33ms
iter 403870: loss 5.9817, time 121.25ms
iter 403880: loss 6.5187, time 121.39ms
iter 403890: loss 6.0954, time 121.72ms
iter 403900: loss 5.5129, time 124.62ms
iter 403910: loss 5.8344, time 125.10ms
iter 403920: loss 5.8598, time 125.25ms
iter 403930: loss 5.5141, time 124.62ms
iter 403940: loss 6.0018, time 124.98ms
iter 403950: loss 5.7942, time 124.79ms
iter 403960: loss 7.1745, time 124.83ms
iter 403970: loss 5.9097, time 125.06ms
iter 403980: loss 5.5462, time 124.97ms
iter 403990: loss 5.8369, time 125.16ms
step 404000: train loss 5.5651, val loss 5.5961
saving checkpoint to out-shakespeare-char
iter 404000: loss 5.8949, time 2880.55ms
iter 404010: loss 6.1020, time 121.25ms
iter 404020: loss 6.1966, time 123.15ms
iter 404030: loss 5.9787, time 121.33ms
iter 404040: loss 6.1301, time 123.06ms
iter 404050: loss 5.6035, time 121.83ms
iter 404060: loss 5.9101, time 122.77ms
iter 404070: loss 7.1431, time 120.57ms
iter 404080: loss 5.9518, time 121.50ms
iter 404090: loss 5.8312, time 121.89ms
iter 404100: loss 5.8581, time 120.54ms
iter 404110: loss 6.5355, time 121.67ms
iter 404120: loss 6.3104, time 120.96ms
iter 404130: loss 6.1524, time 120.37ms
iter 404140: loss 5.9253, time 120.69ms
iter 404150: loss 6.2410, time 120.46ms
iter 404160: loss 5.8830, time 121.28ms
iter 404170: loss 5.5385, time 121.42ms
iter 404180: loss 5.8125, time 121.45ms
iter 404190: loss 6.5144, time 122.73ms
iter 404200: loss 5.7954, time 119.83ms
iter 404210: loss 6.3005, time 120.94ms
iter 404220: loss 6.4891, time 119.57ms
iter 404230: loss 6.3358, time 123.14ms
iter 404240: loss 6.0607, time 121.67ms
step 404250: train loss 5.5667, val loss 5.5790
saving checkpoint to out-shakespeare-char
iter 404250: loss 6.6583, time 2874.96ms
iter 404260: loss 5.8136, time 125.89ms
iter 404270: loss 5.3388, time 125.62ms
iter 404280: loss 4.9414, time 125.35ms
iter 404290: loss 5.5189, time 125.58ms
iter 404300: loss 6.0855, time 125.51ms
iter 404310: loss 6.0754, time 125.98ms
iter 404320: loss 6.2013, time 125.54ms
iter 404330: loss 6.1786, time 125.00ms
iter 404340: loss 5.3364, time 125.75ms
iter 404350: loss 6.1050, time 125.77ms
iter 404360: loss 6.1987, time 125.56ms
iter 404370: loss 5.1332, time 125.49ms
iter 404380: loss 6.2027, time 125.58ms
iter 404390: loss 5.8704, time 125.75ms
iter 404400: loss 6.0946, time 125.85ms
iter 404410: loss 6.0384, time 125.77ms
iter 404420: loss 6.2212, time 125.81ms
iter 404430: loss 6.2456, time 125.70ms
iter 404440: loss 5.5855, time 125.82ms
iter 404450: loss 6.1711, time 126.15ms
iter 404460: loss 5.8745, time 125.60ms
iter 404470: loss 6.7459, time 125.85ms
iter 404480: loss 5.3864, time 125.24ms
iter 404490: loss 5.7307, time 125.70ms
step 404500: train loss 5.5833, val loss 5.5920
saving checkpoint to out-shakespeare-char
iter 404500: loss 4.8646, time 2864.98ms
iter 404510: loss 5.3496, time 124.73ms
iter 404520: loss 5.8481, time 125.26ms
iter 404530: loss 5.8858, time 125.22ms
iter 404540: loss 6.1773, time 125.67ms
iter 404550: loss 5.5394, time 125.50ms
iter 404560: loss 6.0398, time 125.64ms
iter 404570: loss 5.9005, time 126.87ms
iter 404580: loss 6.6409, time 126.32ms
iter 404590: loss 5.8908, time 125.78ms
iter 404600: loss 6.6991, time 125.36ms
iter 404610: loss 6.1320, time 126.96ms
iter 404620: loss 6.6000, time 125.57ms
iter 404630: loss 5.7902, time 125.69ms
iter 404640: loss 6.0266, time 125.60ms
iter 404650: loss 5.2998, time 125.51ms
iter 404660: loss 5.9327, time 125.96ms
iter 404670: loss 6.2074, time 125.59ms
iter 404680: loss 6.1367, time 125.55ms
iter 404690: loss 5.9978, time 124.85ms
iter 404700: loss 5.7734, time 125.40ms
iter 404710: loss 5.5552, time 126.79ms
iter 404720: loss 6.0026, time 125.73ms
iter 404730: loss 6.2741, time 125.71ms
iter 404740: loss 6.4927, time 125.61ms
step 404750: train loss 5.5541, val loss 5.5618
saving checkpoint to out-shakespeare-char
iter 404750: loss 5.7214, time 2888.52ms
iter 404760: loss 6.1640, time 125.76ms
iter 404770: loss 6.1645, time 126.92ms
iter 404780: loss 5.9260, time 124.96ms
iter 404790: loss 6.0077, time 126.46ms
iter 404800: loss 6.1765, time 125.52ms
iter 404810: loss 5.8880, time 126.19ms
iter 404820: loss 6.3075, time 125.01ms
iter 404830: loss 5.5674, time 126.51ms
iter 404840: loss 5.7186, time 125.25ms
iter 404850: loss 6.4275, time 127.26ms
iter 404860: loss 5.8125, time 127.33ms
iter 404870: loss 5.9491, time 126.60ms
iter 404880: loss 5.6565, time 126.94ms
iter 404890: loss 5.6446, time 125.77ms
iter 404900: loss 6.3705, time 128.12ms
iter 404910: loss 5.4250, time 125.33ms
iter 404920: loss 5.8097, time 125.15ms
iter 404930: loss 6.2447, time 125.13ms
iter 404940: loss 5.1579, time 125.07ms
iter 404950: loss 6.8535, time 125.16ms
iter 404960: loss 6.1618, time 125.04ms
iter 404970: loss 5.7195, time 125.16ms
iter 404980: loss 5.7154, time 125.21ms
iter 404990: loss 6.1354, time 125.29ms
step 405000: train loss 5.5751, val loss 5.5775
saving checkpoint to out-shakespeare-char
iter 405000: loss 5.1560, time 2914.74ms
iter 405010: loss 6.8077, time 128.36ms
iter 405020: loss 5.5077, time 125.09ms
iter 405030: loss 5.8627, time 125.09ms
iter 405040: loss 5.7555, time 125.01ms
iter 405050: loss 5.5523, time 125.01ms
iter 405060: loss 5.6231, time 125.47ms
iter 405070: loss 6.2168, time 124.93ms
iter 405080: loss 5.7076, time 124.97ms
iter 405090: loss 6.1385, time 124.75ms
iter 405100: loss 5.9818, time 124.93ms
iter 405110: loss 6.6442, time 124.77ms
iter 405120: loss 5.4771, time 125.41ms
iter 405130: loss 5.4335, time 123.82ms
iter 405140: loss 5.5439, time 124.67ms
iter 405150: loss 5.7027, time 124.91ms
iter 405160: loss 5.9924, time 125.16ms
iter 405170: loss 6.1666, time 124.74ms
iter 405180: loss 5.9517, time 124.39ms
iter 405190: loss 6.0256, time 124.71ms
iter 405200: loss 6.4250, time 125.01ms
iter 405210: loss 5.8843, time 124.86ms
iter 405220: loss 5.7932, time 126.17ms
iter 405230: loss 6.8664, time 126.18ms
iter 405240: loss 6.0302, time 125.75ms
step 405250: train loss 5.5841, val loss 5.5720
saving checkpoint to out-shakespeare-char
iter 405250: loss 6.4827, time 2894.46ms
iter 405260: loss 5.6007, time 126.53ms
iter 405270: loss 6.0675, time 124.82ms
iter 405280: loss 6.0041, time 124.79ms
iter 405290: loss 4.8037, time 125.08ms
iter 405300: loss 5.8768, time 125.27ms
iter 405310: loss 5.3102, time 125.22ms
iter 405320: loss 5.7243, time 125.59ms
iter 405330: loss 6.4083, time 125.36ms
iter 405340: loss 6.2159, time 125.86ms
iter 405350: loss 6.4745, time 124.73ms
iter 405360: loss 5.7270, time 126.95ms
iter 405370: loss 5.6017, time 126.89ms
iter 405380: loss 5.3849, time 125.62ms
iter 405390: loss 5.4838, time 126.54ms
iter 405400: loss 6.7564, time 125.32ms
iter 405410: loss 6.4066, time 128.51ms
iter 405420: loss 6.0115, time 124.96ms
iter 405430: loss 5.7355, time 125.69ms
iter 405440: loss 6.4029, time 124.94ms
iter 405450: loss 6.6260, time 125.99ms
iter 405460: loss 6.1922, time 125.48ms
iter 405470: loss 5.8711, time 125.87ms
iter 405480: loss 6.5737, time 125.91ms
iter 405490: loss 6.2721, time 124.74ms
step 405500: train loss 5.5878, val loss 5.6336
saving checkpoint to out-shakespeare-char
iter 405500: loss 5.5553, time 2866.93ms
iter 405510: loss 6.2836, time 125.52ms
iter 405520: loss 6.6069, time 125.92ms
iter 405530: loss 5.9773, time 125.57ms
iter 405540: loss 5.8266, time 124.76ms
iter 405550: loss 6.0640, time 127.26ms
iter 405560: loss 6.2245, time 125.76ms
iter 405570: loss 6.1626, time 127.21ms
iter 405580: loss 5.7605, time 125.99ms
iter 405590: loss 6.4092, time 127.05ms
iter 405600: loss 5.9083, time 125.40ms
iter 405610: loss 5.4955, time 126.37ms
iter 405620: loss 6.0754, time 126.05ms
iter 405630: loss 6.2307, time 125.89ms
iter 405640: loss 5.3912, time 126.52ms
iter 405650: loss 5.8029, time 125.95ms
iter 405660: loss 6.3401, time 126.62ms
iter 405670: loss 6.0352, time 125.90ms
iter 405680: loss 6.0855, time 127.02ms
iter 405690: loss 5.2004, time 125.52ms
iter 405700: loss 5.7170, time 126.86ms
iter 405710: loss 5.9698, time 125.19ms
iter 405720: loss 5.6386, time 127.15ms
iter 405730: loss 6.1609, time 126.36ms
iter 405740: loss 6.4553, time 125.77ms
step 405750: train loss 5.5757, val loss 5.6048
saving checkpoint to out-shakespeare-char
iter 405750: loss 5.8532, time 2870.84ms
iter 405760: loss 5.4252, time 125.80ms
iter 405770: loss 6.2155, time 125.16ms
iter 405780: loss 5.7294, time 121.50ms
iter 405790: loss 5.4859, time 120.40ms
iter 405800: loss 5.2096, time 121.14ms
iter 405810: loss 6.2218, time 121.51ms
iter 405820: loss 6.0744, time 121.98ms
iter 405830: loss 5.5275, time 121.66ms
iter 405840: loss 6.0685, time 121.59ms
iter 405850: loss 6.2963, time 121.65ms
iter 405860: loss 6.7396, time 121.56ms
iter 405870: loss 5.7276, time 122.14ms
iter 405880: loss 5.1865, time 121.83ms
iter 405890: loss 6.2537, time 122.70ms
iter 405900: loss 5.8576, time 121.59ms
iter 405910: loss 6.2072, time 121.56ms
iter 405920: loss 5.5474, time 121.87ms
iter 405930: loss 5.4272, time 121.71ms
iter 405940: loss 5.8814, time 121.54ms
iter 405950: loss 5.9161, time 121.70ms
iter 405960: loss 5.1850, time 122.34ms
iter 405970: loss 5.5314, time 121.84ms
iter 405980: loss 5.7231, time 122.85ms
iter 405990: loss 5.8800, time 121.73ms
step 406000: train loss 5.5816, val loss 5.5677
saving checkpoint to out-shakespeare-char
iter 406000: loss 6.3200, time 2899.31ms
iter 406010: loss 5.9309, time 126.12ms
iter 406020: loss 6.2964, time 122.85ms
iter 406030: loss 6.0105, time 121.72ms
iter 406040: loss 5.5073, time 121.83ms
iter 406050: loss 6.8366, time 122.77ms
iter 406060: loss 5.6523, time 121.86ms
iter 406070: loss 6.4260, time 121.82ms
iter 406080: loss 5.9667, time 121.85ms
iter 406090: loss 6.3574, time 121.88ms
iter 406100: loss 5.7672, time 121.56ms
iter 406110: loss 6.1058, time 122.18ms
iter 406120: loss 5.8868, time 121.80ms
iter 406130: loss 6.7096, time 121.72ms
iter 406140: loss 6.6333, time 121.58ms
iter 406150: loss 6.0741, time 121.49ms
iter 406160: loss 6.2445, time 120.81ms
iter 406170: loss 6.4588, time 121.79ms
iter 406180: loss 6.4825, time 122.20ms
iter 406190: loss 6.1974, time 121.55ms
iter 406200: loss 6.5633, time 122.11ms
iter 406210: loss 6.5272, time 121.72ms
iter 406220: loss 6.5236, time 121.58ms
iter 406230: loss 6.9673, time 121.81ms
iter 406240: loss 5.3254, time 121.83ms
step 406250: train loss 5.5749, val loss 5.6032
saving checkpoint to out-shakespeare-char
iter 406250: loss 5.7908, time 2896.72ms
iter 406260: loss 5.7689, time 119.25ms
iter 406270: loss 5.5431, time 119.19ms
iter 406280: loss 5.7114, time 122.45ms
iter 406290: loss 6.0717, time 121.86ms
iter 406300: loss 6.2868, time 122.25ms
iter 406310: loss 6.2508, time 122.75ms
iter 406320: loss 5.9903, time 121.77ms
iter 406330: loss 5.9067, time 121.89ms
iter 406340: loss 5.8120, time 120.43ms
iter 406350: loss 5.9386, time 121.11ms
iter 406360: loss 5.8342, time 121.97ms
iter 406370: loss 6.2733, time 121.59ms
iter 406380: loss 5.9950, time 121.41ms
iter 406390: loss 5.8952, time 121.90ms
iter 406400: loss 5.5922, time 121.47ms
iter 406410: loss 4.9984, time 121.83ms
iter 406420: loss 6.0397, time 122.74ms
iter 406430: loss 6.3140, time 121.10ms
iter 406440: loss 5.9080, time 121.87ms
iter 406450: loss 6.1476, time 121.58ms
iter 406460: loss 5.3068, time 121.24ms
iter 406470: loss 5.5729, time 121.53ms
iter 406480: loss 6.2296, time 120.75ms
iter 406490: loss 6.7306, time 121.15ms
step 406500: train loss 5.5773, val loss 5.5488
saving checkpoint to out-shakespeare-char
iter 406500: loss 6.1693, time 2886.60ms
iter 406510: loss 5.7844, time 122.74ms
iter 406520: loss 5.5309, time 121.66ms
iter 406530: loss 5.6781, time 121.93ms
iter 406540: loss 6.3713, time 122.13ms
iter 406550: loss 5.5197, time 122.21ms
iter 406560: loss 5.7397, time 122.84ms
iter 406570: loss 6.4090, time 123.01ms
iter 406580: loss 6.4399, time 121.43ms
iter 406590: loss 6.5575, time 121.78ms
iter 406600: loss 5.4086, time 121.22ms
iter 406610: loss 6.0964, time 119.94ms
iter 406620: loss 5.6892, time 121.99ms
iter 406630: loss 5.9626, time 122.75ms
iter 406640: loss 6.5783, time 122.13ms
iter 406650: loss 6.4037, time 122.43ms
iter 406660: loss 6.1357, time 122.39ms
iter 406670: loss 5.5218, time 122.92ms
iter 406680: loss 6.2707, time 127.81ms
iter 406690: loss 5.6711, time 125.94ms
iter 406700: loss 6.1436, time 125.80ms
iter 406710: loss 5.4996, time 125.05ms
iter 406720: loss 6.2487, time 125.91ms
iter 406730: loss 6.4099, time 125.96ms
iter 406740: loss 6.0345, time 126.25ms
step 406750: train loss 5.5839, val loss 5.6101
saving checkpoint to out-shakespeare-char
iter 406750: loss 6.4733, time 2923.79ms
iter 406760: loss 5.7835, time 125.63ms
iter 406770: loss 6.2533, time 125.53ms
iter 406780: loss 5.3979, time 125.85ms
iter 406790: loss 5.5069, time 125.89ms
iter 406800: loss 6.4958, time 126.76ms
iter 406810: loss 5.8515, time 125.31ms
iter 406820: loss 7.2356, time 124.77ms
iter 406830: loss 6.1962, time 126.91ms
iter 406840: loss 5.4578, time 125.33ms
iter 406850: loss 6.3537, time 126.61ms
iter 406860: loss 6.5754, time 125.10ms
iter 406870: loss 5.9854, time 125.30ms
iter 406880: loss 6.2455, time 125.48ms
iter 406890: loss 5.6457, time 125.51ms
iter 406900: loss 6.0196, time 125.18ms
iter 406910: loss 6.2602, time 125.65ms
iter 406920: loss 5.8594, time 125.75ms
iter 406930: loss 6.1517, time 125.16ms
iter 406940: loss 6.8529, time 125.85ms
iter 406950: loss 7.1859, time 125.24ms
iter 406960: loss 5.5250, time 125.22ms
iter 406970: loss 6.0488, time 125.51ms
iter 406980: loss 4.8173, time 125.42ms
iter 406990: loss 5.6022, time 125.40ms
step 407000: train loss 5.6015, val loss 5.5959
saving checkpoint to out-shakespeare-char
iter 407000: loss 5.9379, time 2889.10ms
iter 407010: loss 6.0450, time 120.53ms
iter 407020: loss 6.1809, time 121.60ms
iter 407030: loss 5.9535, time 121.81ms
iter 407040: loss 6.4508, time 121.58ms
iter 407050: loss 5.8736, time 122.43ms
iter 407060: loss 5.6502, time 121.00ms
iter 407070: loss 6.2911, time 121.76ms
iter 407080: loss 7.1882, time 121.61ms
iter 407090: loss 5.4102, time 121.73ms
iter 407100: loss 5.6541, time 121.54ms
iter 407110: loss 5.9870, time 125.15ms
iter 407120: loss 5.2342, time 126.49ms
iter 407130: loss 5.6856, time 125.55ms
iter 407140: loss 6.1781, time 126.10ms
iter 407150: loss 6.1906, time 125.11ms
iter 407160: loss 6.0609, time 126.36ms
iter 407170: loss 6.1912, time 125.33ms
iter 407180: loss 5.0287, time 126.06ms
iter 407190: loss 5.5393, time 125.11ms
iter 407200: loss 6.4920, time 126.44ms
iter 407210: loss 5.9512, time 126.51ms
iter 407220: loss 5.6761, time 126.24ms
iter 407230: loss 6.9277, time 124.63ms
iter 407240: loss 5.9831, time 126.32ms
step 407250: train loss 5.5595, val loss 5.5272
saving checkpoint to out-shakespeare-char
iter 407250: loss 5.9194, time 2897.63ms
iter 407260: loss 5.5309, time 126.59ms
iter 407270: loss 5.5889, time 124.48ms
iter 407280: loss 6.0838, time 126.68ms
iter 407290: loss 5.8763, time 125.26ms
iter 407300: loss 6.3483, time 126.50ms
iter 407310: loss 6.3363, time 125.37ms
iter 407320: loss 6.3228, time 126.53ms
iter 407330: loss 6.5704, time 125.32ms
iter 407340: loss 6.7623, time 126.51ms
iter 407350: loss 6.0142, time 124.37ms
iter 407360: loss 6.8467, time 126.69ms
iter 407370: loss 6.0139, time 127.45ms
iter 407380: loss 6.6538, time 124.60ms
iter 407390: loss 7.1987, time 126.48ms
iter 407400: loss 6.2585, time 125.08ms
iter 407410: loss 6.2137, time 126.00ms
iter 407420: loss 5.3964, time 125.12ms
iter 407430: loss 5.9998, time 126.66ms
iter 407440: loss 5.8801, time 125.85ms
iter 407450: loss 5.6144, time 125.54ms
iter 407460: loss 6.0791, time 125.07ms
iter 407470: loss 6.1523, time 125.25ms
iter 407480: loss 5.3670, time 125.72ms
iter 407490: loss 6.2504, time 124.50ms
step 407500: train loss 5.6041, val loss 5.6238
saving checkpoint to out-shakespeare-char
iter 407500: loss 6.7828, time 2906.40ms
iter 407510: loss 5.3939, time 121.37ms
iter 407520: loss 6.5141, time 122.39ms
iter 407530: loss 6.3405, time 121.32ms
iter 407540: loss 5.6035, time 122.22ms
iter 407550: loss 6.4253, time 123.89ms
iter 407560: loss 5.7641, time 121.61ms
iter 407570: loss 5.8109, time 121.55ms
iter 407580: loss 6.1758, time 121.37ms
iter 407590: loss 5.7561, time 121.31ms
iter 407600: loss 4.8928, time 121.35ms
iter 407610: loss 6.0129, time 121.47ms
iter 407620: loss 6.0676, time 122.56ms
iter 407630: loss 6.0450, time 122.57ms
iter 407640: loss 6.0325, time 122.13ms
iter 407650: loss 4.8337, time 121.39ms
iter 407660: loss 6.4278, time 121.26ms
iter 407670: loss 5.5174, time 123.94ms
iter 407680: loss 5.8975, time 121.31ms
iter 407690: loss 6.3496, time 121.57ms
iter 407700: loss 6.0838, time 121.48ms
iter 407710: loss 6.0873, time 121.50ms
iter 407720: loss 5.3123, time 121.36ms
iter 407730: loss 5.8836, time 121.22ms
iter 407740: loss 6.1846, time 122.66ms
step 407750: train loss 5.5782, val loss 5.6310
saving checkpoint to out-shakespeare-char
iter 407750: loss 6.5508, time 2911.62ms
iter 407760: loss 5.9847, time 121.71ms
iter 407770: loss 5.7789, time 121.72ms
iter 407780: loss 5.7515, time 123.36ms
iter 407790: loss 6.0210, time 121.97ms
iter 407800: loss 6.1806, time 120.29ms
iter 407810: loss 5.9984, time 122.76ms
iter 407820: loss 5.0130, time 119.40ms
iter 407830: loss 6.1886, time 121.82ms
iter 407840: loss 6.5824, time 121.94ms
iter 407850: loss 5.8688, time 119.13ms
iter 407860: loss 6.3033, time 124.58ms
iter 407870: loss 5.5054, time 121.72ms
iter 407880: loss 5.1469, time 120.99ms
iter 407890: loss 5.9963, time 119.91ms
iter 407900: loss 6.1389, time 122.19ms
iter 407910: loss 6.3210, time 121.78ms
iter 407920: loss 5.3745, time 121.78ms
iter 407930: loss 5.7181, time 123.01ms
iter 407940: loss 5.6315, time 122.04ms
iter 407950: loss 6.2059, time 122.87ms
iter 407960: loss 5.9269, time 121.67ms
iter 407970: loss 5.6956, time 121.82ms
iter 407980: loss 6.2964, time 123.07ms
iter 407990: loss 6.1637, time 121.85ms
step 408000: train loss 5.5828, val loss 5.5758
saving checkpoint to out-shakespeare-char
iter 408000: loss 6.1257, time 2908.63ms
iter 408010: loss 6.0958, time 121.72ms
iter 408020: loss 6.2210, time 122.09ms
iter 408030: loss 5.8613, time 124.48ms
iter 408040: loss 5.3936, time 121.57ms
iter 408050: loss 6.1465, time 121.73ms
iter 408060: loss 5.6815, time 121.89ms
iter 408070: loss 5.3114, time 120.92ms
iter 408080: loss 6.1425, time 121.64ms
iter 408090: loss 5.9061, time 121.83ms
iter 408100: loss 6.0762, time 123.54ms
iter 408110: loss 6.3393, time 121.55ms
iter 408120: loss 5.2539, time 121.84ms
iter 408130: loss 6.0003, time 123.66ms
iter 408140: loss 6.4128, time 126.18ms
iter 408150: loss 6.0675, time 126.09ms
iter 408160: loss 5.8916, time 121.75ms
iter 408170: loss 6.7876, time 121.97ms
iter 408180: loss 6.5324, time 122.06ms
iter 408190: loss 6.2471, time 123.18ms
iter 408200: loss 6.1992, time 120.65ms
iter 408210: loss 6.3971, time 121.87ms
iter 408220: loss 6.1130, time 121.88ms
iter 408230: loss 5.9501, time 123.01ms
iter 408240: loss 5.5984, time 123.04ms
step 408250: train loss 5.5772, val loss 5.5932
saving checkpoint to out-shakespeare-char
iter 408250: loss 5.9092, time 2897.22ms
iter 408260: loss 5.9980, time 123.29ms
iter 408270: loss 6.2776, time 121.52ms
iter 408280: loss 5.8179, time 121.93ms
iter 408290: loss 6.2286, time 122.84ms
iter 408300: loss 7.0483, time 121.46ms
iter 408310: loss 5.5382, time 121.88ms
iter 408320: loss 5.8764, time 124.37ms
iter 408330: loss 6.2492, time 121.89ms
iter 408340: loss 5.9605, time 121.48ms
iter 408350: loss 5.9752, time 121.82ms
iter 408360: loss 5.7851, time 121.78ms
iter 408370: loss 5.7594, time 121.90ms
iter 408380: loss 5.5650, time 122.25ms
iter 408390: loss 6.0802, time 123.42ms
iter 408400: loss 5.6488, time 121.92ms
iter 408410: loss 5.8769, time 121.95ms
iter 408420: loss 6.1438, time 122.93ms
iter 408430: loss 5.6545, time 121.79ms
iter 408440: loss 5.9987, time 121.75ms
iter 408450: loss 5.5492, time 123.30ms
iter 408460: loss 6.1354, time 119.70ms
iter 408470: loss 5.9149, time 120.03ms
iter 408480: loss 5.2568, time 119.91ms
iter 408490: loss 6.1582, time 120.90ms
step 408500: train loss 5.5668, val loss 5.5696
saving checkpoint to out-shakespeare-char
iter 408500: loss 5.7454, time 2902.64ms
iter 408510: loss 5.4829, time 119.63ms
iter 408520: loss 5.5622, time 119.93ms
iter 408530: loss 5.6827, time 119.44ms
iter 408540: loss 6.3115, time 119.80ms
iter 408550: loss 6.6621, time 120.60ms
iter 408560: loss 6.1516, time 121.12ms
iter 408570: loss 5.9576, time 119.57ms
iter 408580: loss 5.5954, time 119.56ms
iter 408590: loss 5.7651, time 120.74ms
iter 408600: loss 5.9290, time 120.17ms
iter 408610: loss 6.2863, time 120.57ms
iter 408620: loss 6.5317, time 121.61ms
iter 408630: loss 6.7343, time 121.48ms
iter 408640: loss 6.4122, time 121.48ms
iter 408650: loss 6.5726, time 121.50ms
iter 408660: loss 6.5006, time 123.16ms
iter 408670: loss 5.6289, time 119.53ms
iter 408680: loss 6.1941, time 119.70ms
iter 408690: loss 5.5898, time 120.67ms
iter 408700: loss 6.1989, time 120.48ms
iter 408710: loss 6.3658, time 119.58ms
iter 408720: loss 5.8523, time 122.26ms
iter 408730: loss 6.4003, time 119.32ms
iter 408740: loss 5.9539, time 120.74ms
step 408750: train loss 5.5934, val loss 5.5650
saving checkpoint to out-shakespeare-char
iter 408750: loss 5.8499, time 2902.20ms
iter 408760: loss 6.4147, time 125.41ms
iter 408770: loss 5.3631, time 125.13ms
iter 408780: loss 5.6458, time 125.40ms
iter 408790: loss 5.8576, time 125.59ms
iter 408800: loss 5.7847, time 126.07ms
iter 408810: loss 6.0682, time 125.17ms
iter 408820: loss 5.4362, time 125.60ms
iter 408830: loss 5.8014, time 121.68ms
iter 408840: loss 5.9496, time 122.71ms
iter 408850: loss 5.3567, time 121.53ms
iter 408860: loss 5.5657, time 121.73ms
iter 408870: loss 5.5700, time 123.99ms
iter 408880: loss 5.9222, time 121.55ms
iter 408890: loss 5.9550, time 121.71ms
iter 408900: loss 5.1963, time 121.46ms
iter 408910: loss 6.1691, time 121.07ms
iter 408920: loss 5.9031, time 121.52ms
iter 408930: loss 6.1968, time 121.44ms
iter 408940: loss 6.0788, time 122.61ms
iter 408950: loss 5.8342, time 121.52ms
iter 408960: loss 6.7439, time 121.48ms
iter 408970: loss 5.8612, time 122.62ms
iter 408980: loss 5.9569, time 121.52ms
iter 408990: loss 5.7032, time 121.87ms
step 409000: train loss 5.6003, val loss 5.5913
saving checkpoint to out-shakespeare-char
iter 409000: loss 5.8681, time 2926.61ms
iter 409010: loss 5.9496, time 125.88ms
iter 409020: loss 5.8999, time 125.94ms
iter 409030: loss 6.1516, time 125.94ms
iter 409040: loss 6.2163, time 126.43ms
iter 409050: loss 6.2537, time 125.87ms
iter 409060: loss 5.8066, time 125.96ms
iter 409070: loss 5.6257, time 126.21ms
iter 409080: loss 5.6720, time 125.79ms
iter 409090: loss 5.7421, time 126.09ms
iter 409100: loss 6.3088, time 128.53ms
iter 409110: loss 5.7039, time 125.85ms
iter 409120: loss 6.2289, time 127.42ms
iter 409130: loss 5.6491, time 125.84ms
iter 409140: loss 5.4352, time 128.17ms
iter 409150: loss 5.7401, time 125.93ms
iter 409160: loss 5.7151, time 128.17ms
iter 409170: loss 6.0126, time 124.78ms
iter 409180: loss 5.3019, time 125.81ms
iter 409190: loss 5.7155, time 125.08ms
iter 409200: loss 5.4535, time 125.58ms
iter 409210: loss 6.7052, time 125.62ms
iter 409220: loss 5.3175, time 125.54ms
iter 409230: loss 5.8609, time 125.27ms
iter 409240: loss 5.8313, time 126.00ms
step 409250: train loss 5.5342, val loss 5.6098
saving checkpoint to out-shakespeare-char
iter 409250: loss 5.7761, time 2893.49ms
iter 409260: loss 5.5903, time 123.27ms
iter 409270: loss 6.0969, time 122.07ms
iter 409280: loss 5.5615, time 122.33ms
iter 409290: loss 5.8223, time 123.30ms
iter 409300: loss 6.2213, time 120.11ms
iter 409310: loss 5.9171, time 121.93ms
iter 409320: loss 6.3847, time 124.57ms
iter 409330: loss 6.4463, time 122.87ms
iter 409340: loss 5.8223, time 124.73ms
iter 409350: loss 5.4975, time 121.06ms
iter 409360: loss 5.3980, time 121.95ms
iter 409370: loss 6.0449, time 121.84ms
iter 409380: loss 6.2702, time 122.27ms
iter 409390: loss 5.9775, time 121.26ms
iter 409400: loss 6.3039, time 122.16ms
iter 409410: loss 5.6728, time 122.95ms
iter 409420: loss 6.5913, time 121.91ms
iter 409430: loss 6.2556, time 121.78ms
iter 409440: loss 5.7406, time 123.05ms
iter 409450: loss 6.5004, time 120.91ms
iter 409460: loss 6.2946, time 123.47ms
iter 409470: loss 6.1884, time 121.95ms
iter 409480: loss 5.9584, time 121.74ms
iter 409490: loss 6.2630, time 122.03ms
step 409500: train loss 5.5427, val loss 5.5752
saving checkpoint to out-shakespeare-char
iter 409500: loss 6.5148, time 2891.82ms
iter 409510: loss 5.9486, time 122.90ms
iter 409520: loss 6.0793, time 124.10ms
iter 409530: loss 5.8853, time 121.80ms
iter 409540: loss 6.2936, time 121.63ms
iter 409550: loss 5.9830, time 121.51ms
iter 409560: loss 6.1956, time 121.48ms
iter 409570: loss 6.2088, time 121.89ms
iter 409580: loss 5.7616, time 121.50ms
iter 409590: loss 5.9767, time 122.88ms
iter 409600: loss 5.1982, time 121.36ms
iter 409610: loss 5.5105, time 121.35ms
iter 409620: loss 5.9321, time 122.71ms
iter 409630: loss 5.9303, time 122.70ms
iter 409640: loss 5.8893, time 121.45ms
iter 409650: loss 6.4786, time 122.07ms
iter 409660: loss 6.1669, time 121.45ms
iter 409670: loss 6.1563, time 121.69ms
iter 409680: loss 5.8443, time 122.64ms
iter 409690: loss 6.4243, time 121.53ms
iter 409700: loss 5.5918, time 121.48ms
iter 409710: loss 6.0458, time 122.61ms
iter 409720: loss 6.2767, time 121.47ms
iter 409730: loss 6.0233, time 121.52ms
iter 409740: loss 5.4072, time 124.01ms
step 409750: train loss 5.5693, val loss 5.5505
saving checkpoint to out-shakespeare-char
iter 409750: loss 5.5283, time 2902.51ms
iter 409760: loss 5.4814, time 123.72ms
iter 409770: loss 6.2517, time 121.51ms
iter 409780: loss 5.9362, time 121.74ms
iter 409790: loss 6.2237, time 121.47ms
iter 409800: loss 5.3803, time 122.73ms
iter 409810: loss 6.3705, time 122.75ms
iter 409820: loss 6.0417, time 121.05ms
iter 409830: loss 6.0567, time 121.66ms
iter 409840: loss 6.1583, time 121.50ms
iter 409850: loss 6.5577, time 122.78ms
iter 409860: loss 6.0198, time 122.66ms
iter 409870: loss 5.8586, time 121.34ms
iter 409880: loss 6.2764, time 122.41ms
iter 409890: loss 6.0905, time 124.22ms
iter 409900: loss 5.8412, time 121.43ms
iter 409910: loss 6.0404, time 121.50ms
iter 409920: loss 5.3039, time 121.42ms
iter 409930: loss 5.9919, time 122.74ms
iter 409940: loss 5.3717, time 122.84ms
iter 409950: loss 5.2173, time 122.64ms
iter 409960: loss 6.2553, time 121.47ms
iter 409970: loss 6.0457, time 121.19ms
iter 409980: loss 5.9342, time 124.15ms
iter 409990: loss 5.8497, time 121.02ms
step 410000: train loss 5.5729, val loss 5.5890
saving checkpoint to out-shakespeare-char
iter 410000: loss 6.0845, time 2876.83ms
iter 410010: loss 5.8870, time 124.72ms
iter 410020: loss 5.5981, time 126.15ms
iter 410030: loss 6.3280, time 125.21ms
iter 410040: loss 6.2019, time 125.58ms
iter 410050: loss 5.5157, time 125.25ms
iter 410060: loss 6.1197, time 126.02ms
iter 410070: loss 5.4573, time 124.64ms
iter 410080: loss 6.4899, time 124.79ms
iter 410090: loss 5.5653, time 125.17ms
iter 410100: loss 5.9525, time 125.72ms
iter 410110: loss 6.4463, time 125.47ms
iter 410120: loss 7.6033, time 125.55ms
iter 410130: loss 5.8677, time 125.42ms
iter 410140: loss 5.3127, time 125.14ms
iter 410150: loss 6.7971, time 125.55ms
iter 410160: loss 5.9683, time 124.85ms
iter 410170: loss 5.7160, time 125.48ms
iter 410180: loss 5.7900, time 124.81ms
iter 410190: loss 5.2597, time 125.56ms
iter 410200: loss 5.8627, time 125.49ms
iter 410210: loss 6.4832, time 126.95ms
iter 410220: loss 6.0559, time 125.78ms
iter 410230: loss 5.3342, time 125.60ms
iter 410240: loss 6.1793, time 125.55ms
step 410250: train loss 5.5953, val loss 5.5739
saving checkpoint to out-shakespeare-char
iter 410250: loss 5.7294, time 2919.50ms
iter 410260: loss 6.2700, time 127.24ms
iter 410270: loss 6.2148, time 129.56ms
iter 410280: loss 6.4163, time 126.11ms
iter 410290: loss 5.8927, time 125.75ms
iter 410300: loss 5.7099, time 125.81ms
iter 410310: loss 6.0102, time 125.91ms
iter 410320: loss 5.6307, time 125.90ms
iter 410330: loss 6.0843, time 125.75ms
iter 410340: loss 6.3717, time 124.35ms
iter 410350: loss 5.5073, time 122.07ms
iter 410360: loss 5.7268, time 122.04ms
iter 410370: loss 6.0262, time 123.02ms
iter 410380: loss 5.0654, time 121.94ms
iter 410390: loss 5.7775, time 122.98ms
iter 410400: loss 6.3097, time 122.05ms
iter 410410: loss 5.9307, time 122.28ms
iter 410420: loss 5.8559, time 124.32ms
iter 410430: loss 5.7881, time 121.82ms
iter 410440: loss 5.7313, time 121.85ms
iter 410450: loss 6.1941, time 122.35ms
iter 410460: loss 6.3825, time 121.78ms
iter 410470: loss 5.4507, time 122.00ms
iter 410480: loss 6.0395, time 122.30ms
iter 410490: loss 6.3346, time 119.89ms
step 410500: train loss 5.6230, val loss 5.5513
saving checkpoint to out-shakespeare-char
iter 410500: loss 6.1591, time 2904.79ms
iter 410510: loss 5.9616, time 121.92ms
iter 410520: loss 5.8602, time 121.93ms
iter 410530: loss 6.3309, time 121.74ms
iter 410540: loss 5.1890, time 121.93ms
iter 410550: loss 6.1044, time 122.96ms
iter 410560: loss 5.5949, time 121.83ms
iter 410570: loss 6.0242, time 121.88ms
iter 410580: loss 5.8928, time 123.77ms
iter 410590: loss 5.6965, time 121.77ms
iter 410600: loss 5.6515, time 121.77ms
iter 410610: loss 5.8926, time 124.43ms
iter 410620: loss 6.4913, time 121.99ms
iter 410630: loss 5.9827, time 121.78ms
iter 410640: loss 6.2445, time 121.83ms
iter 410650: loss 5.8929, time 121.96ms
iter 410660: loss 6.4711, time 120.92ms
iter 410670: loss 5.7605, time 121.91ms
iter 410680: loss 6.1418, time 122.89ms
iter 410690: loss 5.5786, time 121.90ms
iter 410700: loss 6.7279, time 122.12ms
iter 410710: loss 5.8820, time 122.89ms
iter 410720: loss 6.1421, time 121.76ms
iter 410730: loss 6.0433, time 121.53ms
iter 410740: loss 5.2155, time 124.19ms
step 410750: train loss 5.5883, val loss 5.5711
saving checkpoint to out-shakespeare-char
iter 410750: loss 6.0029, time 2917.07ms
iter 410760: loss 5.7599, time 127.57ms
iter 410770: loss 5.7998, time 125.40ms
iter 410780: loss 6.1845, time 126.50ms
iter 410790: loss 5.5980, time 125.40ms
iter 410800: loss 6.6740, time 126.01ms
iter 410810: loss 6.4470, time 125.85ms
iter 410820: loss 5.9614, time 125.95ms
iter 410830: loss 5.3081, time 125.68ms
iter 410840: loss 6.0166, time 125.73ms
iter 410850: loss 5.6000, time 125.82ms
iter 410860: loss 5.6946, time 125.54ms
iter 410870: loss 5.9950, time 125.92ms
iter 410880: loss 6.1405, time 125.85ms
iter 410890: loss 5.7958, time 125.88ms
iter 410900: loss 6.1122, time 126.17ms
iter 410910: loss 6.6301, time 125.93ms
iter 410920: loss 5.1934, time 125.88ms
iter 410930: loss 5.6439, time 125.79ms
iter 410940: loss 5.7017, time 125.83ms
iter 410950: loss 5.7511, time 125.77ms
iter 410960: loss 6.5159, time 125.65ms
iter 410970: loss 6.3478, time 125.66ms
iter 410980: loss 5.6970, time 125.60ms
iter 410990: loss 5.7197, time 125.45ms
step 411000: train loss 5.5905, val loss 5.5554
saving checkpoint to out-shakespeare-char
iter 411000: loss 6.6872, time 2881.43ms
iter 411010: loss 6.0137, time 125.75ms
iter 411020: loss 6.3305, time 125.61ms
iter 411030: loss 6.1184, time 125.90ms
iter 411040: loss 6.7382, time 125.52ms
iter 411050: loss 5.5413, time 125.87ms
iter 411060: loss 6.0009, time 125.46ms
iter 411070: loss 5.7404, time 125.62ms
iter 411080: loss 6.2366, time 125.59ms
iter 411090: loss 6.2900, time 125.63ms
iter 411100: loss 6.5324, time 125.54ms
iter 411110: loss 5.7621, time 125.49ms
iter 411120: loss 6.4113, time 126.48ms
iter 411130: loss 5.8098, time 125.64ms
iter 411140: loss 5.8954, time 126.83ms
iter 411150: loss 5.6716, time 125.02ms
iter 411160: loss 5.7006, time 125.94ms
iter 411170: loss 6.4391, time 125.87ms
iter 411180: loss 5.8875, time 126.02ms
iter 411190: loss 6.5351, time 125.90ms
iter 411200: loss 5.4382, time 126.01ms
iter 411210: loss 5.9681, time 125.04ms
iter 411220: loss 5.3833, time 125.60ms
iter 411230: loss 5.7808, time 125.65ms
iter 411240: loss 6.0525, time 125.75ms
step 411250: train loss 5.5900, val loss 5.5805
saving checkpoint to out-shakespeare-char
iter 411250: loss 5.9630, time 2898.00ms
iter 411260: loss 5.3240, time 127.27ms
iter 411270: loss 5.7541, time 125.37ms
iter 411280: loss 6.1852, time 128.12ms
iter 411290: loss 6.2304, time 127.20ms
iter 411300: loss 5.9562, time 126.74ms
iter 411310: loss 5.6250, time 125.71ms
iter 411320: loss 6.2542, time 125.48ms
iter 411330: loss 6.4816, time 125.37ms
iter 411340: loss 5.9252, time 125.44ms
iter 411350: loss 6.5070, time 125.21ms
iter 411360: loss 5.2371, time 124.71ms
iter 411370: loss 6.5654, time 124.77ms
iter 411380: loss 6.7065, time 124.92ms
iter 411390: loss 6.9419, time 125.56ms
iter 411400: loss 5.5802, time 125.38ms
iter 411410: loss 5.5076, time 125.59ms
iter 411420: loss 6.0032, time 124.78ms
iter 411430: loss 6.6816, time 125.24ms
iter 411440: loss 5.4491, time 126.17ms
iter 411450: loss 5.8470, time 125.53ms
iter 411460: loss 6.2017, time 125.70ms
iter 411470: loss 6.4076, time 125.47ms
iter 411480: loss 5.8375, time 125.45ms
iter 411490: loss 6.1597, time 125.25ms
step 411500: train loss 5.5789, val loss 5.5445
saving checkpoint to out-shakespeare-char
iter 411500: loss 6.1042, time 2878.30ms
iter 411510: loss 5.5015, time 125.23ms
iter 411520: loss 6.2325, time 124.72ms
iter 411530: loss 6.5820, time 125.25ms
iter 411540: loss 6.2675, time 124.93ms
iter 411550: loss 6.8489, time 125.60ms
iter 411560: loss 6.3630, time 125.69ms
iter 411570: loss 5.5357, time 125.22ms
iter 411580: loss 6.1306, time 125.76ms
iter 411590: loss 6.0827, time 125.09ms
iter 411600: loss 6.1809, time 125.27ms
iter 411610: loss 5.8498, time 125.26ms
iter 411620: loss 6.6005, time 125.23ms
iter 411630: loss 5.7645, time 124.50ms
iter 411640: loss 5.6178, time 125.59ms
iter 411650: loss 5.9328, time 125.04ms
iter 411660: loss 6.3909, time 125.74ms
iter 411670: loss 5.7152, time 124.93ms
iter 411680: loss 5.9834, time 125.19ms
iter 411690: loss 5.9054, time 124.86ms
iter 411700: loss 6.0103, time 125.43ms
iter 411710: loss 6.5582, time 125.38ms
iter 411720: loss 5.4078, time 125.27ms
iter 411730: loss 6.1983, time 124.92ms
iter 411740: loss 6.2805, time 125.77ms
step 411750: train loss 5.5800, val loss 5.6337
saving checkpoint to out-shakespeare-char
iter 411750: loss 6.2240, time 2868.09ms
iter 411760: loss 5.2687, time 125.40ms
iter 411770: loss 5.7822, time 125.32ms
iter 411780: loss 5.9943, time 125.38ms
iter 411790: loss 5.9747, time 125.30ms
iter 411800: loss 6.3174, time 125.48ms
iter 411810: loss 5.5606, time 126.33ms
iter 411820: loss 5.6708, time 125.88ms
iter 411830: loss 6.1179, time 125.78ms
iter 411840: loss 5.6144, time 125.38ms
iter 411850: loss 5.6553, time 125.39ms
iter 411860: loss 6.1576, time 125.26ms
iter 411870: loss 6.3317, time 125.52ms
iter 411880: loss 5.8962, time 124.81ms
iter 411890: loss 5.9958, time 125.45ms
iter 411900: loss 6.0112, time 125.68ms
iter 411910: loss 5.6753, time 125.38ms
iter 411920: loss 5.5533, time 124.53ms
iter 411930: loss 5.8690, time 125.59ms
iter 411940: loss 6.1668, time 125.80ms
iter 411950: loss 5.7486, time 125.30ms
iter 411960: loss 5.1751, time 125.20ms
iter 411970: loss 6.6820, time 125.59ms
iter 411980: loss 6.2250, time 125.58ms
iter 411990: loss 5.3467, time 125.15ms
step 412000: train loss 5.5879, val loss 5.6090
saving checkpoint to out-shakespeare-char
iter 412000: loss 5.5340, time 2887.33ms
iter 412010: loss 6.2535, time 126.31ms
iter 412020: loss 6.3741, time 125.59ms
iter 412030: loss 6.0061, time 126.76ms
iter 412040: loss 6.0411, time 126.08ms
iter 412050: loss 5.5902, time 125.56ms
iter 412060: loss 5.7719, time 125.69ms
iter 412070: loss 5.8059, time 126.18ms
iter 412080: loss 5.8224, time 125.38ms
iter 412090: loss 5.5138, time 125.53ms
iter 412100: loss 5.5953, time 125.48ms
iter 412110: loss 6.0178, time 125.36ms
iter 412120: loss 6.1994, time 125.74ms
iter 412130: loss 6.2464, time 125.30ms
iter 412140: loss 6.5406, time 125.58ms
iter 412150: loss 5.7458, time 125.25ms
iter 412160: loss 5.4343, time 125.35ms
iter 412170: loss 5.5796, time 125.15ms
iter 412180: loss 6.3825, time 125.46ms
iter 412190: loss 5.5529, time 125.68ms
iter 412200: loss 5.5784, time 125.79ms
iter 412210: loss 6.2113, time 125.13ms
iter 412220: loss 5.9507, time 125.60ms
iter 412230: loss 5.9830, time 125.71ms
iter 412240: loss 6.1787, time 125.41ms
step 412250: train loss 5.6206, val loss 5.5835
saving checkpoint to out-shakespeare-char
iter 412250: loss 5.5534, time 2874.93ms
iter 412260: loss 6.1328, time 124.88ms
iter 412270: loss 5.5632, time 124.95ms
iter 412280: loss 6.3439, time 125.46ms
iter 412290: loss 6.0724, time 124.93ms
iter 412300: loss 6.3336, time 125.01ms
iter 412310: loss 6.0092, time 125.18ms
iter 412320: loss 5.8180, time 124.85ms
iter 412330: loss 6.7173, time 125.15ms
iter 412340: loss 6.1843, time 125.32ms
iter 412350: loss 5.4683, time 125.02ms
iter 412360: loss 6.5463, time 125.49ms
iter 412370: loss 6.5957, time 124.98ms
iter 412380: loss 6.4844, time 124.14ms
iter 412390: loss 5.7162, time 125.08ms
iter 412400: loss 5.6861, time 124.60ms
iter 412410: loss 5.9846, time 125.10ms
iter 412420: loss 5.6279, time 124.90ms
iter 412430: loss 6.0927, time 125.00ms
iter 412440: loss 6.3494, time 125.37ms
iter 412450: loss 5.0891, time 124.91ms
iter 412460: loss 6.6020, time 124.12ms
iter 412470: loss 6.6592, time 125.03ms
iter 412480: loss 5.8645, time 124.93ms
iter 412490: loss 6.1522, time 124.66ms
step 412500: train loss 5.6148, val loss 5.6339
saving checkpoint to out-shakespeare-char
iter 412500: loss 6.4698, time 2890.38ms
iter 412510: loss 5.8119, time 125.72ms
iter 412520: loss 6.7032, time 125.06ms
iter 412530: loss 6.2203, time 125.21ms
iter 412540: loss 6.3058, time 125.06ms
iter 412550: loss 5.9500, time 125.08ms
iter 412560: loss 5.6884, time 125.26ms
iter 412570: loss 5.4656, time 125.15ms
iter 412580: loss 6.3548, time 125.82ms
iter 412590: loss 6.2666, time 126.23ms
iter 412600: loss 5.7866, time 125.62ms
iter 412610: loss 5.5938, time 125.99ms
iter 412620: loss 5.5633, time 125.51ms
iter 412630: loss 5.6920, time 125.92ms
iter 412640: loss 6.1973, time 125.60ms
iter 412650: loss 7.3691, time 125.67ms
iter 412660: loss 5.7007, time 125.69ms
iter 412670: loss 5.5601, time 125.31ms
iter 412680: loss 5.4404, time 125.39ms
iter 412690: loss 6.0540, time 125.88ms
iter 412700: loss 5.7772, time 126.10ms
iter 412710: loss 5.6978, time 126.04ms
iter 412720: loss 6.1840, time 125.63ms
iter 412730: loss 6.4911, time 125.53ms
iter 412740: loss 5.2500, time 125.60ms
step 412750: train loss 5.5976, val loss 5.5655
saving checkpoint to out-shakespeare-char
iter 412750: loss 6.4968, time 2899.17ms
iter 412760: loss 5.9585, time 121.50ms
iter 412770: loss 5.8751, time 122.64ms
iter 412780: loss 5.5412, time 122.40ms
iter 412790: loss 5.5212, time 122.02ms
iter 412800: loss 7.1256, time 124.39ms
iter 412810: loss 5.0358, time 121.78ms
iter 412820: loss 6.1364, time 121.66ms
iter 412830: loss 5.7309, time 121.45ms
iter 412840: loss 5.8621, time 122.68ms
iter 412850: loss 5.8311, time 121.56ms
iter 412860: loss 5.4324, time 121.59ms
iter 412870: loss 6.4335, time 122.80ms
iter 412880: loss 6.2408, time 121.41ms
iter 412890: loss 6.1606, time 121.65ms
iter 412900: loss 6.1885, time 124.21ms
iter 412910: loss 5.9751, time 121.82ms
iter 412920: loss 6.3604, time 122.05ms
iter 412930: loss 5.3021, time 121.57ms
iter 412940: loss 6.1459, time 121.65ms
iter 412950: loss 6.0719, time 121.62ms
iter 412960: loss 5.9523, time 122.07ms
iter 412970: loss 5.7798, time 123.75ms
iter 412980: loss 5.8301, time 121.75ms
iter 412990: loss 6.5802, time 121.41ms
step 413000: train loss 5.5621, val loss 5.6075
saving checkpoint to out-shakespeare-char
iter 413000: loss 5.7583, time 2918.15ms
iter 413010: loss 6.0336, time 121.67ms
iter 413020: loss 6.0033, time 121.85ms
iter 413030: loss 6.6497, time 122.90ms
iter 413040: loss 5.2815, time 121.79ms
iter 413050: loss 6.4126, time 121.79ms
iter 413060: loss 5.8191, time 124.39ms
iter 413070: loss 5.5891, time 121.72ms
iter 413080: loss 5.7421, time 121.89ms
iter 413090: loss 5.6473, time 121.15ms
iter 413100: loss 5.6858, time 121.08ms
iter 413110: loss 5.6531, time 121.55ms
iter 413120: loss 6.4806, time 121.06ms
iter 413130: loss 5.7917, time 122.94ms
iter 413140: loss 6.3124, time 121.81ms
iter 413150: loss 6.0057, time 121.80ms
iter 413160: loss 6.0514, time 122.41ms
iter 413170: loss 5.6806, time 121.71ms
iter 413180: loss 5.2576, time 121.83ms
iter 413190: loss 5.9468, time 124.35ms
iter 413200: loss 5.9545, time 121.75ms
iter 413210: loss 5.1455, time 121.65ms
iter 413220: loss 6.0095, time 121.76ms
iter 413230: loss 5.9363, time 121.78ms
iter 413240: loss 5.9702, time 121.76ms
step 413250: train loss 5.5868, val loss 5.5799
saving checkpoint to out-shakespeare-char
iter 413250: loss 6.0295, time 2908.40ms
iter 413260: loss 6.9009, time 122.91ms
iter 413270: loss 6.4160, time 121.75ms
iter 413280: loss 5.4030, time 122.20ms
iter 413290: loss 6.3751, time 124.74ms
iter 413300: loss 5.9562, time 122.18ms
iter 413310: loss 6.0518, time 121.80ms
iter 413320: loss 5.7700, time 121.52ms
iter 413330: loss 5.7650, time 121.58ms
iter 413340: loss 5.8273, time 121.54ms
iter 413350: loss 6.6636, time 121.68ms
iter 413360: loss 5.9744, time 121.94ms
iter 413370: loss 6.3502, time 121.47ms
iter 413380: loss 6.1264, time 121.53ms
iter 413390: loss 6.2201, time 122.83ms
iter 413400: loss 6.5803, time 121.60ms
iter 413410: loss 5.8726, time 121.59ms
iter 413420: loss 5.8149, time 124.61ms
iter 413430: loss 6.1841, time 122.59ms
iter 413440: loss 5.6108, time 121.89ms
iter 413450: loss 5.6399, time 121.56ms
iter 413460: loss 6.1804, time 121.58ms
iter 413470: loss 5.9977, time 120.98ms
iter 413480: loss 5.8898, time 121.51ms
iter 413490: loss 5.5859, time 121.74ms
step 413500: train loss 5.6092, val loss 5.5859
saving checkpoint to out-shakespeare-char
iter 413500: loss 6.4657, time 2898.99ms
iter 413510: loss 6.1138, time 125.22ms
iter 413520: loss 5.5101, time 125.79ms
iter 413530: loss 5.3070, time 128.11ms
iter 413540: loss 6.2000, time 126.97ms
iter 413550: loss 5.6585, time 125.17ms
iter 413560: loss 6.3350, time 126.23ms
iter 413570: loss 6.4726, time 125.63ms
iter 413580: loss 6.0538, time 125.25ms
iter 413590: loss 5.5894, time 125.57ms
iter 413600: loss 6.0316, time 125.53ms
iter 413610: loss 5.6293, time 125.12ms
iter 413620: loss 6.5715, time 125.34ms
iter 413630: loss 5.7208, time 125.29ms
iter 413640: loss 6.9908, time 125.10ms
iter 413650: loss 6.3485, time 123.89ms
iter 413660: loss 6.3455, time 124.90ms
iter 413670: loss 6.0748, time 125.41ms
iter 413680: loss 6.1762, time 126.44ms
iter 413690: loss 5.7355, time 126.95ms
iter 413700: loss 5.9970, time 125.32ms
iter 413710: loss 6.6059, time 125.66ms
iter 413720: loss 6.0342, time 125.58ms
iter 413730: loss 5.6256, time 126.43ms
iter 413740: loss 6.8085, time 125.65ms
step 413750: train loss 5.5560, val loss 5.6105
saving checkpoint to out-shakespeare-char
iter 413750: loss 6.0632, time 2885.66ms
iter 413760: loss 6.5439, time 125.18ms
iter 413770: loss 6.4313, time 125.59ms
iter 413780: loss 6.1097, time 125.95ms
iter 413790: loss 5.4924, time 125.53ms
iter 413800: loss 5.6783, time 125.34ms
iter 413810: loss 5.5046, time 125.31ms
iter 413820: loss 5.7384, time 125.47ms
iter 413830: loss 6.1484, time 125.63ms
iter 413840: loss 5.9444, time 125.66ms
iter 413850: loss 6.2624, time 125.49ms
iter 413860: loss 6.0354, time 125.52ms
iter 413870: loss 6.3990, time 125.60ms
iter 413880: loss 5.8228, time 126.16ms
iter 413890: loss 6.2332, time 124.74ms
iter 413900: loss 5.8131, time 125.61ms
iter 413910: loss 6.2893, time 125.67ms
iter 413920: loss 6.1383, time 124.46ms
iter 413930: loss 6.2637, time 124.89ms
iter 413940: loss 5.3832, time 124.92ms
iter 413950: loss 6.8995, time 127.45ms
iter 413960: loss 6.0729, time 125.18ms
iter 413970: loss 6.3628, time 127.75ms
iter 413980: loss 6.0042, time 125.28ms
iter 413990: loss 5.5597, time 124.63ms
step 414000: train loss 5.5858, val loss 5.5919
saving checkpoint to out-shakespeare-char
iter 414000: loss 6.0001, time 2909.15ms
iter 414010: loss 5.9572, time 124.52ms
iter 414020: loss 6.3111, time 124.11ms
iter 414030: loss 5.4840, time 125.34ms
iter 414040: loss 6.5209, time 124.95ms
iter 414050: loss 5.9099, time 124.97ms
iter 414060: loss 6.3126, time 124.21ms
iter 414070: loss 5.8863, time 125.06ms
iter 414080: loss 6.0663, time 125.04ms
iter 414090: loss 6.4167, time 126.13ms
iter 414100: loss 6.2332, time 124.89ms
iter 414110: loss 6.5720, time 125.31ms
iter 414120: loss 6.1093, time 124.13ms
iter 414130: loss 6.0670, time 125.08ms
iter 414140: loss 6.0949, time 124.01ms
iter 414150: loss 6.0400, time 125.25ms
iter 414160: loss 5.1790, time 125.42ms
iter 414170: loss 6.3110, time 124.78ms
iter 414180: loss 5.5078, time 124.89ms
iter 414190: loss 6.1336, time 126.86ms
iter 414200: loss 5.6068, time 124.99ms
iter 414210: loss 6.2235, time 124.99ms
iter 414220: loss 6.6928, time 124.91ms
iter 414230: loss 5.7277, time 126.38ms
iter 414240: loss 5.8609, time 126.27ms
step 414250: train loss 5.5816, val loss 5.5416
saving checkpoint to out-shakespeare-char
iter 414250: loss 5.7274, time 2891.40ms
iter 414260: loss 6.5914, time 126.02ms
iter 414270: loss 6.1230, time 124.66ms
iter 414280: loss 6.3120, time 126.93ms
iter 414290: loss 5.5696, time 125.15ms
iter 414300: loss 5.1947, time 125.66ms
iter 414310: loss 5.3606, time 125.32ms
iter 414320: loss 6.0763, time 124.80ms
iter 414330: loss 6.0832, time 125.06ms
iter 414340: loss 6.4488, time 125.09ms
iter 414350: loss 6.0627, time 125.81ms
iter 414360: loss 6.3887, time 124.91ms
iter 414370: loss 5.9911, time 125.66ms
iter 414380: loss 6.1470, time 127.78ms
iter 414390: loss 6.6825, time 126.97ms
iter 414400: loss 5.7503, time 125.71ms
iter 414410: loss 5.8714, time 125.95ms
iter 414420: loss 5.1524, time 126.78ms
iter 414430: loss 5.6388, time 125.88ms
iter 414440: loss 6.0153, time 126.93ms
iter 414450: loss 5.8538, time 125.78ms
iter 414460: loss 6.1820, time 125.67ms
iter 414470: loss 6.4752, time 125.01ms
iter 414480: loss 6.1286, time 126.68ms
iter 414490: loss 5.6910, time 125.91ms
step 414500: train loss 5.5342, val loss 5.5882
saving checkpoint to out-shakespeare-char
iter 414500: loss 6.0059, time 2894.76ms
iter 414510: loss 6.8659, time 125.95ms
iter 414520: loss 6.0051, time 125.20ms
iter 414530: loss 6.0441, time 124.66ms
iter 414540: loss 6.1480, time 125.87ms
iter 414550: loss 5.6073, time 127.65ms
iter 414560: loss 6.0221, time 128.41ms
iter 414570: loss 6.4809, time 125.93ms
iter 414580: loss 5.5051, time 128.73ms
iter 414590: loss 5.6653, time 126.03ms
iter 414600: loss 6.3137, time 128.80ms
iter 414610: loss 5.2313, time 125.79ms
iter 414620: loss 6.2002, time 127.89ms
iter 414630: loss 6.2787, time 126.27ms
iter 414640: loss 5.9801, time 125.41ms
iter 414650: loss 6.4950, time 127.01ms
iter 414660: loss 6.2703, time 127.28ms
iter 414670: loss 6.9066, time 126.03ms
iter 414680: loss 6.0599, time 125.56ms
iter 414690: loss 6.2495, time 125.80ms
iter 414700: loss 6.0616, time 125.97ms
iter 414710: loss 6.1259, time 126.27ms
iter 414720: loss 5.6449, time 126.45ms
iter 414730: loss 6.2462, time 125.66ms
iter 414740: loss 6.8415, time 125.80ms
step 414750: train loss 5.5974, val loss 5.5242
saving checkpoint to out-shakespeare-char
iter 414750: loss 5.0099, time 2915.53ms
iter 414760: loss 6.6071, time 126.79ms
iter 414770: loss 6.2584, time 125.74ms
iter 414780: loss 6.0535, time 124.97ms
iter 414790: loss 6.1613, time 125.70ms
iter 414800: loss 5.5213, time 124.00ms
iter 414810: loss 6.3157, time 123.85ms
iter 414820: loss 6.4084, time 125.81ms
iter 414830: loss 6.1491, time 125.09ms
iter 414840: loss 6.0369, time 124.23ms
iter 414850: loss 5.7211, time 124.22ms
iter 414860: loss 5.7113, time 124.57ms
iter 414870: loss 6.1546, time 125.31ms
iter 414880: loss 5.8170, time 124.24ms
iter 414890: loss 6.5855, time 123.98ms
iter 414900: loss 5.4403, time 124.94ms
iter 414910: loss 6.1320, time 124.55ms
iter 414920: loss 6.4636, time 124.33ms
iter 414930: loss 5.6552, time 126.14ms
iter 414940: loss 5.8751, time 123.82ms
iter 414950: loss 5.8859, time 124.60ms
iter 414960: loss 5.4490, time 124.86ms
iter 414970: loss 5.8248, time 124.97ms
iter 414980: loss 6.0692, time 125.13ms
iter 414990: loss 5.5974, time 126.01ms
step 415000: train loss 5.5664, val loss 5.5574
saving checkpoint to out-shakespeare-char
iter 415000: loss 5.5336, time 2868.17ms
iter 415010: loss 6.3725, time 121.62ms
iter 415020: loss 5.8704, time 123.97ms
iter 415030: loss 5.8424, time 121.30ms
iter 415040: loss 6.1060, time 122.16ms
iter 415050: loss 6.1702, time 121.88ms
iter 415060: loss 5.9694, time 121.48ms
iter 415070: loss 5.6561, time 122.00ms
iter 415080: loss 6.3493, time 121.51ms
iter 415090: loss 6.0147, time 121.91ms
iter 415100: loss 5.2636, time 121.39ms
iter 415110: loss 6.1644, time 121.50ms
iter 415120: loss 5.4554, time 122.76ms
iter 415130: loss 6.1353, time 122.12ms
iter 415140: loss 6.2637, time 121.15ms
iter 415150: loss 6.8064, time 122.22ms
iter 415160: loss 5.6049, time 121.38ms
iter 415170: loss 6.4312, time 121.35ms
iter 415180: loss 6.1317, time 121.43ms
iter 415190: loss 5.8544, time 121.43ms
iter 415200: loss 6.1566, time 121.67ms
iter 415210: loss 6.5941, time 121.10ms
iter 415220: loss 5.9918, time 122.57ms
iter 415230: loss 6.0289, time 120.67ms
iter 415240: loss 6.4389, time 121.44ms
step 415250: train loss 5.5867, val loss 5.5983
saving checkpoint to out-shakespeare-char
iter 415250: loss 5.9800, time 2877.67ms
iter 415260: loss 5.0526, time 121.79ms
iter 415270: loss 5.9625, time 124.33ms
iter 415280: loss 6.3573, time 120.75ms
iter 415290: loss 6.4183, time 121.01ms
iter 415300: loss 6.3984, time 121.60ms
iter 415310: loss 5.4825, time 121.89ms
iter 415320: loss 5.9664, time 121.27ms
iter 415330: loss 6.0899, time 121.44ms
iter 415340: loss 6.1024, time 121.45ms
iter 415350: loss 6.0775, time 121.51ms
iter 415360: loss 5.8931, time 123.45ms
iter 415370: loss 5.7470, time 121.50ms
iter 415380: loss 5.0474, time 121.12ms
iter 415390: loss 5.5514, time 121.42ms
iter 415400: loss 6.1332, time 121.38ms
iter 415410: loss 5.9672, time 121.44ms
iter 415420: loss 5.7371, time 121.42ms
iter 415430: loss 6.4838, time 121.73ms
iter 415440: loss 6.3168, time 121.58ms
iter 415450: loss 6.5555, time 122.49ms
iter 415460: loss 6.1483, time 121.45ms
iter 415470: loss 6.0707, time 120.63ms
iter 415480: loss 6.2007, time 124.09ms
iter 415490: loss 6.2151, time 121.35ms
step 415500: train loss 5.6070, val loss 5.6000
saving checkpoint to out-shakespeare-char
iter 415500: loss 6.0149, time 2899.62ms
iter 415510: loss 5.8991, time 121.38ms
iter 415520: loss 5.9362, time 121.72ms
iter 415530: loss 5.9452, time 121.38ms
iter 415540: loss 5.2288, time 120.81ms
iter 415550: loss 6.7658, time 123.03ms
iter 415560: loss 5.6999, time 121.48ms
iter 415570: loss 5.9561, time 120.69ms
iter 415580: loss 6.0360, time 124.09ms
iter 415590: loss 5.9276, time 121.61ms
iter 415600: loss 6.6466, time 121.48ms
iter 415610: loss 6.1144, time 121.90ms
iter 415620: loss 6.3593, time 121.58ms
iter 415630: loss 5.3965, time 121.48ms
iter 415640: loss 5.8812, time 121.49ms
iter 415650: loss 5.7892, time 121.45ms
iter 415660: loss 5.7000, time 121.56ms
iter 415670: loss 5.7926, time 121.50ms
iter 415680: loss 5.7487, time 123.06ms
iter 415690: loss 5.8992, time 121.54ms
iter 415700: loss 5.6786, time 121.34ms
iter 415710: loss 5.7912, time 123.46ms
iter 415720: loss 5.7928, time 121.54ms
iter 415730: loss 5.6754, time 122.72ms
iter 415740: loss 5.7057, time 121.34ms
step 415750: train loss 5.5864, val loss 5.5360
saving checkpoint to out-shakespeare-char
iter 415750: loss 6.1786, time 2894.43ms
iter 415760: loss 6.0028, time 121.60ms
iter 415770: loss 6.1887, time 122.87ms
iter 415780: loss 6.0692, time 121.28ms
iter 415790: loss 5.6737, time 122.63ms
iter 415800: loss 5.7949, time 121.36ms
iter 415810: loss 5.1183, time 122.06ms
iter 415820: loss 5.5498, time 121.40ms
iter 415830: loss 5.4803, time 121.41ms
iter 415840: loss 6.1783, time 121.33ms
iter 415850: loss 5.9959, time 121.45ms
iter 415860: loss 5.9333, time 122.54ms
iter 415870: loss 6.2019, time 121.79ms
iter 415880: loss 5.9375, time 121.47ms
iter 415890: loss 6.7620, time 122.52ms
iter 415900: loss 5.6571, time 122.17ms
iter 415910: loss 6.0168, time 122.82ms
iter 415920: loss 6.1928, time 121.41ms
iter 415930: loss 5.6758, time 121.62ms
iter 415940: loss 5.8657, time 121.53ms
iter 415950: loss 5.6364, time 122.93ms
iter 415960: loss 6.1022, time 121.45ms
iter 415970: loss 6.7335, time 121.62ms
iter 415980: loss 6.3055, time 122.42ms
iter 415990: loss 5.2035, time 121.44ms
step 416000: train loss 5.5758, val loss 5.5772
saving checkpoint to out-shakespeare-char
iter 416000: loss 5.6547, time 2894.50ms
iter 416010: loss 5.8834, time 121.57ms
iter 416020: loss 5.5061, time 122.46ms
iter 416030: loss 6.8713, time 121.59ms
iter 416040: loss 5.6946, time 121.57ms
iter 416050: loss 6.1415, time 124.11ms
iter 416060: loss 5.7287, time 121.90ms
iter 416070: loss 5.7966, time 121.45ms
iter 416080: loss 6.3462, time 120.20ms
iter 416090: loss 5.8840, time 122.73ms
iter 416100: loss 5.6992, time 122.96ms
iter 416110: loss 6.3091, time 122.58ms
iter 416120: loss 5.8249, time 121.45ms
iter 416130: loss 5.8900, time 124.50ms
iter 416140: loss 6.2217, time 121.92ms
iter 416150: loss 6.0690, time 121.53ms
iter 416160: loss 6.2471, time 121.26ms
iter 416170: loss 5.1452, time 121.26ms
iter 416180: loss 5.9232, time 121.17ms
iter 416190: loss 5.4059, time 121.53ms
iter 416200: loss 6.3855, time 122.89ms
iter 416210: loss 6.6093, time 122.84ms
iter 416220: loss 5.8568, time 121.96ms
iter 416230: loss 6.7439, time 121.88ms
iter 416240: loss 5.6742, time 121.68ms
step 416250: train loss 5.5353, val loss 5.6135
saving checkpoint to out-shakespeare-char
iter 416250: loss 6.4401, time 2906.00ms
iter 416260: loss 6.0887, time 121.91ms
iter 416270: loss 6.0779, time 123.07ms
iter 416280: loss 5.5851, time 122.03ms
iter 416290: loss 6.3189, time 122.11ms
iter 416300: loss 6.0386, time 121.75ms
iter 416310: loss 5.8051, time 122.68ms
iter 416320: loss 5.6230, time 122.09ms
iter 416330: loss 6.2263, time 121.92ms
iter 416340: loss 6.5440, time 123.03ms
iter 416350: loss 6.5128, time 121.50ms
iter 416360: loss 6.1751, time 121.91ms
iter 416370: loss 6.0073, time 124.66ms
iter 416380: loss 6.0492, time 121.70ms
iter 416390: loss 6.0814, time 123.26ms
iter 416400: loss 5.7931, time 121.89ms
iter 416410: loss 6.7248, time 121.58ms
iter 416420: loss 5.5522, time 121.53ms
iter 416430: loss 5.6804, time 122.74ms
iter 416440: loss 5.8732, time 121.49ms
iter 416450: loss 5.6638, time 121.59ms
iter 416460: loss 5.9425, time 122.39ms
iter 416470: loss 5.8042, time 121.63ms
iter 416480: loss 5.5570, time 121.50ms
iter 416490: loss 5.9937, time 124.00ms
step 416500: train loss 5.5681, val loss 5.6124
saving checkpoint to out-shakespeare-char
iter 416500: loss 5.5485, time 2888.58ms
iter 416510: loss 5.7466, time 120.38ms
iter 416520: loss 6.4048, time 120.50ms
iter 416530: loss 4.8673, time 123.13ms
iter 416540: loss 5.4080, time 120.77ms
iter 416550: loss 5.4985, time 120.52ms
iter 416560: loss 6.3926, time 121.63ms
iter 416570: loss 5.8839, time 120.57ms
iter 416580: loss 6.1930, time 124.77ms
iter 416590: loss 6.2984, time 120.21ms
iter 416600: loss 5.8604, time 120.63ms
iter 416610: loss 5.4957, time 121.48ms
iter 416620: loss 5.9831, time 120.32ms
iter 416630: loss 6.2319, time 123.98ms
iter 416640: loss 5.6700, time 125.28ms
iter 416650: loss 5.8823, time 125.55ms
iter 416660: loss 6.5406, time 126.49ms
iter 416670: loss 5.7164, time 125.02ms
iter 416680: loss 5.6339, time 125.50ms
iter 416690: loss 5.9512, time 126.02ms
iter 416700: loss 6.3593, time 126.13ms
iter 416710: loss 5.8852, time 125.61ms
iter 416720: loss 5.1546, time 121.83ms
iter 416730: loss 6.7005, time 122.18ms
iter 416740: loss 5.6470, time 124.60ms
step 416750: train loss 5.5447, val loss 5.5630
saving checkpoint to out-shakespeare-char
iter 416750: loss 6.6573, time 2901.61ms
iter 416760: loss 6.4314, time 121.83ms
iter 416770: loss 5.7246, time 125.95ms
iter 416780: loss 6.0634, time 125.06ms
iter 416790: loss 5.5225, time 125.85ms
iter 416800: loss 5.9578, time 124.94ms
iter 416810: loss 6.2378, time 125.58ms
iter 416820: loss 5.5572, time 125.33ms
iter 416830: loss 5.2866, time 125.25ms
iter 416840: loss 6.0436, time 125.65ms
iter 416850: loss 6.3296, time 125.84ms
iter 416860: loss 6.7042, time 125.50ms
iter 416870: loss 6.1087, time 125.43ms
iter 416880: loss 5.6822, time 125.25ms
iter 416890: loss 6.3160, time 125.15ms
iter 416900: loss 5.8915, time 125.37ms
iter 416910: loss 6.3631, time 125.27ms
iter 416920: loss 5.4170, time 125.06ms
iter 416930: loss 6.7097, time 123.94ms
iter 416940: loss 6.9019, time 123.78ms
iter 416950: loss 5.6538, time 125.15ms
iter 416960: loss 6.4438, time 125.23ms
iter 416970: loss 5.9764, time 125.20ms
iter 416980: loss 6.5498, time 124.47ms
iter 416990: loss 5.5055, time 125.14ms
step 417000: train loss 5.5489, val loss 5.6056
saving checkpoint to out-shakespeare-char
iter 417000: loss 5.9576, time 2876.60ms
iter 417010: loss 6.3726, time 121.41ms
iter 417020: loss 5.5060, time 122.07ms
iter 417030: loss 5.4635, time 124.61ms
iter 417040: loss 5.2544, time 122.25ms
iter 417050: loss 5.7728, time 122.44ms
iter 417060: loss 6.5118, time 122.02ms
iter 417070: loss 6.0011, time 122.01ms
iter 417080: loss 6.2968, time 122.05ms
iter 417090: loss 5.1888, time 122.04ms
iter 417100: loss 6.1797, time 123.26ms
iter 417110: loss 6.2218, time 122.01ms
iter 417120: loss 6.5300, time 123.09ms
iter 417130: loss 6.3278, time 122.13ms
iter 417140: loss 5.9830, time 122.10ms
iter 417150: loss 5.8898, time 122.58ms
iter 417160: loss 6.5620, time 121.94ms
iter 417170: loss 6.2289, time 122.23ms
iter 417180: loss 6.6193, time 124.73ms
iter 417190: loss 5.9949, time 122.98ms
iter 417200: loss 5.7099, time 122.18ms
iter 417210: loss 6.4628, time 121.93ms
iter 417220: loss 6.0108, time 123.29ms
iter 417230: loss 5.7131, time 122.10ms
iter 417240: loss 6.1562, time 121.60ms
step 417250: train loss 5.5728, val loss 5.5838
saving checkpoint to out-shakespeare-char
iter 417250: loss 6.6051, time 2920.39ms
iter 417260: loss 6.6384, time 125.73ms
iter 417270: loss 5.8089, time 126.92ms
iter 417280: loss 5.5711, time 125.62ms
iter 417290: loss 5.4854, time 125.48ms
iter 417300: loss 6.6397, time 125.11ms
iter 417310: loss 6.2103, time 125.46ms
iter 417320: loss 5.5291, time 125.83ms
iter 417330: loss 5.4716, time 125.89ms
iter 417340: loss 6.7148, time 125.45ms
iter 417350: loss 6.0078, time 125.56ms
iter 417360: loss 5.8435, time 125.64ms
iter 417370: loss 6.5113, time 125.49ms
iter 417380: loss 5.8832, time 127.08ms
iter 417390: loss 5.5154, time 128.23ms
iter 417400: loss 6.0389, time 125.62ms
iter 417410: loss 5.7641, time 128.34ms
iter 417420: loss 5.9988, time 125.86ms
iter 417430: loss 5.9236, time 128.49ms
iter 417440: loss 6.5478, time 125.38ms
iter 417450: loss 5.4394, time 128.20ms
iter 417460: loss 5.7182, time 124.82ms
iter 417470: loss 5.5924, time 126.01ms
iter 417480: loss 5.4975, time 125.74ms
iter 417490: loss 5.4567, time 127.41ms
step 417500: train loss 5.6173, val loss 5.5657
saving checkpoint to out-shakespeare-char
iter 417500: loss 6.4083, time 2896.81ms
iter 417510: loss 6.1448, time 125.40ms
iter 417520: loss 5.8627, time 124.47ms
iter 417530: loss 6.1034, time 125.40ms
iter 417540: loss 5.2323, time 126.79ms
iter 417550: loss 6.3106, time 125.47ms
iter 417560: loss 5.4071, time 124.68ms
iter 417570: loss 6.3652, time 125.37ms
iter 417580: loss 5.9914, time 125.40ms
iter 417590: loss 5.9006, time 126.24ms
iter 417600: loss 5.7326, time 126.04ms
iter 417610: loss 6.1451, time 125.28ms
iter 417620: loss 5.7013, time 125.53ms
iter 417630: loss 5.2791, time 125.36ms
iter 417640: loss 6.2168, time 125.39ms
iter 417650: loss 5.8731, time 119.45ms
iter 417660: loss 6.4691, time 120.82ms
iter 417670: loss 6.5155, time 119.54ms
iter 417680: loss 5.8732, time 119.70ms
iter 417690: loss 6.0732, time 119.71ms
iter 417700: loss 6.5440, time 121.14ms
iter 417710: loss 5.8028, time 119.73ms
iter 417720: loss 6.0208, time 120.89ms
iter 417730: loss 6.2152, time 121.80ms
iter 417740: loss 6.3821, time 120.41ms
step 417750: train loss 5.5293, val loss 5.5597
saving checkpoint to out-shakespeare-char
iter 417750: loss 6.5786, time 2873.37ms
iter 417760: loss 6.8981, time 125.38ms
iter 417770: loss 5.7445, time 125.74ms
iter 417780: loss 5.7412, time 125.43ms
iter 417790: loss 6.3182, time 125.14ms
iter 417800: loss 5.4021, time 124.86ms
iter 417810: loss 6.1662, time 125.82ms
iter 417820: loss 6.2011, time 125.95ms
iter 417830: loss 6.1375, time 125.72ms
iter 417840: loss 5.7898, time 125.62ms
iter 417850: loss 6.3985, time 125.82ms
iter 417860: loss 5.5025, time 126.98ms
iter 417870: loss 5.8281, time 125.70ms
iter 417880: loss 5.9310, time 125.64ms
iter 417890: loss 5.8561, time 125.27ms
iter 417900: loss 6.0152, time 125.79ms
iter 417910: loss 6.3188, time 126.08ms
iter 417920: loss 6.1230, time 126.20ms
iter 417930: loss 6.3134, time 127.60ms
iter 417940: loss 6.2081, time 125.39ms
iter 417950: loss 6.2293, time 125.93ms
iter 417960: loss 5.9428, time 125.23ms
iter 417970: loss 5.5666, time 126.98ms
iter 417980: loss 6.3413, time 125.20ms
iter 417990: loss 5.7582, time 125.53ms
step 418000: train loss 5.5888, val loss 5.5977
saving checkpoint to out-shakespeare-char
iter 418000: loss 5.9699, time 2879.64ms
iter 418010: loss 6.0252, time 128.58ms
iter 418020: loss 5.6978, time 125.74ms
iter 418030: loss 6.2837, time 126.57ms
iter 418040: loss 5.4567, time 125.64ms
iter 418050: loss 6.4916, time 126.52ms
iter 418060: loss 5.5500, time 125.54ms
iter 418070: loss 6.1061, time 125.93ms
iter 418080: loss 5.9762, time 125.27ms
iter 418090: loss 5.2191, time 125.24ms
iter 418100: loss 5.6079, time 125.33ms
iter 418110: loss 6.0174, time 124.95ms
iter 418120: loss 6.0807, time 125.24ms
iter 418130: loss 6.8777, time 125.92ms
iter 418140: loss 6.2839, time 126.83ms
iter 418150: loss 6.4777, time 125.48ms
iter 418160: loss 5.6785, time 125.61ms
iter 418170: loss 6.1755, time 125.81ms
iter 418180: loss 5.7447, time 125.43ms
iter 418190: loss 6.3079, time 125.53ms
iter 418200: loss 6.1510, time 125.86ms
iter 418210: loss 6.1807, time 125.08ms
iter 418220: loss 5.1597, time 125.67ms
iter 418230: loss 6.3079, time 125.33ms
iter 418240: loss 5.7034, time 125.99ms
step 418250: train loss 5.5855, val loss 5.5705
saving checkpoint to out-shakespeare-char
iter 418250: loss 6.0117, time 2878.92ms
iter 418260: loss 6.5249, time 125.57ms
iter 418270: loss 5.8950, time 125.43ms
iter 418280: loss 5.8673, time 125.34ms
iter 418290: loss 5.8112, time 125.79ms
iter 418300: loss 6.2020, time 125.77ms
iter 418310: loss 5.8603, time 125.96ms
iter 418320: loss 5.3031, time 126.85ms
iter 418330: loss 6.3096, time 125.69ms
iter 418340: loss 6.7756, time 125.26ms
iter 418350: loss 5.8406, time 125.33ms
iter 418360: loss 5.9133, time 125.60ms
iter 418370: loss 5.7420, time 125.17ms
iter 418380: loss 6.6340, time 125.09ms
iter 418390: loss 6.0486, time 125.23ms
iter 418400: loss 5.9052, time 125.52ms
iter 418410: loss 7.3810, time 125.48ms
iter 418420: loss 6.1822, time 124.91ms
iter 418430: loss 6.4531, time 125.30ms
iter 418440: loss 6.5648, time 127.70ms
iter 418450: loss 5.7970, time 125.23ms
iter 418460: loss 5.9773, time 127.89ms
iter 418470: loss 5.4377, time 125.23ms
iter 418480: loss 5.2960, time 127.67ms
iter 418490: loss 6.4845, time 125.01ms
step 418500: train loss 5.5734, val loss 5.5975
saving checkpoint to out-shakespeare-char
iter 418500: loss 6.3917, time 2907.85ms
iter 418510: loss 6.5749, time 124.54ms
iter 418520: loss 5.9956, time 125.30ms
iter 418530: loss 5.2353, time 125.28ms
iter 418540: loss 6.3126, time 125.53ms
iter 418550: loss 6.3590, time 124.89ms
iter 418560: loss 6.5735, time 125.48ms
iter 418570: loss 5.8272, time 125.26ms
iter 418580: loss 6.2963, time 125.40ms
iter 418590: loss 5.4405, time 126.26ms
iter 418600: loss 5.8838, time 125.39ms
iter 418610: loss 6.0982, time 125.48ms
iter 418620: loss 6.1841, time 125.32ms
iter 418630: loss 6.6978, time 125.52ms
iter 418640: loss 6.0458, time 126.57ms
iter 418650: loss 6.6689, time 125.71ms
iter 418660: loss 6.2507, time 125.29ms
iter 418670: loss 6.2333, time 125.94ms
iter 418680: loss 5.7287, time 125.39ms
iter 418690: loss 5.7553, time 125.59ms
iter 418700: loss 5.7837, time 126.73ms
iter 418710: loss 5.6633, time 125.82ms
iter 418720: loss 6.4288, time 125.62ms
iter 418730: loss 6.3019, time 125.52ms
iter 418740: loss 6.1911, time 125.92ms
step 418750: train loss 5.5906, val loss 5.6170
saving checkpoint to out-shakespeare-char
iter 418750: loss 5.7556, time 2879.60ms
iter 418760: loss 5.5576, time 126.07ms
iter 418770: loss 5.9672, time 124.69ms
iter 418780: loss 6.2922, time 126.08ms
iter 418790: loss 6.4100, time 125.39ms
iter 418800: loss 5.9443, time 125.30ms
iter 418810: loss 4.7990, time 125.21ms
iter 418820: loss 6.6315, time 124.93ms
iter 418830: loss 6.0222, time 125.66ms
iter 418840: loss 5.3109, time 124.80ms
iter 418850: loss 5.6956, time 125.22ms
iter 418860: loss 5.9393, time 124.78ms
iter 418870: loss 6.2291, time 124.80ms
iter 418880: loss 6.3980, time 124.45ms
iter 418890: loss 5.5803, time 125.14ms
iter 418900: loss 5.8136, time 124.10ms
iter 418910: loss 6.0225, time 123.70ms
iter 418920: loss 6.1565, time 125.09ms
iter 418930: loss 5.0919, time 125.18ms
iter 418940: loss 5.8827, time 124.81ms
iter 418950: loss 5.1657, time 125.61ms
iter 418960: loss 5.9731, time 125.41ms
iter 418970: loss 6.0419, time 125.37ms
iter 418980: loss 5.9054, time 124.78ms
iter 418990: loss 6.9225, time 125.53ms
step 419000: train loss 5.5397, val loss 5.5816
saving checkpoint to out-shakespeare-char
iter 419000: loss 6.3127, time 2884.50ms
iter 419010: loss 6.0051, time 125.88ms
iter 419020: loss 5.7793, time 127.82ms
iter 419030: loss 5.5104, time 125.12ms
iter 419040: loss 6.2505, time 124.91ms
iter 419050: loss 5.7048, time 124.95ms
iter 419060: loss 6.3409, time 124.93ms
iter 419070: loss 6.3735, time 124.87ms
iter 419080: loss 6.3021, time 124.87ms
iter 419090: loss 5.9362, time 125.36ms
iter 419100: loss 6.1517, time 125.46ms
iter 419110: loss 5.8358, time 121.62ms
iter 419120: loss 6.1118, time 121.45ms
iter 419130: loss 5.9604, time 123.44ms
iter 419140: loss 5.8063, time 121.67ms
iter 419150: loss 5.5711, time 122.15ms
iter 419160: loss 5.8123, time 121.76ms
iter 419170: loss 6.1750, time 121.89ms
iter 419180: loss 6.1830, time 121.92ms
iter 419190: loss 6.1750, time 122.03ms
iter 419200: loss 5.8073, time 123.15ms
iter 419210: loss 6.1202, time 122.04ms
iter 419220: loss 6.2126, time 122.16ms
iter 419230: loss 6.3774, time 123.21ms
iter 419240: loss 6.0443, time 122.13ms
step 419250: train loss 5.5933, val loss 5.5992
saving checkpoint to out-shakespeare-char
iter 419250: loss 5.7643, time 2904.12ms
iter 419260: loss 6.3141, time 122.74ms
iter 419270: loss 5.9938, time 122.21ms
iter 419280: loss 6.2199, time 121.77ms
iter 419290: loss 5.7529, time 121.67ms
iter 419300: loss 5.4177, time 123.08ms
iter 419310: loss 6.2101, time 122.41ms
iter 419320: loss 6.0334, time 122.52ms
iter 419330: loss 5.1138, time 123.28ms
iter 419340: loss 6.4489, time 121.93ms
iter 419350: loss 5.8966, time 122.24ms
iter 419360: loss 7.1383, time 123.22ms
iter 419370: loss 6.4318, time 122.18ms
iter 419380: loss 6.4625, time 122.53ms
iter 419390: loss 6.2405, time 124.60ms
iter 419400: loss 5.5859, time 121.66ms
iter 419410: loss 5.6795, time 121.82ms
iter 419420: loss 5.5180, time 122.04ms
iter 419430: loss 6.0545, time 121.97ms
iter 419440: loss 5.9947, time 121.19ms
iter 419450: loss 6.6996, time 121.82ms
iter 419460: loss 5.6767, time 123.35ms
iter 419470: loss 5.9478, time 121.62ms
iter 419480: loss 5.6377, time 121.96ms
iter 419490: loss 6.2597, time 123.37ms
step 419500: train loss 5.5833, val loss 5.5300
saving checkpoint to out-shakespeare-char
iter 419500: loss 6.5743, time 2902.65ms
iter 419510: loss 5.6484, time 121.96ms
iter 419520: loss 6.5232, time 123.41ms
iter 419530: loss 5.3690, time 122.45ms
iter 419540: loss 5.5229, time 121.68ms
iter 419550: loss 5.6632, time 124.44ms
iter 419560: loss 6.4304, time 121.58ms
iter 419570: loss 6.1895, time 122.43ms
iter 419580: loss 4.9166, time 121.85ms
iter 419590: loss 5.9219, time 121.88ms
iter 419600: loss 5.3049, time 121.83ms
iter 419610: loss 6.0526, time 121.62ms
iter 419620: loss 5.8789, time 123.17ms
iter 419630: loss 5.9681, time 121.10ms
iter 419640: loss 6.0692, time 121.17ms
iter 419650: loss 6.0961, time 122.72ms
iter 419660: loss 5.5625, time 122.27ms
iter 419670: loss 5.8855, time 121.72ms
iter 419680: loss 6.2180, time 123.39ms
iter 419690: loss 5.9618, time 121.76ms
iter 419700: loss 5.5671, time 121.63ms
iter 419710: loss 5.2530, time 122.05ms
iter 419720: loss 5.9583, time 122.16ms
iter 419730: loss 5.7533, time 121.37ms
iter 419740: loss 5.9692, time 121.83ms
step 419750: train loss 5.5892, val loss 5.5435
saving checkpoint to out-shakespeare-char
iter 419750: loss 6.2510, time 2892.89ms
iter 419760: loss 5.6074, time 121.18ms
iter 419770: loss 5.2611, time 121.89ms
iter 419780: loss 6.5229, time 120.96ms
iter 419790: loss 5.0495, time 122.31ms
iter 419800: loss 5.7971, time 121.87ms
iter 419810: loss 4.8609, time 121.98ms
iter 419820: loss 6.2299, time 122.83ms
iter 419830: loss 5.5913, time 121.21ms
iter 419840: loss 6.3391, time 121.82ms
iter 419850: loss 5.9020, time 122.92ms
iter 419860: loss 5.9780, time 121.79ms
iter 419870: loss 5.5725, time 120.89ms
iter 419880: loss 5.9434, time 122.10ms
iter 419890: loss 6.4294, time 121.95ms
iter 419900: loss 6.2965, time 122.43ms
iter 419910: loss 6.5230, time 121.85ms
iter 419920: loss 5.9288, time 122.37ms
iter 419930: loss 5.3968, time 121.82ms
iter 419940: loss 5.9545, time 121.58ms
iter 419950: loss 5.4628, time 122.86ms
iter 419960: loss 6.3003, time 121.02ms
iter 419970: loss 5.8954, time 122.00ms
iter 419980: loss 6.3558, time 124.39ms
iter 419990: loss 5.7013, time 122.15ms
step 420000: train loss 5.5975, val loss 5.5825
saving checkpoint to out-shakespeare-char
iter 420000: loss 6.4181, time 2903.73ms
iter 420010: loss 5.7981, time 122.53ms
iter 420020: loss 6.4191, time 122.19ms
iter 420030: loss 6.1521, time 121.81ms
iter 420040: loss 6.0567, time 121.53ms
iter 420050: loss 5.4733, time 123.53ms
iter 420060: loss 5.4147, time 121.35ms
iter 420070: loss 6.5273, time 121.87ms
iter 420080: loss 5.9928, time 123.10ms
iter 420090: loss 6.4676, time 122.07ms
iter 420100: loss 5.9958, time 122.01ms
iter 420110: loss 6.3596, time 125.98ms
iter 420120: loss 6.2950, time 125.70ms
iter 420130: loss 6.4155, time 126.11ms
iter 420140: loss 6.1960, time 125.63ms
iter 420150: loss 5.1625, time 125.83ms
iter 420160: loss 7.1141, time 125.66ms
iter 420170: loss 6.2725, time 125.95ms
iter 420180: loss 5.7176, time 125.65ms
iter 420190: loss 5.1699, time 126.07ms
iter 420200: loss 5.7418, time 124.85ms
iter 420210: loss 6.0001, time 125.74ms
iter 420220: loss 6.0728, time 124.88ms
iter 420230: loss 6.4839, time 125.88ms
iter 420240: loss 6.2191, time 125.68ms
step 420250: train loss 5.6009, val loss 5.6413
saving checkpoint to out-shakespeare-char
iter 420250: loss 5.9822, time 2912.90ms
iter 420260: loss 6.8083, time 125.65ms
iter 420270: loss 5.4423, time 125.77ms
iter 420280: loss 6.1178, time 125.97ms
iter 420290: loss 5.8411, time 125.72ms
iter 420300: loss 5.5278, time 125.65ms
iter 420310: loss 6.0592, time 125.85ms
iter 420320: loss 5.8231, time 125.92ms
iter 420330: loss 5.1476, time 125.81ms
iter 420340: loss 5.8453, time 125.64ms
iter 420350: loss 5.8117, time 127.57ms
iter 420360: loss 5.6742, time 125.51ms
iter 420370: loss 6.3384, time 128.24ms
iter 420380: loss 5.9301, time 125.81ms
iter 420390: loss 6.1496, time 128.10ms
iter 420400: loss 5.6572, time 125.77ms
iter 420410: loss 5.8579, time 128.47ms
iter 420420: loss 5.3261, time 125.57ms
iter 420430: loss 5.4643, time 126.21ms
iter 420440: loss 5.9656, time 125.37ms
iter 420450: loss 5.3667, time 125.81ms
iter 420460: loss 5.6475, time 124.72ms
iter 420470: loss 6.1251, time 125.48ms
iter 420480: loss 6.8845, time 125.16ms
iter 420490: loss 5.7006, time 125.97ms
step 420500: train loss 5.5283, val loss 5.6053
saving checkpoint to out-shakespeare-char
iter 420500: loss 6.2332, time 2884.00ms
iter 420510: loss 5.6225, time 126.02ms
iter 420520: loss 6.4898, time 126.94ms
iter 420530: loss 5.6044, time 125.73ms
iter 420540: loss 6.0718, time 125.63ms
iter 420550: loss 5.7117, time 125.83ms
iter 420560: loss 5.7673, time 125.59ms
iter 420570: loss 5.1527, time 125.73ms
iter 420580: loss 5.8331, time 126.10ms
iter 420590: loss 6.6337, time 125.61ms
iter 420600: loss 5.6467, time 124.49ms
iter 420610: loss 6.0199, time 125.81ms
iter 420620: loss 5.5246, time 125.68ms
iter 420630: loss 5.4852, time 124.78ms
iter 420640: loss 6.4485, time 125.72ms
iter 420650: loss 6.2392, time 125.89ms
iter 420660: loss 6.3023, time 125.63ms
iter 420670: loss 6.3748, time 124.78ms
iter 420680: loss 6.0889, time 125.63ms
iter 420690: loss 6.1029, time 125.44ms
iter 420700: loss 6.6221, time 125.58ms
iter 420710: loss 6.4363, time 124.69ms
iter 420720: loss 6.4717, time 125.61ms
iter 420730: loss 5.7850, time 125.68ms
iter 420740: loss 5.8878, time 125.45ms
step 420750: train loss 5.5887, val loss 5.5276
saving checkpoint to out-shakespeare-char
iter 420750: loss 5.8395, time 2886.85ms
iter 420760: loss 5.3413, time 125.58ms
iter 420770: loss 6.5907, time 125.49ms
iter 420780: loss 6.4787, time 126.12ms
iter 420790: loss 5.8056, time 125.91ms
iter 420800: loss 5.7575, time 125.53ms
iter 420810: loss 5.9046, time 125.37ms
iter 420820: loss 5.9157, time 125.66ms
iter 420830: loss 5.8880, time 125.65ms
iter 420840: loss 6.0504, time 125.64ms
iter 420850: loss 6.1396, time 125.74ms
iter 420860: loss 6.0133, time 125.60ms
iter 420870: loss 5.9021, time 127.12ms
iter 420880: loss 6.0043, time 125.76ms
iter 420890: loss 5.9968, time 127.74ms
iter 420900: loss 6.3836, time 125.80ms
iter 420910: loss 5.8631, time 128.33ms
iter 420920: loss 5.8939, time 125.80ms
iter 420930: loss 6.1407, time 127.13ms
iter 420940: loss 5.9738, time 125.61ms
iter 420950: loss 6.2716, time 128.00ms
iter 420960: loss 5.9433, time 125.71ms
iter 420970: loss 5.9001, time 128.09ms
iter 420980: loss 6.0658, time 126.10ms
iter 420990: loss 6.3279, time 129.07ms
step 421000: train loss 5.6208, val loss 5.5571
saving checkpoint to out-shakespeare-char
iter 421000: loss 5.6495, time 2889.10ms
iter 421010: loss 5.5079, time 125.89ms
iter 421020: loss 5.7705, time 125.69ms
iter 421030: loss 6.5150, time 128.84ms
iter 421040: loss 6.3423, time 125.59ms
iter 421050: loss 5.8905, time 125.57ms
iter 421060: loss 6.1504, time 125.67ms
iter 421070: loss 6.2302, time 125.83ms
iter 421080: loss 6.1441, time 127.10ms
iter 421090: loss 6.4435, time 125.64ms
iter 421100: loss 6.0215, time 125.41ms
iter 421110: loss 5.7821, time 127.66ms
iter 421120: loss 5.9380, time 125.90ms
iter 421130: loss 6.4886, time 126.92ms
iter 421140: loss 6.0014, time 125.67ms
iter 421150: loss 5.9019, time 127.00ms
iter 421160: loss 5.5867, time 125.83ms
iter 421170: loss 5.5511, time 122.43ms
iter 421180: loss 6.0498, time 121.94ms
iter 421190: loss 5.2035, time 121.13ms
iter 421200: loss 5.0961, time 122.11ms
iter 421210: loss 5.7164, time 121.87ms
iter 421220: loss 5.5396, time 121.78ms
iter 421230: loss 6.1700, time 123.31ms
iter 421240: loss 5.5645, time 121.67ms
step 421250: train loss 5.5931, val loss 5.5602
saving checkpoint to out-shakespeare-char
iter 421250: loss 6.2088, time 2917.28ms
iter 421260: loss 5.4185, time 126.08ms
iter 421270: loss 6.4295, time 125.78ms
iter 421280: loss 6.0431, time 125.06ms
iter 421290: loss 5.1756, time 125.96ms
iter 421300: loss 5.3835, time 125.95ms
iter 421310: loss 6.0796, time 126.03ms
iter 421320: loss 6.1807, time 128.47ms
iter 421330: loss 5.6255, time 125.92ms
iter 421340: loss 5.8921, time 127.78ms
iter 421350: loss 5.8754, time 126.22ms
iter 421360: loss 6.5805, time 129.21ms
iter 421370: loss 6.5801, time 125.73ms
iter 421380: loss 6.4475, time 128.27ms
iter 421390: loss 6.9855, time 125.81ms
iter 421400: loss 5.7404, time 127.91ms
iter 421410: loss 5.1889, time 125.55ms
iter 421420: loss 6.2426, time 128.33ms
iter 421430: loss 6.0604, time 126.25ms
iter 421440: loss 6.5435, time 128.36ms
iter 421450: loss 5.5594, time 126.00ms
iter 421460: loss 5.8806, time 128.97ms
iter 421470: loss 6.0755, time 125.63ms
iter 421480: loss 5.7737, time 128.09ms
iter 421490: loss 6.5434, time 125.44ms
step 421500: train loss 5.6137, val loss 5.5881
saving checkpoint to out-shakespeare-char
iter 421500: loss 5.9071, time 2882.65ms
iter 421510: loss 5.2402, time 122.64ms
iter 421520: loss 5.6464, time 121.41ms
iter 421530: loss 5.5697, time 121.73ms
iter 421540: loss 6.3293, time 122.62ms
iter 421550: loss 6.1334, time 121.30ms
iter 421560: loss 7.0579, time 121.45ms
iter 421570: loss 5.9156, time 124.13ms
iter 421580: loss 5.9107, time 121.38ms
iter 421590: loss 6.0034, time 121.59ms
iter 421600: loss 6.3562, time 121.48ms
iter 421610: loss 6.9032, time 121.65ms
iter 421620: loss 6.3239, time 121.51ms
iter 421630: loss 6.4176, time 121.45ms
iter 421640: loss 5.9844, time 122.91ms
iter 421650: loss 6.5192, time 121.64ms
iter 421660: loss 6.0703, time 121.52ms
iter 421670: loss 6.1854, time 122.68ms
iter 421680: loss 5.6307, time 121.06ms
iter 421690: loss 6.0244, time 121.55ms
iter 421700: loss 5.1370, time 124.04ms
iter 421710: loss 5.9459, time 121.26ms
iter 421720: loss 5.5734, time 121.53ms
iter 421730: loss 6.1604, time 121.30ms
iter 421740: loss 5.4542, time 121.26ms
step 421750: train loss 5.5534, val loss 5.5752
saving checkpoint to out-shakespeare-char
iter 421750: loss 5.9538, time 2895.46ms
iter 421760: loss 5.6863, time 125.48ms
iter 421770: loss 5.3458, time 125.26ms
iter 421780: loss 5.4436, time 125.33ms
iter 421790: loss 6.0802, time 124.64ms
iter 421800: loss 6.0702, time 125.71ms
iter 421810: loss 5.8649, time 125.12ms
iter 421820: loss 6.4029, time 124.99ms
iter 421830: loss 5.9733, time 125.45ms
iter 421840: loss 6.0675, time 125.65ms
iter 421850: loss 6.1438, time 125.67ms
iter 421860: loss 6.8086, time 125.74ms
iter 421870: loss 6.7482, time 125.77ms
iter 421880: loss 6.3513, time 125.71ms
iter 421890: loss 5.5529, time 126.15ms
iter 421900: loss 5.9091, time 125.94ms
iter 421910: loss 6.5673, time 126.07ms
iter 421920: loss 6.6285, time 125.73ms
iter 421930: loss 5.8161, time 125.49ms
iter 421940: loss 5.9599, time 125.65ms
iter 421950: loss 6.0078, time 125.91ms
iter 421960: loss 6.0815, time 125.30ms
iter 421970: loss 6.1623, time 125.02ms
iter 421980: loss 6.2902, time 127.08ms
iter 421990: loss 6.1029, time 125.36ms
step 422000: train loss 5.5583, val loss 5.5578
saving checkpoint to out-shakespeare-char
iter 422000: loss 5.8304, time 2897.43ms
iter 422010: loss 6.0478, time 125.50ms
iter 422020: loss 5.7449, time 125.58ms
iter 422030: loss 5.8379, time 125.47ms
iter 422040: loss 5.8592, time 125.21ms
iter 422050: loss 6.1118, time 125.34ms
iter 422060: loss 5.6351, time 125.41ms
iter 422070: loss 6.0337, time 125.18ms
iter 422080: loss 5.6906, time 124.27ms
iter 422090: loss 5.8562, time 125.11ms
iter 422100: loss 6.5379, time 125.17ms
iter 422110: loss 5.8530, time 124.96ms
iter 422120: loss 5.4260, time 125.57ms
iter 422130: loss 5.9705, time 124.00ms
iter 422140: loss 5.9568, time 123.53ms
iter 422150: loss 5.8984, time 125.11ms
iter 422160: loss 5.7105, time 125.05ms
iter 422170: loss 6.3998, time 124.59ms
iter 422180: loss 5.9802, time 125.89ms
iter 422190: loss 6.5284, time 125.92ms
iter 422200: loss 6.2868, time 126.34ms
iter 422210: loss 5.5994, time 125.73ms
iter 422220: loss 6.1584, time 125.55ms
iter 422230: loss 6.3089, time 126.11ms
iter 422240: loss 5.3400, time 125.90ms
step 422250: train loss 5.5694, val loss 5.6019
saving checkpoint to out-shakespeare-char
iter 422250: loss 5.9475, time 2900.85ms
iter 422260: loss 6.3109, time 125.78ms
iter 422270: loss 5.8181, time 125.12ms
iter 422280: loss 7.0925, time 125.49ms
iter 422290: loss 5.9771, time 125.70ms
iter 422300: loss 6.3911, time 124.91ms
iter 422310: loss 6.8743, time 126.20ms
iter 422320: loss 5.7423, time 125.41ms
iter 422330: loss 6.4957, time 125.68ms
iter 422340: loss 5.4324, time 123.88ms
iter 422350: loss 6.9655, time 125.56ms
iter 422360: loss 5.5877, time 125.61ms
iter 422370: loss 6.2007, time 125.58ms
iter 422380: loss 6.2383, time 125.87ms
iter 422390: loss 6.6752, time 125.37ms
iter 422400: loss 5.9738, time 125.47ms
iter 422410: loss 6.3438, time 125.66ms
iter 422420: loss 5.7927, time 125.92ms
iter 422430: loss 5.9811, time 125.63ms
iter 422440: loss 5.3778, time 125.60ms
iter 422450: loss 6.2018, time 125.47ms
iter 422460: loss 6.2540, time 126.36ms
iter 422470: loss 6.1537, time 126.01ms
iter 422480: loss 6.0112, time 126.95ms
iter 422490: loss 5.8980, time 125.67ms
step 422500: train loss 5.5648, val loss 5.5713
saving checkpoint to out-shakespeare-char
iter 422500: loss 5.7896, time 2883.58ms
iter 422510: loss 6.1837, time 125.13ms
iter 422520: loss 5.7410, time 125.12ms
iter 422530: loss 6.2211, time 124.65ms
iter 422540: loss 5.7660, time 124.68ms
iter 422550: loss 5.3804, time 124.87ms
iter 422560: loss 5.8842, time 125.13ms
iter 422570: loss 6.0134, time 125.14ms
iter 422580: loss 6.3736, time 125.79ms
iter 422590: loss 6.1595, time 124.12ms
iter 422600: loss 6.5968, time 125.28ms
iter 422610: loss 5.6419, time 125.19ms
iter 422620: loss 6.3321, time 125.49ms
iter 422630: loss 5.8383, time 125.60ms
iter 422640: loss 5.7261, time 125.21ms
iter 422650: loss 6.7757, time 125.31ms
iter 422660: loss 5.5928, time 126.35ms
iter 422670: loss 5.8812, time 125.68ms
iter 422680: loss 6.0436, time 124.87ms
iter 422690: loss 5.9602, time 125.34ms
iter 422700: loss 5.4041, time 125.48ms
iter 422710: loss 6.2365, time 125.42ms
iter 422720: loss 6.1300, time 125.53ms
iter 422730: loss 6.3460, time 125.32ms
iter 422740: loss 6.1883, time 125.47ms
step 422750: train loss 5.5912, val loss 5.5653
saving checkpoint to out-shakespeare-char
iter 422750: loss 6.0215, time 2913.06ms
iter 422760: loss 5.8333, time 125.25ms
iter 422770: loss 5.4764, time 125.59ms
iter 422780: loss 6.1012, time 125.76ms
iter 422790: loss 6.1124, time 125.34ms
iter 422800: loss 5.6925, time 125.64ms
iter 422810: loss 5.3163, time 125.34ms
iter 422820: loss 5.8146, time 121.15ms
iter 422830: loss 5.9492, time 120.76ms
iter 422840: loss 6.1480, time 119.60ms
iter 422850: loss 5.9394, time 120.65ms
iter 422860: loss 5.4520, time 122.98ms
iter 422870: loss 5.9062, time 119.95ms
iter 422880: loss 5.7926, time 119.75ms
iter 422890: loss 5.7404, time 119.70ms
iter 422900: loss 5.9038, time 120.69ms
iter 422910: loss 6.1451, time 120.55ms
iter 422920: loss 5.9214, time 120.00ms
iter 422930: loss 6.3335, time 120.64ms
iter 422940: loss 5.3312, time 119.44ms
iter 422950: loss 5.8457, time 120.60ms
iter 422960: loss 6.6746, time 124.01ms
iter 422970: loss 6.4500, time 121.60ms
iter 422980: loss 6.1305, time 121.53ms
iter 422990: loss 6.1684, time 121.46ms
step 423000: train loss 5.5647, val loss 5.6197
saving checkpoint to out-shakespeare-char
iter 423000: loss 5.8592, time 2904.79ms
iter 423010: loss 5.5701, time 121.16ms
iter 423020: loss 5.9842, time 124.12ms
iter 423030: loss 5.5647, time 121.60ms
iter 423040: loss 5.4067, time 121.59ms
iter 423050: loss 6.4888, time 121.63ms
iter 423060: loss 5.8870, time 122.25ms
iter 423070: loss 5.9333, time 122.82ms
iter 423080: loss 6.1083, time 121.71ms
iter 423090: loss 5.5588, time 120.66ms
iter 423100: loss 5.6520, time 121.26ms
iter 423110: loss 5.5959, time 123.98ms
iter 423120: loss 5.7131, time 121.45ms
iter 423130: loss 6.4651, time 121.51ms
iter 423140: loss 5.7953, time 121.18ms
iter 423150: loss 6.0135, time 121.49ms
iter 423160: loss 6.1101, time 121.55ms
iter 423170: loss 6.0473, time 121.47ms
iter 423180: loss 6.4410, time 121.39ms
iter 423190: loss 5.6826, time 121.47ms
iter 423200: loss 5.8168, time 121.43ms
iter 423210: loss 6.2048, time 121.92ms
iter 423220: loss 5.7684, time 121.76ms
iter 423230: loss 5.7952, time 124.35ms
iter 423240: loss 6.3871, time 121.38ms
step 423250: train loss 5.5321, val loss 5.6350
saving checkpoint to out-shakespeare-char
iter 423250: loss 6.1508, time 2900.33ms
iter 423260: loss 6.0490, time 121.61ms
iter 423270: loss 6.1484, time 120.48ms
iter 423280: loss 6.0487, time 121.63ms
iter 423290: loss 6.7517, time 124.64ms
iter 423300: loss 6.2542, time 121.66ms
iter 423310: loss 5.5903, time 121.86ms
iter 423320: loss 6.6332, time 121.55ms
iter 423330: loss 5.8094, time 121.48ms
iter 423340: loss 5.8224, time 122.04ms
iter 423350: loss 5.6033, time 121.58ms
iter 423360: loss 5.4447, time 122.62ms
iter 423370: loss 5.0274, time 121.40ms
iter 423380: loss 5.9765, time 121.55ms
iter 423390: loss 6.0663, time 123.39ms
iter 423400: loss 6.4722, time 121.57ms
iter 423410: loss 6.2056, time 124.01ms
iter 423420: loss 6.1822, time 121.38ms
iter 423430: loss 6.0382, time 121.58ms
iter 423440: loss 6.4938, time 121.55ms
iter 423450: loss 5.7576, time 121.84ms
iter 423460: loss 6.4900, time 121.53ms
iter 423470: loss 5.8108, time 121.44ms
iter 423480: loss 6.1923, time 122.96ms
iter 423490: loss 5.8437, time 121.64ms
step 423500: train loss 5.5604, val loss 5.6050
saving checkpoint to out-shakespeare-char
iter 423500: loss 6.3485, time 2888.22ms
iter 423510: loss 5.4408, time 121.48ms
iter 423520: loss 5.8575, time 121.81ms
iter 423530: loss 5.8618, time 121.59ms
iter 423540: loss 5.7667, time 121.65ms
iter 423550: loss 6.0653, time 122.76ms
iter 423560: loss 5.9736, time 121.49ms
iter 423570: loss 6.3010, time 121.57ms
iter 423580: loss 5.8529, time 122.58ms
iter 423590: loss 5.5901, time 121.34ms
iter 423600: loss 6.8971, time 121.37ms
iter 423610: loss 5.4077, time 124.04ms
iter 423620: loss 5.7790, time 120.82ms
iter 423630: loss 6.1539, time 121.39ms
iter 423640: loss 6.0554, time 122.05ms
iter 423650: loss 5.6582, time 121.35ms
iter 423660: loss 6.1784, time 121.58ms
iter 423670: loss 5.9149, time 121.99ms
iter 423680: loss 5.9125, time 122.05ms
iter 423690: loss 5.3764, time 121.58ms
iter 423700: loss 5.7644, time 121.33ms
iter 423710: loss 6.0970, time 122.54ms
iter 423720: loss 5.5057, time 121.76ms
iter 423730: loss 5.2043, time 122.08ms
iter 423740: loss 5.4799, time 124.53ms
step 423750: train loss 5.5828, val loss 5.5629
saving checkpoint to out-shakespeare-char
iter 423750: loss 6.1080, time 2892.28ms
iter 423760: loss 5.8190, time 125.21ms
iter 423770: loss 6.3272, time 125.53ms
iter 423780: loss 5.2423, time 125.23ms
iter 423790: loss 5.6082, time 125.11ms
iter 423800: loss 6.3148, time 125.10ms
iter 423810: loss 5.9751, time 124.94ms
iter 423820: loss 5.1939, time 125.10ms
iter 423830: loss 6.6543, time 125.12ms
iter 423840: loss 6.2398, time 126.41ms
iter 423850: loss 6.2055, time 125.01ms
iter 423860: loss 6.1258, time 124.98ms
iter 423870: loss 6.2732, time 125.25ms
iter 423880: loss 6.3133, time 124.93ms
iter 423890: loss 6.6921, time 125.09ms
iter 423900: loss 6.0984, time 125.05ms
iter 423910: loss 6.2483, time 125.15ms
iter 423920: loss 6.4303, time 124.74ms
iter 423930: loss 6.7777, time 124.96ms
iter 423940: loss 5.4227, time 125.25ms
iter 423950: loss 6.0685, time 125.00ms
iter 423960: loss 5.8847, time 125.11ms
iter 423970: loss 5.8383, time 125.00ms
iter 423980: loss 5.6849, time 125.02ms
iter 423990: loss 6.2059, time 125.17ms
step 424000: train loss 5.6126, val loss 5.5742
saving checkpoint to out-shakespeare-char
iter 424000: loss 6.0039, time 2910.89ms
iter 424010: loss 5.2538, time 127.98ms
iter 424020: loss 5.6625, time 125.49ms
iter 424030: loss 5.8795, time 127.71ms
iter 424040: loss 6.2868, time 125.01ms
iter 424050: loss 6.5733, time 127.90ms
iter 424060: loss 5.9698, time 125.40ms
iter 424070: loss 6.9271, time 127.78ms
iter 424080: loss 5.8012, time 125.32ms
iter 424090: loss 5.2662, time 127.85ms
iter 424100: loss 5.5541, time 125.30ms
iter 424110: loss 5.6712, time 127.94ms
iter 424120: loss 6.3705, time 125.27ms
iter 424130: loss 6.5404, time 127.76ms
iter 424140: loss 5.5211, time 125.13ms
iter 424150: loss 5.7243, time 127.86ms
iter 424160: loss 6.1185, time 125.62ms
iter 424170: loss 5.5111, time 127.53ms
iter 424180: loss 5.3962, time 125.16ms
iter 424190: loss 6.0943, time 127.55ms
iter 424200: loss 5.7990, time 125.19ms
iter 424210: loss 5.9667, time 124.64ms
iter 424220: loss 5.9931, time 125.01ms
iter 424230: loss 5.9839, time 125.01ms
iter 424240: loss 5.7262, time 125.11ms
step 424250: train loss 5.6048, val loss 5.5456
saving checkpoint to out-shakespeare-char
iter 424250: loss 6.5049, time 2884.24ms
iter 424260: loss 5.7572, time 126.22ms
iter 424270: loss 5.6777, time 125.44ms
iter 424280: loss 6.1441, time 125.42ms
iter 424290: loss 5.9522, time 125.43ms
iter 424300: loss 5.3379, time 126.07ms
iter 424310: loss 6.4504, time 128.05ms
iter 424320: loss 5.7341, time 125.85ms
iter 424330: loss 6.3417, time 125.23ms
iter 424340: loss 5.6521, time 126.21ms
iter 424350: loss 5.2892, time 128.45ms
iter 424360: loss 6.2996, time 125.54ms
iter 424370: loss 5.6437, time 127.85ms
iter 424380: loss 5.8001, time 125.15ms
iter 424390: loss 6.2223, time 128.46ms
iter 424400: loss 6.6759, time 125.68ms
iter 424410: loss 6.3351, time 128.05ms
iter 424420: loss 5.4017, time 125.13ms
iter 424430: loss 6.0290, time 127.78ms
iter 424440: loss 5.7789, time 125.17ms
iter 424450: loss 6.3868, time 127.70ms
iter 424460: loss 5.6590, time 125.63ms
iter 424470: loss 5.1446, time 127.80ms
iter 424480: loss 5.9505, time 125.20ms
iter 424490: loss 5.9701, time 127.81ms
step 424500: train loss 5.5970, val loss 5.5986
saving checkpoint to out-shakespeare-char
iter 424500: loss 5.9978, time 2908.67ms
iter 424510: loss 6.0358, time 125.88ms
iter 424520: loss 6.4232, time 125.90ms
iter 424530: loss 6.1058, time 125.32ms
iter 424540: loss 5.5003, time 126.79ms
iter 424550: loss 6.1026, time 125.18ms
iter 424560: loss 5.2702, time 125.75ms
iter 424570: loss 6.2340, time 125.08ms
iter 424580: loss 5.9440, time 125.81ms
iter 424590: loss 5.9733, time 126.11ms
iter 424600: loss 5.8630, time 126.63ms
iter 424610: loss 5.9302, time 125.11ms
iter 424620: loss 5.6883, time 126.49ms
iter 424630: loss 6.1940, time 125.33ms
iter 424640: loss 6.1299, time 125.24ms
iter 424650: loss 5.9829, time 126.28ms
iter 424660: loss 6.0812, time 125.32ms
iter 424670: loss 5.2831, time 126.16ms
iter 424680: loss 6.6944, time 125.55ms
iter 424690: loss 5.9992, time 125.13ms
iter 424700: loss 6.4769, time 125.46ms
iter 424710: loss 5.2186, time 126.57ms
iter 424720: loss 5.8494, time 125.44ms
iter 424730: loss 6.3822, time 125.28ms
iter 424740: loss 5.6916, time 125.29ms
step 424750: train loss 5.5916, val loss 5.5704
saving checkpoint to out-shakespeare-char
iter 424750: loss 5.3742, time 2880.65ms
iter 424760: loss 6.0538, time 126.80ms
iter 424770: loss 5.2664, time 125.56ms
iter 424780: loss 5.6554, time 125.26ms
iter 424790: loss 6.3339, time 125.75ms
iter 424800: loss 6.0171, time 125.27ms
iter 424810: loss 6.7592, time 124.51ms
iter 424820: loss 5.4101, time 124.84ms
iter 424830: loss 5.6399, time 125.18ms
iter 424840: loss 6.2698, time 125.16ms
iter 424850: loss 6.5068, time 125.86ms
iter 424860: loss 6.0450, time 125.43ms
iter 424870: loss 6.3669, time 126.82ms
iter 424880: loss 5.3528, time 125.20ms
iter 424890: loss 6.5822, time 125.31ms
iter 424900: loss 5.5819, time 125.62ms
iter 424910: loss 6.1721, time 125.08ms
iter 424920: loss 6.1134, time 125.44ms
iter 424930: loss 5.5008, time 125.20ms
iter 424940: loss 6.7503, time 125.54ms
iter 424950: loss 6.2769, time 124.31ms
iter 424960: loss 5.3441, time 125.42ms
iter 424970: loss 5.4433, time 125.50ms
iter 424980: loss 6.3445, time 126.51ms
iter 424990: loss 5.7925, time 124.12ms
step 425000: train loss 5.5923, val loss 5.6437
saving checkpoint to out-shakespeare-char
iter 425000: loss 6.1212, time 2902.89ms
iter 425010: loss 5.7581, time 125.85ms
iter 425020: loss 5.5051, time 128.56ms
iter 425030: loss 6.0494, time 125.85ms
iter 425040: loss 5.5847, time 128.30ms
iter 425050: loss 6.0490, time 125.56ms
iter 425060: loss 5.8337, time 128.35ms
iter 425070: loss 5.4804, time 124.87ms
iter 425080: loss 6.4439, time 125.59ms
iter 425090: loss 5.5770, time 128.20ms
iter 425100: loss 6.5777, time 124.83ms
iter 425110: loss 6.4475, time 125.94ms
iter 425120: loss 5.4589, time 125.24ms
iter 425130: loss 5.7262, time 125.45ms
iter 425140: loss 5.3677, time 125.20ms
iter 425150: loss 5.8325, time 125.54ms
iter 425160: loss 6.1631, time 125.20ms
iter 425170: loss 6.0254, time 125.66ms
iter 425180: loss 5.4765, time 125.22ms
iter 425190: loss 6.3544, time 125.02ms
iter 425200: loss 6.3589, time 125.51ms
iter 425210: loss 5.5992, time 124.98ms
iter 425220: loss 5.6325, time 125.22ms
iter 425230: loss 5.6338, time 125.35ms
iter 425240: loss 5.9219, time 126.44ms
step 425250: train loss 5.5906, val loss 5.5631
saving checkpoint to out-shakespeare-char
iter 425250: loss 5.7029, time 2907.77ms
iter 425260: loss 6.3903, time 124.96ms
iter 425270: loss 5.2028, time 123.56ms
iter 425280: loss 6.0139, time 124.95ms
iter 425290: loss 5.7623, time 124.92ms
iter 425300: loss 6.5513, time 124.76ms
iter 425310: loss 5.8586, time 123.97ms
iter 425320: loss 5.8192, time 124.51ms
iter 425330: loss 6.0364, time 124.43ms
iter 425340: loss 5.1561, time 123.41ms
iter 425350: loss 5.4877, time 123.82ms
iter 425360: loss 5.9201, time 124.93ms
iter 425370: loss 6.2492, time 125.58ms
iter 425380: loss 5.6955, time 125.28ms
iter 425390: loss 5.9369, time 124.09ms
iter 425400: loss 5.6777, time 125.49ms
iter 425410: loss 5.2915, time 126.12ms
iter 425420: loss 6.1795, time 125.99ms
iter 425430: loss 6.3719, time 126.05ms
iter 425440: loss 5.5157, time 124.80ms
iter 425450: loss 5.7928, time 127.46ms
iter 425460: loss 5.9278, time 125.83ms
iter 425470: loss 6.5813, time 125.83ms
iter 425480: loss 6.0442, time 124.75ms
iter 425490: loss 5.3098, time 126.22ms
step 425500: train loss 5.5469, val loss 5.6054
saving checkpoint to out-shakespeare-char
iter 425500: loss 5.8686, time 2898.04ms
iter 425510: loss 6.2720, time 125.20ms
iter 425520: loss 5.8985, time 125.37ms
iter 425530: loss 6.2389, time 125.37ms
iter 425540: loss 5.9685, time 124.13ms
iter 425550: loss 5.9702, time 125.35ms
iter 425560: loss 5.9679, time 125.09ms
iter 425570: loss 6.0210, time 125.77ms
iter 425580: loss 6.3819, time 125.24ms
iter 425590: loss 5.5190, time 125.18ms
iter 425600: loss 5.1327, time 125.06ms
iter 425610: loss 6.4034, time 126.26ms
iter 425620: loss 5.7923, time 125.68ms
iter 425630: loss 5.5514, time 125.12ms
iter 425640: loss 6.1510, time 125.25ms
iter 425650: loss 5.9280, time 125.93ms
iter 425660: loss 5.5222, time 125.92ms
iter 425670: loss 5.6454, time 125.52ms
iter 425680: loss 6.1717, time 124.41ms
iter 425690: loss 6.5307, time 125.91ms
iter 425700: loss 6.2059, time 125.20ms
iter 425710: loss 5.7456, time 125.39ms
iter 425720: loss 5.5861, time 125.54ms
iter 425730: loss 6.1370, time 125.65ms
iter 425740: loss 6.1815, time 125.81ms
step 425750: train loss 5.6142, val loss 5.6073
saving checkpoint to out-shakespeare-char
iter 425750: loss 6.0005, time 2898.87ms
iter 425760: loss 6.2135, time 125.61ms
iter 425770: loss 5.9729, time 125.75ms
iter 425780: loss 6.1523, time 125.65ms
iter 425790: loss 6.5956, time 126.10ms
iter 425800: loss 6.0319, time 125.37ms
iter 425810: loss 5.8380, time 125.65ms
iter 425820: loss 6.0276, time 125.52ms
iter 425830: loss 5.8638, time 125.69ms
iter 425840: loss 6.5700, time 125.34ms
iter 425850: loss 6.6080, time 125.67ms
iter 425860: loss 6.0928, time 125.69ms
iter 425870: loss 6.3579, time 125.60ms
iter 425880: loss 6.2121, time 125.78ms
iter 425890: loss 5.7498, time 125.69ms
iter 425900: loss 6.2970, time 125.28ms
iter 425910: loss 6.1219, time 124.93ms
iter 425920: loss 5.9488, time 125.37ms
iter 425930: loss 6.2627, time 125.63ms
iter 425940: loss 6.5458, time 125.32ms
iter 425950: loss 5.8072, time 125.50ms
iter 425960: loss 6.4939, time 125.37ms
iter 425970: loss 6.3345, time 125.46ms
iter 425980: loss 6.0784, time 125.63ms
iter 425990: loss 5.9029, time 125.33ms
step 426000: train loss 5.6086, val loss 5.5792
saving checkpoint to out-shakespeare-char
iter 426000: loss 5.9367, time 2884.23ms
iter 426010: loss 6.5000, time 125.45ms
iter 426020: loss 6.3237, time 125.65ms
iter 426030: loss 5.9296, time 125.47ms
iter 426040: loss 6.1132, time 125.08ms
iter 426050: loss 5.5111, time 125.66ms
iter 426060: loss 5.8989, time 125.22ms
iter 426070: loss 6.1857, time 125.12ms
iter 426080: loss 5.4897, time 125.22ms
iter 426090: loss 5.5572, time 125.31ms
iter 426100: loss 5.6039, time 124.99ms
iter 426110: loss 6.0044, time 125.12ms
iter 426120: loss 6.3358, time 125.30ms
iter 426130: loss 6.6248, time 125.32ms
iter 426140: loss 5.7549, time 125.52ms
iter 426150: loss 6.0759, time 125.24ms
iter 426160: loss 5.9118, time 125.41ms
iter 426170: loss 6.5412, time 125.29ms
iter 426180: loss 6.3027, time 125.66ms
iter 426190: loss 6.3750, time 125.43ms
iter 426200: loss 6.5612, time 125.55ms
iter 426210: loss 5.9013, time 125.36ms
iter 426220: loss 6.5835, time 125.42ms
iter 426230: loss 5.5690, time 126.03ms
iter 426240: loss 5.8368, time 125.97ms
step 426250: train loss 5.5454, val loss 5.6127
saving checkpoint to out-shakespeare-char
iter 426250: loss 6.0285, time 2873.51ms
iter 426260: loss 6.7903, time 124.08ms
iter 426270: loss 5.7191, time 121.61ms
iter 426280: loss 6.1479, time 121.55ms
iter 426290: loss 5.6434, time 121.50ms
iter 426300: loss 5.7287, time 121.35ms
iter 426310: loss 5.8527, time 121.43ms
iter 426320: loss 6.0134, time 121.53ms
iter 426330: loss 6.0169, time 122.41ms
iter 426340: loss 6.1303, time 121.59ms
iter 426350: loss 6.4264, time 121.88ms
iter 426360: loss 6.0130, time 122.55ms
iter 426370: loss 6.2392, time 121.85ms
iter 426380: loss 6.3106, time 121.33ms
iter 426390: loss 5.9815, time 124.10ms
iter 426400: loss 6.0232, time 121.36ms
iter 426410: loss 6.3977, time 121.40ms
iter 426420: loss 5.8363, time 121.79ms
iter 426430: loss 5.8497, time 121.45ms
iter 426440: loss 5.8593, time 121.35ms
iter 426450: loss 6.4121, time 121.21ms
iter 426460: loss 5.6758, time 122.54ms
iter 426470: loss 6.4167, time 121.42ms
iter 426480: loss 5.5021, time 121.48ms
iter 426490: loss 6.3739, time 122.57ms
step 426500: train loss 5.6007, val loss 5.5631
saving checkpoint to out-shakespeare-char
iter 426500: loss 5.7762, time 2909.42ms
iter 426510: loss 6.1192, time 125.74ms
iter 426520: loss 6.2148, time 125.33ms
iter 426530: loss 6.0427, time 125.78ms
iter 426540: loss 6.0139, time 125.67ms
iter 426550: loss 5.9679, time 125.36ms
iter 426560: loss 5.8978, time 125.39ms
iter 426570: loss 5.9681, time 125.29ms
iter 426580: loss 6.2222, time 125.40ms
iter 426590: loss 5.9859, time 125.36ms
iter 426600: loss 6.1029, time 123.49ms
iter 426610: loss 5.9409, time 125.57ms
iter 426620: loss 5.8024, time 125.34ms
iter 426630: loss 5.7290, time 125.49ms
iter 426640: loss 6.2971, time 125.04ms
iter 426650: loss 6.2243, time 124.35ms
iter 426660: loss 5.8209, time 125.45ms
iter 426670: loss 5.7646, time 125.56ms
iter 426680: loss 5.7426, time 124.45ms
iter 426690: loss 5.6784, time 125.29ms
iter 426700: loss 6.1471, time 125.56ms
iter 426710: loss 5.7157, time 127.17ms
iter 426720: loss 6.2268, time 125.58ms
iter 426730: loss 6.4894, time 125.87ms
iter 426740: loss 5.8634, time 125.37ms
step 426750: train loss 5.5472, val loss 5.6145
saving checkpoint to out-shakespeare-char
iter 426750: loss 5.7772, time 2908.51ms
iter 426760: loss 5.6569, time 126.35ms
iter 426770: loss 6.2507, time 125.14ms
iter 426780: loss 5.0926, time 125.15ms
iter 426790: loss 6.2312, time 126.36ms
iter 426800: loss 5.8156, time 125.31ms
iter 426810: loss 5.8599, time 125.49ms
iter 426820: loss 5.8089, time 125.34ms
iter 426830: loss 6.4108, time 125.34ms
iter 426840: loss 6.1805, time 125.40ms
iter 426850: loss 5.7450, time 121.40ms
iter 426860: loss 5.3935, time 121.49ms
iter 426870: loss 6.5849, time 123.99ms
iter 426880: loss 6.2169, time 121.37ms
iter 426890: loss 5.8154, time 121.33ms
iter 426900: loss 5.6182, time 121.42ms
iter 426910: loss 6.4337, time 121.36ms
iter 426920: loss 5.6653, time 122.84ms
iter 426930: loss 5.9113, time 121.68ms
iter 426940: loss 6.2797, time 121.27ms
iter 426950: loss 6.1609, time 122.00ms
iter 426960: loss 6.3553, time 121.59ms
iter 426970: loss 6.2992, time 121.49ms
iter 426980: loss 5.7023, time 123.92ms
iter 426990: loss 6.5200, time 121.22ms
step 427000: train loss 5.5791, val loss 5.5722
saving checkpoint to out-shakespeare-char
iter 427000: loss 6.4743, time 2897.63ms
iter 427010: loss 5.5202, time 121.91ms
iter 427020: loss 6.6851, time 121.44ms
iter 427030: loss 6.2160, time 121.69ms
iter 427040: loss 6.0664, time 120.92ms
iter 427050: loss 6.0386, time 122.63ms
iter 427060: loss 6.1419, time 121.64ms
iter 427070: loss 5.5686, time 121.87ms
iter 427080: loss 6.0783, time 122.09ms
iter 427090: loss 6.4711, time 121.32ms
iter 427100: loss 5.7480, time 121.63ms
iter 427110: loss 6.4414, time 122.58ms
iter 427120: loss 5.3792, time 121.54ms
iter 427130: loss 6.0313, time 121.45ms
iter 427140: loss 5.5777, time 122.61ms
iter 427150: loss 5.9503, time 121.46ms
iter 427160: loss 6.3516, time 121.50ms
iter 427170: loss 5.8448, time 122.56ms
iter 427180: loss 5.6019, time 121.58ms
iter 427190: loss 6.4014, time 121.87ms
iter 427200: loss 5.9748, time 124.00ms
iter 427210: loss 5.8059, time 121.99ms
iter 427220: loss 6.5608, time 121.91ms
iter 427230: loss 5.3010, time 121.50ms
iter 427240: loss 5.7473, time 121.84ms
step 427250: train loss 5.5546, val loss 5.5979
saving checkpoint to out-shakespeare-char
iter 427250: loss 5.7524, time 2896.52ms
iter 427260: loss 5.6399, time 121.42ms
iter 427270: loss 6.6450, time 121.30ms
iter 427280: loss 5.9399, time 121.32ms
iter 427290: loss 6.5339, time 122.52ms
iter 427300: loss 5.4384, time 121.40ms
iter 427310: loss 5.5381, time 121.20ms
iter 427320: loss 6.3422, time 122.54ms
iter 427330: loss 5.9650, time 121.23ms
iter 427340: loss 5.6550, time 121.29ms
iter 427350: loss 6.1876, time 122.51ms
iter 427360: loss 6.2340, time 121.22ms
iter 427370: loss 6.0222, time 121.58ms
iter 427380: loss 6.1074, time 123.90ms
iter 427390: loss 6.2879, time 121.30ms
iter 427400: loss 6.6210, time 121.49ms
iter 427410: loss 6.1304, time 122.76ms
iter 427420: loss 6.3502, time 121.08ms
iter 427430: loss 6.2223, time 122.64ms
iter 427440: loss 5.9329, time 121.29ms
iter 427450: loss 6.3722, time 121.50ms
iter 427460: loss 6.1277, time 123.84ms
iter 427470: loss 5.4477, time 121.53ms
iter 427480: loss 6.6588, time 121.51ms
iter 427490: loss 5.8194, time 121.37ms
step 427500: train loss 5.6119, val loss 5.5657
saving checkpoint to out-shakespeare-char
iter 427500: loss 5.7105, time 2902.08ms
iter 427510: loss 5.8029, time 121.21ms
iter 427520: loss 5.7219, time 124.01ms
iter 427530: loss 5.9539, time 121.39ms
iter 427540: loss 6.8153, time 121.70ms
iter 427550: loss 5.7213, time 121.61ms
iter 427560: loss 5.7073, time 121.26ms
iter 427570: loss 5.7347, time 120.70ms
iter 427580: loss 5.6328, time 120.23ms
iter 427590: loss 5.6534, time 121.69ms
iter 427600: loss 6.2787, time 121.22ms
iter 427610: loss 5.4577, time 121.54ms
iter 427620: loss 6.3734, time 122.58ms
iter 427630: loss 5.3694, time 121.41ms
iter 427640: loss 6.1273, time 121.43ms
iter 427650: loss 5.4156, time 123.44ms
iter 427660: loss 6.5926, time 121.44ms
iter 427670: loss 5.8862, time 121.40ms
iter 427680: loss 5.9995, time 121.74ms
iter 427690: loss 5.1769, time 120.57ms
iter 427700: loss 6.1443, time 120.10ms
iter 427710: loss 6.6397, time 120.73ms
iter 427720: loss 5.3821, time 122.40ms
iter 427730: loss 5.7837, time 121.55ms
iter 427740: loss 5.2132, time 121.45ms
step 427750: train loss 5.5900, val loss 5.5844
saving checkpoint to out-shakespeare-char
iter 427750: loss 6.1807, time 2910.50ms
iter 427760: loss 5.9154, time 121.86ms
iter 427770: loss 6.9682, time 123.12ms
iter 427780: loss 5.8803, time 121.87ms
iter 427790: loss 5.7592, time 121.96ms
iter 427800: loss 6.3891, time 124.39ms
iter 427810: loss 5.7035, time 126.04ms
iter 427820: loss 6.3754, time 125.76ms
iter 427830: loss 5.7926, time 124.77ms
iter 427840: loss 6.2555, time 125.42ms
iter 427850: loss 6.7670, time 124.97ms
iter 427860: loss 5.8376, time 125.65ms
iter 427870: loss 6.4989, time 124.70ms
iter 427880: loss 5.0826, time 125.97ms
iter 427890: loss 6.1367, time 124.97ms
iter 427900: loss 5.8891, time 125.21ms
iter 427910: loss 6.6896, time 125.38ms
iter 427920: loss 5.6272, time 125.29ms
iter 427930: loss 5.7274, time 124.87ms
iter 427940: loss 6.2104, time 125.18ms
iter 427950: loss 5.7500, time 124.18ms
iter 427960: loss 6.1247, time 125.35ms
iter 427970: loss 6.8813, time 125.83ms
iter 427980: loss 6.1039, time 124.99ms
iter 427990: loss 6.0416, time 124.64ms
step 428000: train loss 5.6216, val loss 5.5706
saving checkpoint to out-shakespeare-char
iter 428000: loss 5.6540, time 2902.70ms
iter 428010: loss 6.3490, time 122.10ms
iter 428020: loss 5.9165, time 119.44ms
iter 428030: loss 6.3298, time 119.49ms
iter 428040: loss 6.1345, time 120.65ms
iter 428050: loss 6.0447, time 121.90ms
iter 428060: loss 6.5202, time 120.62ms
iter 428070: loss 6.3357, time 121.33ms
iter 428080: loss 6.2134, time 123.93ms
iter 428090: loss 6.5984, time 121.32ms
iter 428100: loss 5.9316, time 121.17ms
iter 428110: loss 5.9920, time 121.91ms
iter 428120: loss 6.1557, time 121.33ms
iter 428130: loss 6.0673, time 121.34ms
iter 428140: loss 5.9441, time 121.24ms
iter 428150: loss 5.7239, time 122.64ms
iter 428160: loss 5.9404, time 121.34ms
iter 428170: loss 6.2944, time 121.44ms
iter 428180: loss 6.3608, time 121.76ms
iter 428190: loss 6.1029, time 121.20ms
iter 428200: loss 5.9205, time 121.33ms
iter 428210: loss 5.8856, time 123.89ms
iter 428220: loss 5.2591, time 121.16ms
iter 428230: loss 5.6517, time 121.27ms
iter 428240: loss 5.6090, time 121.77ms
step 428250: train loss 5.5656, val loss 5.5401
saving checkpoint to out-shakespeare-char
iter 428250: loss 6.2265, time 2893.29ms
iter 428260: loss 5.6333, time 125.24ms
iter 428270: loss 6.0382, time 125.26ms
iter 428280: loss 5.5543, time 125.32ms
iter 428290: loss 5.2637, time 125.43ms
iter 428300: loss 6.1387, time 125.44ms
iter 428310: loss 5.4846, time 125.43ms
iter 428320: loss 6.3543, time 125.62ms
iter 428330: loss 5.4739, time 125.61ms
iter 428340: loss 5.8554, time 125.22ms
iter 428350: loss 6.1673, time 125.15ms
iter 428360: loss 5.4879, time 125.38ms
iter 428370: loss 6.2220, time 126.09ms
iter 428380: loss 5.8443, time 125.49ms
iter 428390: loss 5.5290, time 126.23ms
iter 428400: loss 6.0187, time 125.73ms
iter 428410: loss 5.5313, time 126.13ms
iter 428420: loss 5.6904, time 125.27ms
iter 428430: loss 5.4449, time 125.51ms
iter 428440: loss 5.6272, time 125.47ms
iter 428450: loss 5.9664, time 125.43ms
iter 428460: loss 6.5605, time 126.00ms
iter 428470: loss 5.3202, time 125.61ms
iter 428480: loss 5.5575, time 125.81ms
iter 428490: loss 5.1635, time 125.84ms
step 428500: train loss 5.5780, val loss 5.5933
saving checkpoint to out-shakespeare-char
iter 428500: loss 6.1178, time 2888.06ms
iter 428510: loss 5.4815, time 128.38ms
iter 428520: loss 6.1767, time 125.50ms
iter 428530: loss 5.9138, time 127.87ms
iter 428540: loss 5.8544, time 125.29ms
iter 428550: loss 5.5458, time 128.20ms
iter 428560: loss 5.1585, time 125.87ms
iter 428570: loss 6.2473, time 128.19ms
iter 428580: loss 6.2991, time 125.36ms
iter 428590: loss 5.8659, time 128.15ms
iter 428600: loss 5.5207, time 125.62ms
iter 428610: loss 6.1464, time 127.87ms
iter 428620: loss 5.4355, time 125.36ms
iter 428630: loss 5.9911, time 127.81ms
iter 428640: loss 5.7948, time 125.71ms
iter 428650: loss 5.6425, time 125.08ms
iter 428660: loss 6.2046, time 125.56ms
iter 428670: loss 5.6466, time 126.05ms
iter 428680: loss 5.5319, time 125.75ms
iter 428690: loss 5.9095, time 124.96ms
iter 428700: loss 5.2871, time 125.84ms
iter 428710: loss 6.3399, time 125.41ms
iter 428720: loss 5.8009, time 125.48ms
iter 428730: loss 6.7396, time 126.10ms
iter 428740: loss 5.5267, time 125.80ms
step 428750: train loss 5.5361, val loss 5.6262
saving checkpoint to out-shakespeare-char
iter 428750: loss 6.0415, time 2890.25ms
iter 428760: loss 5.6119, time 128.11ms
iter 428770: loss 5.9487, time 124.51ms
iter 428780: loss 6.3888, time 127.21ms
iter 428790: loss 5.9017, time 125.15ms
iter 428800: loss 6.7841, time 127.20ms
iter 428810: loss 6.2832, time 124.00ms
iter 428820: loss 5.9532, time 128.01ms
iter 428830: loss 6.2456, time 125.50ms
iter 428840: loss 6.2338, time 128.01ms
iter 428850: loss 5.3441, time 125.34ms
iter 428860: loss 5.9165, time 127.94ms
iter 428870: loss 5.9895, time 125.56ms
iter 428880: loss 5.7458, time 127.78ms
iter 428890: loss 5.8404, time 125.62ms
iter 428900: loss 5.1815, time 127.72ms
iter 428910: loss 5.7799, time 124.43ms
iter 428920: loss 6.0074, time 128.33ms
iter 428930: loss 6.1857, time 124.67ms
iter 428940: loss 5.8983, time 127.46ms
iter 428950: loss 5.4430, time 125.36ms
iter 428960: loss 5.5858, time 125.33ms
iter 428970: loss 6.1998, time 125.08ms
iter 428980: loss 5.6526, time 124.32ms
iter 428990: loss 6.5713, time 125.28ms
step 429000: train loss 5.5602, val loss 5.5172
saving checkpoint to out-shakespeare-char
iter 429000: loss 5.7180, time 2874.55ms
iter 429010: loss 5.5496, time 125.26ms
iter 429020: loss 6.2584, time 125.78ms
iter 429030: loss 6.3066, time 125.77ms
iter 429040: loss 6.0940, time 125.85ms
iter 429050: loss 5.8602, time 125.81ms
iter 429060: loss 6.3893, time 126.06ms
iter 429070: loss 6.1722, time 125.84ms
iter 429080: loss 6.2881, time 125.94ms
iter 429090: loss 5.3300, time 126.00ms
iter 429100: loss 6.7465, time 125.97ms
iter 429110: loss 5.9659, time 126.42ms
iter 429120: loss 6.5717, time 126.09ms
iter 429130: loss 5.7673, time 125.92ms
iter 429140: loss 6.0697, time 125.97ms
iter 429150: loss 6.0576, time 125.86ms
iter 429160: loss 5.9764, time 126.02ms
iter 429170: loss 6.5688, time 125.81ms
iter 429180: loss 6.2894, time 125.97ms
iter 429190: loss 6.0733, time 125.87ms
iter 429200: loss 5.7483, time 126.07ms
iter 429210: loss 5.4847, time 125.95ms
iter 429220: loss 5.9046, time 126.30ms
iter 429230: loss 5.7366, time 125.89ms
iter 429240: loss 6.2451, time 126.44ms
step 429250: train loss 5.5307, val loss 5.5776
saving checkpoint to out-shakespeare-char
iter 429250: loss 5.9073, time 2881.47ms
iter 429260: loss 5.7828, time 126.08ms
iter 429270: loss 5.8411, time 124.95ms
iter 429280: loss 5.9960, time 119.95ms
iter 429290: loss 6.1651, time 121.89ms
iter 429300: loss 5.1886, time 121.02ms
iter 429310: loss 6.9026, time 119.43ms
iter 429320: loss 5.8321, time 119.95ms
iter 429330: loss 5.5225, time 122.52ms
iter 429340: loss 5.3815, time 119.92ms
iter 429350: loss 6.7477, time 119.77ms
iter 429360: loss 6.4475, time 120.68ms
iter 429370: loss 5.8748, time 121.26ms
iter 429380: loss 6.1451, time 119.99ms
iter 429390: loss 5.5567, time 119.90ms
iter 429400: loss 6.1043, time 122.61ms
iter 429410: loss 5.5362, time 120.16ms
iter 429420: loss 5.9086, time 120.00ms
iter 429430: loss 5.9409, time 120.28ms
iter 429440: loss 6.2145, time 120.74ms
iter 429450: loss 6.5343, time 119.86ms
iter 429460: loss 6.0730, time 120.01ms
iter 429470: loss 6.2595, time 121.12ms
iter 429480: loss 6.6151, time 119.88ms
iter 429490: loss 6.3306, time 119.95ms
step 429500: train loss 5.5312, val loss 5.5552
saving checkpoint to out-shakespeare-char
iter 429500: loss 6.3939, time 2909.90ms
iter 429510: loss 6.3537, time 119.50ms
iter 429520: loss 5.1998, time 120.41ms
iter 429530: loss 5.8015, time 122.53ms
iter 429540: loss 6.4938, time 119.91ms
iter 429550: loss 6.5179, time 119.95ms
iter 429560: loss 5.3184, time 119.83ms
iter 429570: loss 5.7155, time 119.84ms
iter 429580: loss 5.7156, time 119.94ms
iter 429590: loss 6.6374, time 120.90ms
iter 429600: loss 6.1569, time 121.00ms
iter 429610: loss 6.0666, time 119.95ms
iter 429620: loss 5.8029, time 121.33ms
iter 429630: loss 6.1332, time 121.34ms
iter 429640: loss 6.3662, time 119.77ms
iter 429650: loss 6.1069, time 119.93ms
iter 429660: loss 6.3786, time 122.48ms
iter 429670: loss 6.2769, time 120.42ms
iter 429680: loss 5.7059, time 120.17ms
iter 429690: loss 5.6614, time 121.97ms
iter 429700: loss 5.9008, time 121.88ms
iter 429710: loss 5.8529, time 121.70ms
iter 429720: loss 5.9969, time 121.93ms
iter 429730: loss 6.1325, time 122.93ms
iter 429740: loss 6.0363, time 120.71ms
step 429750: train loss 5.5864, val loss 5.5909
saving checkpoint to out-shakespeare-char
iter 429750: loss 5.4151, time 2888.59ms
iter 429760: loss 6.3508, time 125.01ms
iter 429770: loss 6.7795, time 124.54ms
iter 429780: loss 6.8529, time 124.83ms
iter 429790: loss 5.9172, time 124.86ms
iter 429800: loss 5.7704, time 124.73ms
iter 429810: loss 6.3596, time 125.87ms
iter 429820: loss 6.1312, time 124.44ms
iter 429830: loss 6.3879, time 125.83ms
iter 429840: loss 5.6739, time 125.26ms
iter 429850: loss 5.5829, time 125.14ms
iter 429860: loss 5.6644, time 125.40ms
iter 429870: loss 5.9533, time 125.39ms
iter 429880: loss 6.7041, time 125.19ms
iter 429890: loss 6.2906, time 125.12ms
iter 429900: loss 5.9801, time 125.37ms
iter 429910: loss 5.9341, time 125.03ms
iter 429920: loss 5.9260, time 125.48ms
iter 429930: loss 5.1513, time 125.33ms
iter 429940: loss 6.1106, time 125.32ms
iter 429950: loss 5.8533, time 125.40ms
iter 429960: loss 5.8462, time 124.41ms
iter 429970: loss 5.2583, time 125.25ms
iter 429980: loss 5.2629, time 125.26ms
iter 429990: loss 6.4274, time 125.32ms
step 430000: train loss 5.5216, val loss 5.5296
saving checkpoint to out-shakespeare-char
iter 430000: loss 6.2967, time 2878.14ms
iter 430010: loss 6.0613, time 125.62ms
iter 430020: loss 5.8889, time 128.01ms
iter 430030: loss 5.6061, time 125.53ms
iter 430040: loss 6.3375, time 128.12ms
iter 430050: loss 5.8949, time 125.53ms
iter 430060: loss 5.8323, time 127.10ms
iter 430070: loss 6.1409, time 125.69ms
iter 430080: loss 5.5919, time 125.59ms
iter 430090: loss 5.0725, time 124.88ms
iter 430100: loss 5.1593, time 125.74ms
iter 430110: loss 5.7406, time 125.72ms
iter 430120: loss 6.0886, time 125.75ms
iter 430130: loss 6.2255, time 125.30ms
iter 430140: loss 6.5698, time 125.85ms
iter 430150: loss 5.6124, time 125.29ms
iter 430160: loss 6.3201, time 125.71ms
iter 430170: loss 5.7819, time 125.28ms
iter 430180: loss 6.6650, time 125.27ms
iter 430190: loss 5.0113, time 125.07ms
iter 430200: loss 6.4379, time 125.48ms
iter 430210: loss 5.7799, time 125.22ms
iter 430220: loss 6.3174, time 125.35ms
iter 430230: loss 5.6558, time 125.52ms
iter 430240: loss 6.1239, time 125.46ms
step 430250: train loss 5.5974, val loss 5.5744
saving checkpoint to out-shakespeare-char
iter 430250: loss 5.6267, time 2896.67ms
iter 430260: loss 5.9370, time 128.14ms
iter 430270: loss 5.6567, time 125.36ms
iter 430280: loss 6.4277, time 127.87ms
iter 430290: loss 5.7741, time 125.89ms
iter 430300: loss 6.1031, time 125.60ms
iter 430310: loss 6.6632, time 125.26ms
iter 430320: loss 5.7364, time 125.21ms
iter 430330: loss 5.4296, time 125.03ms
iter 430340: loss 6.6173, time 125.10ms
iter 430350: loss 5.1354, time 125.28ms
iter 430360: loss 6.3675, time 125.48ms
iter 430370: loss 5.6522, time 125.23ms
iter 430380: loss 6.5604, time 125.11ms
iter 430390: loss 5.8836, time 124.99ms
iter 430400: loss 6.3475, time 125.50ms
iter 430410: loss 6.3822, time 125.50ms
iter 430420: loss 5.1061, time 125.00ms
iter 430430: loss 6.2114, time 125.97ms
iter 430440: loss 5.9734, time 126.26ms
iter 430450: loss 6.2451, time 129.62ms
iter 430460: loss 6.2738, time 126.62ms
iter 430470: loss 5.5721, time 126.17ms
iter 430480: loss 6.2055, time 125.75ms
iter 430490: loss 6.1764, time 125.29ms
step 430500: train loss 5.6129, val loss 5.5690
saving checkpoint to out-shakespeare-char
iter 430500: loss 6.0510, time 2898.93ms
iter 430510: loss 6.0496, time 125.90ms
iter 430520: loss 5.5496, time 125.93ms
iter 430530: loss 6.5382, time 125.15ms
iter 430540: loss 6.1875, time 125.30ms
iter 430550: loss 6.4476, time 125.42ms
iter 430560: loss 6.6303, time 125.53ms
iter 430570: loss 5.2952, time 125.62ms
iter 430580: loss 5.6747, time 125.32ms
iter 430590: loss 5.4195, time 124.55ms
iter 430600: loss 5.9199, time 125.23ms
iter 430610: loss 6.0964, time 125.26ms
iter 430620: loss 5.9615, time 125.24ms
iter 430630: loss 5.9152, time 125.74ms
iter 430640: loss 6.2930, time 126.13ms
iter 430650: loss 6.6636, time 125.64ms
iter 430660: loss 5.7222, time 125.01ms
iter 430670: loss 6.5068, time 125.09ms
iter 430680: loss 6.5252, time 125.13ms
iter 430690: loss 5.6870, time 125.44ms
iter 430700: loss 6.0420, time 125.40ms
iter 430710: loss 5.9130, time 125.53ms
iter 430720: loss 6.0467, time 121.03ms
iter 430730: loss 5.7127, time 121.95ms
iter 430740: loss 6.0384, time 125.59ms
step 430750: train loss 5.5677, val loss 5.5741
saving checkpoint to out-shakespeare-char
iter 430750: loss 6.1017, time 2885.23ms
iter 430760: loss 6.3810, time 125.27ms
iter 430770: loss 6.4062, time 125.54ms
iter 430780: loss 6.4635, time 125.12ms
iter 430790: loss 6.2214, time 125.84ms
iter 430800: loss 5.9160, time 125.76ms
iter 430810: loss 6.2647, time 125.94ms
iter 430820: loss 6.0324, time 125.41ms
iter 430830: loss 6.2244, time 125.31ms
iter 430840: loss 5.9599, time 124.99ms
iter 430850: loss 5.5995, time 124.75ms
iter 430860: loss 5.9151, time 126.06ms
iter 430870: loss 6.4169, time 125.27ms
iter 430880: loss 5.6052, time 125.84ms
iter 430890: loss 6.1272, time 124.97ms
iter 430900: loss 6.7276, time 125.73ms
iter 430910: loss 6.3722, time 124.84ms
iter 430920: loss 6.1004, time 125.54ms
iter 430930: loss 5.9805, time 124.50ms
iter 430940: loss 5.8612, time 125.84ms
iter 430950: loss 5.6970, time 125.57ms
iter 430960: loss 6.4555, time 125.51ms
iter 430970: loss 6.4336, time 125.98ms
iter 430980: loss 5.7352, time 124.12ms
iter 430990: loss 5.9099, time 125.22ms
step 431000: train loss 5.5189, val loss 5.5932
saving checkpoint to out-shakespeare-char
iter 431000: loss 5.2234, time 2920.19ms
iter 431010: loss 5.6083, time 125.75ms
iter 431020: loss 5.7745, time 125.77ms
iter 431030: loss 5.9673, time 125.67ms
iter 431040: loss 5.7888, time 126.16ms
iter 431050: loss 5.8658, time 125.56ms
iter 431060: loss 6.3185, time 125.62ms
iter 431070: loss 6.1355, time 125.92ms
iter 431080: loss 5.9391, time 125.88ms
iter 431090: loss 6.5800, time 125.33ms
iter 431100: loss 5.7187, time 125.70ms
iter 431110: loss 6.5232, time 125.66ms
iter 431120: loss 5.8770, time 126.10ms
iter 431130: loss 6.3783, time 125.40ms
iter 431140: loss 5.7040, time 125.65ms
iter 431150: loss 6.3612, time 125.66ms
iter 431160: loss 6.2956, time 125.91ms
iter 431170: loss 6.0500, time 125.40ms
iter 431180: loss 5.3233, time 125.41ms
iter 431190: loss 5.3951, time 125.50ms
iter 431200: loss 6.1061, time 125.85ms
iter 431210: loss 6.2416, time 125.75ms
iter 431220: loss 6.0867, time 125.95ms
iter 431230: loss 6.3206, time 126.16ms
iter 431240: loss 5.8803, time 126.06ms
step 431250: train loss 5.5872, val loss 5.5552
saving checkpoint to out-shakespeare-char
iter 431250: loss 6.4509, time 2900.44ms
iter 431260: loss 5.8916, time 125.56ms
iter 431270: loss 6.5136, time 128.04ms
iter 431280: loss 6.7479, time 125.95ms
iter 431290: loss 6.0258, time 128.35ms
iter 431300: loss 5.6098, time 124.76ms
iter 431310: loss 5.7074, time 128.53ms
iter 431320: loss 6.0683, time 126.05ms
iter 431330: loss 5.5881, time 128.78ms
iter 431340: loss 5.6345, time 125.76ms
iter 431350: loss 5.9767, time 128.54ms
iter 431360: loss 6.0728, time 126.06ms
iter 431370: loss 6.0232, time 128.46ms
iter 431380: loss 5.8804, time 125.50ms
iter 431390: loss 6.0678, time 128.42ms
iter 431400: loss 5.5876, time 126.01ms
iter 431410: loss 6.3249, time 128.41ms
iter 431420: loss 5.6898, time 125.79ms
iter 431430: loss 5.5699, time 128.37ms
iter 431440: loss 5.6445, time 125.72ms
iter 431450: loss 5.5297, time 128.42ms
iter 431460: loss 6.2864, time 125.90ms
iter 431470: loss 5.9766, time 128.47ms
iter 431480: loss 5.8207, time 125.84ms
iter 431490: loss 5.6316, time 127.89ms
step 431500: train loss 5.5831, val loss 5.5669
saving checkpoint to out-shakespeare-char
iter 431500: loss 6.2266, time 2871.69ms
iter 431510: loss 5.5652, time 125.48ms
iter 431520: loss 5.7769, time 125.12ms
iter 431530: loss 5.8910, time 125.56ms
iter 431540: loss 6.1626, time 125.24ms
iter 431550: loss 5.5130, time 125.82ms
iter 431560: loss 5.7593, time 125.67ms
iter 431570: loss 6.3023, time 125.32ms
iter 431580: loss 5.8982, time 125.90ms
iter 431590: loss 6.2514, time 125.31ms
iter 431600: loss 5.4636, time 126.31ms
iter 431610: loss 5.6169, time 125.49ms
iter 431620: loss 5.4709, time 125.60ms
iter 431630: loss 6.1304, time 125.61ms
iter 431640: loss 5.7354, time 125.68ms
iter 431650: loss 6.0314, time 125.93ms
iter 431660: loss 6.0844, time 125.42ms
iter 431670: loss 5.4628, time 125.72ms
iter 431680: loss 5.5691, time 125.49ms
iter 431690: loss 5.7309, time 125.59ms
iter 431700: loss 5.6462, time 125.86ms
iter 431710: loss 5.9463, time 126.06ms
iter 431720: loss 6.0664, time 125.54ms
iter 431730: loss 6.7886, time 125.46ms
iter 431740: loss 5.2731, time 127.01ms
step 431750: train loss 5.6106, val loss 5.5796
saving checkpoint to out-shakespeare-char
iter 431750: loss 5.9660, time 2883.38ms
iter 431760: loss 6.2364, time 121.07ms
iter 431770: loss 6.0029, time 119.61ms
iter 431780: loss 6.2300, time 119.54ms
iter 431790: loss 6.0006, time 123.90ms
iter 431800: loss 5.4777, time 121.31ms
iter 431810: loss 6.5834, time 121.50ms
iter 431820: loss 6.1855, time 121.45ms
iter 431830: loss 5.3723, time 121.36ms
iter 431840: loss 6.2753, time 121.65ms
iter 431850: loss 6.4442, time 121.54ms
iter 431860: loss 6.2248, time 121.42ms
iter 431870: loss 6.4267, time 121.07ms
iter 431880: loss 6.4978, time 121.16ms
iter 431890: loss 5.7681, time 122.55ms
iter 431900: loss 6.4230, time 120.94ms
iter 431910: loss 6.6194, time 120.32ms
iter 431920: loss 5.9188, time 124.11ms
iter 431930: loss 6.1253, time 121.30ms
iter 431940: loss 5.4280, time 121.33ms
iter 431950: loss 6.6245, time 121.64ms
iter 431960: loss 5.7040, time 121.24ms
iter 431970: loss 5.9507, time 121.22ms
iter 431980: loss 5.9019, time 120.76ms
iter 431990: loss 6.0895, time 121.11ms
step 432000: train loss 5.5577, val loss 5.6056
saving checkpoint to out-shakespeare-char
iter 432000: loss 6.3318, time 2897.51ms
iter 432010: loss 5.3050, time 121.31ms
iter 432020: loss 5.9566, time 122.02ms
iter 432030: loss 6.1400, time 120.64ms
iter 432040: loss 5.7636, time 121.78ms
iter 432050: loss 6.0177, time 123.70ms
iter 432060: loss 6.0990, time 121.49ms
iter 432070: loss 5.8353, time 121.47ms
iter 432080: loss 6.2916, time 121.44ms
iter 432090: loss 6.0115, time 121.97ms
iter 432100: loss 6.0289, time 121.59ms
iter 432110: loss 5.7399, time 121.95ms
iter 432120: loss 5.0230, time 122.72ms
iter 432130: loss 5.3749, time 121.45ms
iter 432140: loss 6.2920, time 122.18ms
iter 432150: loss 6.2934, time 123.10ms
iter 432160: loss 6.0834, time 121.58ms
iter 432170: loss 6.2836, time 121.46ms
iter 432180: loss 5.8036, time 122.69ms
iter 432190: loss 6.2934, time 121.70ms
iter 432200: loss 5.9537, time 121.65ms
iter 432210: loss 5.8488, time 123.84ms
iter 432220: loss 5.8670, time 121.53ms
iter 432230: loss 5.9824, time 121.75ms
iter 432240: loss 6.0272, time 121.54ms
step 432250: train loss 5.5714, val loss 5.5662
saving checkpoint to out-shakespeare-char
iter 432250: loss 5.8138, time 2890.53ms
iter 432260: loss 6.2290, time 121.93ms
iter 432270: loss 5.0816, time 121.47ms
iter 432280: loss 6.6311, time 122.70ms
iter 432290: loss 5.4775, time 120.80ms
iter 432300: loss 6.4437, time 121.36ms
iter 432310: loss 6.2387, time 124.06ms
iter 432320: loss 6.5848, time 121.39ms
iter 432330: loss 6.3664, time 121.28ms
iter 432340: loss 5.2874, time 122.09ms
iter 432350: loss 5.6772, time 121.42ms
iter 432360: loss 5.6186, time 121.37ms
iter 432370: loss 5.4238, time 121.35ms
iter 432380: loss 6.0088, time 122.57ms
iter 432390: loss 5.7243, time 120.67ms
iter 432400: loss 6.2585, time 121.27ms
iter 432410: loss 6.1207, time 122.82ms
iter 432420: loss 5.9734, time 121.37ms
iter 432430: loss 6.2002, time 121.76ms
iter 432440: loss 6.7076, time 123.93ms
iter 432450: loss 5.5747, time 121.32ms
iter 432460: loss 5.5680, time 121.67ms
iter 432470: loss 5.8359, time 121.37ms
iter 432480: loss 5.8077, time 121.27ms
iter 432490: loss 4.8506, time 122.00ms
step 432500: train loss 5.5266, val loss 5.5700
saving checkpoint to out-shakespeare-char
iter 432500: loss 5.5647, time 2907.00ms
iter 432510: loss 6.1252, time 124.55ms
iter 432520: loss 6.1234, time 125.43ms
iter 432530: loss 6.4169, time 124.48ms
iter 432540: loss 5.8381, time 126.02ms
iter 432550: loss 6.3468, time 124.48ms
iter 432560: loss 6.0157, time 125.54ms
iter 432570: loss 6.1212, time 124.84ms
iter 432580: loss 5.7236, time 125.53ms
iter 432590: loss 6.1643, time 125.78ms
iter 432600: loss 5.0787, time 124.19ms
iter 432610: loss 6.2329, time 122.03ms
iter 432620: loss 5.6032, time 122.01ms
iter 432630: loss 6.2616, time 121.24ms
iter 432640: loss 5.8545, time 122.20ms
iter 432650: loss 6.2459, time 121.06ms
iter 432660: loss 5.8443, time 121.94ms
iter 432670: loss 5.9748, time 123.02ms
iter 432680: loss 5.8298, time 121.82ms
iter 432690: loss 5.6701, time 122.19ms
iter 432700: loss 6.1043, time 124.48ms
iter 432710: loss 5.5225, time 121.99ms
iter 432720: loss 6.2410, time 121.93ms
iter 432730: loss 5.3825, time 121.75ms
iter 432740: loss 5.3839, time 121.92ms
step 432750: train loss 5.5770, val loss 5.5981
saving checkpoint to out-shakespeare-char
iter 432750: loss 6.2746, time 2912.72ms
iter 432760: loss 6.5855, time 121.98ms
iter 432770: loss 5.8160, time 124.48ms
iter 432780: loss 5.8741, time 122.06ms
iter 432790: loss 5.2896, time 122.05ms
iter 432800: loss 5.4049, time 121.88ms
iter 432810: loss 6.0922, time 121.96ms
iter 432820: loss 6.1588, time 121.79ms
iter 432830: loss 5.8896, time 121.89ms
iter 432840: loss 6.0058, time 123.28ms
iter 432850: loss 5.3550, time 121.95ms
iter 432860: loss 6.2302, time 121.90ms
iter 432870: loss 5.6853, time 123.77ms
iter 432880: loss 5.4505, time 121.91ms
iter 432890: loss 5.7427, time 121.88ms
iter 432900: loss 5.9682, time 124.38ms
iter 432910: loss 6.1458, time 121.78ms
iter 432920: loss 5.4794, time 122.08ms
iter 432930: loss 5.9772, time 122.25ms
iter 432940: loss 5.2728, time 121.97ms
iter 432950: loss 6.2539, time 122.12ms
iter 432960: loss 6.1490, time 121.89ms
iter 432970: loss 6.3823, time 123.11ms
iter 432980: loss 6.0655, time 121.82ms
iter 432990: loss 6.5011, time 121.14ms
step 433000: train loss 5.5580, val loss 5.5712
saving checkpoint to out-shakespeare-char
iter 433000: loss 6.2225, time 2898.64ms
iter 433010: loss 5.3835, time 119.89ms
iter 433020: loss 6.4676, time 121.74ms
iter 433030: loss 6.5284, time 121.61ms
iter 433040: loss 6.1127, time 122.60ms
iter 433050: loss 5.7887, time 121.83ms
iter 433060: loss 5.5939, time 121.50ms
iter 433070: loss 5.9605, time 124.21ms
iter 433080: loss 6.0951, time 121.56ms
iter 433090: loss 6.2550, time 121.61ms
iter 433100: loss 5.8112, time 121.54ms
iter 433110: loss 5.5531, time 121.46ms
iter 433120: loss 6.0121, time 121.62ms
iter 433130: loss 6.2454, time 122.10ms
iter 433140: loss 6.1620, time 122.58ms
iter 433150: loss 6.3284, time 121.52ms
iter 433160: loss 5.6598, time 121.50ms
iter 433170: loss 6.0090, time 121.78ms
iter 433180: loss 5.9616, time 121.60ms
iter 433190: loss 5.4040, time 120.21ms
iter 433200: loss 5.9453, time 122.92ms
iter 433210: loss 5.2817, time 120.73ms
iter 433220: loss 5.7521, time 121.43ms
iter 433230: loss 5.7844, time 124.26ms
iter 433240: loss 5.9705, time 121.66ms
step 433250: train loss 5.5785, val loss 5.5772
saving checkpoint to out-shakespeare-char
iter 433250: loss 5.2079, time 2902.33ms
iter 433260: loss 5.6773, time 121.90ms
iter 433270: loss 5.7513, time 124.49ms
iter 433280: loss 6.1261, time 121.75ms
iter 433290: loss 5.6370, time 121.82ms
iter 433300: loss 5.3967, time 121.42ms
iter 433310: loss 6.7916, time 120.87ms
iter 433320: loss 6.5433, time 120.97ms
iter 433330: loss 6.0999, time 121.28ms
iter 433340: loss 6.0921, time 122.85ms
iter 433350: loss 5.8828, time 121.88ms
iter 433360: loss 5.5695, time 122.43ms
iter 433370: loss 6.7393, time 122.83ms
iter 433380: loss 6.5896, time 121.16ms
iter 433390: loss 6.5061, time 121.76ms
iter 433400: loss 6.6227, time 124.31ms
iter 433410: loss 5.5156, time 120.75ms
iter 433420: loss 6.0364, time 121.93ms
iter 433430: loss 6.1264, time 121.90ms
iter 433440: loss 6.3598, time 121.78ms
iter 433450: loss 6.5263, time 121.66ms
iter 433460: loss 6.0621, time 121.80ms
iter 433470: loss 5.9441, time 122.72ms
iter 433480: loss 6.3904, time 121.82ms
iter 433490: loss 5.8145, time 121.57ms
step 433500: train loss 5.5526, val loss 5.5717
saving checkpoint to out-shakespeare-char
iter 433500: loss 6.3031, time 2900.68ms
iter 433510: loss 5.7325, time 124.90ms
iter 433520: loss 6.6759, time 125.68ms
iter 433530: loss 5.1353, time 125.91ms
iter 433540: loss 5.8274, time 125.34ms
iter 433550: loss 5.7849, time 124.72ms
iter 433560: loss 5.8063, time 125.68ms
iter 433570: loss 5.1029, time 125.49ms
iter 433580: loss 6.2262, time 125.20ms
iter 433590: loss 6.3486, time 125.87ms
iter 433600: loss 6.3971, time 125.63ms
iter 433610: loss 5.8321, time 125.39ms
iter 433620: loss 5.2615, time 125.24ms
iter 433630: loss 5.6594, time 125.83ms
iter 433640: loss 6.1550, time 127.96ms
iter 433650: loss 5.9097, time 125.77ms
iter 433660: loss 6.0365, time 128.13ms
iter 433670: loss 5.7726, time 125.61ms
iter 433680: loss 5.6144, time 128.12ms
iter 433690: loss 6.3188, time 125.68ms
iter 433700: loss 6.0886, time 128.06ms
iter 433710: loss 6.0507, time 125.41ms
iter 433720: loss 5.6774, time 128.13ms
iter 433730: loss 5.2106, time 126.23ms
iter 433740: loss 6.2505, time 127.81ms
step 433750: train loss 5.6161, val loss 5.5960
saving checkpoint to out-shakespeare-char
iter 433750: loss 5.7503, time 2885.00ms
iter 433760: loss 6.2787, time 124.71ms
iter 433770: loss 5.8262, time 124.73ms
iter 433780: loss 5.9189, time 124.83ms
iter 433790: loss 6.7261, time 125.52ms
iter 433800: loss 6.3459, time 125.13ms
iter 433810: loss 6.3515, time 125.17ms
iter 433820: loss 6.0946, time 125.33ms
iter 433830: loss 6.2527, time 124.21ms
iter 433840: loss 5.8219, time 125.67ms
iter 433850: loss 6.2361, time 124.23ms
iter 433860: loss 4.8373, time 125.60ms
iter 433870: loss 5.2488, time 125.49ms
iter 433880: loss 6.2128, time 125.50ms
iter 433890: loss 5.9366, time 125.28ms
iter 433900: loss 5.2923, time 125.41ms
iter 433910: loss 6.0685, time 124.05ms
iter 433920: loss 6.1300, time 125.00ms
iter 433930: loss 6.1811, time 125.19ms
iter 433940: loss 5.8492, time 125.01ms
iter 433950: loss 6.0743, time 124.97ms
iter 433960: loss 5.8862, time 125.07ms
iter 433970: loss 5.9828, time 124.73ms
iter 433980: loss 5.9663, time 124.74ms
iter 433990: loss 6.2397, time 124.19ms
step 434000: train loss 5.5313, val loss 5.5031
saving checkpoint to out-shakespeare-char
iter 434000: loss 6.0356, time 2888.63ms
iter 434010: loss 5.4346, time 126.06ms
iter 434020: loss 5.8578, time 125.69ms
iter 434030: loss 6.3136, time 126.04ms
iter 434040: loss 6.0352, time 125.92ms
iter 434050: loss 6.2551, time 125.83ms
iter 434060: loss 6.4862, time 125.91ms
iter 434070: loss 5.7514, time 125.65ms
iter 434080: loss 6.1053, time 125.91ms
iter 434090: loss 6.1274, time 125.50ms
iter 434100: loss 6.5015, time 125.33ms
iter 434110: loss 5.4474, time 125.76ms
iter 434120: loss 5.6537, time 125.91ms
iter 434130: loss 5.5211, time 125.62ms
iter 434140: loss 5.6701, time 126.02ms
iter 434150: loss 5.7689, time 125.97ms
iter 434160: loss 6.8564, time 125.87ms
iter 434170: loss 6.0954, time 125.80ms
iter 434180: loss 6.4243, time 126.13ms
iter 434190: loss 5.7777, time 125.67ms
iter 434200: loss 6.2666, time 125.99ms
iter 434210: loss 5.5907, time 126.12ms
iter 434220: loss 5.7369, time 126.07ms
iter 434230: loss 6.0707, time 126.08ms
iter 434240: loss 6.1172, time 126.25ms
step 434250: train loss 5.5576, val loss 5.5587
saving checkpoint to out-shakespeare-char
iter 434250: loss 5.6661, time 2889.34ms
iter 434260: loss 5.8547, time 125.73ms
iter 434270: loss 6.0380, time 125.39ms
iter 434280: loss 5.4903, time 125.29ms
iter 434290: loss 6.0469, time 125.11ms
iter 434300: loss 6.3228, time 125.12ms
iter 434310: loss 6.3712, time 125.44ms
iter 434320: loss 5.6529, time 125.35ms
iter 434330: loss 5.8413, time 125.20ms
iter 434340: loss 6.0966, time 125.35ms
iter 434350: loss 6.4377, time 125.22ms
iter 434360: loss 6.3106, time 125.22ms
iter 434370: loss 5.4884, time 125.77ms
iter 434380: loss 6.7421, time 125.86ms
iter 434390: loss 6.3470, time 125.78ms
iter 434400: loss 5.8934, time 125.73ms
iter 434410: loss 6.3285, time 125.62ms
iter 434420: loss 5.8448, time 125.81ms
iter 434430: loss 5.6595, time 125.70ms
iter 434440: loss 6.0140, time 125.66ms
iter 434450: loss 6.3128, time 125.48ms
iter 434460: loss 6.3137, time 125.29ms
iter 434470: loss 5.5661, time 125.05ms
iter 434480: loss 6.2043, time 125.45ms
iter 434490: loss 6.0934, time 125.20ms
step 434500: train loss 5.5432, val loss 5.6166
saving checkpoint to out-shakespeare-char
iter 434500: loss 5.7661, time 2888.00ms
iter 434510: loss 5.9038, time 125.09ms
iter 434520: loss 6.5041, time 125.19ms
iter 434530: loss 5.9421, time 124.81ms
iter 434540: loss 5.9822, time 124.91ms
iter 434550: loss 5.8113, time 125.99ms
iter 434560: loss 6.1379, time 124.80ms
iter 434570: loss 5.5465, time 124.79ms
iter 434580: loss 6.1649, time 124.24ms
iter 434590: loss 6.3065, time 124.85ms
iter 434600: loss 6.2472, time 125.19ms
iter 434610: loss 5.8104, time 124.92ms
iter 434620: loss 5.7348, time 125.44ms
iter 434630: loss 5.3768, time 124.98ms
iter 434640: loss 6.5089, time 124.89ms
iter 434650: loss 5.8349, time 124.93ms
iter 434660: loss 6.0624, time 125.36ms
iter 434670: loss 5.9266, time 124.84ms
iter 434680: loss 5.8390, time 125.05ms
iter 434690: loss 6.4120, time 124.71ms
iter 434700: loss 6.0057, time 125.92ms
iter 434710: loss 6.6080, time 125.57ms
iter 434720: loss 5.6238, time 125.72ms
iter 434730: loss 5.9320, time 125.90ms
iter 434740: loss 6.6429, time 125.72ms
step 434750: train loss 5.5710, val loss 5.5573
saving checkpoint to out-shakespeare-char
iter 434750: loss 6.2175, time 2902.19ms
iter 434760: loss 6.1720, time 125.59ms
iter 434770: loss 6.1037, time 125.94ms
iter 434780: loss 6.4279, time 125.22ms
iter 434790: loss 6.1064, time 125.12ms
iter 434800: loss 5.7395, time 125.73ms
iter 434810: loss 5.6744, time 124.64ms
iter 434820: loss 5.8466, time 124.98ms
iter 434830: loss 5.6251, time 125.06ms
iter 434840: loss 5.7973, time 124.62ms
iter 434850: loss 5.9109, time 125.30ms
iter 434860: loss 5.1734, time 124.47ms
iter 434870: loss 5.5199, time 124.57ms
iter 434880: loss 6.3543, time 124.23ms
iter 434890: loss 6.1185, time 124.56ms
iter 434900: loss 6.0070, time 125.24ms
iter 434910: loss 6.1587, time 125.58ms
iter 434920: loss 5.7725, time 125.10ms
iter 434930: loss 6.0926, time 125.08ms
iter 434940: loss 6.2476, time 125.07ms
iter 434950: loss 5.6694, time 125.08ms
iter 434960: loss 6.0875, time 125.52ms
iter 434970: loss 6.1637, time 124.07ms
iter 434980: loss 5.5976, time 125.30ms
iter 434990: loss 5.6183, time 124.56ms
step 435000: train loss 5.6061, val loss 5.5503
saving checkpoint to out-shakespeare-char
iter 435000: loss 6.8244, time 2881.05ms
iter 435010: loss 6.0288, time 128.88ms
iter 435020: loss 5.8945, time 125.38ms
iter 435030: loss 5.8503, time 127.87ms
iter 435040: loss 5.3584, time 125.52ms
iter 435050: loss 5.3273, time 128.38ms
iter 435060: loss 5.5512, time 125.56ms
iter 435070: loss 5.4915, time 128.40ms
iter 435080: loss 6.0255, time 125.78ms
iter 435090: loss 6.7710, time 129.35ms
iter 435100: loss 5.9924, time 125.14ms
iter 435110: loss 6.3342, time 125.19ms
iter 435120: loss 6.3026, time 126.30ms
iter 435130: loss 6.5991, time 125.29ms
iter 435140: loss 6.6600, time 125.10ms
iter 435150: loss 6.3422, time 125.75ms
iter 435160: loss 6.2986, time 125.39ms
iter 435170: loss 6.1729, time 126.57ms
iter 435180: loss 6.4659, time 125.36ms
iter 435190: loss 6.3222, time 125.32ms
iter 435200: loss 7.1685, time 125.34ms
iter 435210: loss 6.1334, time 125.23ms
iter 435220: loss 5.6481, time 125.27ms
iter 435230: loss 6.0258, time 125.32ms
iter 435240: loss 6.0727, time 125.27ms
step 435250: train loss 5.6610, val loss 5.6033
saving checkpoint to out-shakespeare-char
iter 435250: loss 5.4040, time 2906.84ms
iter 435260: loss 6.2641, time 125.96ms
iter 435270: loss 5.6951, time 125.25ms
iter 435280: loss 5.8546, time 125.12ms
iter 435290: loss 5.3743, time 125.22ms
iter 435300: loss 6.4340, time 125.38ms
iter 435310: loss 5.5847, time 125.42ms
iter 435320: loss 6.8200, time 125.14ms
iter 435330: loss 5.7843, time 125.40ms
iter 435340: loss 6.0890, time 125.49ms
iter 435350: loss 4.7321, time 125.86ms
iter 435360: loss 6.0242, time 125.34ms
iter 435370: loss 5.6154, time 125.37ms
iter 435380: loss 6.1662, time 125.54ms
iter 435390: loss 6.8592, time 125.37ms
iter 435400: loss 6.2233, time 125.41ms
iter 435410: loss 5.5344, time 125.47ms
iter 435420: loss 5.9542, time 125.37ms
iter 435430: loss 6.0603, time 125.83ms
iter 435440: loss 4.9910, time 125.68ms
iter 435450: loss 5.9341, time 121.66ms
iter 435460: loss 6.1700, time 122.44ms
iter 435470: loss 5.6596, time 120.93ms
iter 435480: loss 5.4372, time 121.32ms
iter 435490: loss 6.5168, time 122.33ms
step 435500: train loss 5.5807, val loss 5.5754
saving checkpoint to out-shakespeare-char
iter 435500: loss 5.8235, time 2878.38ms
iter 435510: loss 6.0117, time 122.51ms
iter 435520: loss 5.5446, time 124.42ms
iter 435530: loss 5.7059, time 121.86ms
iter 435540: loss 5.9683, time 121.96ms
iter 435550: loss 5.9491, time 121.90ms
iter 435560: loss 6.0747, time 121.81ms
iter 435570: loss 5.8566, time 121.71ms
iter 435580: loss 5.6815, time 121.94ms
iter 435590: loss 5.8471, time 122.93ms
iter 435600: loss 5.6940, time 121.79ms
iter 435610: loss 5.7737, time 121.82ms
iter 435620: loss 5.8259, time 122.95ms
iter 435630: loss 5.3855, time 122.07ms
iter 435640: loss 6.3728, time 121.75ms
iter 435650: loss 5.3969, time 124.53ms
iter 435660: loss 5.3329, time 122.15ms
iter 435670: loss 5.6261, time 121.99ms
iter 435680: loss 5.8521, time 121.88ms
iter 435690: loss 5.8608, time 121.88ms
iter 435700: loss 5.7353, time 121.84ms
iter 435710: loss 5.9849, time 121.32ms
iter 435720: loss 5.4052, time 123.06ms
iter 435730: loss 5.8312, time 121.93ms
iter 435740: loss 5.9474, time 121.88ms
step 435750: train loss 5.5397, val loss 5.5487
saving checkpoint to out-shakespeare-char
iter 435750: loss 5.9983, time 2892.86ms
iter 435760: loss 6.2466, time 122.03ms
iter 435770: loss 5.9168, time 121.58ms
iter 435780: loss 6.1246, time 121.34ms
iter 435790: loss 5.8491, time 123.07ms
iter 435800: loss 6.0181, time 121.57ms
iter 435810: loss 5.8158, time 120.82ms
iter 435820: loss 6.0930, time 122.89ms
iter 435830: loss 6.0235, time 121.51ms
iter 435840: loss 5.7728, time 122.02ms
iter 435850: loss 5.5302, time 124.25ms
iter 435860: loss 5.7061, time 121.72ms
iter 435870: loss 5.7977, time 121.59ms
iter 435880: loss 5.8161, time 121.55ms
iter 435890: loss 6.3653, time 121.80ms
iter 435900: loss 6.0328, time 121.57ms
iter 435910: loss 6.8547, time 121.56ms
iter 435920: loss 6.1431, time 123.24ms
iter 435930: loss 5.3366, time 121.48ms
iter 435940: loss 5.8736, time 121.64ms
iter 435950: loss 5.7626, time 122.82ms
iter 435960: loss 5.8963, time 121.54ms
iter 435970: loss 6.0597, time 121.67ms
iter 435980: loss 6.5185, time 124.73ms
iter 435990: loss 6.3123, time 121.48ms
step 436000: train loss 5.5353, val loss 5.5467
saving checkpoint to out-shakespeare-char
iter 436000: loss 6.6387, time 2897.67ms
iter 436010: loss 5.4498, time 125.71ms
iter 436020: loss 5.3367, time 126.90ms
iter 436030: loss 5.8192, time 125.30ms
iter 436040: loss 6.1143, time 125.70ms
iter 436050: loss 6.2901, time 125.72ms
iter 436060: loss 5.9226, time 122.54ms
iter 436070: loss 6.1919, time 127.76ms
iter 436080: loss 5.6183, time 125.64ms
iter 436090: loss 5.6341, time 127.63ms
iter 436100: loss 6.1915, time 125.17ms
iter 436110: loss 5.7708, time 128.05ms
iter 436120: loss 5.9229, time 125.41ms
iter 436130: loss 6.0114, time 128.38ms
iter 436140: loss 6.4705, time 125.25ms
iter 436150: loss 5.9636, time 127.75ms
iter 436160: loss 5.5637, time 125.31ms
iter 436170: loss 5.4633, time 129.04ms
iter 436180: loss 5.4675, time 125.27ms
iter 436190: loss 5.8609, time 127.62ms
iter 436200: loss 5.9273, time 125.22ms
iter 436210: loss 5.2241, time 127.88ms
iter 436220: loss 5.4616, time 125.54ms
iter 436230: loss 5.3578, time 128.07ms
iter 436240: loss 6.1973, time 124.69ms
step 436250: train loss 5.5452, val loss 5.6255
saving checkpoint to out-shakespeare-char
iter 436250: loss 5.7009, time 2903.96ms
iter 436260: loss 6.2320, time 125.61ms
iter 436270: loss 6.2848, time 126.14ms
iter 436280: loss 6.0427, time 125.56ms
iter 436290: loss 5.4197, time 125.24ms
iter 436300: loss 6.1376, time 125.56ms
iter 436310: loss 6.1735, time 125.35ms
iter 436320: loss 5.9318, time 125.69ms
iter 436330: loss 6.4886, time 125.75ms
iter 436340: loss 5.9498, time 126.01ms
iter 436350: loss 6.2982, time 125.40ms
iter 436360: loss 5.6611, time 122.44ms
iter 436370: loss 5.4822, time 121.66ms
iter 436380: loss 6.1248, time 121.78ms
iter 436390: loss 5.6551, time 122.72ms
iter 436400: loss 4.9920, time 121.63ms
iter 436410: loss 5.5164, time 120.77ms
iter 436420: loss 5.2304, time 122.63ms
iter 436430: loss 5.9741, time 121.53ms
iter 436440: loss 5.4126, time 121.68ms
iter 436450: loss 6.3708, time 124.55ms
iter 436460: loss 5.4425, time 122.86ms
iter 436470: loss 5.5454, time 123.12ms
iter 436480: loss 6.6071, time 121.76ms
iter 436490: loss 5.2907, time 121.91ms
step 436500: train loss 5.6136, val loss 5.5765
saving checkpoint to out-shakespeare-char
iter 436500: loss 5.9581, time 2905.64ms
iter 436510: loss 6.0087, time 120.19ms
iter 436520: loss 6.0338, time 122.01ms
iter 436530: loss 6.6889, time 121.71ms
iter 436540: loss 5.9713, time 121.82ms
iter 436550: loss 6.3458, time 124.26ms
iter 436560: loss 5.8540, time 121.81ms
iter 436570: loss 5.6120, time 121.69ms
iter 436580: loss 6.2038, time 121.74ms
iter 436590: loss 5.9754, time 121.70ms
iter 436600: loss 6.1206, time 121.67ms
iter 436610: loss 5.6439, time 121.64ms
iter 436620: loss 6.1046, time 122.45ms
iter 436630: loss 6.0439, time 121.23ms
iter 436640: loss 6.3228, time 121.67ms
iter 436650: loss 6.6529, time 122.89ms
iter 436660: loss 6.0813, time 121.78ms
iter 436670: loss 5.3691, time 121.93ms
iter 436680: loss 6.0553, time 122.69ms
iter 436690: loss 5.4616, time 121.78ms
iter 436700: loss 6.2551, time 121.76ms
iter 436710: loss 5.9531, time 121.84ms
iter 436720: loss 6.1856, time 121.90ms
iter 436730: loss 5.7100, time 121.09ms
iter 436740: loss 6.3967, time 121.68ms
step 436750: train loss 5.5837, val loss 5.5600
saving checkpoint to out-shakespeare-char
iter 436750: loss 6.7015, time 2899.90ms
iter 436760: loss 5.3856, time 124.10ms
iter 436770: loss 5.8854, time 124.78ms
iter 436780: loss 5.9239, time 125.50ms
iter 436790: loss 5.3012, time 125.22ms
iter 436800: loss 5.4020, time 125.11ms
iter 436810: loss 5.7548, time 125.21ms
iter 436820: loss 6.2524, time 125.35ms
iter 436830: loss 6.0947, time 125.53ms
iter 436840: loss 5.9082, time 125.06ms
iter 436850: loss 5.4586, time 125.48ms
iter 436860: loss 6.5571, time 125.05ms
iter 436870: loss 5.6835, time 125.44ms
iter 436880: loss 5.6775, time 125.05ms
iter 436890: loss 6.1103, time 125.13ms
iter 436900: loss 6.0507, time 125.03ms
iter 436910: loss 6.1983, time 125.20ms
iter 436920: loss 6.0051, time 125.65ms
iter 436930: loss 6.0797, time 124.70ms
iter 436940: loss 5.3183, time 125.03ms
iter 436950: loss 6.0860, time 124.27ms
iter 436960: loss 6.1758, time 125.60ms
iter 436970: loss 6.5348, time 125.15ms
iter 436980: loss 5.6040, time 124.88ms
iter 436990: loss 5.2239, time 125.72ms
step 437000: train loss 5.5726, val loss 5.5597
saving checkpoint to out-shakespeare-char
iter 437000: loss 5.8020, time 2893.05ms
iter 437010: loss 6.0955, time 125.20ms
iter 437020: loss 5.9705, time 125.23ms
iter 437030: loss 5.9639, time 125.05ms
iter 437040: loss 5.5393, time 125.10ms
iter 437050: loss 5.9770, time 124.97ms
iter 437060: loss 5.5911, time 125.18ms
iter 437070: loss 5.4669, time 125.21ms
iter 437080: loss 6.0709, time 124.53ms
iter 437090: loss 5.8520, time 125.02ms
iter 437100: loss 6.0600, time 125.27ms
iter 437110: loss 5.3949, time 125.04ms
iter 437120: loss 6.3123, time 125.07ms
iter 437130: loss 5.2431, time 124.97ms
iter 437140: loss 5.7311, time 125.08ms
iter 437150: loss 6.6484, time 125.19ms
iter 437160: loss 6.9803, time 125.30ms
iter 437170: loss 6.3552, time 125.04ms
iter 437180: loss 5.8638, time 124.96ms
iter 437190: loss 6.2114, time 124.52ms
iter 437200: loss 6.3100, time 125.35ms
iter 437210: loss 5.3606, time 125.32ms
iter 437220: loss 5.6266, time 124.95ms
iter 437230: loss 5.3028, time 125.28ms
iter 437240: loss 5.9081, time 125.23ms
step 437250: train loss 5.5622, val loss 5.6158
saving checkpoint to out-shakespeare-char
iter 437250: loss 6.4371, time 2890.75ms
iter 437260: loss 5.8280, time 125.42ms
iter 437270: loss 5.7056, time 125.42ms
iter 437280: loss 5.9230, time 125.01ms
iter 437290: loss 6.6148, time 125.42ms
iter 437300: loss 5.6758, time 125.03ms
iter 437310: loss 5.4793, time 125.03ms
iter 437320: loss 5.9765, time 125.03ms
iter 437330: loss 4.8594, time 125.08ms
iter 437340: loss 5.8234, time 125.25ms
iter 437350: loss 6.1438, time 127.71ms
iter 437360: loss 6.4026, time 124.85ms
iter 437370: loss 5.6934, time 128.05ms
iter 437380: loss 6.0651, time 123.92ms
iter 437390: loss 6.2681, time 127.71ms
iter 437400: loss 6.3587, time 125.25ms
iter 437410: loss 5.8391, time 127.90ms
iter 437420: loss 5.5308, time 125.24ms
iter 437430: loss 6.2193, time 127.78ms
iter 437440: loss 5.8993, time 125.30ms
iter 437450: loss 5.9135, time 128.22ms
iter 437460: loss 6.2550, time 125.09ms
iter 437470: loss 6.3257, time 127.73ms
iter 437480: loss 6.3434, time 124.05ms
iter 437490: loss 5.7827, time 127.62ms
step 437500: train loss 5.5536, val loss 5.6037
saving checkpoint to out-shakespeare-char
iter 437500: loss 6.4261, time 2876.66ms
iter 437510: loss 6.2770, time 125.08ms
iter 437520: loss 5.9289, time 125.28ms
iter 437530: loss 5.7265, time 125.26ms
iter 437540: loss 5.9849, time 125.35ms
iter 437550: loss 5.3363, time 125.09ms
iter 437560: loss 6.3388, time 125.78ms
iter 437570: loss 6.5640, time 125.57ms
iter 437580: loss 5.5104, time 125.54ms
iter 437590: loss 5.9711, time 125.74ms
iter 437600: loss 5.6660, time 125.30ms
iter 437610: loss 5.6162, time 125.81ms
iter 437620: loss 5.7936, time 125.77ms
iter 437630: loss 5.9219, time 125.86ms
iter 437640: loss 5.0736, time 125.78ms
iter 437650: loss 5.7866, time 125.68ms
iter 437660: loss 5.8854, time 125.46ms
iter 437670: loss 5.6636, time 125.77ms
iter 437680: loss 6.2785, time 125.91ms
iter 437690: loss 6.1993, time 125.25ms
iter 437700: loss 6.1138, time 126.01ms
iter 437710: loss 6.0745, time 125.70ms
iter 437720: loss 5.5826, time 125.52ms
iter 437730: loss 6.2084, time 125.52ms
iter 437740: loss 5.9808, time 126.14ms
step 437750: train loss 5.5940, val loss 5.5425
saving checkpoint to out-shakespeare-char
iter 437750: loss 6.1807, time 2908.88ms
iter 437760: loss 5.8783, time 121.78ms
iter 437770: loss 5.4929, time 121.34ms
iter 437780: loss 6.4363, time 123.08ms
iter 437790: loss 5.7603, time 121.41ms
iter 437800: loss 5.5368, time 121.35ms
iter 437810: loss 5.4816, time 123.91ms
iter 437820: loss 5.7559, time 121.63ms
iter 437830: loss 6.7572, time 121.58ms
iter 437840: loss 6.1767, time 121.68ms
iter 437850: loss 5.9482, time 122.17ms
iter 437860: loss 5.1485, time 121.54ms
iter 437870: loss 6.4917, time 121.62ms
iter 437880: loss 6.6387, time 122.77ms
iter 437890: loss 5.4479, time 121.67ms
iter 437900: loss 6.6884, time 121.69ms
iter 437910: loss 5.2282, time 124.73ms
iter 437920: loss 5.8983, time 121.72ms
iter 437930: loss 5.9811, time 121.64ms
iter 437940: loss 5.7535, time 122.05ms
iter 437950: loss 6.4278, time 121.87ms
iter 437960: loss 5.1862, time 121.77ms
iter 437970: loss 5.8309, time 121.63ms
iter 437980: loss 5.2408, time 122.63ms
iter 437990: loss 5.8691, time 121.56ms
step 438000: train loss 5.5459, val loss 5.5778
saving checkpoint to out-shakespeare-char
iter 438000: loss 6.6276, time 2913.82ms
iter 438010: loss 5.6223, time 125.68ms
iter 438020: loss 6.1525, time 124.70ms
iter 438030: loss 6.1230, time 125.16ms
iter 438040: loss 5.8481, time 125.12ms
iter 438050: loss 5.4076, time 125.27ms
iter 438060: loss 5.2181, time 125.29ms
iter 438070: loss 5.8345, time 125.17ms
iter 438080: loss 6.1322, time 125.13ms
iter 438090: loss 6.0338, time 125.42ms
iter 438100: loss 7.0118, time 124.99ms
iter 438110: loss 6.1958, time 125.28ms
iter 438120: loss 6.4240, time 125.10ms
iter 438130: loss 6.2143, time 124.85ms
iter 438140: loss 5.9414, time 125.23ms
iter 438150: loss 5.4741, time 125.22ms
iter 438160: loss 6.7704, time 125.18ms
iter 438170: loss 6.1503, time 125.06ms
iter 438180: loss 6.1793, time 124.95ms
iter 438190: loss 5.4710, time 125.13ms
iter 438200: loss 5.0415, time 125.47ms
iter 438210: loss 5.9826, time 127.66ms
iter 438220: loss 5.8262, time 125.03ms
iter 438230: loss 6.4964, time 127.71ms
iter 438240: loss 5.2579, time 125.51ms
step 438250: train loss 5.5248, val loss 5.5794
saving checkpoint to out-shakespeare-char
iter 438250: loss 5.9438, time 2895.77ms
iter 438260: loss 5.5853, time 125.37ms
iter 438270: loss 5.3074, time 125.36ms
iter 438280: loss 5.8597, time 125.58ms
iter 438290: loss 6.2273, time 125.20ms
iter 438300: loss 5.7878, time 123.74ms
iter 438310: loss 5.7192, time 125.40ms
iter 438320: loss 6.2198, time 125.64ms
iter 438330: loss 6.1651, time 125.98ms
iter 438340: loss 6.6343, time 125.57ms
iter 438350: loss 6.3343, time 124.57ms
iter 438360: loss 6.0258, time 125.53ms
iter 438370: loss 5.6074, time 125.66ms
iter 438380: loss 5.7307, time 125.93ms
iter 438390: loss 5.8973, time 126.19ms
iter 438400: loss 5.6349, time 125.83ms
iter 438410: loss 6.3052, time 125.85ms
iter 438420: loss 5.5190, time 125.94ms
iter 438430: loss 6.0272, time 125.66ms
iter 438440: loss 5.7915, time 125.58ms
iter 438450: loss 5.4795, time 125.64ms
iter 438460: loss 6.2283, time 125.71ms
iter 438470: loss 6.0825, time 125.73ms
iter 438480: loss 5.7872, time 125.70ms
iter 438490: loss 6.7661, time 125.65ms
step 438500: train loss 5.5558, val loss 5.6279
saving checkpoint to out-shakespeare-char
iter 438500: loss 5.4686, time 2895.99ms
iter 438510: loss 6.1763, time 121.74ms
iter 438520: loss 6.0695, time 121.90ms
iter 438530: loss 6.2024, time 124.47ms
iter 438540: loss 5.9479, time 121.79ms
iter 438550: loss 6.5575, time 121.93ms
iter 438560: loss 5.5603, time 121.87ms
iter 438570: loss 6.1722, time 121.90ms
iter 438580: loss 6.1223, time 121.82ms
iter 438590: loss 6.0178, time 121.84ms
iter 438600: loss 5.1114, time 123.63ms
iter 438610: loss 5.7079, time 121.84ms
iter 438620: loss 6.4208, time 122.07ms
iter 438630: loss 5.7285, time 122.95ms
iter 438640: loss 6.1684, time 121.74ms
iter 438650: loss 6.2385, time 121.82ms
iter 438660: loss 5.1677, time 124.30ms
iter 438670: loss 6.2934, time 121.72ms
iter 438680: loss 5.5003, time 121.91ms
iter 438690: loss 6.1738, time 121.80ms
iter 438700: loss 6.2498, time 121.68ms
iter 438710: loss 5.9394, time 121.84ms
iter 438720: loss 5.7841, time 121.93ms
iter 438730: loss 5.9336, time 123.34ms
iter 438740: loss 6.0548, time 121.92ms
step 438750: train loss 5.5688, val loss 5.5562
saving checkpoint to out-shakespeare-char
iter 438750: loss 5.4313, time 2907.07ms
iter 438760: loss 5.4336, time 121.54ms
iter 438770: loss 6.4035, time 121.56ms
iter 438780: loss 5.7676, time 121.43ms
iter 438790: loss 5.8720, time 122.05ms
iter 438800: loss 6.6314, time 121.75ms
iter 438810: loss 5.9298, time 121.53ms
iter 438820: loss 5.8563, time 121.51ms
iter 438830: loss 4.9195, time 122.71ms
iter 438840: loss 5.9844, time 121.43ms
iter 438850: loss 5.9490, time 121.48ms
iter 438860: loss 6.4316, time 122.58ms
iter 438870: loss 5.7613, time 121.41ms
iter 438880: loss 5.8180, time 121.43ms
iter 438890: loss 6.0002, time 123.96ms
iter 438900: loss 6.4072, time 121.61ms
iter 438910: loss 6.2045, time 121.23ms
iter 438920: loss 5.8298, time 121.52ms
iter 438930: loss 6.2192, time 121.54ms
iter 438940: loss 5.2526, time 122.63ms
iter 438950: loss 6.4916, time 121.58ms
iter 438960: loss 6.6619, time 123.08ms
iter 438970: loss 5.3760, time 121.61ms
iter 438980: loss 5.3570, time 121.47ms
iter 438990: loss 6.4693, time 122.62ms
step 439000: train loss 5.5607, val loss 5.5969
saving checkpoint to out-shakespeare-char
iter 439000: loss 6.1846, time 2906.48ms
iter 439010: loss 5.9629, time 122.03ms
iter 439020: loss 5.8393, time 121.52ms
iter 439030: loss 6.2464, time 122.69ms
iter 439040: loss 5.8129, time 121.47ms
iter 439050: loss 6.1272, time 121.36ms
iter 439060: loss 5.7350, time 122.58ms
iter 439070: loss 6.1084, time 121.62ms
iter 439080: loss 6.7553, time 121.55ms
iter 439090: loss 5.9729, time 124.07ms
iter 439100: loss 5.8495, time 121.49ms
iter 439110: loss 6.9446, time 121.58ms
iter 439120: loss 5.7969, time 121.51ms
iter 439130: loss 6.4195, time 121.37ms
iter 439140: loss 5.1717, time 121.50ms
iter 439150: loss 6.0413, time 121.47ms
iter 439160: loss 5.8233, time 123.08ms
iter 439170: loss 6.0619, time 121.61ms
iter 439180: loss 5.6985, time 121.58ms
iter 439190: loss 5.4774, time 122.76ms
iter 439200: loss 6.1747, time 121.58ms
iter 439210: loss 6.3299, time 121.55ms
iter 439220: loss 5.7453, time 124.13ms
iter 439230: loss 6.0657, time 121.53ms
iter 439240: loss 5.4724, time 121.60ms
step 439250: train loss 5.5708, val loss 5.5927
saving checkpoint to out-shakespeare-char
iter 439250: loss 5.9861, time 2888.44ms
iter 439260: loss 6.6843, time 122.56ms
iter 439270: loss 5.6614, time 121.38ms
iter 439280: loss 6.0697, time 121.35ms
iter 439290: loss 5.5872, time 123.92ms
iter 439300: loss 5.7283, time 120.53ms
iter 439310: loss 6.4868, time 121.27ms
iter 439320: loss 6.6388, time 121.19ms
iter 439330: loss 5.9402, time 121.37ms
iter 439340: loss 6.1544, time 121.76ms
iter 439350: loss 5.5064, time 121.18ms
iter 439360: loss 5.8021, time 122.33ms
iter 439370: loss 5.8586, time 121.38ms
iter 439380: loss 6.0765, time 121.10ms
iter 439390: loss 6.0445, time 122.20ms
iter 439400: loss 6.3438, time 120.98ms
iter 439410: loss 6.4412, time 123.99ms
iter 439420: loss 6.3409, time 125.06ms
iter 439430: loss 6.1658, time 124.81ms
iter 439440: loss 5.7751, time 124.70ms
iter 439450: loss 6.1885, time 124.36ms
iter 439460: loss 5.9371, time 125.27ms
iter 439470: loss 6.8502, time 124.93ms
iter 439480: loss 5.3677, time 125.08ms
iter 439490: loss 6.4056, time 124.68ms
step 439500: train loss 5.5339, val loss 5.5612
saving checkpoint to out-shakespeare-char
iter 439500: loss 5.2460, time 2905.07ms
iter 439510: loss 6.1677, time 125.46ms
iter 439520: loss 5.7591, time 125.35ms
iter 439530: loss 6.2867, time 125.69ms
iter 439540: loss 6.0255, time 125.46ms
iter 439550: loss 6.3261, time 125.21ms
iter 439560: loss 6.1652, time 125.41ms
iter 439570: loss 5.7871, time 125.38ms
iter 439580: loss 5.5971, time 125.51ms
iter 439590: loss 5.8687, time 125.33ms
iter 439600: loss 6.0202, time 125.50ms
iter 439610: loss 6.3364, time 125.65ms
iter 439620: loss 5.1578, time 125.50ms
iter 439630: loss 5.7467, time 125.19ms
iter 439640: loss 5.0004, time 125.15ms
iter 439650: loss 6.4621, time 124.43ms
iter 439660: loss 6.1367, time 125.16ms
iter 439670: loss 6.3598, time 125.06ms
iter 439680: loss 5.8363, time 125.11ms
iter 439690: loss 6.1571, time 125.21ms
iter 439700: loss 5.6074, time 125.42ms
iter 439710: loss 6.6178, time 125.23ms
iter 439720: loss 6.3304, time 125.91ms
iter 439730: loss 6.7477, time 125.34ms
iter 439740: loss 5.6956, time 126.18ms
step 439750: train loss 5.5362, val loss 5.5992
saving checkpoint to out-shakespeare-char
iter 439750: loss 6.3108, time 2883.91ms
iter 439760: loss 5.9570, time 121.76ms
iter 439770: loss 6.3390, time 121.95ms
iter 439780: loss 5.5481, time 121.79ms
iter 439790: loss 5.4924, time 121.66ms
iter 439800: loss 6.3723, time 122.73ms
iter 439810: loss 5.9786, time 120.90ms
iter 439820: loss 6.4536, time 121.69ms
iter 439830: loss 6.3168, time 122.45ms
iter 439840: loss 5.9293, time 120.81ms
iter 439850: loss 6.3300, time 120.85ms
iter 439860: loss 5.9301, time 124.42ms
iter 439870: loss 5.6909, time 121.58ms
iter 439880: loss 4.9501, time 121.58ms
iter 439890: loss 5.3223, time 121.73ms
iter 439900: loss 6.3257, time 121.65ms
iter 439910: loss 6.2033, time 120.92ms
iter 439920: loss 5.4214, time 121.92ms
iter 439930: loss 5.6005, time 122.88ms
iter 439940: loss 5.7400, time 121.97ms
iter 439950: loss 6.0784, time 121.46ms
iter 439960: loss 5.6398, time 122.94ms
iter 439970: loss 6.1716, time 121.70ms
iter 439980: loss 5.9373, time 121.67ms
iter 439990: loss 6.2734, time 124.20ms
step 440000: train loss 5.5768, val loss 5.5209
saving checkpoint to out-shakespeare-char
iter 440000: loss 6.0928, time 2901.83ms
iter 440010: loss 6.4077, time 121.99ms
iter 440020: loss 6.1231, time 121.63ms
iter 440030: loss 6.3351, time 124.14ms
iter 440040: loss 6.3595, time 121.73ms
iter 440050: loss 5.7518, time 121.68ms
iter 440060: loss 5.4979, time 122.06ms
iter 440070: loss 5.1893, time 121.48ms
iter 440080: loss 6.1877, time 121.30ms
iter 440090: loss 6.2754, time 121.39ms
iter 440100: loss 6.0645, time 123.08ms
iter 440110: loss 6.1201, time 121.45ms
iter 440120: loss 6.0775, time 121.27ms
iter 440130: loss 6.6122, time 124.48ms
iter 440140: loss 5.3624, time 121.89ms
iter 440150: loss 6.5618, time 122.10ms
iter 440160: loss 5.5805, time 121.95ms
iter 440170: loss 5.9163, time 121.59ms
iter 440180: loss 6.1047, time 121.78ms
iter 440190: loss 6.2535, time 121.55ms
iter 440200: loss 5.9421, time 122.74ms
iter 440210: loss 6.5420, time 121.45ms
iter 440220: loss 5.7579, time 121.65ms
iter 440230: loss 6.5062, time 122.09ms
iter 440240: loss 5.6243, time 121.10ms
step 440250: train loss 5.5662, val loss 5.5761
saving checkpoint to out-shakespeare-char
iter 440250: loss 5.8499, time 2874.69ms
iter 440260: loss 5.9199, time 122.10ms
iter 440270: loss 5.9606, time 121.43ms
iter 440280: loss 5.7903, time 121.59ms
iter 440290: loss 5.8155, time 121.47ms
iter 440300: loss 5.8850, time 124.20ms
iter 440310: loss 5.6807, time 121.48ms
iter 440320: loss 5.6396, time 120.65ms
iter 440330: loss 5.9454, time 121.46ms
iter 440340: loss 6.7696, time 121.46ms
iter 440350: loss 6.3976, time 121.36ms
iter 440360: loss 6.0368, time 121.58ms
iter 440370: loss 5.7252, time 122.85ms
iter 440380: loss 6.1709, time 121.56ms
iter 440390: loss 4.9023, time 121.50ms
iter 440400: loss 6.3257, time 122.78ms
iter 440410: loss 5.7638, time 120.97ms
iter 440420: loss 5.7669, time 121.46ms
iter 440430: loss 5.9315, time 124.15ms
iter 440440: loss 5.6382, time 121.39ms
iter 440450: loss 6.0216, time 121.59ms
iter 440460: loss 6.4403, time 121.72ms
iter 440470: loss 6.4897, time 121.58ms
iter 440480: loss 5.3118, time 121.55ms
iter 440490: loss 5.3956, time 121.57ms
step 440500: train loss 5.5200, val loss 5.5104
saving checkpoint to out-shakespeare-char
iter 440500: loss 5.6588, time 2902.76ms
iter 440510: loss 5.7659, time 125.89ms
iter 440520: loss 6.1280, time 128.41ms
iter 440530: loss 6.3338, time 125.12ms
iter 440540: loss 6.7578, time 127.78ms
iter 440550: loss 6.6153, time 124.27ms
iter 440560: loss 5.5443, time 127.77ms
iter 440570: loss 5.9332, time 125.30ms
iter 440580: loss 5.7394, time 128.41ms
iter 440590: loss 6.0106, time 125.45ms
iter 440600: loss 5.7419, time 127.92ms
iter 440610: loss 6.0230, time 125.95ms
iter 440620: loss 5.6350, time 125.83ms
iter 440630: loss 5.4239, time 126.01ms
iter 440640: loss 5.5671, time 125.44ms
iter 440650: loss 6.0498, time 125.28ms
iter 440660: loss 5.8465, time 125.61ms
iter 440670: loss 6.0754, time 125.79ms
iter 440680: loss 5.5526, time 125.13ms
iter 440690: loss 6.1605, time 125.45ms
iter 440700: loss 6.0787, time 125.55ms
iter 440710: loss 5.8484, time 125.51ms
iter 440720: loss 6.2541, time 125.35ms
iter 440730: loss 6.3137, time 125.31ms
iter 440740: loss 5.7585, time 125.67ms
step 440750: train loss 5.5481, val loss 5.5722
saving checkpoint to out-shakespeare-char
iter 440750: loss 7.0518, time 2894.56ms
iter 440760: loss 6.4406, time 125.51ms
iter 440770: loss 5.5278, time 126.15ms
iter 440780: loss 5.9220, time 125.39ms
iter 440790: loss 6.1118, time 125.43ms
iter 440800: loss 5.8437, time 125.36ms
iter 440810: loss 6.2853, time 126.77ms
iter 440820: loss 5.7625, time 125.19ms
iter 440830: loss 5.6956, time 125.59ms
iter 440840: loss 5.7326, time 125.12ms
iter 440850: loss 5.5731, time 125.22ms
iter 440860: loss 6.0422, time 125.63ms
iter 440870: loss 5.5282, time 125.55ms
iter 440880: loss 5.2278, time 125.33ms
iter 440890: loss 5.8188, time 125.19ms
iter 440900: loss 6.0927, time 125.35ms
iter 440910: loss 5.5738, time 125.13ms
iter 440920: loss 6.5455, time 125.41ms
iter 440930: loss 5.3920, time 125.94ms
iter 440940: loss 6.5008, time 125.62ms
iter 440950: loss 5.9154, time 124.86ms
iter 440960: loss 6.1026, time 124.89ms
iter 440970: loss 5.5547, time 127.20ms
iter 440980: loss 6.2661, time 121.56ms
iter 440990: loss 5.4108, time 120.85ms
step 441000: train loss 5.5918, val loss 5.5558
saving checkpoint to out-shakespeare-char
iter 441000: loss 5.5158, time 2883.87ms
iter 441010: loss 5.7897, time 125.93ms
iter 441020: loss 6.1041, time 126.28ms
iter 441030: loss 5.5200, time 125.01ms
iter 441040: loss 6.2324, time 126.11ms
iter 441050: loss 5.6068, time 125.63ms
iter 441060: loss 6.4657, time 126.47ms
iter 441070: loss 5.9431, time 125.38ms
iter 441080: loss 6.3523, time 120.15ms
iter 441090: loss 5.9449, time 120.62ms
iter 441100: loss 6.1508, time 121.53ms
iter 441110: loss 5.8270, time 121.31ms
iter 441120: loss 6.4705, time 120.07ms
iter 441130: loss 6.0981, time 119.93ms
iter 441140: loss 6.1938, time 120.57ms
iter 441150: loss 5.7831, time 120.07ms
iter 441160: loss 6.5346, time 120.22ms
iter 441170: loss 5.5564, time 121.49ms
iter 441180: loss 5.7034, time 119.66ms
iter 441190: loss 5.8650, time 120.08ms
iter 441200: loss 5.9733, time 121.32ms
iter 441210: loss 5.8205, time 120.02ms
iter 441220: loss 5.1406, time 119.90ms
iter 441230: loss 6.2316, time 122.62ms
iter 441240: loss 6.7765, time 122.05ms
step 441250: train loss 5.5516, val loss 5.5796
saving checkpoint to out-shakespeare-char
iter 441250: loss 6.0485, time 2901.39ms
iter 441260: loss 5.5665, time 123.31ms
iter 441270: loss 6.1499, time 124.43ms
iter 441280: loss 6.5240, time 121.92ms
iter 441290: loss 5.4632, time 122.44ms
iter 441300: loss 5.8040, time 121.90ms
iter 441310: loss 5.9695, time 121.74ms
iter 441320: loss 5.7216, time 120.84ms
iter 441330: loss 5.7273, time 122.02ms
iter 441340: loss 5.6421, time 123.20ms
iter 441350: loss 6.1467, time 121.90ms
iter 441360: loss 5.4059, time 122.06ms
iter 441370: loss 5.3881, time 123.46ms
iter 441380: loss 5.8673, time 122.07ms
iter 441390: loss 5.6461, time 121.87ms
iter 441400: loss 5.8681, time 124.54ms
iter 441410: loss 6.0359, time 121.80ms
iter 441420: loss 5.8268, time 121.92ms
iter 441430: loss 6.0971, time 121.54ms
iter 441440: loss 5.9719, time 122.14ms
iter 441450: loss 6.4310, time 122.07ms
iter 441460: loss 5.6062, time 121.87ms
iter 441470: loss 5.4414, time 122.89ms
iter 441480: loss 6.0951, time 121.85ms
iter 441490: loss 6.0181, time 121.89ms
step 441500: train loss 5.6132, val loss 5.5902
saving checkpoint to out-shakespeare-char
iter 441500: loss 5.9473, time 2928.43ms
iter 441510: loss 5.6082, time 124.47ms
iter 441520: loss 5.6815, time 125.76ms
iter 441530: loss 5.6680, time 125.31ms
iter 441540: loss 6.2986, time 126.20ms
iter 441550: loss 6.2403, time 126.10ms
iter 441560: loss 6.4407, time 125.97ms
iter 441570: loss 5.5644, time 125.98ms
iter 441580: loss 6.1548, time 125.84ms
iter 441590: loss 6.6177, time 126.04ms
iter 441600: loss 5.8184, time 126.10ms
iter 441610: loss 6.0888, time 124.68ms
iter 441620: loss 6.2247, time 125.28ms
iter 441630: loss 6.2251, time 124.95ms
iter 441640: loss 5.9541, time 125.27ms
iter 441650: loss 6.4131, time 125.22ms
iter 441660: loss 6.1372, time 125.31ms
iter 441670: loss 5.6771, time 125.00ms
iter 441680: loss 6.2868, time 125.46ms
iter 441690: loss 5.8767, time 125.42ms
iter 441700: loss 6.7018, time 125.28ms
iter 441710: loss 6.4915, time 125.07ms
iter 441720: loss 5.2241, time 125.30ms
iter 441730: loss 5.5839, time 125.53ms
iter 441740: loss 6.4138, time 124.62ms
step 441750: train loss 5.6150, val loss 5.5552
saving checkpoint to out-shakespeare-char
iter 441750: loss 5.8306, time 2912.60ms
iter 441760: loss 4.9132, time 125.50ms
iter 441770: loss 6.5680, time 125.96ms
iter 441780: loss 5.6114, time 126.22ms
iter 441790: loss 5.4490, time 125.32ms
iter 441800: loss 6.0022, time 127.37ms
iter 441810: loss 5.8760, time 125.26ms
iter 441820: loss 5.7966, time 127.87ms
iter 441830: loss 6.2922, time 125.24ms
iter 441840: loss 6.0317, time 128.29ms
iter 441850: loss 6.2998, time 125.64ms
iter 441860: loss 6.2429, time 125.17ms
iter 441870: loss 6.0986, time 126.09ms
iter 441880: loss 5.6290, time 125.65ms
iter 441890: loss 6.0142, time 125.94ms
iter 441900: loss 5.5544, time 124.82ms
iter 441910: loss 5.8084, time 125.67ms
iter 441920: loss 5.8631, time 128.28ms
iter 441930: loss 5.7616, time 124.39ms
iter 441940: loss 6.0890, time 123.86ms
iter 441950: loss 6.3018, time 125.96ms
iter 441960: loss 5.8034, time 125.78ms
iter 441970: loss 5.8446, time 125.61ms
iter 441980: loss 5.7352, time 125.68ms
iter 441990: loss 5.2808, time 125.68ms
step 442000: train loss 5.6061, val loss 5.5759
saving checkpoint to out-shakespeare-char
iter 442000: loss 5.5898, time 2917.61ms
iter 442010: loss 5.7555, time 126.25ms
iter 442020: loss 5.6718, time 126.29ms
iter 442030: loss 5.8469, time 125.97ms
iter 442040: loss 5.8266, time 125.91ms
iter 442050: loss 5.8121, time 126.22ms
iter 442060: loss 5.9253, time 124.72ms
iter 442070: loss 5.9639, time 125.75ms
iter 442080: loss 6.1521, time 125.79ms
iter 442090: loss 6.2830, time 126.00ms
iter 442100: loss 6.3834, time 125.59ms
iter 442110: loss 5.8887, time 125.67ms
iter 442120: loss 6.0478, time 125.85ms
iter 442130: loss 5.9542, time 126.02ms
iter 442140: loss 6.4349, time 125.89ms
iter 442150: loss 6.0220, time 126.16ms
iter 442160: loss 5.7285, time 125.85ms
iter 442170: loss 6.2225, time 125.76ms
iter 442180: loss 5.1833, time 125.10ms
iter 442190: loss 5.9100, time 125.75ms
iter 442200: loss 5.9516, time 121.59ms
iter 442210: loss 5.7735, time 123.67ms
iter 442220: loss 6.4026, time 122.08ms
iter 442230: loss 5.9061, time 121.39ms
iter 442240: loss 5.6893, time 121.41ms
step 442250: train loss 5.5820, val loss 5.6024
saving checkpoint to out-shakespeare-char
iter 442250: loss 5.9713, time 2903.31ms
iter 442260: loss 6.5131, time 122.20ms
iter 442270: loss 6.2033, time 121.43ms
iter 442280: loss 5.4533, time 121.80ms
iter 442290: loss 5.3970, time 121.87ms
iter 442300: loss 5.8088, time 121.34ms
iter 442310: loss 6.1343, time 122.77ms
iter 442320: loss 6.2291, time 121.48ms
iter 442330: loss 5.7790, time 121.58ms
iter 442340: loss 5.4628, time 122.58ms
iter 442350: loss 5.6589, time 121.52ms
iter 442360: loss 6.2399, time 121.50ms
iter 442370: loss 6.5316, time 123.84ms
iter 442380: loss 5.7876, time 121.41ms
iter 442390: loss 5.6208, time 121.23ms
iter 442400: loss 5.5960, time 121.28ms
iter 442410: loss 5.9467, time 121.50ms
iter 442420: loss 6.0738, time 121.15ms
iter 442430: loss 6.0110, time 121.76ms
iter 442440: loss 6.2940, time 122.31ms
iter 442450: loss 6.0588, time 121.29ms
iter 442460: loss 5.6224, time 121.27ms
iter 442470: loss 5.8549, time 121.76ms
iter 442480: loss 5.1295, time 121.65ms
iter 442490: loss 5.7231, time 121.18ms
step 442500: train loss 5.5953, val loss 5.5766
saving checkpoint to out-shakespeare-char
iter 442500: loss 5.4506, time 2892.54ms
iter 442510: loss 6.6015, time 125.48ms
iter 442520: loss 5.9925, time 124.73ms
iter 442530: loss 6.1687, time 125.35ms
iter 442540: loss 5.9691, time 125.42ms
iter 442550: loss 6.0588, time 125.51ms
iter 442560: loss 5.4529, time 125.28ms
iter 442570: loss 6.3571, time 125.21ms
iter 442580: loss 5.5835, time 125.59ms
iter 442590: loss 5.5580, time 126.18ms
iter 442600: loss 6.5311, time 126.22ms
iter 442610: loss 6.3624, time 125.72ms
iter 442620: loss 6.3407, time 125.29ms
iter 442630: loss 6.6772, time 125.77ms
iter 442640: loss 6.7194, time 128.30ms
iter 442650: loss 6.1714, time 125.32ms
iter 442660: loss 5.8088, time 127.72ms
iter 442670: loss 6.3412, time 124.81ms
iter 442680: loss 6.3193, time 126.78ms
iter 442690: loss 5.9696, time 125.60ms
iter 442700: loss 6.0208, time 124.95ms
iter 442710: loss 5.1836, time 124.17ms
iter 442720: loss 6.1209, time 124.68ms
iter 442730: loss 5.3625, time 125.24ms
iter 442740: loss 6.3887, time 125.58ms
step 442750: train loss 5.5925, val loss 5.5877
saving checkpoint to out-shakespeare-char
iter 442750: loss 5.8806, time 2894.08ms
iter 442760: loss 5.5673, time 124.70ms
iter 442770: loss 6.1683, time 125.20ms
iter 442780: loss 5.6435, time 125.63ms
iter 442790: loss 6.2531, time 124.18ms
iter 442800: loss 6.1172, time 123.67ms
iter 442810: loss 5.9117, time 124.76ms
iter 442820: loss 6.5236, time 123.97ms
iter 442830: loss 6.1416, time 124.59ms
iter 442840: loss 6.0451, time 124.20ms
iter 442850: loss 6.7626, time 124.09ms
iter 442860: loss 5.7054, time 124.68ms
iter 442870: loss 5.4850, time 125.00ms
iter 442880: loss 6.3456, time 124.88ms
iter 442890: loss 5.3542, time 124.84ms
iter 442900: loss 6.8149, time 124.66ms
iter 442910: loss 6.5565, time 124.89ms
iter 442920: loss 6.0391, time 124.27ms
iter 442930: loss 6.5355, time 124.56ms
iter 442940: loss 7.2395, time 124.27ms
iter 442950: loss 5.5150, time 124.50ms
iter 442960: loss 5.8572, time 124.48ms
iter 442970: loss 6.3780, time 124.74ms
iter 442980: loss 5.5735, time 124.70ms
iter 442990: loss 5.7107, time 124.73ms
step 443000: train loss 5.5481, val loss 5.5929
saving checkpoint to out-shakespeare-char
iter 443000: loss 6.0833, time 2874.78ms
iter 443010: loss 5.6213, time 123.95ms
iter 443020: loss 5.2778, time 124.05ms
iter 443030: loss 6.3026, time 123.91ms
iter 443040: loss 5.3154, time 124.72ms
iter 443050: loss 6.4459, time 125.57ms
iter 443060: loss 6.2283, time 125.32ms
iter 443070: loss 6.5879, time 125.42ms
iter 443080: loss 5.6419, time 125.70ms
iter 443090: loss 5.9925, time 124.73ms
iter 443100: loss 5.9573, time 125.29ms
iter 443110: loss 6.4772, time 125.26ms
iter 443120: loss 6.1180, time 125.63ms
iter 443130: loss 6.1772, time 125.37ms
iter 443140: loss 5.7347, time 125.12ms
iter 443150: loss 5.7103, time 125.37ms
iter 443160: loss 6.3247, time 125.21ms
iter 443170: loss 6.7245, time 125.66ms
iter 443180: loss 5.3473, time 125.62ms
iter 443190: loss 6.0405, time 125.31ms
iter 443200: loss 6.3454, time 125.11ms
iter 443210: loss 5.6840, time 125.47ms
iter 443220: loss 5.8790, time 125.37ms
iter 443230: loss 6.2430, time 125.42ms
iter 443240: loss 5.9440, time 124.25ms
step 443250: train loss 5.5443, val loss 5.5627
saving checkpoint to out-shakespeare-char
iter 443250: loss 5.8002, time 2902.47ms
iter 443260: loss 5.7275, time 120.57ms
iter 443270: loss 6.2612, time 122.19ms
iter 443280: loss 6.0643, time 123.90ms
iter 443290: loss 5.1287, time 121.69ms
iter 443300: loss 5.3932, time 121.91ms
iter 443310: loss 6.7435, time 121.95ms
iter 443320: loss 6.0568, time 121.15ms
iter 443330: loss 5.8559, time 122.43ms
iter 443340: loss 6.6166, time 121.86ms
iter 443350: loss 6.3245, time 123.32ms
iter 443360: loss 6.1082, time 121.85ms
iter 443370: loss 6.1987, time 122.06ms
iter 443380: loss 5.8131, time 122.89ms
iter 443390: loss 6.6696, time 121.84ms
iter 443400: loss 6.1960, time 121.90ms
iter 443410: loss 5.4172, time 124.40ms
iter 443420: loss 6.2512, time 122.08ms
iter 443430: loss 5.4768, time 121.69ms
iter 443440: loss 5.9013, time 120.90ms
iter 443450: loss 6.2580, time 121.69ms
iter 443460: loss 5.2681, time 121.91ms
iter 443470: loss 5.8332, time 122.00ms
iter 443480: loss 6.1957, time 122.90ms
iter 443490: loss 4.9995, time 122.43ms
step 443500: train loss 5.5701, val loss 5.5768
saving checkpoint to out-shakespeare-char
iter 443500: loss 6.8965, time 2902.24ms
iter 443510: loss 6.1742, time 122.61ms
iter 443520: loss 5.9063, time 121.82ms
iter 443530: loss 6.0170, time 121.16ms
iter 443540: loss 6.0703, time 123.58ms
iter 443550: loss 5.8956, time 121.25ms
iter 443560: loss 5.8291, time 120.96ms
iter 443570: loss 5.9938, time 121.41ms
iter 443580: loss 5.9418, time 121.47ms
iter 443590: loss 5.9723, time 122.06ms
iter 443600: loss 6.3361, time 121.22ms
iter 443610: loss 6.4753, time 122.57ms
iter 443620: loss 6.3794, time 120.70ms
iter 443630: loss 6.3025, time 121.15ms
iter 443640: loss 6.2385, time 122.41ms
iter 443650: loss 6.4433, time 121.49ms
iter 443660: loss 6.0734, time 121.21ms
iter 443670: loss 7.1794, time 120.77ms
iter 443680: loss 6.1737, time 121.87ms
iter 443690: loss 6.3339, time 121.34ms
iter 443700: loss 6.2588, time 121.11ms
iter 443710: loss 5.9041, time 122.56ms
iter 443720: loss 5.7015, time 120.99ms
iter 443730: loss 6.1780, time 121.29ms
iter 443740: loss 6.6303, time 122.90ms
step 443750: train loss 5.5764, val loss 5.6029
saving checkpoint to out-shakespeare-char
iter 443750: loss 5.7965, time 2893.07ms
iter 443760: loss 5.6015, time 121.08ms
iter 443770: loss 6.9537, time 121.66ms
iter 443780: loss 5.8060, time 122.53ms
iter 443790: loss 5.6589, time 121.19ms
iter 443800: loss 5.5930, time 121.45ms
iter 443810: loss 5.4677, time 121.38ms
iter 443820: loss 6.0072, time 121.61ms
iter 443830: loss 6.1800, time 121.63ms
iter 443840: loss 6.3544, time 121.35ms
iter 443850: loss 6.2200, time 121.32ms
iter 443860: loss 5.5239, time 121.19ms
iter 443870: loss 5.1927, time 121.29ms
iter 443880: loss 5.6691, time 124.04ms
iter 443890: loss 6.3057, time 121.42ms
iter 443900: loss 6.4124, time 120.69ms
iter 443910: loss 6.3801, time 121.32ms
iter 443920: loss 6.0139, time 121.55ms
iter 443930: loss 6.1343, time 121.25ms
iter 443940: loss 5.7750, time 121.79ms
iter 443950: loss 5.4821, time 123.26ms
iter 443960: loss 5.6961, time 121.82ms
iter 443970: loss 6.4579, time 122.00ms
iter 443980: loss 6.0095, time 122.98ms
iter 443990: loss 5.8334, time 121.34ms
step 444000: train loss 5.5376, val loss 5.5753
saving checkpoint to out-shakespeare-char
iter 444000: loss 6.2942, time 2899.35ms
iter 444010: loss 5.7542, time 125.37ms
iter 444020: loss 6.1586, time 125.31ms
iter 444030: loss 6.4976, time 125.31ms
iter 444040: loss 6.1726, time 124.51ms
iter 444050: loss 6.7062, time 125.27ms
iter 444060: loss 5.6436, time 125.46ms
iter 444070: loss 6.0002, time 124.94ms
iter 444080: loss 6.8834, time 124.28ms
iter 444090: loss 6.2734, time 125.10ms
iter 444100: loss 6.2074, time 125.57ms
iter 444110: loss 5.9753, time 125.07ms
iter 444120: loss 5.5973, time 124.80ms
iter 444130: loss 5.5672, time 124.68ms
iter 444140: loss 5.8257, time 125.54ms
iter 444150: loss 5.1753, time 124.98ms
iter 444160: loss 5.5324, time 124.60ms
iter 444170: loss 5.4092, time 125.15ms
iter 444180: loss 5.4166, time 125.33ms
iter 444190: loss 6.0577, time 125.22ms
iter 444200: loss 5.7472, time 125.04ms
iter 444210: loss 5.6491, time 124.81ms
iter 444220: loss 6.0286, time 124.83ms
iter 444230: loss 6.1242, time 124.99ms
iter 444240: loss 6.3281, time 124.67ms
step 444250: train loss 5.5381, val loss 5.5534
saving checkpoint to out-shakespeare-char
iter 444250: loss 5.6335, time 2880.08ms
iter 444260: loss 6.0805, time 125.20ms
iter 444270: loss 6.1893, time 124.99ms
iter 444280: loss 6.3298, time 124.78ms
iter 444290: loss 6.3492, time 124.24ms
iter 444300: loss 5.3920, time 124.93ms
iter 444310: loss 6.2196, time 124.83ms
iter 444320: loss 5.7566, time 124.81ms
iter 444330: loss 6.3240, time 124.25ms
iter 444340: loss 5.5971, time 125.04ms
iter 444350: loss 6.1202, time 124.09ms
iter 444360: loss 5.7274, time 124.89ms
iter 444370: loss 5.5214, time 125.13ms
iter 444380: loss 5.1557, time 124.89ms
iter 444390: loss 5.6233, time 124.82ms
iter 444400: loss 6.3269, time 124.96ms
iter 444410: loss 6.7731, time 124.07ms
iter 444420: loss 5.6590, time 124.89ms
iter 444430: loss 6.3879, time 125.23ms
iter 444440: loss 5.6969, time 124.73ms
iter 444450: loss 5.8251, time 124.53ms
iter 444460: loss 5.5376, time 124.79ms
iter 444470: loss 6.1337, time 124.95ms
iter 444480: loss 5.4099, time 125.27ms
iter 444490: loss 6.3598, time 123.85ms
step 444500: train loss 5.5487, val loss 5.6265
saving checkpoint to out-shakespeare-char
iter 444500: loss 6.6030, time 2878.93ms
iter 444510: loss 6.3398, time 125.24ms
iter 444520: loss 6.0099, time 125.10ms
iter 444530: loss 5.9598, time 124.70ms
iter 444540: loss 6.9806, time 123.92ms
iter 444550: loss 5.4490, time 125.62ms
iter 444560: loss 6.2004, time 124.90ms
iter 444570: loss 5.9720, time 125.56ms
iter 444580: loss 6.2477, time 124.76ms
iter 444590: loss 6.1931, time 125.47ms
iter 444600: loss 6.3412, time 125.37ms
iter 444610: loss 6.1756, time 125.43ms
iter 444620: loss 6.2836, time 125.02ms
iter 444630: loss 6.0908, time 125.57ms
iter 444640: loss 6.5228, time 125.70ms
iter 444650: loss 6.7026, time 122.82ms
iter 444660: loss 6.2295, time 121.22ms
iter 444670: loss 6.1240, time 121.80ms
iter 444680: loss 6.0343, time 124.56ms
iter 444690: loss 5.7088, time 121.94ms
iter 444700: loss 6.3773, time 121.92ms
iter 444710: loss 6.5393, time 121.88ms
iter 444720: loss 5.9608, time 121.48ms
iter 444730: loss 5.4279, time 121.93ms
iter 444740: loss 6.0052, time 121.93ms
step 444750: train loss 5.5760, val loss 5.5057
saving checkpoint to out-shakespeare-char
iter 444750: loss 5.6505, time 2926.11ms
iter 444760: loss 6.2204, time 120.98ms
iter 444770: loss 6.6610, time 121.63ms
iter 444780: loss 5.8460, time 124.47ms
iter 444790: loss 6.4459, time 121.97ms
iter 444800: loss 5.8153, time 121.96ms
iter 444810: loss 5.4553, time 121.89ms
iter 444820: loss 5.8442, time 122.24ms
iter 444830: loss 6.0091, time 122.62ms
iter 444840: loss 5.4166, time 121.56ms
iter 444850: loss 6.3451, time 122.51ms
iter 444860: loss 5.6298, time 121.86ms
iter 444870: loss 5.6148, time 122.07ms
iter 444880: loss 6.3907, time 122.94ms
iter 444890: loss 5.8147, time 121.83ms
iter 444900: loss 6.1349, time 121.73ms
iter 444910: loss 6.2183, time 124.48ms
iter 444920: loss 5.9438, time 121.89ms
iter 444930: loss 6.7971, time 121.76ms
iter 444940: loss 5.6574, time 121.00ms
iter 444950: loss 6.1904, time 122.15ms
iter 444960: loss 5.8047, time 122.14ms
iter 444970: loss 5.6820, time 122.08ms
iter 444980: loss 6.0680, time 122.28ms
iter 444990: loss 6.4637, time 121.84ms
step 445000: train loss 5.5789, val loss 5.5831
saving checkpoint to out-shakespeare-char
iter 445000: loss 5.5700, time 2913.40ms
iter 445010: loss 5.7164, time 121.91ms
iter 445020: loss 6.1653, time 121.93ms
iter 445030: loss 5.9662, time 121.33ms
iter 445040: loss 6.1435, time 121.06ms
iter 445050: loss 6.4094, time 123.25ms
iter 445060: loss 5.8506, time 121.99ms
iter 445070: loss 6.1849, time 121.89ms
iter 445080: loss 6.4807, time 122.84ms
iter 445090: loss 5.6650, time 121.68ms
iter 445100: loss 6.0412, time 121.72ms
iter 445110: loss 5.8184, time 124.02ms
iter 445120: loss 6.7657, time 121.75ms
iter 445130: loss 6.6024, time 120.96ms
iter 445140: loss 5.7539, time 121.85ms
iter 445150: loss 5.7508, time 121.80ms
iter 445160: loss 6.3423, time 121.84ms
iter 445170: loss 5.4840, time 121.88ms
iter 445180: loss 6.3212, time 122.94ms
iter 445190: loss 5.3996, time 122.05ms
iter 445200: loss 6.0955, time 122.20ms
iter 445210: loss 5.9174, time 122.57ms
iter 445220: loss 5.9886, time 121.03ms
iter 445230: loss 5.7032, time 121.94ms
iter 445240: loss 5.8084, time 124.09ms
step 445250: train loss 5.5592, val loss 5.5375
saving checkpoint to out-shakespeare-char
iter 445250: loss 6.2952, time 2904.51ms
iter 445260: loss 5.9670, time 121.87ms
iter 445270: loss 5.3320, time 122.07ms
iter 445280: loss 5.7719, time 121.83ms
iter 445290: loss 6.9865, time 121.90ms
iter 445300: loss 5.8444, time 123.03ms
iter 445310: loss 5.2715, time 121.63ms
iter 445320: loss 5.7185, time 120.98ms
iter 445330: loss 5.4048, time 122.85ms
iter 445340: loss 6.5436, time 120.82ms
iter 445350: loss 5.6429, time 122.07ms
iter 445360: loss 6.2579, time 124.36ms
iter 445370: loss 5.9337, time 121.72ms
iter 445380: loss 5.9811, time 121.65ms
iter 445390: loss 6.7623, time 121.82ms
iter 445400: loss 6.2716, time 122.62ms
iter 445410: loss 6.4920, time 121.27ms
iter 445420: loss 6.2798, time 121.82ms
iter 445430: loss 6.1308, time 122.78ms
iter 445440: loss 6.6851, time 121.86ms
iter 445450: loss 6.8029, time 121.84ms
iter 445460: loss 6.3880, time 124.35ms
iter 445470: loss 6.3654, time 121.87ms
iter 445480: loss 5.7469, time 121.95ms
iter 445490: loss 5.7086, time 121.86ms
step 445500: train loss 5.5495, val loss 5.5911
saving checkpoint to out-shakespeare-char
iter 445500: loss 5.8341, time 2916.33ms
iter 445510: loss 6.6392, time 121.13ms
iter 445520: loss 5.9023, time 121.93ms
iter 445530: loss 5.9578, time 123.04ms
iter 445540: loss 6.2649, time 121.99ms
iter 445550: loss 6.3588, time 121.68ms
iter 445560: loss 6.4966, time 124.37ms
iter 445570: loss 6.0799, time 121.58ms
iter 445580: loss 6.2565, time 121.95ms
iter 445590: loss 5.8037, time 121.45ms
iter 445600: loss 5.8091, time 121.27ms
iter 445610: loss 5.7912, time 121.82ms
iter 445620: loss 5.5592, time 121.55ms
iter 445630: loss 5.4338, time 123.49ms
iter 445640: loss 5.8469, time 121.94ms
iter 445650: loss 5.9575, time 122.24ms
iter 445660: loss 6.2398, time 122.27ms
iter 445670: loss 5.4931, time 121.63ms
iter 445680: loss 5.6243, time 121.67ms
iter 445690: loss 6.1862, time 124.47ms
iter 445700: loss 6.3427, time 121.42ms
iter 445710: loss 5.7552, time 121.73ms
iter 445720: loss 5.6008, time 121.87ms
iter 445730: loss 5.9141, time 121.86ms
iter 445740: loss 5.9237, time 121.57ms
step 445750: train loss 5.5614, val loss 5.5236
saving checkpoint to out-shakespeare-char
iter 445750: loss 6.2602, time 2923.59ms
iter 445760: loss 6.3500, time 122.68ms
iter 445770: loss 6.2260, time 122.03ms
iter 445780: loss 5.9422, time 121.68ms
iter 445790: loss 6.1762, time 124.55ms
iter 445800: loss 6.2428, time 121.73ms
iter 445810: loss 6.2899, time 121.89ms
iter 445820: loss 5.6686, time 122.02ms
iter 445830: loss 6.0021, time 122.00ms
iter 445840: loss 6.5051, time 122.05ms
iter 445850: loss 5.1700, time 121.62ms
iter 445860: loss 6.2171, time 123.26ms
iter 445870: loss 6.0226, time 121.85ms
iter 445880: loss 6.5348, time 121.89ms
iter 445890: loss 6.2853, time 122.75ms
iter 445900: loss 6.4397, time 121.77ms
iter 445910: loss 6.1043, time 121.78ms
iter 445920: loss 5.5060, time 124.21ms
iter 445930: loss 5.6891, time 121.59ms
iter 445940: loss 6.0512, time 121.86ms
iter 445950: loss 5.9053, time 122.24ms
iter 445960: loss 5.9525, time 121.05ms
iter 445970: loss 6.3663, time 121.80ms
iter 445980: loss 6.2383, time 121.97ms
iter 445990: loss 5.7473, time 122.27ms
step 446000: train loss 5.5757, val loss 5.6058
saving checkpoint to out-shakespeare-char
iter 446000: loss 6.3319, time 2904.64ms
iter 446010: loss 6.4277, time 121.96ms
iter 446020: loss 6.2886, time 122.75ms
iter 446030: loss 6.0151, time 122.00ms
iter 446040: loss 6.3830, time 121.70ms
iter 446050: loss 6.3800, time 124.45ms
iter 446060: loss 5.6349, time 122.19ms
iter 446070: loss 6.6947, time 121.62ms
iter 446080: loss 5.8231, time 121.79ms
iter 446090: loss 6.2726, time 121.79ms
iter 446100: loss 5.6601, time 119.71ms
iter 446110: loss 5.9435, time 119.62ms
iter 446120: loss 6.1440, time 121.48ms
iter 446130: loss 6.4822, time 119.64ms
iter 446140: loss 5.9473, time 119.86ms
iter 446150: loss 5.7623, time 121.02ms
iter 446160: loss 5.4053, time 119.69ms
iter 446170: loss 5.5437, time 119.62ms
iter 446180: loss 6.1532, time 122.42ms
iter 446190: loss 5.2837, time 119.66ms
iter 446200: loss 6.1253, time 121.10ms
iter 446210: loss 5.6131, time 119.84ms
iter 446220: loss 5.9581, time 119.49ms
iter 446230: loss 5.6384, time 119.97ms
iter 446240: loss 5.6878, time 119.66ms
step 446250: train loss 5.5854, val loss 5.5793
saving checkpoint to out-shakespeare-char
iter 446250: loss 6.1576, time 2895.78ms
iter 446260: loss 6.7459, time 119.55ms
iter 446270: loss 6.4381, time 120.16ms
iter 446280: loss 5.0857, time 120.05ms
iter 446290: loss 5.8161, time 119.71ms
iter 446300: loss 6.4224, time 120.83ms
iter 446310: loss 6.0206, time 121.54ms
iter 446320: loss 6.6069, time 121.42ms
iter 446330: loss 6.5819, time 119.87ms
iter 446340: loss 5.7729, time 121.44ms
iter 446350: loss 5.3940, time 120.00ms
iter 446360: loss 5.7104, time 119.91ms
iter 446370: loss 5.6757, time 122.99ms
iter 446380: loss 6.0680, time 120.58ms
iter 446390: loss 5.7328, time 120.16ms
iter 446400: loss 6.0409, time 121.12ms
iter 446410: loss 5.5203, time 119.91ms
iter 446420: loss 5.9095, time 120.22ms
iter 446430: loss 5.8516, time 120.24ms
iter 446440: loss 6.7049, time 121.84ms
iter 446450: loss 6.0765, time 119.98ms
iter 446460: loss 5.9096, time 120.95ms
iter 446470: loss 5.6680, time 121.71ms
iter 446480: loss 5.8787, time 120.05ms
iter 446490: loss 5.2778, time 119.97ms
step 446500: train loss 5.5661, val loss 5.5918
saving checkpoint to out-shakespeare-char
iter 446500: loss 5.2847, time 2909.57ms
iter 446510: loss 5.6457, time 125.63ms
iter 446520: loss 5.6381, time 128.55ms
iter 446530: loss 5.7187, time 125.52ms
iter 446540: loss 6.0204, time 127.48ms
iter 446550: loss 6.4736, time 126.46ms
iter 446560: loss 5.9263, time 128.19ms
iter 446570: loss 5.4882, time 125.66ms
iter 446580: loss 5.4498, time 128.30ms
iter 446590: loss 5.7144, time 125.44ms
iter 446600: loss 6.1303, time 127.17ms
iter 446610: loss 5.9429, time 125.68ms
iter 446620: loss 5.8140, time 128.94ms
iter 446630: loss 5.5384, time 125.78ms
iter 446640: loss 5.6482, time 128.39ms
iter 446650: loss 5.3759, time 125.87ms
iter 446660: loss 6.1797, time 128.65ms
iter 446670: loss 5.8913, time 125.87ms
iter 446680: loss 6.0463, time 128.06ms
iter 446690: loss 5.7221, time 128.17ms
iter 446700: loss 6.1860, time 125.78ms
iter 446710: loss 5.6110, time 125.36ms
iter 446720: loss 5.6680, time 126.02ms
iter 446730: loss 5.0760, time 125.77ms
iter 446740: loss 5.4772, time 126.67ms
step 446750: train loss 5.5554, val loss 5.6010
saving checkpoint to out-shakespeare-char
iter 446750: loss 6.2661, time 2906.17ms
iter 446760: loss 6.2052, time 125.18ms
iter 446770: loss 6.5595, time 125.94ms
iter 446780: loss 6.0250, time 125.81ms
iter 446790: loss 5.8353, time 125.14ms
iter 446800: loss 5.3103, time 125.08ms
iter 446810: loss 5.6183, time 125.84ms
iter 446820: loss 6.1865, time 126.12ms
iter 446830: loss 5.7796, time 125.05ms
iter 446840: loss 5.9811, time 125.11ms
iter 446850: loss 5.6533, time 126.42ms
iter 446860: loss 5.5637, time 124.70ms
iter 446870: loss 6.2785, time 124.62ms
iter 446880: loss 6.3035, time 125.00ms
iter 446890: loss 6.3132, time 125.12ms
iter 446900: loss 5.8004, time 124.86ms
iter 446910: loss 5.4342, time 124.91ms
iter 446920: loss 5.6278, time 124.90ms
iter 446930: loss 5.7910, time 124.50ms
iter 446940: loss 5.9321, time 125.08ms
iter 446950: loss 5.8792, time 125.10ms
iter 446960: loss 4.8881, time 125.03ms
iter 446970: loss 5.9124, time 125.09ms
iter 446980: loss 5.7335, time 125.13ms
iter 446990: loss 6.0806, time 124.62ms
step 447000: train loss 5.5763, val loss 5.5895
saving checkpoint to out-shakespeare-char
iter 447000: loss 6.2321, time 2878.62ms
iter 447010: loss 6.2158, time 124.94ms
iter 447020: loss 6.1910, time 126.79ms
iter 447030: loss 5.7830, time 125.63ms
iter 447040: loss 5.9471, time 125.81ms
iter 447050: loss 6.6932, time 125.78ms
iter 447060: loss 5.3010, time 125.81ms
iter 447070: loss 5.8705, time 125.80ms
iter 447080: loss 6.3026, time 125.66ms
iter 447090: loss 6.2990, time 125.81ms
iter 447100: loss 6.3476, time 126.31ms
iter 447110: loss 5.4347, time 128.09ms
iter 447120: loss 6.3339, time 125.71ms
iter 447130: loss 6.5302, time 128.27ms
iter 447140: loss 5.8624, time 125.69ms
iter 447150: loss 6.1098, time 128.85ms
iter 447160: loss 5.0331, time 127.60ms
iter 447170: loss 5.9617, time 125.96ms
iter 447180: loss 5.7975, time 124.37ms
iter 447190: loss 6.5533, time 126.52ms
iter 447200: loss 6.3688, time 125.53ms
iter 447210: loss 6.5181, time 128.34ms
iter 447220: loss 5.6980, time 125.20ms
iter 447230: loss 5.3912, time 128.06ms
iter 447240: loss 6.6039, time 125.35ms
step 447250: train loss 5.5640, val loss 5.5457
saving checkpoint to out-shakespeare-char
iter 447250: loss 5.8209, time 2892.38ms
iter 447260: loss 5.7285, time 126.10ms
iter 447270: loss 6.1600, time 125.23ms
iter 447280: loss 6.1298, time 127.73ms
iter 447290: loss 6.4273, time 125.32ms
iter 447300: loss 5.0204, time 127.95ms
iter 447310: loss 6.2904, time 125.32ms
iter 447320: loss 5.6720, time 128.01ms
iter 447330: loss 5.9697, time 124.74ms
iter 447340: loss 5.7565, time 127.86ms
iter 447350: loss 6.2283, time 125.02ms
iter 447360: loss 7.0423, time 125.91ms
iter 447370: loss 6.1602, time 125.31ms
iter 447380: loss 6.6304, time 125.36ms
iter 447390: loss 5.9023, time 125.13ms
iter 447400: loss 6.6716, time 124.93ms
iter 447410: loss 5.6767, time 124.96ms
iter 447420: loss 6.4114, time 125.25ms
iter 447430: loss 6.0884, time 125.32ms
iter 447440: loss 5.8347, time 125.86ms
iter 447450: loss 5.9210, time 125.73ms
iter 447460: loss 5.6845, time 125.81ms
iter 447470: loss 6.2618, time 126.33ms
iter 447480: loss 5.8420, time 125.72ms
iter 447490: loss 5.8897, time 125.25ms
step 447500: train loss 5.5017, val loss 5.5356
saving checkpoint to out-shakespeare-char
iter 447500: loss 5.6275, time 2906.23ms
iter 447510: loss 6.1412, time 125.57ms
iter 447520: loss 6.3131, time 124.98ms
iter 447530: loss 5.8497, time 125.64ms
iter 447540: loss 6.0430, time 125.70ms
iter 447550: loss 5.3177, time 125.71ms
iter 447560: loss 6.0517, time 125.30ms
iter 447570: loss 6.0801, time 125.27ms
iter 447580: loss 6.1367, time 126.07ms
iter 447590: loss 6.2220, time 125.83ms
iter 447600: loss 6.0463, time 125.44ms
iter 447610: loss 5.8266, time 126.40ms
iter 447620: loss 6.9539, time 126.09ms
iter 447630: loss 6.6468, time 126.64ms
iter 447640: loss 5.6681, time 126.05ms
iter 447650: loss 6.8114, time 128.36ms
iter 447660: loss 6.0630, time 125.60ms
iter 447670: loss 5.9747, time 128.42ms
iter 447680: loss 6.3291, time 126.18ms
iter 447690: loss 5.8481, time 128.23ms
iter 447700: loss 5.8083, time 125.40ms
iter 447710: loss 6.4963, time 127.92ms
iter 447720: loss 5.5200, time 125.78ms
iter 447730: loss 5.6731, time 128.27ms
iter 447740: loss 6.0132, time 125.22ms
step 447750: train loss 5.5579, val loss 5.5097
saving checkpoint to out-shakespeare-char
iter 447750: loss 6.1487, time 2898.94ms
iter 447760: loss 6.1000, time 125.30ms
iter 447770: loss 5.5816, time 125.17ms
iter 447780: loss 6.1744, time 125.05ms
iter 447790: loss 6.5309, time 125.60ms
iter 447800: loss 6.4175, time 125.62ms
iter 447810: loss 6.6425, time 125.85ms
iter 447820: loss 6.7898, time 125.68ms
iter 447830: loss 5.8447, time 125.63ms
iter 447840: loss 6.2184, time 125.69ms
iter 447850: loss 6.6197, time 125.78ms
iter 447860: loss 6.0208, time 124.36ms
iter 447870: loss 5.3549, time 124.33ms
iter 447880: loss 6.0530, time 125.13ms
iter 447890: loss 6.3017, time 125.02ms
iter 447900: loss 5.8555, time 125.33ms
iter 447910: loss 5.9200, time 124.80ms
iter 447920: loss 5.7642, time 125.19ms
iter 447930: loss 6.2473, time 125.79ms
iter 447940: loss 6.5008, time 125.47ms
iter 447950: loss 5.8319, time 124.92ms
iter 447960: loss 6.5579, time 125.32ms
iter 447970: loss 6.1797, time 124.36ms
iter 447980: loss 5.8643, time 127.74ms
iter 447990: loss 6.0862, time 124.33ms
step 448000: train loss 5.6140, val loss 5.5634
saving checkpoint to out-shakespeare-char
iter 448000: loss 6.8662, time 2882.70ms
iter 448010: loss 6.0278, time 125.12ms
iter 448020: loss 6.7745, time 125.30ms
iter 448030: loss 5.3669, time 124.05ms
iter 448040: loss 5.4188, time 124.64ms
iter 448050: loss 5.3049, time 125.28ms
iter 448060: loss 6.3671, time 124.28ms
iter 448070: loss 5.7689, time 125.13ms
iter 448080: loss 5.8551, time 125.08ms
iter 448090: loss 6.0547, time 124.74ms
iter 448100: loss 6.4073, time 125.08ms
iter 448110: loss 6.5725, time 125.07ms
iter 448120: loss 6.4949, time 125.14ms
iter 448130: loss 5.7968, time 125.13ms
iter 448140: loss 6.3628, time 124.87ms
iter 448150: loss 6.1214, time 124.95ms
iter 448160: loss 6.1882, time 125.11ms
iter 448170: loss 5.4435, time 124.68ms
iter 448180: loss 6.1602, time 124.83ms
iter 448190: loss 5.6865, time 125.37ms
iter 448200: loss 5.9248, time 125.16ms
iter 448210: loss 6.0299, time 125.16ms
iter 448220: loss 6.2230, time 124.06ms
iter 448230: loss 6.2321, time 125.15ms
iter 448240: loss 5.6973, time 124.90ms
step 448250: train loss 5.5735, val loss 5.5843
saving checkpoint to out-shakespeare-char
iter 448250: loss 5.5516, time 2900.10ms
iter 448260: loss 5.9940, time 124.66ms
iter 448270: loss 5.5384, time 124.13ms
iter 448280: loss 5.5482, time 125.17ms
iter 448290: loss 5.7700, time 125.35ms
iter 448300: loss 5.7834, time 125.38ms
iter 448310: loss 5.8341, time 125.37ms
iter 448320: loss 5.4540, time 124.95ms
iter 448330: loss 6.1311, time 128.11ms
iter 448340: loss 5.8052, time 124.98ms
iter 448350: loss 6.6610, time 127.65ms
iter 448360: loss 6.6462, time 125.08ms
iter 448370: loss 5.7589, time 127.55ms
iter 448380: loss 5.0818, time 125.23ms
iter 448390: loss 5.8929, time 126.67ms
iter 448400: loss 6.2467, time 124.18ms
iter 448410: loss 6.2995, time 127.85ms
iter 448420: loss 5.6858, time 126.13ms
iter 448430: loss 6.1607, time 127.90ms
iter 448440: loss 5.9181, time 125.74ms
iter 448450: loss 6.0854, time 127.70ms
iter 448460: loss 5.6449, time 125.00ms
iter 448470: loss 6.3393, time 127.95ms
iter 448480: loss 5.8678, time 125.20ms
iter 448490: loss 6.5868, time 127.78ms
step 448500: train loss 5.5903, val loss 5.5730
saving checkpoint to out-shakespeare-char
iter 448500: loss 5.6506, time 2892.94ms
iter 448510: loss 5.8001, time 125.27ms
iter 448520: loss 6.2013, time 125.38ms
iter 448530: loss 5.6832, time 125.36ms
iter 448540: loss 6.2958, time 125.02ms
iter 448550: loss 5.9790, time 125.49ms
iter 448560: loss 6.1643, time 125.95ms
iter 448570: loss 5.9839, time 125.20ms
iter 448580: loss 5.9100, time 125.56ms
iter 448590: loss 6.1674, time 125.82ms
iter 448600: loss 6.2446, time 125.60ms
iter 448610: loss 5.8979, time 125.66ms
iter 448620: loss 6.3801, time 125.54ms
iter 448630: loss 5.7764, time 125.51ms
iter 448640: loss 5.7453, time 125.64ms
iter 448650: loss 6.2197, time 126.22ms
iter 448660: loss 6.0027, time 125.50ms
iter 448670: loss 5.8971, time 125.66ms
iter 448680: loss 6.8226, time 125.62ms
iter 448690: loss 6.2500, time 125.40ms
iter 448700: loss 5.2584, time 125.65ms
iter 448710: loss 5.5139, time 125.76ms
iter 448720: loss 5.9134, time 126.17ms
iter 448730: loss 6.4547, time 125.21ms
iter 448740: loss 6.0560, time 125.18ms
step 448750: train loss 5.5732, val loss 5.5933
saving checkpoint to out-shakespeare-char
iter 448750: loss 6.6461, time 2888.37ms
iter 448760: loss 5.4226, time 125.66ms
iter 448770: loss 6.2644, time 123.70ms
iter 448780: loss 5.7382, time 124.50ms
iter 448790: loss 5.8415, time 125.61ms
iter 448800: loss 5.3503, time 125.73ms
iter 448810: loss 5.1017, time 125.59ms
iter 448820: loss 6.5498, time 126.30ms
iter 448830: loss 5.6329, time 126.13ms
iter 448840: loss 5.8187, time 125.93ms
iter 448850: loss 5.2935, time 125.90ms
iter 448860: loss 6.0408, time 127.37ms
iter 448870: loss 6.3689, time 125.12ms
iter 448880: loss 5.9416, time 126.37ms
iter 448890: loss 5.1866, time 126.11ms
iter 448900: loss 6.2802, time 125.33ms
iter 448910: loss 6.2711, time 126.04ms
iter 448920: loss 5.4693, time 126.35ms
iter 448930: loss 6.1911, time 126.03ms
iter 448940: loss 5.8470, time 126.12ms
iter 448950: loss 5.8226, time 125.59ms
iter 448960: loss 5.7330, time 124.68ms
iter 448970: loss 6.9560, time 125.51ms
iter 448980: loss 5.5888, time 125.54ms
iter 448990: loss 5.9866, time 124.76ms
step 449000: train loss 5.5794, val loss 5.5674
saving checkpoint to out-shakespeare-char
iter 449000: loss 5.9502, time 2884.57ms
iter 449010: loss 5.3176, time 124.36ms
iter 449020: loss 6.4511, time 125.74ms
iter 449030: loss 6.5011, time 125.05ms
iter 449040: loss 6.1684, time 124.84ms
iter 449050: loss 5.8302, time 124.61ms
iter 449060: loss 6.2044, time 125.58ms
iter 449070: loss 5.4286, time 125.51ms
iter 449080: loss 6.6532, time 125.93ms
iter 449090: loss 5.7870, time 125.91ms
iter 449100: loss 6.1458, time 126.58ms
iter 449110: loss 5.4779, time 121.40ms
iter 449120: loss 5.7582, time 120.36ms
iter 449130: loss 5.8731, time 122.26ms
iter 449140: loss 5.7158, time 121.10ms
iter 449150: loss 5.9950, time 121.81ms
iter 449160: loss 6.0580, time 122.65ms
iter 449170: loss 6.3806, time 121.33ms
iter 449180: loss 6.4298, time 120.77ms
iter 449190: loss 5.2688, time 123.96ms
iter 449200: loss 6.2360, time 121.38ms
iter 449210: loss 5.5416, time 121.55ms
iter 449220: loss 5.9968, time 121.33ms
iter 449230: loss 6.1867, time 125.28ms
iter 449240: loss 6.0589, time 125.14ms
step 449250: train loss 5.5458, val loss 5.6246
saving checkpoint to out-shakespeare-char
iter 449250: loss 5.4244, time 2889.69ms
iter 449260: loss 6.0048, time 121.45ms
iter 449270: loss 5.6155, time 121.30ms
iter 449280: loss 6.6952, time 124.07ms
iter 449290: loss 6.4200, time 120.77ms
iter 449300: loss 6.4052, time 121.52ms
iter 449310: loss 6.1914, time 121.46ms
iter 449320: loss 6.1540, time 121.65ms
iter 449330: loss 6.1913, time 121.45ms
iter 449340: loss 5.6342, time 121.52ms
iter 449350: loss 6.2300, time 122.65ms
iter 449360: loss 5.9069, time 121.71ms
iter 449370: loss 5.9615, time 122.20ms
iter 449380: loss 6.0146, time 123.47ms
iter 449390: loss 6.2356, time 122.22ms
iter 449400: loss 6.3817, time 121.44ms
iter 449410: loss 6.1056, time 124.26ms
iter 449420: loss 6.3177, time 121.87ms
iter 449430: loss 6.1245, time 122.00ms
iter 449440: loss 5.8436, time 121.33ms
iter 449450: loss 5.7632, time 121.35ms
iter 449460: loss 5.9900, time 121.34ms
iter 449470: loss 5.8588, time 121.45ms
iter 449480: loss 5.6273, time 122.58ms
iter 449490: loss 6.2647, time 120.83ms
step 449500: train loss 5.5547, val loss 5.5882
saving checkpoint to out-shakespeare-char
iter 449500: loss 6.2398, time 2890.77ms
iter 449510: loss 5.4818, time 121.41ms
iter 449520: loss 4.8939, time 121.15ms
iter 449530: loss 5.8954, time 121.23ms
iter 449540: loss 5.5538, time 122.69ms
iter 449550: loss 6.0925, time 121.51ms
iter 449560: loss 5.8525, time 121.68ms
iter 449570: loss 6.5245, time 122.77ms
iter 449580: loss 5.8396, time 121.52ms
iter 449590: loss 5.8099, time 121.54ms
iter 449600: loss 5.9364, time 123.74ms
iter 449610: loss 6.1877, time 121.52ms
iter 449620: loss 5.7743, time 120.88ms
iter 449630: loss 5.5474, time 121.72ms
iter 449640: loss 5.4979, time 121.63ms
iter 449650: loss 5.7729, time 121.69ms
iter 449660: loss 6.1497, time 121.48ms
iter 449670: loss 5.5294, time 123.09ms
iter 449680: loss 4.5914, time 121.73ms
iter 449690: loss 6.1445, time 121.61ms
iter 449700: loss 5.9076, time 122.05ms
iter 449710: loss 6.5475, time 121.72ms
iter 449720: loss 6.4761, time 121.60ms
iter 449730: loss 5.3544, time 123.37ms
iter 449740: loss 5.7557, time 121.66ms
step 449750: train loss 5.5900, val loss 5.5417
saving checkpoint to out-shakespeare-char
iter 449750: loss 5.7671, time 2909.10ms
iter 449760: loss 6.2493, time 127.83ms
iter 449770: loss 6.3409, time 125.03ms
iter 449780: loss 6.7487, time 127.79ms
iter 449790: loss 6.0038, time 125.42ms
iter 449800: loss 6.4015, time 127.99ms
iter 449810: loss 6.7236, time 125.33ms
iter 449820: loss 5.3201, time 127.86ms
iter 449830: loss 5.9251, time 125.15ms
iter 449840: loss 5.4946, time 127.69ms
iter 449850: loss 6.6599, time 125.01ms
iter 449860: loss 6.2910, time 125.27ms
iter 449870: loss 6.1118, time 125.29ms
iter 449880: loss 5.9509, time 125.00ms
iter 449890: loss 5.7481, time 125.14ms
iter 449900: loss 6.2243, time 125.77ms
iter 449910: loss 6.2826, time 125.02ms
iter 449920: loss 6.2977, time 124.71ms
iter 449930: loss 5.9353, time 126.29ms
iter 449940: loss 6.5564, time 125.17ms
iter 449950: loss 5.5156, time 125.01ms
iter 449960: loss 5.1930, time 125.23ms
iter 449970: loss 6.7109, time 124.31ms
iter 449980: loss 6.1822, time 124.95ms
iter 449990: loss 5.6798, time 125.26ms
step 450000: train loss 5.5773, val loss 5.5724
saving checkpoint to out-shakespeare-char
iter 450000: loss 5.8449, time 2888.46ms
iter 450010: loss 6.0695, time 125.85ms
iter 450020: loss 6.2287, time 124.92ms
iter 450030: loss 6.3259, time 124.96ms
iter 450040: loss 5.4956, time 126.89ms
iter 450050: loss 6.2240, time 124.53ms
iter 450060: loss 6.1695, time 124.59ms
iter 450070: loss 5.2234, time 125.13ms
iter 450080: loss 6.3198, time 124.06ms
iter 450090: loss 5.2527, time 124.46ms
iter 450100: loss 6.2469, time 124.96ms
iter 450110: loss 5.3072, time 125.78ms
iter 450120: loss 5.6690, time 125.33ms
iter 450130: loss 5.6945, time 125.45ms
iter 450140: loss 5.9223, time 125.31ms
iter 450150: loss 6.3835, time 125.46ms
iter 450160: loss 5.7017, time 125.13ms
iter 450170: loss 5.4213, time 125.74ms
iter 450180: loss 6.5873, time 125.36ms
iter 450190: loss 6.0108, time 125.06ms
iter 450200: loss 5.6993, time 125.10ms
iter 450210: loss 6.3649, time 125.51ms
iter 450220: loss 6.4073, time 125.41ms
iter 450230: loss 6.3125, time 125.36ms
iter 450240: loss 5.7137, time 125.71ms
step 450250: train loss 5.5516, val loss 5.5930
saving checkpoint to out-shakespeare-char
iter 450250: loss 5.4130, time 2890.31ms
iter 450260: loss 6.0236, time 125.74ms
iter 450270: loss 6.5571, time 125.67ms
iter 450280: loss 5.4744, time 125.59ms
iter 450290: loss 5.5875, time 125.24ms
iter 450300: loss 6.2096, time 125.47ms
iter 450310: loss 5.5809, time 126.45ms
iter 450320: loss 6.7474, time 125.47ms
iter 450330: loss 5.8316, time 127.68ms
iter 450340: loss 5.5583, time 126.82ms
iter 450350: loss 6.0422, time 125.69ms
iter 450360: loss 5.6748, time 126.82ms
iter 450370: loss 6.5225, time 126.68ms
iter 450380: loss 5.8362, time 125.41ms
iter 450390: loss 5.8060, time 125.48ms
iter 450400: loss 6.1492, time 126.48ms
iter 450410: loss 6.1548, time 125.43ms
iter 450420: loss 6.6011, time 125.86ms
iter 450430: loss 6.0023, time 126.40ms
iter 450440: loss 5.2958, time 125.84ms
iter 450450: loss 6.0280, time 125.90ms
iter 450460: loss 5.8383, time 125.74ms
iter 450470: loss 5.6994, time 125.81ms
iter 450480: loss 6.0963, time 126.19ms
iter 450490: loss 5.8196, time 126.03ms
step 450500: train loss 5.5591, val loss 5.5579
saving checkpoint to out-shakespeare-char
iter 450500: loss 6.3196, time 2911.41ms
iter 450510: loss 5.5468, time 126.78ms
iter 450520: loss 5.6441, time 125.45ms
iter 450530: loss 6.2303, time 125.67ms
iter 450540: loss 6.1067, time 125.45ms
iter 450550: loss 5.1001, time 125.80ms
iter 450560: loss 6.0307, time 125.70ms
iter 450570: loss 6.0131, time 125.65ms
iter 450580: loss 6.4038, time 125.59ms
iter 450590: loss 6.5596, time 125.83ms
iter 450600: loss 5.8574, time 125.27ms
iter 450610: loss 6.0599, time 125.63ms
iter 450620: loss 6.3438, time 125.37ms
iter 450630: loss 6.1907, time 123.66ms
iter 450640: loss 6.2913, time 123.01ms
iter 450650: loss 5.4798, time 121.80ms
iter 450660: loss 5.4161, time 122.03ms
iter 450670: loss 5.9424, time 121.87ms
iter 450680: loss 5.9296, time 122.62ms
iter 450690: loss 6.0276, time 121.47ms
iter 450700: loss 6.0492, time 121.75ms
iter 450710: loss 6.5094, time 122.46ms
iter 450720: loss 5.6014, time 121.78ms
iter 450730: loss 5.6579, time 121.59ms
iter 450740: loss 5.8470, time 123.64ms
step 450750: train loss 5.5947, val loss 5.5662
saving checkpoint to out-shakespeare-char
iter 450750: loss 5.9044, time 2906.29ms
iter 450760: loss 5.6208, time 122.07ms
iter 450770: loss 5.6089, time 121.96ms
iter 450780: loss 6.0082, time 121.97ms
iter 450790: loss 6.0079, time 121.95ms
iter 450800: loss 6.0923, time 121.56ms
iter 450810: loss 5.7611, time 123.18ms
iter 450820: loss 5.6716, time 123.17ms
iter 450830: loss 5.5543, time 121.67ms
iter 450840: loss 6.6419, time 121.84ms
iter 450850: loss 6.2273, time 121.97ms
iter 450860: loss 5.9551, time 121.96ms
iter 450870: loss 5.9420, time 121.96ms
iter 450880: loss 5.8886, time 123.12ms
iter 450890: loss 6.2902, time 121.65ms
iter 450900: loss 5.3728, time 121.88ms
iter 450910: loss 5.6403, time 123.01ms
iter 450920: loss 5.3413, time 123.15ms
iter 450930: loss 6.5663, time 122.67ms
iter 450940: loss 6.2948, time 121.77ms
iter 450950: loss 5.9625, time 121.87ms
iter 450960: loss 6.0345, time 124.63ms
iter 450970: loss 5.8446, time 121.88ms
iter 450980: loss 6.0108, time 122.02ms
iter 450990: loss 6.3297, time 121.89ms
step 451000: train loss 5.5471, val loss 5.5355
saving checkpoint to out-shakespeare-char
iter 451000: loss 6.0005, time 2895.15ms
iter 451010: loss 6.4674, time 122.32ms
iter 451020: loss 5.8910, time 121.43ms
iter 451030: loss 6.0145, time 121.30ms
iter 451040: loss 5.3202, time 121.49ms
iter 451050: loss 6.4599, time 121.33ms
iter 451060: loss 5.3698, time 122.56ms
iter 451070: loss 5.4087, time 121.48ms
iter 451080: loss 5.9961, time 121.68ms
iter 451090: loss 5.6833, time 122.91ms
iter 451100: loss 5.7992, time 120.41ms
iter 451110: loss 5.2798, time 121.75ms
iter 451120: loss 5.9602, time 121.45ms
iter 451130: loss 6.4785, time 121.41ms
iter 451140: loss 5.1744, time 122.06ms
iter 451150: loss 7.2245, time 121.43ms
iter 451160: loss 5.6680, time 122.54ms
iter 451170: loss 6.0633, time 121.66ms
iter 451180: loss 6.2236, time 121.35ms
iter 451190: loss 6.2477, time 123.93ms
iter 451200: loss 5.6022, time 121.39ms
iter 451210: loss 6.5628, time 121.51ms
iter 451220: loss 5.9212, time 121.75ms
iter 451230: loss 5.5749, time 121.45ms
iter 451240: loss 5.9359, time 121.30ms
step 451250: train loss 5.5241, val loss 5.5292
saving checkpoint to out-shakespeare-char
iter 451250: loss 6.5120, time 2897.44ms
iter 451260: loss 6.0993, time 122.43ms
iter 451270: loss 5.2937, time 121.20ms
iter 451280: loss 7.3774, time 121.40ms
iter 451290: loss 6.2932, time 122.24ms
iter 451300: loss 6.3114, time 121.33ms
iter 451310: loss 5.7637, time 121.40ms
iter 451320: loss 6.3424, time 124.01ms
iter 451330: loss 6.3846, time 121.31ms
iter 451340: loss 6.3060, time 121.56ms
iter 451350: loss 6.1964, time 121.40ms
iter 451360: loss 6.1056, time 121.24ms
iter 451370: loss 5.8879, time 121.41ms
iter 451380: loss 6.4171, time 121.51ms
iter 451390: loss 5.7020, time 122.40ms
iter 451400: loss 6.4085, time 121.43ms
iter 451410: loss 5.5027, time 121.58ms
iter 451420: loss 6.1191, time 122.66ms
iter 451430: loss 5.8039, time 121.44ms
iter 451440: loss 6.1007, time 121.01ms
iter 451450: loss 6.4748, time 124.02ms
iter 451460: loss 6.0832, time 121.28ms
iter 451470: loss 5.9382, time 121.36ms
iter 451480: loss 5.9792, time 121.44ms
iter 451490: loss 5.9299, time 121.50ms
step 451500: train loss 5.5555, val loss 5.5780
saving checkpoint to out-shakespeare-char
iter 451500: loss 6.3372, time 2892.41ms
iter 451510: loss 6.0246, time 121.82ms
iter 451520: loss 6.1873, time 124.27ms
iter 451530: loss 5.6355, time 120.64ms
iter 451540: loss 6.2084, time 120.49ms
iter 451550: loss 6.3240, time 121.67ms
iter 451560: loss 5.6649, time 121.79ms
iter 451570: loss 5.7510, time 121.31ms
iter 451580: loss 5.6232, time 120.97ms
iter 451590: loss 6.2183, time 122.75ms
iter 451600: loss 5.5497, time 121.46ms
iter 451610: loss 6.0933, time 121.25ms
iter 451620: loss 5.7757, time 124.26ms
iter 451630: loss 5.5108, time 122.00ms
iter 451640: loss 6.1754, time 121.39ms
iter 451650: loss 5.5249, time 122.79ms
iter 451660: loss 6.0014, time 121.26ms
iter 451670: loss 4.8868, time 121.31ms
iter 451680: loss 6.7583, time 121.57ms
iter 451690: loss 5.8432, time 123.01ms
iter 451700: loss 5.4955, time 121.37ms
iter 451710: loss 5.1138, time 121.31ms
iter 451720: loss 5.6434, time 123.91ms
iter 451730: loss 5.9440, time 120.94ms
iter 451740: loss 6.0717, time 121.40ms
step 451750: train loss 5.5783, val loss 5.5488
saving checkpoint to out-shakespeare-char
iter 451750: loss 5.8400, time 2891.99ms
iter 451760: loss 5.1747, time 123.04ms
iter 451770: loss 5.4325, time 121.08ms
iter 451780: loss 6.3269, time 121.53ms
iter 451790: loss 5.6800, time 124.05ms
iter 451800: loss 5.2806, time 122.15ms
iter 451810: loss 5.9396, time 120.85ms
iter 451820: loss 6.2120, time 121.36ms
iter 451830: loss 5.2183, time 121.77ms
iter 451840: loss 5.3584, time 121.99ms
iter 451850: loss 5.5151, time 121.41ms
iter 451860: loss 6.0627, time 121.16ms
iter 451870: loss 5.5646, time 123.26ms
iter 451880: loss 6.5809, time 121.23ms
iter 451890: loss 5.5974, time 121.53ms
iter 451900: loss 6.4663, time 124.14ms
iter 451910: loss 5.2737, time 122.19ms
iter 451920: loss 5.8885, time 121.71ms
iter 451930: loss 6.3982, time 122.03ms
iter 451940: loss 5.8683, time 123.26ms
iter 451950: loss 6.0353, time 122.90ms
iter 451960: loss 6.3851, time 123.26ms
iter 451970: loss 5.9457, time 121.23ms
iter 451980: loss 5.9902, time 121.60ms
iter 451990: loss 5.8283, time 123.93ms
step 452000: train loss 5.5968, val loss 5.5660
saving checkpoint to out-shakespeare-char
iter 452000: loss 5.8991, time 2892.07ms
iter 452010: loss 6.2424, time 122.68ms
iter 452020: loss 5.6458, time 121.34ms
iter 452030: loss 5.5821, time 121.47ms
iter 452040: loss 5.7000, time 121.46ms
iter 452050: loss 6.3363, time 121.97ms
iter 452060: loss 5.9136, time 121.44ms
iter 452070: loss 6.1368, time 121.30ms
iter 452080: loss 5.9033, time 124.50ms
iter 452090: loss 5.8299, time 121.60ms
iter 452100: loss 5.6032, time 121.47ms
iter 452110: loss 5.6221, time 121.67ms
iter 452120: loss 5.7490, time 121.83ms
iter 452130: loss 6.1912, time 122.83ms
iter 452140: loss 6.1376, time 122.66ms
iter 452150: loss 5.8207, time 121.64ms
iter 452160: loss 5.6407, time 121.61ms
iter 452170: loss 5.1487, time 124.19ms
iter 452180: loss 6.3323, time 122.01ms
iter 452190: loss 5.1605, time 121.66ms
iter 452200: loss 6.0462, time 121.56ms
iter 452210: loss 6.2875, time 121.66ms
iter 452220: loss 6.0116, time 121.74ms
iter 452230: loss 6.3855, time 121.44ms
iter 452240: loss 5.8042, time 122.95ms
step 452250: train loss 5.5571, val loss 5.5766
saving checkpoint to out-shakespeare-char
iter 452250: loss 5.0508, time 2902.43ms
iter 452260: loss 6.9645, time 122.02ms
iter 452270: loss 5.9994, time 122.22ms
iter 452280: loss 5.3261, time 122.94ms
iter 452290: loss 5.3828, time 121.65ms
iter 452300: loss 5.3168, time 122.02ms
iter 452310: loss 6.2316, time 122.73ms
iter 452320: loss 5.9246, time 120.90ms
iter 452330: loss 5.7178, time 119.81ms
iter 452340: loss 5.9878, time 120.66ms
iter 452350: loss 6.0176, time 122.30ms
iter 452360: loss 7.1079, time 121.75ms
iter 452370: loss 5.6832, time 122.95ms
iter 452380: loss 5.6605, time 121.66ms
iter 452390: loss 5.7309, time 121.58ms
iter 452400: loss 5.9780, time 122.81ms
iter 452410: loss 6.0308, time 121.62ms
iter 452420: loss 5.8646, time 121.68ms
iter 452430: loss 6.0431, time 124.00ms
iter 452440: loss 6.0118, time 122.99ms
iter 452450: loss 6.4887, time 123.04ms
iter 452460: loss 6.1377, time 121.48ms
iter 452470: loss 6.3949, time 121.56ms
iter 452480: loss 6.1005, time 124.32ms
iter 452490: loss 6.5594, time 121.51ms
step 452500: train loss 5.6125, val loss 5.5609
saving checkpoint to out-shakespeare-char
iter 452500: loss 4.9909, time 2894.71ms
iter 452510: loss 6.4593, time 121.94ms
iter 452520: loss 7.0700, time 124.45ms
iter 452530: loss 6.1162, time 121.80ms
iter 452540: loss 5.7149, time 121.83ms
iter 452550: loss 6.3686, time 121.64ms
iter 452560: loss 5.2355, time 121.80ms
iter 452570: loss 5.7792, time 121.78ms
iter 452580: loss 5.8004, time 121.69ms
iter 452590: loss 5.9017, time 122.65ms
iter 452600: loss 6.2551, time 121.79ms
iter 452610: loss 6.0218, time 121.73ms
iter 452620: loss 6.0431, time 122.86ms
iter 452630: loss 5.8040, time 122.94ms
iter 452640: loss 5.7680, time 124.92ms
iter 452650: loss 5.6900, time 121.98ms
iter 452660: loss 5.8325, time 122.15ms
iter 452670: loss 5.2832, time 121.74ms
iter 452680: loss 6.0931, time 121.67ms
iter 452690: loss 6.1236, time 122.03ms
iter 452700: loss 5.3678, time 121.62ms
iter 452710: loss 6.7153, time 121.96ms
iter 452720: loss 6.1607, time 121.60ms
iter 452730: loss 6.2756, time 121.89ms
iter 452740: loss 6.3039, time 122.64ms
step 452750: train loss 5.5309, val loss 5.6242
saving checkpoint to out-shakespeare-char
iter 452750: loss 6.0344, time 2902.76ms
iter 452760: loss 6.4825, time 122.70ms
iter 452770: loss 6.3913, time 121.66ms
iter 452780: loss 6.1062, time 121.44ms
iter 452790: loss 5.4531, time 122.72ms
iter 452800: loss 6.4934, time 122.97ms
iter 452810: loss 5.7053, time 122.72ms
iter 452820: loss 5.3973, time 121.60ms
iter 452830: loss 6.0814, time 122.03ms
iter 452840: loss 5.6025, time 124.34ms
iter 452850: loss 5.7154, time 121.79ms
iter 452860: loss 5.2298, time 121.67ms
iter 452870: loss 6.2612, time 121.74ms
iter 452880: loss 6.4388, time 121.74ms
iter 452890: loss 5.6844, time 121.53ms
iter 452900: loss 6.1748, time 121.55ms
iter 452910: loss 5.8011, time 122.71ms
iter 452920: loss 6.0306, time 122.92ms
iter 452930: loss 6.0192, time 122.92ms
iter 452940: loss 5.8878, time 121.49ms
iter 452950: loss 5.7990, time 121.56ms
iter 452960: loss 6.3469, time 124.04ms
iter 452970: loss 6.4887, time 121.60ms
iter 452980: loss 6.2298, time 121.50ms
iter 452990: loss 6.4365, time 121.32ms
step 453000: train loss 5.5720, val loss 5.5505
saving checkpoint to out-shakespeare-char
iter 453000: loss 5.6117, time 2886.90ms
iter 453010: loss 5.9741, time 121.90ms
iter 453020: loss 5.3845, time 121.79ms
iter 453030: loss 5.8284, time 121.69ms
iter 453040: loss 6.2891, time 121.71ms
iter 453050: loss 5.2087, time 121.71ms
iter 453060: loss 6.4886, time 121.87ms
iter 453070: loss 6.8628, time 122.35ms
iter 453080: loss 5.9127, time 121.62ms
iter 453090: loss 5.6649, time 121.78ms
iter 453100: loss 6.0573, time 122.81ms
iter 453110: loss 5.3509, time 122.01ms
iter 453120: loss 5.2179, time 122.00ms
iter 453130: loss 5.7826, time 120.91ms
iter 453140: loss 5.8810, time 121.04ms
iter 453150: loss 6.4969, time 121.77ms
iter 453160: loss 5.7631, time 123.14ms
iter 453170: loss 5.5304, time 121.54ms
iter 453180: loss 6.3224, time 121.39ms
iter 453190: loss 6.2119, time 123.82ms
iter 453200: loss 5.9743, time 123.17ms
iter 453210: loss 5.9800, time 121.95ms
iter 453220: loss 6.4747, time 124.42ms
iter 453230: loss 6.3178, time 121.26ms
iter 453240: loss 6.2901, time 121.16ms
step 453250: train loss 5.6043, val loss 5.5213
saving checkpoint to out-shakespeare-char
iter 453250: loss 6.4621, time 2904.09ms
iter 453260: loss 5.4308, time 120.85ms
iter 453270: loss 6.0090, time 121.60ms
iter 453280: loss 5.6112, time 121.52ms
iter 453290: loss 5.5666, time 121.71ms
iter 453300: loss 5.0882, time 120.10ms
iter 453310: loss 6.2000, time 121.83ms
iter 453320: loss 6.1371, time 119.20ms
iter 453330: loss 6.5630, time 121.84ms
iter 453340: loss 5.9842, time 121.52ms
iter 453350: loss 6.3072, time 121.61ms
iter 453360: loss 6.1500, time 121.38ms
iter 453370: loss 5.7904, time 123.14ms
iter 453380: loss 5.9699, time 121.05ms
iter 453390: loss 5.6079, time 122.01ms
iter 453400: loss 5.6019, time 122.08ms
iter 453410: loss 6.2938, time 122.63ms
iter 453420: loss 5.9632, time 121.55ms
iter 453430: loss 6.2171, time 121.51ms
iter 453440: loss 5.6048, time 123.06ms
iter 453450: loss 5.5622, time 122.97ms
iter 453460: loss 6.6960, time 122.49ms
iter 453470: loss 5.6337, time 121.44ms
iter 453480: loss 6.4017, time 121.53ms
iter 453490: loss 5.6055, time 124.12ms
step 453500: train loss 5.5716, val loss 5.5567
saving checkpoint to out-shakespeare-char
iter 453500: loss 5.8661, time 2895.23ms
iter 453510: loss 5.6198, time 120.30ms
iter 453520: loss 6.3115, time 124.35ms
iter 453530: loss 5.7503, time 121.29ms
iter 453540: loss 6.0818, time 121.44ms
iter 453550: loss 6.3823, time 121.41ms
iter 453560: loss 5.3430, time 121.55ms
iter 453570: loss 6.0914, time 121.80ms
iter 453580: loss 5.3065, time 121.60ms
iter 453590: loss 6.0523, time 121.84ms
iter 453600: loss 6.0344, time 121.61ms
iter 453610: loss 6.7822, time 121.34ms
iter 453620: loss 6.1264, time 124.28ms
iter 453630: loss 5.7833, time 120.06ms
iter 453640: loss 6.2162, time 123.24ms
iter 453650: loss 5.8851, time 122.06ms
iter 453660: loss 6.5212, time 121.60ms
iter 453670: loss 5.7116, time 122.77ms
iter 453680: loss 6.0658, time 121.45ms
iter 453690: loss 6.4907, time 121.63ms
iter 453700: loss 6.3843, time 124.12ms
iter 453710: loss 6.2009, time 121.62ms
iter 453720: loss 5.8844, time 121.65ms
iter 453730: loss 5.8013, time 121.66ms
iter 453740: loss 5.5674, time 121.51ms
step 453750: train loss 5.5381, val loss 5.5540
saving checkpoint to out-shakespeare-char
iter 453750: loss 5.6739, time 2899.89ms
iter 453760: loss 6.1036, time 122.13ms
iter 453770: loss 5.9039, time 121.97ms
iter 453780: loss 6.6580, time 121.43ms
iter 453790: loss 5.2523, time 123.86ms
iter 453800: loss 5.4113, time 123.24ms
iter 453810: loss 5.9667, time 121.69ms
iter 453820: loss 5.6689, time 121.71ms
iter 453830: loss 6.2313, time 122.90ms
iter 453840: loss 6.0972, time 122.17ms
iter 453850: loss 5.2756, time 121.83ms
iter 453860: loss 5.8851, time 125.79ms
iter 453870: loss 5.5365, time 122.25ms
iter 453880: loss 5.4372, time 122.04ms
iter 453890: loss 6.2874, time 121.89ms
iter 453900: loss 5.7159, time 124.06ms
iter 453910: loss 6.1029, time 122.10ms
iter 453920: loss 5.3042, time 123.10ms
iter 453930: loss 6.2732, time 121.40ms
iter 453940: loss 6.5447, time 122.86ms
iter 453950: loss 5.6121, time 121.93ms
iter 453960: loss 6.4243, time 121.89ms
iter 453970: loss 6.1695, time 124.36ms
iter 453980: loss 5.9395, time 125.61ms
iter 453990: loss 6.0140, time 125.64ms
step 454000: train loss 5.5679, val loss 5.5619
saving checkpoint to out-shakespeare-char
iter 454000: loss 5.4807, time 2874.60ms
iter 454010: loss 6.4076, time 121.41ms
iter 454020: loss 5.5396, time 121.60ms
iter 454030: loss 5.6135, time 121.24ms
iter 454040: loss 6.1123, time 122.77ms
iter 454050: loss 5.8248, time 121.35ms
iter 454060: loss 6.7225, time 122.54ms
iter 454070: loss 6.1301, time 121.19ms
iter 454080: loss 6.0431, time 120.06ms
iter 454090: loss 5.6127, time 122.56ms
iter 454100: loss 5.6524, time 122.49ms
iter 454110: loss 6.0190, time 123.94ms
iter 454120: loss 6.0841, time 121.02ms
iter 454130: loss 5.4239, time 121.36ms
iter 454140: loss 6.1406, time 121.84ms
iter 454150: loss 5.7455, time 122.45ms
iter 454160: loss 5.7614, time 121.35ms
iter 454170: loss 6.5145, time 121.60ms
iter 454180: loss 5.8429, time 123.12ms
iter 454190: loss 6.1861, time 121.13ms
iter 454200: loss 6.5433, time 121.36ms
iter 454210: loss 5.8594, time 122.83ms
iter 454220: loss 5.8226, time 122.54ms
iter 454230: loss 5.9599, time 121.30ms
iter 454240: loss 5.9211, time 122.45ms
step 454250: train loss 5.5418, val loss 5.5442
saving checkpoint to out-shakespeare-char
iter 454250: loss 6.2771, time 2880.55ms
iter 454260: loss 6.2107, time 121.30ms
iter 454270: loss 6.6711, time 120.83ms
iter 454280: loss 5.5681, time 122.58ms
iter 454290: loss 5.8139, time 122.90ms
iter 454300: loss 6.3986, time 121.18ms
iter 454310: loss 5.5642, time 121.25ms
iter 454320: loss 6.0110, time 121.17ms
iter 454330: loss 6.2412, time 122.71ms
iter 454340: loss 6.0702, time 121.22ms
iter 454350: loss 5.9865, time 121.18ms
iter 454360: loss 6.3709, time 121.79ms
iter 454370: loss 5.7010, time 121.27ms
iter 454380: loss 5.7486, time 121.33ms
iter 454390: loss 5.0469, time 124.21ms
iter 454400: loss 5.9767, time 121.41ms
iter 454410: loss 5.6177, time 122.63ms
iter 454420: loss 6.9389, time 124.29ms
iter 454430: loss 6.2103, time 121.31ms
iter 454440: loss 6.6359, time 121.01ms
iter 454450: loss 5.6008, time 121.34ms
iter 454460: loss 6.2524, time 121.12ms
iter 454470: loss 6.1791, time 121.23ms
iter 454480: loss 5.0479, time 121.25ms
iter 454490: loss 6.6148, time 122.71ms
step 454500: train loss 5.5547, val loss 5.5815
saving checkpoint to out-shakespeare-char
iter 454500: loss 5.9504, time 2897.01ms
iter 454510: loss 6.3193, time 121.48ms
iter 454520: loss 5.4859, time 121.30ms
iter 454530: loss 5.3152, time 122.48ms
iter 454540: loss 6.3597, time 121.13ms
iter 454550: loss 5.6778, time 121.33ms
iter 454560: loss 5.3674, time 122.69ms
iter 454570: loss 6.7158, time 121.20ms
iter 454580: loss 6.2808, time 121.31ms
iter 454590: loss 6.4814, time 122.12ms
iter 454600: loss 6.7200, time 121.72ms
iter 454610: loss 5.1595, time 122.64ms
iter 454620: loss 6.4871, time 122.75ms
iter 454630: loss 6.0289, time 121.42ms
iter 454640: loss 5.8176, time 121.46ms
iter 454650: loss 5.8231, time 121.15ms
iter 454660: loss 6.1587, time 121.49ms
iter 454670: loss 6.1049, time 121.25ms
iter 454680: loss 6.6794, time 122.57ms
iter 454690: loss 5.5895, time 121.49ms
iter 454700: loss 5.1027, time 120.04ms
iter 454710: loss 6.0964, time 121.14ms
iter 454720: loss 5.7459, time 120.43ms
iter 454730: loss 5.8689, time 120.78ms
iter 454740: loss 6.0493, time 120.13ms
step 454750: train loss 5.5579, val loss 5.5892
saving checkpoint to out-shakespeare-char
iter 454750: loss 6.4213, time 2907.49ms
iter 454760: loss 6.5427, time 121.01ms
iter 454770: loss 5.8741, time 122.11ms
iter 454780: loss 5.9251, time 121.11ms
iter 454790: loss 6.1015, time 121.40ms
iter 454800: loss 5.5514, time 122.35ms
iter 454810: loss 5.8176, time 120.59ms
iter 454820: loss 5.7629, time 125.78ms
iter 454830: loss 5.3382, time 125.95ms
iter 454840: loss 5.4750, time 124.09ms
iter 454850: loss 5.8893, time 126.12ms
iter 454860: loss 5.9883, time 125.96ms
iter 454870: loss 5.9460, time 125.89ms
iter 454880: loss 6.2034, time 127.09ms
iter 454890: loss 5.5992, time 125.62ms
iter 454900: loss 5.3138, time 125.78ms
iter 454910: loss 6.5716, time 125.86ms
iter 454920: loss 5.3928, time 125.71ms
iter 454930: loss 5.6299, time 126.10ms
iter 454940: loss 7.0731, time 125.28ms
iter 454950: loss 5.8492, time 125.42ms
iter 454960: loss 6.3281, time 125.05ms
iter 454970: loss 6.2427, time 124.79ms
iter 454980: loss 5.9082, time 125.16ms
iter 454990: loss 6.3405, time 126.79ms
step 455000: train loss 5.5657, val loss 5.5505
saving checkpoint to out-shakespeare-char
iter 455000: loss 5.5415, time 2877.54ms
iter 455010: loss 5.1563, time 126.74ms
iter 455020: loss 6.5598, time 126.07ms
iter 455030: loss 5.8911, time 125.97ms
iter 455040: loss 5.7899, time 125.68ms
iter 455050: loss 5.8685, time 125.91ms
iter 455060: loss 5.9966, time 125.38ms
iter 455070: loss 6.0020, time 126.20ms
iter 455080: loss 5.9246, time 125.77ms
iter 455090: loss 5.9568, time 125.57ms
iter 455100: loss 5.7938, time 125.76ms
iter 455110: loss 5.7639, time 126.02ms
iter 455120: loss 6.3071, time 126.02ms
iter 455130: loss 5.8101, time 125.63ms
iter 455140: loss 6.2050, time 125.91ms
iter 455150: loss 6.3029, time 125.73ms
iter 455160: loss 5.8696, time 125.20ms
iter 455170: loss 5.4428, time 125.72ms
iter 455180: loss 5.3641, time 125.83ms
iter 455190: loss 5.9463, time 125.66ms
iter 455200: loss 6.3455, time 126.06ms
iter 455210: loss 6.2715, time 125.36ms
iter 455220: loss 6.2476, time 125.77ms
iter 455230: loss 5.1252, time 125.52ms
iter 455240: loss 6.6966, time 125.90ms
step 455250: train loss 5.5867, val loss 5.5195
saving checkpoint to out-shakespeare-char
iter 455250: loss 6.2028, time 2887.86ms
iter 455260: loss 5.7946, time 126.24ms
iter 455270: loss 5.9396, time 125.88ms
iter 455280: loss 5.6788, time 125.90ms
iter 455290: loss 5.6649, time 125.82ms
iter 455300: loss 6.0853, time 126.19ms
iter 455310: loss 6.1328, time 126.02ms
iter 455320: loss 6.1245, time 124.87ms
iter 455330: loss 5.9940, time 125.71ms
iter 455340: loss 5.9087, time 126.53ms
iter 455350: loss 6.8096, time 125.66ms
iter 455360: loss 6.1294, time 125.48ms
iter 455370: loss 6.4416, time 125.60ms
iter 455380: loss 6.2126, time 125.64ms
iter 455390: loss 6.0761, time 125.55ms
iter 455400: loss 6.2430, time 125.56ms
iter 455410: loss 5.4971, time 125.61ms
iter 455420: loss 5.7967, time 125.59ms
iter 455430: loss 6.0753, time 126.24ms
iter 455440: loss 5.6808, time 125.72ms
iter 455450: loss 6.0962, time 125.64ms
iter 455460: loss 5.5477, time 125.50ms
iter 455470: loss 6.3829, time 125.82ms
iter 455480: loss 6.4904, time 125.88ms
iter 455490: loss 5.6129, time 128.61ms
step 455500: train loss 5.5597, val loss 5.5674
saving checkpoint to out-shakespeare-char
iter 455500: loss 5.6711, time 2882.26ms
iter 455510: loss 5.7375, time 125.11ms
iter 455520: loss 6.1915, time 124.61ms
iter 455530: loss 5.7526, time 125.39ms
iter 455540: loss 6.5807, time 125.01ms
iter 455550: loss 5.2639, time 125.43ms
iter 455560: loss 5.6601, time 125.40ms
iter 455570: loss 6.1875, time 125.92ms
iter 455580: loss 5.3159, time 125.51ms
iter 455590: loss 5.6859, time 125.39ms
iter 455600: loss 5.5147, time 125.58ms
iter 455610: loss 5.7546, time 125.89ms
iter 455620: loss 6.8138, time 125.63ms
iter 455630: loss 5.6468, time 125.85ms
iter 455640: loss 6.0541, time 125.74ms
iter 455650: loss 6.1437, time 125.60ms
iter 455660: loss 6.5648, time 125.26ms
iter 455670: loss 6.2448, time 124.15ms
iter 455680: loss 6.2180, time 125.04ms
iter 455690: loss 6.7457, time 125.21ms
iter 455700: loss 5.9818, time 125.29ms
iter 455710: loss 6.1973, time 125.25ms
iter 455720: loss 5.8644, time 125.41ms
iter 455730: loss 5.6677, time 125.24ms
iter 455740: loss 5.7424, time 124.98ms
step 455750: train loss 5.5324, val loss 5.5784
saving checkpoint to out-shakespeare-char
iter 455750: loss 5.5760, time 2885.28ms
iter 455760: loss 6.9389, time 121.92ms
iter 455770: loss 5.6395, time 119.25ms
iter 455780: loss 5.9862, time 121.39ms
iter 455790: loss 5.1534, time 121.39ms
iter 455800: loss 6.2909, time 121.34ms
iter 455810: loss 6.3540, time 121.27ms
iter 455820: loss 6.1348, time 121.24ms
iter 455830: loss 6.5446, time 122.59ms
iter 455840: loss 5.2819, time 121.40ms
iter 455850: loss 6.0751, time 121.41ms
iter 455860: loss 6.3432, time 122.38ms
iter 455870: loss 6.3206, time 121.44ms
iter 455880: loss 6.1013, time 121.31ms
iter 455890: loss 5.8326, time 124.01ms
iter 455900: loss 5.4546, time 121.17ms
iter 455910: loss 5.5735, time 121.44ms
iter 455920: loss 5.8861, time 121.17ms
iter 455930: loss 6.2549, time 121.28ms
iter 455940: loss 5.6361, time 121.42ms
iter 455950: loss 5.8511, time 121.31ms
iter 455960: loss 6.2143, time 122.52ms
iter 455970: loss 6.4099, time 121.42ms
iter 455980: loss 6.4598, time 121.78ms
iter 455990: loss 6.1015, time 122.56ms
step 456000: train loss 5.5472, val loss 5.6137
saving checkpoint to out-shakespeare-char
iter 456000: loss 5.5265, time 2893.33ms
iter 456010: loss 5.4796, time 121.57ms
iter 456020: loss 5.9426, time 121.89ms
iter 456030: loss 6.1692, time 121.57ms
iter 456040: loss 6.7762, time 121.69ms
iter 456050: loss 6.4560, time 120.90ms
iter 456060: loss 5.3772, time 123.08ms
iter 456070: loss 5.9301, time 121.41ms
iter 456080: loss 6.1473, time 121.72ms
iter 456090: loss 5.7286, time 123.43ms
iter 456100: loss 6.4534, time 121.73ms
iter 456110: loss 6.3677, time 121.72ms
iter 456120: loss 5.9129, time 124.18ms
iter 456130: loss 5.7559, time 121.69ms
iter 456140: loss 5.9579, time 121.72ms
iter 456150: loss 5.8484, time 121.59ms
iter 456160: loss 6.7009, time 121.59ms
iter 456170: loss 5.5550, time 122.10ms
iter 456180: loss 5.8911, time 121.49ms
iter 456190: loss 5.5675, time 122.84ms
iter 456200: loss 6.4431, time 120.52ms
iter 456210: loss 5.7342, time 121.42ms
iter 456220: loss 5.9594, time 122.80ms
iter 456230: loss 5.9643, time 121.64ms
iter 456240: loss 6.0285, time 121.55ms
step 456250: train loss 5.5611, val loss 5.5821
saving checkpoint to out-shakespeare-char
iter 456250: loss 5.9487, time 2903.87ms
iter 456260: loss 6.1091, time 121.69ms
iter 456270: loss 5.7557, time 121.53ms
iter 456280: loss 6.0187, time 124.31ms
iter 456290: loss 5.3327, time 121.76ms
iter 456300: loss 5.7856, time 121.48ms
iter 456310: loss 6.2739, time 122.11ms
iter 456320: loss 5.8102, time 121.72ms
iter 456330: loss 6.1861, time 121.57ms
iter 456340: loss 6.5591, time 121.49ms
iter 456350: loss 6.1725, time 122.74ms
iter 456360: loss 6.1675, time 121.64ms
iter 456370: loss 5.4643, time 122.04ms
iter 456380: loss 6.0073, time 122.84ms
iter 456390: loss 6.1066, time 121.35ms
iter 456400: loss 6.4098, time 121.86ms
iter 456410: loss 6.1134, time 123.58ms
iter 456420: loss 6.0991, time 121.39ms
iter 456430: loss 6.7695, time 121.65ms
iter 456440: loss 5.6405, time 121.88ms
iter 456450: loss 5.8688, time 121.18ms
iter 456460: loss 6.3682, time 121.60ms
iter 456470: loss 6.2242, time 121.86ms
iter 456480: loss 6.4634, time 123.17ms
iter 456490: loss 5.7223, time 121.87ms
step 456500: train loss 5.5307, val loss 5.5646
saving checkpoint to out-shakespeare-char
iter 456500: loss 6.2653, time 2905.65ms
iter 456510: loss 5.9397, time 122.10ms
iter 456520: loss 6.6014, time 122.45ms
iter 456530: loss 6.0121, time 121.05ms
iter 456540: loss 6.0857, time 121.80ms
iter 456550: loss 5.7844, time 122.99ms
iter 456560: loss 6.0654, time 121.60ms
iter 456570: loss 6.0577, time 121.64ms
iter 456580: loss 5.8006, time 125.66ms
iter 456590: loss 5.6723, time 122.05ms
iter 456600: loss 6.3043, time 121.50ms
iter 456610: loss 5.5022, time 124.59ms
iter 456620: loss 5.5332, time 121.35ms
iter 456630: loss 5.6830, time 121.33ms
iter 456640: loss 5.5173, time 122.21ms
iter 456650: loss 6.3276, time 121.82ms
iter 456660: loss 5.8774, time 121.71ms
iter 456670: loss 5.7329, time 121.46ms
iter 456680: loss 5.5362, time 122.73ms
iter 456690: loss 5.9987, time 122.29ms
iter 456700: loss 5.5699, time 121.80ms
iter 456710: loss 5.9337, time 122.01ms
iter 456720: loss 6.1915, time 121.61ms
iter 456730: loss 5.9980, time 121.50ms
iter 456740: loss 6.4007, time 124.92ms
step 456750: train loss 5.5349, val loss 5.5339
saving checkpoint to out-shakespeare-char
iter 456750: loss 6.0754, time 2913.59ms
iter 456760: loss 6.6833, time 125.78ms
iter 456770: loss 5.3945, time 126.11ms
iter 456780: loss 6.2455, time 125.08ms
iter 456790: loss 6.3027, time 125.88ms
iter 456800: loss 5.6363, time 126.02ms
iter 456810: loss 5.4845, time 125.86ms
iter 456820: loss 5.2883, time 126.19ms
iter 456830: loss 5.2279, time 125.80ms
iter 456840: loss 6.0499, time 126.00ms
iter 456850: loss 6.5800, time 125.47ms
iter 456860: loss 6.0719, time 126.01ms
iter 456870: loss 6.5218, time 126.12ms
iter 456880: loss 6.7822, time 126.08ms
iter 456890: loss 6.4555, time 125.73ms
iter 456900: loss 5.8607, time 126.27ms
iter 456910: loss 6.6475, time 125.90ms
iter 456920: loss 5.5771, time 126.03ms
iter 456930: loss 6.1818, time 125.37ms
iter 456940: loss 5.7858, time 125.63ms
iter 456950: loss 5.5236, time 125.30ms
iter 456960: loss 5.7117, time 125.32ms
iter 456970: loss 6.0581, time 125.54ms
iter 456980: loss 6.1309, time 125.97ms
iter 456990: loss 5.4955, time 125.44ms
step 457000: train loss 5.5764, val loss 5.5527
saving checkpoint to out-shakespeare-char
iter 457000: loss 5.6362, time 2877.84ms
iter 457010: loss 6.2431, time 125.81ms
iter 457020: loss 6.1038, time 125.68ms
iter 457030: loss 6.4066, time 125.63ms
iter 457040: loss 6.5146, time 125.72ms
iter 457050: loss 5.9842, time 125.95ms
iter 457060: loss 5.6267, time 126.00ms
iter 457070: loss 6.3829, time 125.85ms
iter 457080: loss 5.5869, time 125.61ms
iter 457090: loss 5.8454, time 125.80ms
iter 457100: loss 5.7020, time 126.16ms
iter 457110: loss 6.4770, time 127.15ms
iter 457120: loss 6.0724, time 126.00ms
iter 457130: loss 6.2587, time 126.19ms
iter 457140: loss 6.3966, time 126.26ms
iter 457150: loss 5.6134, time 126.62ms
iter 457160: loss 5.8042, time 126.93ms
iter 457170: loss 6.0076, time 127.09ms
iter 457180: loss 5.8828, time 126.25ms
iter 457190: loss 5.9327, time 126.74ms
iter 457200: loss 5.8724, time 126.44ms
iter 457210: loss 5.1719, time 125.90ms
iter 457220: loss 6.1555, time 125.56ms
iter 457230: loss 5.9948, time 125.96ms
iter 457240: loss 5.9520, time 125.06ms
step 457250: train loss 5.5897, val loss 5.6120
saving checkpoint to out-shakespeare-char
iter 457250: loss 6.5315, time 2885.37ms
iter 457260: loss 5.1289, time 121.55ms
iter 457270: loss 5.6956, time 124.19ms
iter 457280: loss 5.7411, time 121.72ms
iter 457290: loss 5.7658, time 121.57ms
iter 457300: loss 6.2620, time 121.75ms
iter 457310: loss 5.9691, time 121.65ms
iter 457320: loss 5.3257, time 121.23ms
iter 457330: loss 7.1893, time 120.98ms
iter 457340: loss 5.8945, time 122.58ms
iter 457350: loss 5.8662, time 121.37ms
iter 457360: loss 6.3812, time 121.53ms
iter 457370: loss 6.1082, time 122.50ms
iter 457380: loss 6.1905, time 121.44ms
iter 457390: loss 5.7418, time 121.64ms
iter 457400: loss 5.8961, time 124.22ms
iter 457410: loss 5.5535, time 121.57ms
iter 457420: loss 5.7884, time 121.40ms
iter 457430: loss 5.6061, time 121.94ms
iter 457440: loss 5.8093, time 121.62ms
iter 457450: loss 6.2011, time 121.48ms
iter 457460: loss 6.1101, time 121.59ms
iter 457470: loss 6.1164, time 122.66ms
iter 457480: loss 6.2325, time 121.73ms
iter 457490: loss 6.4527, time 121.45ms
step 457500: train loss 5.6095, val loss 5.5539
saving checkpoint to out-shakespeare-char
iter 457500: loss 6.2053, time 2901.21ms
iter 457510: loss 5.4228, time 121.75ms
iter 457520: loss 5.6922, time 121.72ms
iter 457530: loss 5.3742, time 121.47ms
iter 457540: loss 5.8984, time 122.66ms
iter 457550: loss 6.2372, time 121.55ms
iter 457560: loss 6.0758, time 121.54ms
iter 457570: loss 6.1093, time 122.58ms
iter 457580: loss 6.5315, time 121.54ms
iter 457590: loss 5.6195, time 121.38ms
iter 457600: loss 5.7969, time 124.09ms
iter 457610: loss 6.4866, time 121.41ms
iter 457620: loss 6.1308, time 121.67ms
iter 457630: loss 6.4425, time 121.43ms
iter 457640: loss 6.1889, time 121.42ms
iter 457650: loss 5.7245, time 121.63ms
iter 457660: loss 5.9874, time 121.69ms
iter 457670: loss 6.0828, time 122.65ms
iter 457680: loss 5.9176, time 121.25ms
iter 457690: loss 6.0766, time 121.48ms
iter 457700: loss 5.2236, time 122.66ms
iter 457710: loss 5.3322, time 121.21ms
iter 457720: loss 6.9290, time 121.32ms
iter 457730: loss 5.8272, time 124.20ms
iter 457740: loss 6.2621, time 121.52ms
step 457750: train loss 5.5614, val loss 5.5691
saving checkpoint to out-shakespeare-char
iter 457750: loss 5.6820, time 2901.20ms
iter 457760: loss 5.9377, time 125.82ms
iter 457770: loss 5.5553, time 126.16ms
iter 457780: loss 6.5126, time 125.66ms
iter 457790: loss 5.9638, time 125.30ms
iter 457800: loss 5.9846, time 124.35ms
iter 457810: loss 5.2913, time 125.29ms
iter 457820: loss 5.7998, time 125.23ms
iter 457830: loss 6.4035, time 125.16ms
iter 457840: loss 5.9873, time 125.35ms
iter 457850: loss 6.8779, time 125.53ms
iter 457860: loss 5.9472, time 125.55ms
iter 457870: loss 5.9881, time 125.59ms
iter 457880: loss 6.4338, time 125.82ms
iter 457890: loss 5.4060, time 126.00ms
iter 457900: loss 5.8134, time 125.94ms
iter 457910: loss 6.5489, time 125.63ms
iter 457920: loss 6.2711, time 125.65ms
iter 457930: loss 6.1779, time 125.51ms
iter 457940: loss 5.5442, time 126.04ms
iter 457950: loss 6.6627, time 126.33ms
iter 457960: loss 6.2474, time 126.81ms
iter 457970: loss 5.8908, time 125.77ms
iter 457980: loss 5.3794, time 126.15ms
iter 457990: loss 6.0361, time 125.92ms
step 458000: train loss 5.5398, val loss 5.5557
saving checkpoint to out-shakespeare-char
iter 458000: loss 6.2596, time 2901.12ms
iter 458010: loss 5.4526, time 125.64ms
iter 458020: loss 5.9461, time 128.20ms
iter 458030: loss 5.7831, time 125.92ms
iter 458040: loss 5.8918, time 128.30ms
iter 458050: loss 6.0821, time 125.42ms
iter 458060: loss 5.4492, time 128.13ms
iter 458070: loss 6.4520, time 125.88ms
iter 458080: loss 6.5138, time 127.94ms
iter 458090: loss 6.0413, time 125.52ms
iter 458100: loss 5.8565, time 128.24ms
iter 458110: loss 6.4799, time 125.62ms
iter 458120: loss 5.9331, time 128.27ms
iter 458130: loss 6.1395, time 125.47ms
iter 458140: loss 5.7128, time 128.43ms
iter 458150: loss 5.7975, time 125.63ms
iter 458160: loss 5.7806, time 128.09ms
iter 458170: loss 5.5379, time 125.63ms
iter 458180: loss 5.7834, time 128.25ms
iter 458190: loss 5.8042, time 125.81ms
iter 458200: loss 5.3723, time 128.17ms
iter 458210: loss 6.6989, time 125.66ms
iter 458220: loss 5.3324, time 125.69ms
iter 458230: loss 6.2643, time 125.47ms
iter 458240: loss 5.1783, time 126.30ms
step 458250: train loss 5.4850, val loss 5.5476
saving checkpoint to out-shakespeare-char
iter 458250: loss 5.5749, time 2894.80ms
iter 458260: loss 5.8012, time 125.06ms
iter 458270: loss 5.5579, time 125.20ms
iter 458280: loss 5.9855, time 125.21ms
iter 458290: loss 5.8884, time 125.31ms
iter 458300: loss 6.3541, time 125.37ms
iter 458310: loss 5.9137, time 125.52ms
iter 458320: loss 5.5844, time 127.16ms
iter 458330: loss 6.3279, time 124.78ms
iter 458340: loss 5.8633, time 125.40ms
iter 458350: loss 6.2243, time 124.72ms
iter 458360: loss 6.0150, time 125.23ms
iter 458370: loss 5.9757, time 125.19ms
iter 458380: loss 5.7540, time 125.72ms
iter 458390: loss 6.0887, time 125.57ms
iter 458400: loss 5.6811, time 125.56ms
iter 458410: loss 6.4471, time 125.56ms
iter 458420: loss 5.4077, time 128.04ms
iter 458430: loss 6.2301, time 125.66ms
iter 458440: loss 5.9855, time 128.12ms
iter 458450: loss 5.9248, time 125.71ms
iter 458460: loss 5.2322, time 128.87ms
iter 458470: loss 6.1870, time 125.71ms
iter 458480: loss 6.2387, time 128.31ms
iter 458490: loss 6.3154, time 125.41ms
step 458500: train loss 5.5349, val loss 5.5743
saving checkpoint to out-shakespeare-char
iter 458500: loss 5.4460, time 2881.79ms
iter 458510: loss 5.6990, time 125.95ms
iter 458520: loss 5.6958, time 126.69ms
iter 458530: loss 5.3217, time 125.87ms
iter 458540: loss 5.3635, time 124.73ms
iter 458550: loss 5.8927, time 125.61ms
iter 458560: loss 5.9725, time 125.73ms
iter 458570: loss 6.0824, time 126.01ms
iter 458580: loss 5.9851, time 125.86ms
iter 458590: loss 5.6907, time 125.86ms
iter 458600: loss 5.2162, time 125.37ms
iter 458610: loss 5.8597, time 125.07ms
iter 458620: loss 5.2461, time 124.01ms
iter 458630: loss 6.2497, time 125.30ms
iter 458640: loss 6.4499, time 125.21ms
iter 458650: loss 6.1561, time 125.58ms
iter 458660: loss 6.3461, time 125.08ms
iter 458670: loss 6.1260, time 125.36ms
iter 458680: loss 6.6131, time 125.11ms
iter 458690: loss 6.0282, time 125.23ms
iter 458700: loss 6.0798, time 125.24ms
iter 458710: loss 5.7639, time 125.08ms
iter 458720: loss 5.5112, time 125.02ms
iter 458730: loss 6.2009, time 125.04ms
iter 458740: loss 5.5416, time 124.88ms
step 458750: train loss 5.5513, val loss 5.5904
saving checkpoint to out-shakespeare-char
iter 458750: loss 5.5548, time 2908.36ms
iter 458760: loss 5.6577, time 125.10ms
iter 458770: loss 6.1542, time 125.13ms
iter 458780: loss 6.3405, time 125.14ms
iter 458790: loss 5.8317, time 125.35ms
iter 458800: loss 6.1055, time 125.71ms
iter 458810: loss 5.8626, time 121.63ms
iter 458820: loss 5.6453, time 122.94ms
iter 458830: loss 6.2931, time 121.51ms
iter 458840: loss 5.7774, time 120.95ms
iter 458850: loss 5.9718, time 124.20ms
iter 458860: loss 6.5639, time 121.73ms
iter 458870: loss 5.9638, time 121.57ms
iter 458880: loss 5.9227, time 121.62ms
iter 458890: loss 6.0552, time 121.58ms
iter 458900: loss 6.1517, time 121.47ms
iter 458910: loss 6.1628, time 121.27ms
iter 458920: loss 5.5272, time 122.36ms
iter 458930: loss 6.3342, time 121.43ms
iter 458940: loss 5.7464, time 121.52ms
iter 458950: loss 5.9523, time 122.94ms
iter 458960: loss 6.3899, time 121.45ms
iter 458970: loss 6.1118, time 121.59ms
iter 458980: loss 5.8711, time 124.04ms
iter 458990: loss 6.0818, time 121.40ms
step 459000: train loss 5.5338, val loss 5.5712
saving checkpoint to out-shakespeare-char
iter 459000: loss 6.2078, time 2894.85ms
iter 459010: loss 6.0613, time 124.26ms
iter 459020: loss 6.8760, time 121.99ms
iter 459030: loss 6.1379, time 121.68ms
iter 459040: loss 5.9274, time 122.11ms
iter 459050: loss 6.0948, time 122.62ms
iter 459060: loss 5.4982, time 121.52ms
iter 459070: loss 6.0506, time 121.28ms
iter 459080: loss 6.0444, time 123.50ms
iter 459090: loss 6.5631, time 122.07ms
iter 459100: loss 6.4135, time 121.62ms
iter 459110: loss 5.8878, time 123.08ms
iter 459120: loss 6.2879, time 121.58ms
iter 459130: loss 5.8255, time 121.72ms
iter 459140: loss 6.0774, time 122.31ms
iter 459150: loss 5.6953, time 121.22ms
iter 459160: loss 6.1094, time 121.61ms
iter 459170: loss 6.3080, time 121.58ms
iter 459180: loss 5.9265, time 123.33ms
iter 459190: loss 5.7589, time 122.50ms
iter 459200: loss 5.6971, time 121.52ms
iter 459210: loss 5.6939, time 123.14ms
iter 459220: loss 5.6401, time 121.50ms
iter 459230: loss 5.9205, time 121.82ms
iter 459240: loss 5.8305, time 124.55ms
step 459250: train loss 5.5253, val loss 5.6203
saving checkpoint to out-shakespeare-char
iter 459250: loss 5.5867, time 2890.61ms
iter 459260: loss 5.8898, time 121.52ms
iter 459270: loss 5.7600, time 121.55ms
iter 459280: loss 5.9560, time 124.23ms
iter 459290: loss 6.1906, time 121.63ms
iter 459300: loss 6.6022, time 121.60ms
iter 459310: loss 5.8878, time 121.67ms
iter 459320: loss 5.5720, time 121.62ms
iter 459330: loss 5.8006, time 121.99ms
iter 459340: loss 6.5195, time 121.55ms
iter 459350: loss 6.3227, time 122.76ms
iter 459360: loss 6.5174, time 121.60ms
iter 459370: loss 6.3843, time 121.67ms
iter 459380: loss 5.5183, time 122.83ms
iter 459390: loss 5.7621, time 121.53ms
iter 459400: loss 5.9093, time 121.50ms
iter 459410: loss 5.7599, time 124.00ms
iter 459420: loss 5.2638, time 121.62ms
iter 459430: loss 6.0910, time 121.93ms
iter 459440: loss 5.7944, time 121.62ms
iter 459450: loss 6.1903, time 120.60ms
iter 459460: loss 5.8183, time 121.45ms
iter 459470: loss 5.9651, time 121.52ms
iter 459480: loss 6.3393, time 122.64ms
iter 459490: loss 6.0566, time 121.48ms
step 459500: train loss 5.5846, val loss 5.5993
saving checkpoint to out-shakespeare-char
iter 459500: loss 6.2478, time 2894.70ms
iter 459510: loss 5.9149, time 121.69ms
iter 459520: loss 6.2048, time 122.69ms
iter 459530: loss 5.4003, time 121.52ms
iter 459540: loss 5.5052, time 121.52ms
iter 459550: loss 6.3726, time 122.63ms
iter 459560: loss 6.0952, time 121.59ms
iter 459570: loss 6.0549, time 121.24ms
iter 459580: loss 5.9109, time 121.76ms
iter 459590: loss 5.8778, time 121.53ms
iter 459600: loss 5.9133, time 121.61ms
iter 459610: loss 5.9348, time 121.49ms
iter 459620: loss 5.9076, time 123.03ms
iter 459630: loss 6.1727, time 121.66ms
iter 459640: loss 5.5388, time 121.72ms
iter 459650: loss 5.8282, time 122.66ms
iter 459660: loss 5.3215, time 121.60ms
iter 459670: loss 5.7339, time 121.34ms
iter 459680: loss 6.2554, time 124.18ms
iter 459690: loss 6.1993, time 121.76ms
iter 459700: loss 5.4965, time 122.44ms
iter 459710: loss 6.0257, time 121.60ms
iter 459720: loss 5.9062, time 121.86ms
iter 459730: loss 6.6133, time 120.85ms
iter 459740: loss 6.3630, time 121.13ms
step 459750: train loss 5.5895, val loss 5.6096
saving checkpoint to out-shakespeare-char
iter 459750: loss 6.1424, time 2903.69ms
iter 459760: loss 5.7468, time 121.49ms
iter 459770: loss 5.5900, time 121.39ms
iter 459780: loss 5.4909, time 124.14ms
iter 459790: loss 5.5955, time 121.64ms
iter 459800: loss 6.4967, time 121.55ms
iter 459810: loss 6.0864, time 120.49ms
iter 459820: loss 5.8473, time 121.53ms
iter 459830: loss 5.7006, time 121.55ms
iter 459840: loss 6.5621, time 121.57ms
iter 459850: loss 6.0551, time 122.80ms
iter 459860: loss 6.1887, time 121.50ms
iter 459870: loss 6.6766, time 121.59ms
iter 459880: loss 5.9015, time 122.71ms
iter 459890: loss 5.5913, time 121.30ms
iter 459900: loss 5.2633, time 121.50ms
iter 459910: loss 6.5414, time 123.37ms
iter 459920: loss 6.1095, time 121.54ms
iter 459930: loss 5.8996, time 121.59ms
iter 459940: loss 6.7084, time 121.60ms
iter 459950: loss 6.2643, time 121.39ms
iter 459960: loss 5.6866, time 121.79ms
iter 459970: loss 6.0870, time 120.46ms
iter 459980: loss 5.9658, time 122.54ms
iter 459990: loss 6.3343, time 121.34ms
step 460000: train loss 5.5527, val loss 5.5250
saving checkpoint to out-shakespeare-char
iter 460000: loss 4.9473, time 2901.14ms
iter 460010: loss 6.1594, time 124.35ms
iter 460020: loss 5.7271, time 121.21ms
iter 460030: loss 5.6819, time 121.33ms
iter 460040: loss 5.6245, time 121.28ms
iter 460050: loss 5.2442, time 121.73ms
iter 460060: loss 5.8849, time 121.23ms
iter 460070: loss 5.7334, time 121.38ms
iter 460080: loss 6.1717, time 122.84ms
iter 460090: loss 6.4244, time 121.27ms
iter 460100: loss 5.5043, time 121.40ms
iter 460110: loss 6.5660, time 122.45ms
iter 460120: loss 5.8995, time 121.39ms
iter 460130: loss 5.5918, time 121.35ms
iter 460140: loss 5.8531, time 123.79ms
iter 460150: loss 6.3284, time 121.14ms
iter 460160: loss 6.1037, time 120.68ms
iter 460170: loss 5.8595, time 121.30ms
iter 460180: loss 5.6753, time 121.29ms
iter 460190: loss 5.8042, time 121.67ms
iter 460200: loss 5.6242, time 122.52ms
iter 460210: loss 5.6995, time 122.24ms
iter 460220: loss 6.2419, time 121.25ms
iter 460230: loss 6.0980, time 121.18ms
iter 460240: loss 6.1471, time 123.88ms
step 460250: train loss 5.5842, val loss 5.5523
saving checkpoint to out-shakespeare-char
iter 460250: loss 6.3856, time 2896.66ms
iter 460260: loss 5.9976, time 121.60ms
iter 460270: loss 5.5677, time 121.57ms
iter 460280: loss 5.5981, time 124.10ms
iter 460290: loss 6.6179, time 121.22ms
iter 460300: loss 6.2159, time 121.43ms
iter 460310: loss 6.3672, time 121.48ms
iter 460320: loss 5.3514, time 121.61ms
iter 460330: loss 5.2795, time 121.54ms
iter 460340: loss 6.2075, time 121.61ms
iter 460350: loss 5.7813, time 122.65ms
iter 460360: loss 6.1214, time 121.49ms
iter 460370: loss 6.1926, time 121.41ms
iter 460380: loss 6.7673, time 122.51ms
iter 460390: loss 5.9082, time 121.70ms
iter 460400: loss 5.9904, time 121.60ms
iter 460410: loss 5.7502, time 124.26ms
iter 460420: loss 5.6963, time 121.09ms
iter 460430: loss 5.9940, time 121.57ms
iter 460440: loss 6.3332, time 121.77ms
iter 460450: loss 5.9378, time 121.34ms
iter 460460: loss 6.2373, time 121.53ms
iter 460470: loss 6.1371, time 121.90ms
iter 460480: loss 6.1739, time 123.18ms
iter 460490: loss 6.9511, time 121.50ms
step 460500: train loss 5.5522, val loss 5.5487
saving checkpoint to out-shakespeare-char
iter 460500: loss 6.1684, time 2908.53ms
iter 460510: loss 6.8231, time 122.66ms
iter 460520: loss 6.2502, time 121.54ms
iter 460530: loss 5.8405, time 125.45ms
iter 460540: loss 5.6425, time 125.52ms
iter 460550: loss 5.8225, time 125.93ms
iter 460560: loss 5.8823, time 125.58ms
iter 460570: loss 6.1789, time 125.45ms
iter 460580: loss 5.8514, time 126.39ms
iter 460590: loss 5.8967, time 125.39ms
iter 460600: loss 6.1878, time 125.67ms
iter 460610: loss 6.3793, time 125.25ms
iter 460620: loss 5.7284, time 125.54ms
iter 460630: loss 5.7906, time 125.02ms
iter 460640: loss 6.4880, time 125.53ms
iter 460650: loss 6.0453, time 125.35ms
iter 460660: loss 5.8302, time 125.23ms
iter 460670: loss 5.7393, time 125.30ms
iter 460680: loss 5.8328, time 125.28ms
iter 460690: loss 5.9397, time 125.22ms
iter 460700: loss 5.9615, time 125.40ms
iter 460710: loss 6.7684, time 125.43ms
iter 460720: loss 6.3856, time 125.97ms
iter 460730: loss 5.5438, time 125.37ms
iter 460740: loss 5.7457, time 125.25ms
step 460750: train loss 5.6195, val loss 5.5738
saving checkpoint to out-shakespeare-char
iter 460750: loss 6.0724, time 2892.10ms
iter 460760: loss 5.9307, time 125.40ms
iter 460770: loss 5.8905, time 125.28ms
iter 460780: loss 6.1081, time 125.44ms
iter 460790: loss 5.8539, time 124.19ms
iter 460800: loss 5.5722, time 125.48ms
iter 460810: loss 5.5480, time 125.25ms
iter 460820: loss 5.6143, time 124.17ms
iter 460830: loss 6.4605, time 125.29ms
iter 460840: loss 5.6492, time 125.47ms
iter 460850: loss 6.5060, time 125.31ms
iter 460860: loss 5.7811, time 125.39ms
iter 460870: loss 6.1110, time 124.58ms
iter 460880: loss 5.2149, time 125.52ms
iter 460890: loss 5.4315, time 125.32ms
iter 460900: loss 6.2973, time 125.65ms
iter 460910: loss 6.1163, time 125.04ms
iter 460920: loss 6.1365, time 125.48ms
iter 460930: loss 6.4690, time 125.21ms
iter 460940: loss 6.1287, time 125.44ms
iter 460950: loss 6.7518, time 125.39ms
iter 460960: loss 5.5010, time 126.08ms
iter 460970: loss 5.8063, time 125.40ms
iter 460980: loss 6.0845, time 125.56ms
iter 460990: loss 6.1143, time 125.43ms
step 461000: train loss 5.5677, val loss 5.5340
saving checkpoint to out-shakespeare-char
iter 461000: loss 6.1598, time 2908.65ms
iter 461010: loss 6.2200, time 125.39ms
iter 461020: loss 5.5692, time 125.49ms
iter 461030: loss 5.6976, time 125.38ms
iter 461040: loss 6.2267, time 124.76ms
iter 461050: loss 6.0404, time 124.81ms
iter 461060: loss 5.7237, time 125.30ms
iter 461070: loss 5.4783, time 125.18ms
iter 461080: loss 5.2636, time 125.43ms
iter 461090: loss 5.7050, time 125.28ms
iter 461100: loss 5.9276, time 125.44ms
iter 461110: loss 5.8254, time 125.24ms
iter 461120: loss 5.8482, time 125.59ms
iter 461130: loss 5.6920, time 125.31ms
iter 461140: loss 5.7893, time 125.78ms
iter 461150: loss 6.3015, time 125.45ms
iter 461160: loss 5.9625, time 125.39ms
iter 461170: loss 5.6624, time 125.32ms
iter 461180: loss 6.2861, time 125.26ms
iter 461190: loss 6.1806, time 125.11ms
iter 461200: loss 6.7055, time 124.47ms
iter 461210: loss 5.5011, time 125.46ms
iter 461220: loss 5.6543, time 125.29ms
iter 461230: loss 6.4350, time 125.34ms
iter 461240: loss 6.2136, time 125.46ms
step 461250: train loss 5.5549, val loss 5.5683
saving checkpoint to out-shakespeare-char
iter 461250: loss 6.1112, time 2891.31ms
iter 461260: loss 5.8819, time 124.47ms
iter 461270: loss 6.2522, time 125.22ms
iter 461280: loss 5.6757, time 125.13ms
iter 461290: loss 6.3952, time 125.94ms
iter 461300: loss 5.9483, time 125.34ms
iter 461310: loss 5.4740, time 125.13ms
iter 461320: loss 5.9399, time 125.41ms
iter 461330: loss 5.8568, time 126.42ms
iter 461340: loss 6.0436, time 125.57ms
iter 461350: loss 6.0986, time 127.75ms
iter 461360: loss 5.7174, time 125.73ms
iter 461370: loss 5.8203, time 128.36ms
iter 461380: loss 6.2130, time 125.69ms
iter 461390: loss 5.8270, time 125.97ms
iter 461400: loss 5.5569, time 125.63ms
iter 461410: loss 5.9413, time 125.63ms
iter 461420: loss 5.7508, time 125.61ms
iter 461430: loss 5.7501, time 125.63ms
iter 461440: loss 5.9839, time 125.58ms
iter 461450: loss 6.2431, time 125.54ms
iter 461460: loss 5.5458, time 125.23ms
iter 461470: loss 5.8011, time 126.03ms
iter 461480: loss 6.2866, time 126.90ms
iter 461490: loss 5.6402, time 125.69ms
step 461500: train loss 5.5345, val loss 5.5574
saving checkpoint to out-shakespeare-char
iter 461500: loss 6.7747, time 2907.15ms
iter 461510: loss 5.9275, time 125.39ms
iter 461520: loss 5.6182, time 125.68ms
iter 461530: loss 6.3739, time 126.39ms
iter 461540: loss 5.8236, time 125.20ms
iter 461550: loss 6.4485, time 125.46ms
iter 461560: loss 6.0590, time 126.05ms
iter 461570: loss 5.9739, time 125.84ms
iter 461580: loss 6.2406, time 125.84ms
iter 461590: loss 6.3437, time 125.60ms
iter 461600: loss 5.6776, time 125.73ms
iter 461610: loss 5.0734, time 125.65ms
iter 461620: loss 5.4308, time 126.27ms
iter 461630: loss 5.6154, time 125.88ms
iter 461640: loss 5.6431, time 127.46ms
iter 461650: loss 5.9472, time 124.93ms
iter 461660: loss 5.1033, time 125.68ms
iter 461670: loss 6.0265, time 126.28ms
iter 461680: loss 6.1882, time 125.19ms
iter 461690: loss 5.2633, time 126.17ms
iter 461700: loss 6.2408, time 125.80ms
iter 461710: loss 6.6922, time 125.93ms
iter 461720: loss 6.0699, time 125.41ms
iter 461730: loss 6.2161, time 125.31ms
iter 461740: loss 5.4351, time 125.79ms
step 461750: train loss 5.5383, val loss 5.6050
saving checkpoint to out-shakespeare-char
iter 461750: loss 6.1675, time 2912.93ms
iter 461760: loss 5.9112, time 129.34ms
iter 461770: loss 6.3322, time 125.92ms
iter 461780: loss 5.6430, time 128.64ms
iter 461790: loss 5.7968, time 127.17ms
iter 461800: loss 6.2257, time 125.79ms
iter 461810: loss 5.9784, time 126.05ms
iter 461820: loss 6.6711, time 125.55ms
iter 461830: loss 6.0495, time 125.69ms
iter 461840: loss 5.6819, time 126.34ms
iter 461850: loss 5.8281, time 126.14ms
iter 461860: loss 5.5182, time 125.62ms
iter 461870: loss 6.1728, time 125.56ms
iter 461880: loss 6.2462, time 125.76ms
iter 461890: loss 6.4278, time 129.04ms
iter 461900: loss 6.0375, time 127.03ms
iter 461910: loss 6.2189, time 125.88ms
iter 461920: loss 5.7463, time 125.61ms
iter 461930: loss 5.8703, time 125.81ms
iter 461940: loss 6.3819, time 125.15ms
iter 461950: loss 6.2830, time 126.17ms
iter 461960: loss 5.7661, time 125.70ms
iter 461970: loss 5.9463, time 125.65ms
iter 461980: loss 4.9835, time 126.03ms
iter 461990: loss 5.8829, time 125.96ms
step 462000: train loss 5.5948, val loss 5.5891
saving checkpoint to out-shakespeare-char
iter 462000: loss 5.6143, time 2877.69ms
iter 462010: loss 5.9155, time 124.66ms
iter 462020: loss 5.8425, time 122.25ms
iter 462030: loss 5.6632, time 122.32ms
iter 462040: loss 5.8762, time 122.15ms
iter 462050: loss 6.0170, time 123.15ms
iter 462060: loss 6.3962, time 121.76ms
iter 462070: loss 6.5242, time 122.83ms
iter 462080: loss 6.2275, time 124.31ms
iter 462090: loss 5.0725, time 122.67ms
iter 462100: loss 6.3424, time 121.77ms
iter 462110: loss 5.7564, time 121.67ms
iter 462120: loss 5.9232, time 123.22ms
iter 462130: loss 6.5911, time 127.04ms
iter 462140: loss 5.7628, time 129.49ms
iter 462150: loss 7.0213, time 127.39ms
iter 462160: loss 6.2677, time 128.48ms
iter 462170: loss 5.7402, time 125.99ms
iter 462180: loss 5.5127, time 127.57ms
iter 462190: loss 6.2933, time 125.98ms
iter 462200: loss 6.0636, time 125.93ms
iter 462210: loss 5.2234, time 125.75ms
iter 462220: loss 5.6323, time 125.82ms
iter 462230: loss 5.8930, time 125.63ms
iter 462240: loss 5.9323, time 126.03ms
step 462250: train loss 5.5185, val loss 5.5961
saving checkpoint to out-shakespeare-char
iter 462250: loss 5.6889, time 2913.94ms
iter 462260: loss 5.4192, time 125.80ms
iter 462270: loss 6.0219, time 125.56ms
iter 462280: loss 5.7776, time 125.99ms
iter 462290: loss 6.5850, time 125.33ms
iter 462300: loss 6.0919, time 125.72ms
iter 462310: loss 6.4199, time 125.58ms
iter 462320: loss 6.3382, time 125.72ms
iter 462330: loss 5.9944, time 125.25ms
iter 462340: loss 5.5878, time 126.92ms
iter 462350: loss 6.1088, time 127.08ms
iter 462360: loss 5.5494, time 125.10ms
iter 462370: loss 5.4484, time 127.67ms
iter 462380: loss 5.9352, time 125.50ms
iter 462390: loss 6.5833, time 127.75ms
iter 462400: loss 5.4463, time 124.41ms
iter 462410: loss 5.8195, time 127.77ms
iter 462420: loss 6.0511, time 124.33ms
iter 462430: loss 6.2252, time 127.92ms
iter 462440: loss 5.7696, time 125.16ms
iter 462450: loss 5.3351, time 125.45ms
iter 462460: loss 6.2137, time 126.52ms
iter 462470: loss 6.3562, time 126.43ms
iter 462480: loss 6.9130, time 125.76ms
iter 462490: loss 5.7784, time 127.33ms
step 462500: train loss 5.5963, val loss 5.5051
saving checkpoint to out-shakespeare-char
iter 462500: loss 6.7015, time 2904.33ms
iter 462510: loss 6.2023, time 125.63ms
iter 462520: loss 5.3027, time 125.62ms
iter 462530: loss 5.8387, time 125.55ms
iter 462540: loss 6.0649, time 126.10ms
iter 462550: loss 5.7094, time 127.81ms
iter 462560: loss 5.8300, time 125.55ms
iter 462570: loss 6.9371, time 125.65ms
iter 462580: loss 5.9026, time 125.88ms
iter 462590: loss 5.9525, time 125.17ms
iter 462600: loss 5.7140, time 126.20ms
iter 462610: loss 5.9291, time 125.74ms
iter 462620: loss 6.4575, time 125.18ms
iter 462630: loss 6.7651, time 124.90ms
iter 462640: loss 6.1245, time 126.22ms
iter 462650: loss 5.6411, time 125.15ms
iter 462660: loss 5.5915, time 125.49ms
iter 462670: loss 5.4022, time 125.53ms
iter 462680: loss 5.8870, time 124.91ms
iter 462690: loss 6.4463, time 125.48ms
iter 462700: loss 6.4298, time 125.49ms
iter 462710: loss 5.7588, time 120.85ms
iter 462720: loss 6.2957, time 121.12ms
iter 462730: loss 6.1061, time 123.23ms
iter 462740: loss 6.3561, time 119.70ms
step 462750: train loss 5.5347, val loss 5.5707
saving checkpoint to out-shakespeare-char
iter 462750: loss 5.6274, time 2933.28ms
iter 462760: loss 5.9472, time 121.55ms
iter 462770: loss 6.1924, time 123.29ms
iter 462780: loss 5.8427, time 121.50ms
iter 462790: loss 5.4762, time 121.99ms
iter 462800: loss 5.7673, time 122.25ms
iter 462810: loss 6.2135, time 121.58ms
iter 462820: loss 6.1871, time 121.73ms
iter 462830: loss 6.2327, time 122.67ms
iter 462840: loss 6.2985, time 121.55ms
iter 462850: loss 6.0439, time 121.18ms
iter 462860: loss 5.8513, time 123.66ms
iter 462870: loss 5.6684, time 122.66ms
iter 462880: loss 6.3079, time 121.54ms
iter 462890: loss 5.3304, time 122.65ms
iter 462900: loss 5.9138, time 121.17ms
iter 462910: loss 6.2868, time 122.03ms
iter 462920: loss 6.8750, time 121.80ms
iter 462930: loss 6.1683, time 119.59ms
iter 462940: loss 5.8464, time 119.39ms
iter 462950: loss 5.6912, time 120.74ms
iter 462960: loss 6.0166, time 120.73ms
iter 462970: loss 5.7794, time 119.65ms
iter 462980: loss 5.3745, time 119.76ms
iter 462990: loss 6.3040, time 120.73ms
step 463000: train loss 5.5690, val loss 5.5286
saving checkpoint to out-shakespeare-char
iter 463000: loss 6.2266, time 2908.35ms
iter 463010: loss 6.5276, time 125.65ms
iter 463020: loss 5.5128, time 126.91ms
iter 463030: loss 6.3722, time 125.65ms
iter 463040: loss 5.8179, time 125.48ms
iter 463050: loss 6.1416, time 125.67ms
iter 463060: loss 6.7165, time 126.70ms
iter 463070: loss 6.1345, time 125.73ms
iter 463080: loss 6.8568, time 125.35ms
iter 463090: loss 6.1982, time 125.77ms
iter 463100: loss 6.3237, time 125.32ms
iter 463110: loss 5.5905, time 126.31ms
iter 463120: loss 6.0626, time 125.57ms
iter 463130: loss 6.1512, time 125.70ms
iter 463140: loss 6.5435, time 125.72ms
iter 463150: loss 5.7920, time 125.97ms
iter 463160: loss 5.8596, time 125.92ms
iter 463170: loss 5.9844, time 125.79ms
iter 463180: loss 5.5841, time 128.71ms
iter 463190: loss 5.7198, time 125.74ms
iter 463200: loss 6.4356, time 128.72ms
iter 463210: loss 6.0293, time 127.83ms
iter 463220: loss 5.5216, time 128.51ms
iter 463230: loss 5.8306, time 125.51ms
iter 463240: loss 6.2176, time 127.93ms
step 463250: train loss 5.5508, val loss 5.6249
saving checkpoint to out-shakespeare-char
iter 463250: loss 6.3735, time 2908.43ms
iter 463260: loss 6.2120, time 125.61ms
iter 463270: loss 6.3383, time 125.58ms
iter 463280: loss 5.8693, time 125.61ms
iter 463290: loss 6.7861, time 125.65ms
iter 463300: loss 5.8829, time 125.85ms
iter 463310: loss 6.2334, time 125.63ms
iter 463320: loss 5.7796, time 126.11ms
iter 463330: loss 6.2325, time 126.03ms
iter 463340: loss 5.8258, time 125.67ms
iter 463350: loss 6.0407, time 124.80ms
iter 463360: loss 5.5907, time 125.54ms
iter 463370: loss 6.2672, time 126.10ms
iter 463380: loss 6.5564, time 125.13ms
iter 463390: loss 6.1858, time 125.49ms
iter 463400: loss 5.6086, time 126.07ms
iter 463410: loss 6.3845, time 126.05ms
iter 463420: loss 5.8702, time 124.92ms
iter 463430: loss 5.5959, time 124.71ms
iter 463440: loss 5.1713, time 125.51ms
iter 463450: loss 5.8128, time 126.11ms
iter 463460: loss 5.1026, time 125.76ms
iter 463470: loss 6.2711, time 125.61ms
iter 463480: loss 5.6981, time 125.53ms
iter 463490: loss 6.6421, time 126.05ms
step 463500: train loss 5.6068, val loss 5.5355
saving checkpoint to out-shakespeare-char
iter 463500: loss 6.1992, time 2909.82ms
iter 463510: loss 4.9887, time 125.11ms
iter 463520: loss 6.0697, time 125.12ms
iter 463530: loss 5.4629, time 125.17ms
iter 463540: loss 5.3846, time 125.17ms
iter 463550: loss 6.1548, time 124.66ms
iter 463560: loss 5.4209, time 125.36ms
iter 463570: loss 5.7365, time 126.10ms
iter 463580: loss 5.8193, time 125.19ms
iter 463590: loss 6.5651, time 125.28ms
iter 463600: loss 5.9279, time 125.37ms
iter 463610: loss 6.7071, time 124.50ms
iter 463620: loss 6.0052, time 125.89ms
iter 463630: loss 5.5601, time 125.50ms
iter 463640: loss 5.7172, time 125.77ms
iter 463650: loss 6.3854, time 125.85ms
iter 463660: loss 5.5416, time 125.28ms
iter 463670: loss 6.3255, time 127.06ms
iter 463680: loss 6.3621, time 126.40ms
iter 463690: loss 6.1454, time 127.34ms
iter 463700: loss 5.6030, time 125.87ms
iter 463710: loss 6.1398, time 125.97ms
iter 463720: loss 6.1799, time 125.83ms
iter 463730: loss 5.5799, time 126.00ms
iter 463740: loss 6.0204, time 126.17ms
step 463750: train loss 5.5555, val loss 5.5938
saving checkpoint to out-shakespeare-char
iter 463750: loss 5.7634, time 2896.65ms
iter 463760: loss 5.8831, time 128.05ms
iter 463770: loss 6.3162, time 125.46ms
iter 463780: loss 5.8167, time 127.95ms
iter 463790: loss 6.0861, time 125.38ms
iter 463800: loss 5.4327, time 128.35ms
iter 463810: loss 5.8337, time 125.46ms
iter 463820: loss 5.9561, time 127.59ms
iter 463830: loss 5.8817, time 126.38ms
iter 463840: loss 5.3237, time 128.84ms
iter 463850: loss 6.7301, time 128.02ms
iter 463860: loss 6.1138, time 126.11ms
iter 463870: loss 6.5608, time 125.43ms
iter 463880: loss 5.7288, time 125.84ms
iter 463890: loss 6.5268, time 126.95ms
iter 463900: loss 5.9028, time 125.84ms
iter 463910: loss 5.5479, time 125.78ms
iter 463920: loss 5.9648, time 125.34ms
iter 463930: loss 6.0418, time 125.55ms
iter 463940: loss 5.5823, time 124.60ms
iter 463950: loss 5.6235, time 126.03ms
iter 463960: loss 6.0247, time 127.24ms
iter 463970: loss 5.7793, time 128.27ms
iter 463980: loss 5.6692, time 125.61ms
iter 463990: loss 6.2652, time 128.73ms
step 464000: train loss 5.5550, val loss 5.5848
saving checkpoint to out-shakespeare-char
iter 464000: loss 6.1338, time 2901.82ms
iter 464010: loss 5.8619, time 127.69ms
iter 464020: loss 5.3660, time 126.04ms
iter 464030: loss 6.0409, time 125.16ms
iter 464040: loss 6.4052, time 125.88ms
iter 464050: loss 5.5011, time 126.10ms
iter 464060: loss 5.4822, time 125.84ms
iter 464070: loss 6.4172, time 125.76ms
iter 464080: loss 5.7954, time 125.46ms
iter 464090: loss 6.2024, time 126.10ms
iter 464100: loss 5.7251, time 125.18ms
iter 464110: loss 5.5810, time 124.40ms
iter 464120: loss 6.8339, time 126.38ms
iter 464130: loss 6.1439, time 125.06ms
iter 464140: loss 6.1981, time 125.61ms
iter 464150: loss 6.3898, time 125.73ms
iter 464160: loss 6.0098, time 125.77ms
iter 464170: loss 6.4149, time 125.77ms
iter 464180: loss 5.9365, time 125.66ms
iter 464190: loss 6.1798, time 125.77ms
iter 464200: loss 6.3598, time 125.84ms
iter 464210: loss 6.2952, time 125.99ms
iter 464220: loss 5.2400, time 125.30ms
iter 464230: loss 5.8581, time 124.85ms
iter 464240: loss 5.7682, time 125.62ms
step 464250: train loss 5.6043, val loss 5.5577
saving checkpoint to out-shakespeare-char
iter 464250: loss 5.7550, time 2888.02ms
iter 464260: loss 5.5538, time 127.66ms
iter 464270: loss 6.2987, time 125.79ms
iter 464280: loss 5.9524, time 125.60ms
iter 464290: loss 6.0784, time 125.51ms
iter 464300: loss 5.5538, time 125.40ms
iter 464310: loss 5.5777, time 125.48ms
iter 464320: loss 6.1304, time 125.32ms
iter 464330: loss 6.2888, time 125.42ms
iter 464340: loss 5.8477, time 125.26ms
iter 464350: loss 6.6774, time 124.51ms
iter 464360: loss 6.1357, time 125.81ms
iter 464370: loss 6.2688, time 125.56ms
iter 464380: loss 6.0320, time 127.02ms
iter 464390: loss 6.8629, time 124.35ms
iter 464400: loss 6.0776, time 125.42ms
iter 464410: loss 6.7458, time 125.89ms
iter 464420: loss 6.5883, time 125.67ms
iter 464430: loss 6.0004, time 125.73ms
iter 464440: loss 6.0736, time 125.95ms
iter 464450: loss 6.1607, time 125.79ms
iter 464460: loss 5.7331, time 125.96ms
iter 464470: loss 5.6185, time 126.06ms
iter 464480: loss 5.8291, time 125.52ms
iter 464490: loss 6.4229, time 125.59ms
step 464500: train loss 5.4890, val loss 5.5879
saving checkpoint to out-shakespeare-char
iter 464500: loss 6.0064, time 2883.17ms
iter 464510: loss 5.5778, time 125.70ms
iter 464520: loss 5.7555, time 126.70ms
iter 464530: loss 6.2681, time 124.50ms
iter 464540: loss 6.5192, time 125.96ms
iter 464550: loss 6.0438, time 125.53ms
iter 464560: loss 6.4427, time 126.14ms
iter 464570: loss 5.6092, time 125.42ms
iter 464580: loss 6.1520, time 125.45ms
iter 464590: loss 6.1527, time 125.15ms
iter 464600: loss 5.4872, time 125.80ms
iter 464610: loss 6.5583, time 125.43ms
iter 464620: loss 5.7273, time 125.57ms
iter 464630: loss 5.8092, time 125.56ms
iter 464640: loss 5.7652, time 127.11ms
iter 464650: loss 5.2130, time 128.05ms
iter 464660: loss 5.9380, time 125.58ms
iter 464670: loss 5.5288, time 128.06ms
iter 464680: loss 6.6488, time 125.28ms
iter 464690: loss 5.8169, time 128.62ms
iter 464700: loss 5.5754, time 125.64ms
iter 464710: loss 5.9436, time 127.90ms
iter 464720: loss 5.7016, time 125.44ms
iter 464730: loss 6.0777, time 128.10ms
iter 464740: loss 5.8487, time 126.95ms
step 464750: train loss 5.5585, val loss 5.5854
saving checkpoint to out-shakespeare-char
iter 464750: loss 6.4191, time 2908.48ms
iter 464760: loss 6.3662, time 125.78ms
iter 464770: loss 5.9590, time 125.71ms
iter 464780: loss 6.5051, time 124.60ms
iter 464790: loss 5.6772, time 125.54ms
iter 464800: loss 5.8966, time 126.65ms
iter 464810: loss 6.5330, time 125.41ms
iter 464820: loss 6.1267, time 125.68ms
iter 464830: loss 5.4945, time 125.69ms
iter 464840: loss 5.4383, time 125.59ms
iter 464850: loss 6.4003, time 125.56ms
iter 464860: loss 5.7228, time 125.27ms
iter 464870: loss 5.9379, time 125.67ms
iter 464880: loss 5.7044, time 125.66ms
iter 464890: loss 5.5931, time 125.89ms
iter 464900: loss 5.7328, time 125.60ms
iter 464910: loss 6.7657, time 126.82ms
iter 464920: loss 6.0681, time 125.28ms
iter 464930: loss 6.0010, time 125.40ms
iter 464940: loss 6.3083, time 125.72ms
iter 464950: loss 5.9442, time 124.41ms
iter 464960: loss 5.9242, time 125.65ms
iter 464970: loss 6.0957, time 124.99ms
iter 464980: loss 6.5014, time 125.28ms
iter 464990: loss 6.4404, time 125.70ms
step 465000: train loss 5.6000, val loss 5.5536
saving checkpoint to out-shakespeare-char
iter 465000: loss 5.8226, time 2911.28ms
iter 465010: loss 6.1163, time 125.66ms
iter 465020: loss 6.2517, time 128.63ms
iter 465030: loss 6.4880, time 125.54ms
iter 465040: loss 6.0231, time 127.76ms
iter 465050: loss 5.8165, time 125.67ms
iter 465060: loss 5.4554, time 127.66ms
iter 465070: loss 6.2271, time 126.72ms
iter 465080: loss 6.6620, time 125.87ms
iter 465090: loss 5.3692, time 127.54ms
iter 465100: loss 5.9738, time 125.67ms
iter 465110: loss 6.6109, time 125.54ms
iter 465120: loss 5.9592, time 124.89ms
iter 465130: loss 6.3245, time 125.58ms
iter 465140: loss 6.5633, time 124.42ms
iter 465150: loss 6.5093, time 125.40ms
iter 465160: loss 5.6563, time 125.51ms
iter 465170: loss 5.4290, time 125.56ms
iter 465180: loss 5.9458, time 125.69ms
iter 465190: loss 6.3985, time 125.89ms
iter 465200: loss 5.6837, time 125.83ms
iter 465210: loss 6.0706, time 125.64ms
iter 465220: loss 6.1269, time 125.21ms
iter 465230: loss 5.7619, time 125.71ms
iter 465240: loss 6.2475, time 125.85ms
step 465250: train loss 5.4974, val loss 5.5408
saving checkpoint to out-shakespeare-char
iter 465250: loss 5.6414, time 2903.11ms
iter 465260: loss 5.7743, time 126.37ms
iter 465270: loss 5.6352, time 125.92ms
iter 465280: loss 5.4563, time 125.72ms
iter 465290: loss 6.1595, time 125.86ms
iter 465300: loss 6.0864, time 126.07ms
iter 465310: loss 5.5830, time 125.77ms
iter 465320: loss 5.4288, time 128.71ms
iter 465330: loss 5.8172, time 127.14ms
iter 465340: loss 6.0502, time 125.62ms
iter 465350: loss 6.1167, time 125.74ms
iter 465360: loss 5.5891, time 125.23ms
iter 465370: loss 5.4147, time 126.00ms
iter 465380: loss 5.6166, time 125.47ms
iter 465390: loss 6.7953, time 125.95ms
iter 465400: loss 7.0568, time 125.51ms
iter 465410: loss 5.8351, time 125.70ms
iter 465420: loss 6.2071, time 125.92ms
iter 465430: loss 6.2151, time 125.64ms
iter 465440: loss 5.8255, time 127.43ms
iter 465450: loss 5.4333, time 126.08ms
iter 465460: loss 6.4284, time 126.05ms
iter 465470: loss 6.0515, time 125.33ms
iter 465480: loss 5.0293, time 125.59ms
iter 465490: loss 5.8167, time 126.06ms
step 465500: train loss 5.5300, val loss 5.5518
saving checkpoint to out-shakespeare-char
iter 465500: loss 5.7798, time 2900.86ms
iter 465510: loss 5.5275, time 125.77ms
iter 465520: loss 5.5283, time 125.24ms
iter 465530: loss 6.1438, time 126.13ms
iter 465540: loss 5.8169, time 126.17ms
iter 465550: loss 5.5158, time 126.16ms
iter 465560: loss 6.0961, time 127.93ms
iter 465570: loss 5.7357, time 125.18ms
iter 465580: loss 6.0565, time 128.16ms
iter 465590: loss 5.1775, time 126.15ms
iter 465600: loss 5.7428, time 125.55ms
iter 465610: loss 5.8649, time 125.38ms
iter 465620: loss 5.8646, time 125.48ms
iter 465630: loss 5.8792, time 126.18ms
iter 465640: loss 5.5101, time 125.71ms
iter 465650: loss 5.5428, time 125.76ms
iter 465660: loss 5.6594, time 121.51ms
iter 465670: loss 5.6618, time 121.50ms
iter 465680: loss 5.5001, time 121.58ms
iter 465690: loss 5.7971, time 121.66ms
iter 465700: loss 5.5453, time 121.58ms
iter 465710: loss 5.4500, time 122.72ms
iter 465720: loss 6.2449, time 121.88ms
iter 465730: loss 6.8584, time 125.19ms
iter 465740: loss 5.8763, time 124.82ms
step 465750: train loss 5.5774, val loss 5.5737
saving checkpoint to out-shakespeare-char
iter 465750: loss 5.5740, time 2885.91ms
iter 465760: loss 5.3314, time 124.26ms
iter 465770: loss 6.1805, time 124.25ms
iter 465780: loss 5.8278, time 124.66ms
iter 465790: loss 5.9118, time 124.83ms
iter 465800: loss 6.0733, time 124.92ms
iter 465810: loss 6.4202, time 124.69ms
iter 465820: loss 5.8810, time 125.28ms
iter 465830: loss 6.2020, time 126.13ms
iter 465840: loss 5.8999, time 125.12ms
iter 465850: loss 6.4635, time 125.44ms
iter 465860: loss 6.5315, time 125.93ms
iter 465870: loss 5.7006, time 125.84ms
iter 465880: loss 5.8503, time 125.73ms
iter 465890: loss 6.0172, time 125.17ms
iter 465900: loss 5.4912, time 124.89ms
iter 465910: loss 5.5048, time 125.25ms
iter 465920: loss 6.3264, time 125.01ms
iter 465930: loss 5.9302, time 125.00ms
iter 465940: loss 6.1725, time 124.78ms
iter 465950: loss 5.0231, time 125.67ms
iter 465960: loss 6.4086, time 125.51ms
iter 465970: loss 5.5337, time 126.93ms
iter 465980: loss 6.6168, time 125.86ms
iter 465990: loss 6.0943, time 125.32ms
step 466000: train loss 5.5563, val loss 5.5501
saving checkpoint to out-shakespeare-char
iter 466000: loss 5.8550, time 2885.68ms
iter 466010: loss 6.0337, time 125.59ms
iter 466020: loss 5.4707, time 126.83ms
iter 466030: loss 5.8648, time 125.31ms
iter 466040: loss 6.0583, time 126.23ms
iter 466050: loss 6.2312, time 125.60ms
iter 466060: loss 5.6526, time 125.62ms
iter 466070: loss 5.5829, time 125.80ms
iter 466080: loss 5.5667, time 125.69ms
iter 466090: loss 5.9294, time 127.91ms
iter 466100: loss 6.3092, time 126.02ms
iter 466110: loss 6.5021, time 125.34ms
iter 466120: loss 6.2742, time 125.25ms
iter 466130: loss 5.1323, time 126.85ms
iter 466140: loss 5.7275, time 125.41ms
iter 466150: loss 6.1296, time 127.67ms
iter 466160: loss 6.2877, time 125.17ms
iter 466170: loss 6.2112, time 128.05ms
iter 466180: loss 6.1383, time 125.57ms
iter 466190: loss 6.3685, time 127.87ms
iter 466200: loss 6.1641, time 125.48ms
iter 466210: loss 5.9358, time 128.63ms
iter 466220: loss 5.5991, time 125.36ms
iter 466230: loss 6.3531, time 128.17ms
iter 466240: loss 6.1440, time 125.57ms
step 466250: train loss 5.6355, val loss 5.5613
saving checkpoint to out-shakespeare-char
iter 466250: loss 6.0580, time 2895.90ms
iter 466260: loss 6.6828, time 125.17ms
iter 466270: loss 5.7057, time 130.01ms
iter 466280: loss 5.7836, time 125.87ms
iter 466290: loss 6.2513, time 125.96ms
iter 466300: loss 6.7172, time 125.55ms
iter 466310: loss 6.0217, time 125.60ms
iter 466320: loss 6.0817, time 126.01ms
iter 466330: loss 6.2465, time 126.91ms
iter 466340: loss 6.0604, time 125.49ms
iter 466350: loss 6.0634, time 125.52ms
iter 466360: loss 5.9045, time 125.41ms
iter 466370: loss 6.3190, time 125.62ms
iter 466380: loss 4.7361, time 125.69ms
iter 466390: loss 6.0175, time 126.85ms
iter 466400: loss 6.3425, time 125.40ms
iter 466410: loss 5.4095, time 124.56ms
iter 466420: loss 5.7154, time 125.93ms
iter 466430: loss 5.9692, time 126.15ms
iter 466440: loss 5.8756, time 126.10ms
iter 466450: loss 5.8268, time 126.15ms
iter 466460: loss 5.9723, time 127.71ms
iter 466470: loss 5.7070, time 126.53ms
iter 466480: loss 5.9682, time 125.25ms
iter 466490: loss 6.5925, time 124.88ms
step 466500: train loss 5.5918, val loss 5.5713
saving checkpoint to out-shakespeare-char
iter 466500: loss 5.2722, time 2911.52ms
iter 466510: loss 6.4685, time 123.93ms
iter 466520: loss 6.7736, time 125.75ms
iter 466530: loss 5.6632, time 125.55ms
iter 466540: loss 5.6648, time 125.38ms
iter 466550: loss 5.9026, time 126.56ms
iter 466560: loss 6.0741, time 124.93ms
iter 466570: loss 6.0484, time 124.52ms
iter 466580: loss 6.1711, time 125.18ms
iter 466590: loss 6.6194, time 125.89ms
iter 466600: loss 5.7815, time 125.86ms
iter 466610: loss 5.8144, time 126.03ms
iter 466620: loss 5.6535, time 125.76ms
iter 466630: loss 5.5274, time 125.27ms
iter 466640: loss 5.7523, time 125.86ms
iter 466650: loss 5.7216, time 125.18ms
iter 466660: loss 6.5828, time 125.84ms
iter 466670: loss 5.4110, time 125.80ms
iter 466680: loss 5.9601, time 125.73ms
iter 466690: loss 5.7684, time 126.17ms
iter 466700: loss 5.5763, time 124.36ms
iter 466710: loss 6.0453, time 125.18ms
iter 466720: loss 6.0581, time 125.66ms
iter 466730: loss 6.7902, time 125.68ms
iter 466740: loss 6.2754, time 125.85ms
step 466750: train loss 5.5475, val loss 5.5813
saving checkpoint to out-shakespeare-char
iter 466750: loss 5.7657, time 2889.21ms
iter 466760: loss 6.5365, time 125.69ms
iter 466770: loss 6.3069, time 122.08ms
iter 466780: loss 5.9978, time 125.14ms
iter 466790: loss 5.6568, time 121.84ms
iter 466800: loss 6.1456, time 122.42ms
iter 466810: loss 5.7253, time 119.27ms
iter 466820: loss 5.8323, time 122.90ms
iter 466830: loss 5.6559, time 125.76ms
iter 466840: loss 5.7944, time 122.26ms
iter 466850: loss 6.0021, time 124.91ms
iter 466860: loss 6.1409, time 124.95ms
iter 466870: loss 5.8981, time 124.90ms
iter 466880: loss 6.5115, time 124.96ms
iter 466890: loss 6.3601, time 124.87ms
iter 466900: loss 6.5611, time 124.82ms
iter 466910: loss 5.6550, time 124.02ms
iter 466920: loss 5.7843, time 124.32ms
iter 466930: loss 5.8178, time 125.13ms
iter 466940: loss 6.6065, time 124.81ms
iter 466950: loss 5.9206, time 124.17ms
iter 466960: loss 6.6522, time 124.99ms
iter 466970: loss 6.0263, time 124.86ms
iter 466980: loss 5.9070, time 124.92ms
iter 466990: loss 5.9916, time 128.25ms
step 467000: train loss 5.5784, val loss 5.5863
saving checkpoint to out-shakespeare-char
iter 467000: loss 5.4413, time 2898.34ms
iter 467010: loss 5.5262, time 125.06ms
iter 467020: loss 6.2196, time 125.40ms
iter 467030: loss 5.6274, time 127.21ms
iter 467040: loss 5.9177, time 125.04ms
iter 467050: loss 6.3671, time 127.62ms
iter 467060: loss 6.1804, time 125.22ms
iter 467070: loss 6.1807, time 127.60ms
iter 467080: loss 6.2104, time 125.28ms
iter 467090: loss 5.5819, time 126.97ms
iter 467100: loss 5.5002, time 125.98ms
iter 467110: loss 6.0434, time 126.04ms
iter 467120: loss 6.0955, time 126.13ms
iter 467130: loss 5.6378, time 124.98ms
iter 467140: loss 6.1358, time 125.33ms
iter 467150: loss 5.6057, time 126.11ms
iter 467160: loss 6.6118, time 125.56ms
iter 467170: loss 5.7826, time 126.17ms
iter 467180: loss 6.0907, time 125.15ms
iter 467190: loss 5.7695, time 125.30ms
iter 467200: loss 6.2289, time 128.25ms
iter 467210: loss 6.0923, time 124.45ms
iter 467220: loss 5.7212, time 124.98ms
iter 467230: loss 6.2047, time 125.36ms
iter 467240: loss 6.1882, time 125.64ms
step 467250: train loss 5.6010, val loss 5.5610
saving checkpoint to out-shakespeare-char
iter 467250: loss 5.8661, time 2912.77ms
iter 467260: loss 5.7888, time 125.75ms
iter 467270: loss 6.2031, time 126.58ms
iter 467280: loss 6.4554, time 125.12ms
iter 467290: loss 6.0770, time 125.10ms
iter 467300: loss 5.8067, time 125.52ms
iter 467310: loss 6.0806, time 125.47ms
iter 467320: loss 6.2699, time 125.15ms
iter 467330: loss 6.3699, time 125.08ms
iter 467340: loss 6.2729, time 125.23ms
iter 467350: loss 6.0455, time 125.18ms
iter 467360: loss 5.8310, time 125.39ms
iter 467370: loss 6.3032, time 124.98ms
iter 467380: loss 6.0623, time 125.33ms
iter 467390: loss 5.9915, time 121.23ms
iter 467400: loss 5.7351, time 123.54ms
iter 467410: loss 5.9604, time 121.67ms
iter 467420: loss 5.9808, time 121.90ms
iter 467430: loss 5.3516, time 124.69ms
iter 467440: loss 5.8702, time 122.00ms
iter 467450: loss 5.5836, time 121.92ms
iter 467460: loss 5.3178, time 121.46ms
iter 467470: loss 6.4004, time 123.03ms
iter 467480: loss 5.6809, time 122.03ms
iter 467490: loss 6.1911, time 122.96ms
step 467500: train loss 5.5355, val loss 5.5842
saving checkpoint to out-shakespeare-char
iter 467500: loss 5.6810, time 2887.93ms
iter 467510: loss 5.9103, time 125.91ms
iter 467520: loss 5.6369, time 124.93ms
iter 467530: loss 6.3393, time 125.28ms
iter 467540: loss 5.7674, time 125.82ms
iter 467550: loss 6.1822, time 124.75ms
iter 467560: loss 5.8168, time 125.46ms
iter 467570: loss 6.4415, time 125.11ms
iter 467580: loss 6.1052, time 124.90ms
iter 467590: loss 5.3878, time 125.36ms
iter 467600: loss 6.2713, time 125.05ms
iter 467610: loss 6.1362, time 125.17ms
iter 467620: loss 6.0799, time 124.96ms
iter 467630: loss 6.0693, time 125.01ms
iter 467640: loss 5.6917, time 125.25ms
iter 467650: loss 5.9708, time 125.46ms
iter 467660: loss 5.2509, time 125.31ms
iter 467670: loss 6.1284, time 125.09ms
iter 467680: loss 5.7408, time 124.86ms
iter 467690: loss 5.8071, time 124.28ms
iter 467700: loss 6.0237, time 124.94ms
iter 467710: loss 5.3132, time 125.23ms
iter 467720: loss 6.3163, time 124.98ms
iter 467730: loss 5.8296, time 125.15ms
iter 467740: loss 5.7313, time 127.90ms
step 467750: train loss 5.5224, val loss 5.5632
saving checkpoint to out-shakespeare-char
iter 467750: loss 5.2675, time 2872.55ms
iter 467760: loss 5.4516, time 125.65ms
iter 467770: loss 6.7196, time 124.77ms
iter 467780: loss 5.5544, time 124.59ms
iter 467790: loss 5.8492, time 126.04ms
iter 467800: loss 5.9124, time 124.85ms
iter 467810: loss 6.0856, time 124.53ms
iter 467820: loss 6.1265, time 124.86ms
iter 467830: loss 6.1511, time 124.00ms
iter 467840: loss 6.5058, time 124.88ms
iter 467850: loss 5.6443, time 124.74ms
iter 467860: loss 5.8932, time 125.79ms
iter 467870: loss 6.1058, time 124.44ms
iter 467880: loss 6.1953, time 125.15ms
iter 467890: loss 6.1684, time 124.63ms
iter 467900: loss 6.4692, time 125.00ms
iter 467910: loss 6.1064, time 125.86ms
iter 467920: loss 5.5007, time 124.80ms
iter 467930: loss 5.9196, time 124.16ms
iter 467940: loss 6.5257, time 125.03ms
iter 467950: loss 6.2907, time 124.77ms
iter 467960: loss 5.7065, time 125.22ms
iter 467970: loss 6.2356, time 124.86ms
iter 467980: loss 6.5605, time 124.86ms
iter 467990: loss 6.2496, time 124.88ms
step 468000: train loss 5.5178, val loss 5.5378
saving checkpoint to out-shakespeare-char
iter 468000: loss 5.2348, time 2885.28ms
iter 468010: loss 5.8055, time 126.46ms
iter 468020: loss 5.9089, time 125.16ms
iter 468030: loss 6.3017, time 124.97ms
iter 468040: loss 6.3453, time 125.20ms
iter 468050: loss 6.1971, time 127.46ms
iter 468060: loss 6.7268, time 125.14ms
iter 468070: loss 5.6620, time 124.59ms
iter 468080: loss 5.6769, time 125.10ms
iter 468090: loss 5.8607, time 125.19ms
iter 468100: loss 5.7500, time 125.00ms
iter 468110: loss 5.9351, time 126.14ms
iter 468120: loss 5.6779, time 124.99ms
iter 468130: loss 5.7979, time 125.40ms
iter 468140: loss 6.1485, time 125.26ms
iter 468150: loss 6.3288, time 125.21ms
iter 468160: loss 6.1161, time 127.34ms
iter 468170: loss 5.5990, time 125.74ms
iter 468180: loss 6.2805, time 125.41ms
iter 468190: loss 5.9395, time 125.83ms
iter 468200: loss 6.3946, time 125.62ms
iter 468210: loss 6.2066, time 125.47ms
iter 468220: loss 5.9076, time 125.52ms
iter 468230: loss 5.1582, time 127.37ms
iter 468240: loss 6.1544, time 124.75ms
step 468250: train loss 5.5447, val loss 5.5207
saving checkpoint to out-shakespeare-char
iter 468250: loss 6.4234, time 2893.28ms
iter 468260: loss 6.5667, time 125.18ms
iter 468270: loss 6.7141, time 125.29ms
iter 468280: loss 5.7658, time 130.67ms
iter 468290: loss 6.8022, time 125.83ms
iter 468300: loss 5.8643, time 125.49ms
iter 468310: loss 6.0044, time 125.75ms
iter 468320: loss 5.5870, time 125.74ms
iter 468330: loss 5.8276, time 126.27ms
iter 468340: loss 5.6177, time 128.47ms
iter 468350: loss 6.0620, time 126.11ms
iter 468360: loss 5.6597, time 129.14ms
iter 468370: loss 5.7878, time 125.60ms
iter 468380: loss 6.4632, time 128.24ms
iter 468390: loss 6.2212, time 125.66ms
iter 468400: loss 5.9847, time 127.86ms
iter 468410: loss 5.8570, time 125.47ms
iter 468420: loss 6.5367, time 128.25ms
iter 468430: loss 5.9480, time 127.16ms
iter 468440: loss 6.0621, time 125.82ms
iter 468450: loss 6.0386, time 125.77ms
iter 468460: loss 5.6595, time 125.82ms
iter 468470: loss 5.4500, time 125.74ms
iter 468480: loss 5.9269, time 126.26ms
iter 468490: loss 5.8893, time 125.63ms
step 468500: train loss 5.5696, val loss 5.5258
saving checkpoint to out-shakespeare-char
iter 468500: loss 5.5308, time 2872.27ms
iter 468510: loss 5.6494, time 125.68ms
iter 468520: loss 5.7322, time 126.33ms
iter 468530: loss 5.6435, time 126.15ms
iter 468540: loss 6.3187, time 125.73ms
iter 468550: loss 6.3115, time 125.82ms
iter 468560: loss 6.3244, time 125.54ms
iter 468570: loss 6.0426, time 125.59ms
iter 468580: loss 6.0397, time 125.38ms
iter 468590: loss 6.0994, time 127.19ms
iter 468600: loss 5.9441, time 124.75ms
iter 468610: loss 6.0520, time 125.56ms
iter 468620: loss 5.6456, time 124.87ms
iter 468630: loss 5.8256, time 126.30ms
iter 468640: loss 5.8185, time 126.06ms
iter 468650: loss 5.9878, time 126.04ms
iter 468660: loss 6.2406, time 125.56ms
iter 468670: loss 6.6907, time 125.39ms
iter 468680: loss 6.2613, time 125.71ms
iter 468690: loss 6.8272, time 125.81ms
iter 468700: loss 6.2439, time 127.28ms
iter 468710: loss 5.8407, time 128.39ms
iter 468720: loss 5.7337, time 126.14ms
iter 468730: loss 5.6680, time 128.18ms
iter 468740: loss 6.0124, time 125.64ms
step 468750: train loss 5.5429, val loss 5.6031
saving checkpoint to out-shakespeare-char
iter 468750: loss 5.9556, time 2912.70ms
iter 468760: loss 5.4563, time 123.14ms
iter 468770: loss 5.7261, time 122.95ms
iter 468780: loss 6.0379, time 121.90ms
iter 468790: loss 5.7811, time 122.05ms
iter 468800: loss 5.3680, time 124.89ms
iter 468810: loss 7.2468, time 122.03ms
iter 468820: loss 5.8343, time 122.00ms
iter 468830: loss 5.9507, time 121.62ms
iter 468840: loss 6.0823, time 121.84ms
iter 468850: loss 5.9172, time 121.87ms
iter 468860: loss 6.5950, time 122.18ms
iter 468870: loss 5.6170, time 123.17ms
iter 468880: loss 6.5234, time 122.87ms
iter 468890: loss 5.7485, time 121.55ms
iter 468900: loss 5.6545, time 121.00ms
iter 468910: loss 6.1730, time 121.70ms
iter 468920: loss 6.5267, time 122.97ms
iter 468930: loss 5.8497, time 121.84ms
iter 468940: loss 5.9121, time 122.07ms
iter 468950: loss 6.4292, time 122.92ms
iter 468960: loss 6.6989, time 121.48ms
iter 468970: loss 6.0247, time 121.33ms
iter 468980: loss 6.4249, time 124.40ms
iter 468990: loss 7.0859, time 120.29ms
step 469000: train loss 5.5092, val loss 5.5479
saving checkpoint to out-shakespeare-char
iter 469000: loss 6.2555, time 2907.83ms
iter 469010: loss 6.1785, time 122.97ms
iter 469020: loss 6.2024, time 121.27ms
iter 469030: loss 6.5924, time 121.29ms
iter 469040: loss 6.0012, time 122.43ms
iter 469050: loss 5.7583, time 122.59ms
iter 469060: loss 5.9655, time 122.82ms
iter 469070: loss 5.4634, time 121.36ms
iter 469080: loss 5.5870, time 121.30ms
iter 469090: loss 5.9526, time 121.28ms
iter 469100: loss 5.8712, time 123.11ms
iter 469110: loss 5.7078, time 121.52ms
iter 469120: loss 5.5393, time 121.49ms
iter 469130: loss 6.1799, time 122.70ms
iter 469140: loss 5.8869, time 121.33ms
iter 469150: loss 6.5240, time 121.62ms
iter 469160: loss 5.5403, time 123.99ms
iter 469170: loss 6.3330, time 123.28ms
iter 469180: loss 5.9596, time 123.21ms
iter 469190: loss 6.1427, time 121.52ms
iter 469200: loss 6.4845, time 121.47ms
iter 469210: loss 5.8191, time 122.14ms
iter 469220: loss 5.6259, time 122.54ms
iter 469230: loss 6.2756, time 121.43ms
iter 469240: loss 5.7363, time 122.27ms
step 469250: train loss 5.5380, val loss 5.5424
saving checkpoint to out-shakespeare-char
iter 469250: loss 5.3993, time 2906.06ms
iter 469260: loss 6.5809, time 121.82ms
iter 469270: loss 6.2279, time 122.06ms
iter 469280: loss 5.8615, time 122.07ms
iter 469290: loss 6.3028, time 121.07ms
iter 469300: loss 5.8106, time 121.16ms
iter 469310: loss 5.7244, time 121.54ms
iter 469320: loss 5.9995, time 123.25ms
iter 469330: loss 5.7724, time 121.01ms
iter 469340: loss 5.8064, time 122.12ms
iter 469350: loss 6.4595, time 123.04ms
iter 469360: loss 6.2110, time 123.15ms
iter 469370: loss 6.2206, time 121.10ms
iter 469380: loss 5.4955, time 120.11ms
iter 469390: loss 6.3456, time 123.82ms
iter 469400: loss 6.4703, time 120.63ms
iter 469410: loss 6.2315, time 121.53ms
iter 469420: loss 5.6406, time 120.98ms
iter 469430: loss 6.3287, time 121.57ms
iter 469440: loss 5.2427, time 120.49ms
iter 469450: loss 5.4619, time 120.48ms
iter 469460: loss 6.0663, time 122.09ms
iter 469470: loss 5.6804, time 122.80ms
iter 469480: loss 5.5112, time 121.56ms
iter 469490: loss 5.5362, time 121.36ms
step 469500: train loss 5.5555, val loss 5.5161
saving checkpoint to out-shakespeare-char
iter 469500: loss 5.4179, time 2899.20ms
iter 469510: loss 6.2480, time 121.59ms
iter 469520: loss 5.9790, time 121.75ms
iter 469530: loss 5.9890, time 122.91ms
iter 469540: loss 6.1371, time 120.55ms
iter 469550: loss 5.8488, time 122.60ms
iter 469560: loss 5.5752, time 121.42ms
iter 469570: loss 6.0612, time 120.94ms
iter 469580: loss 6.3644, time 122.92ms
iter 469590: loss 5.4677, time 121.70ms
iter 469600: loss 5.8627, time 121.60ms
iter 469610: loss 6.0134, time 124.18ms
iter 469620: loss 6.0420, time 120.18ms
iter 469630: loss 6.1545, time 121.34ms
iter 469640: loss 6.0976, time 120.92ms
iter 469650: loss 6.5220, time 122.77ms
iter 469660: loss 6.1751, time 121.39ms
iter 469670: loss 5.5688, time 122.65ms
iter 469680: loss 5.1477, time 121.37ms
iter 469690: loss 6.0457, time 121.60ms
iter 469700: loss 5.6437, time 122.98ms
iter 469710: loss 5.7118, time 121.50ms
iter 469720: loss 6.4277, time 121.53ms
iter 469730: loss 6.2123, time 121.36ms
iter 469740: loss 5.8184, time 121.51ms
step 469750: train loss 5.6219, val loss 5.5859
saving checkpoint to out-shakespeare-char
iter 469750: loss 5.9480, time 2900.20ms
iter 469760: loss 6.4280, time 121.51ms
iter 469770: loss 6.2523, time 122.96ms
iter 469780: loss 5.8204, time 121.23ms
iter 469790: loss 6.3837, time 121.51ms
iter 469800: loss 5.9418, time 121.54ms
iter 469810: loss 6.3636, time 120.81ms
iter 469820: loss 6.0551, time 122.82ms
iter 469830: loss 5.8077, time 124.16ms
iter 469840: loss 6.3944, time 122.02ms
iter 469850: loss 5.6535, time 121.54ms
iter 469860: loss 5.5889, time 120.19ms
iter 469870: loss 6.1925, time 122.73ms
iter 469880: loss 6.1908, time 121.26ms
iter 469890: loss 6.7331, time 121.48ms
iter 469900: loss 6.2372, time 121.61ms
iter 469910: loss 5.4202, time 121.66ms
iter 469920: loss 5.8428, time 121.72ms
iter 469930: loss 6.4326, time 122.16ms
iter 469940: loss 5.0326, time 121.91ms
iter 469950: loss 6.0121, time 121.52ms
iter 469960: loss 6.1636, time 123.01ms
iter 469970: loss 5.7380, time 121.23ms
iter 469980: loss 5.9172, time 121.47ms
iter 469990: loss 5.7011, time 124.21ms
step 470000: train loss 5.5631, val loss 5.5728
saving checkpoint to out-shakespeare-char
iter 470000: loss 6.1560, time 2901.48ms
iter 470010: loss 6.1620, time 122.01ms
iter 470020: loss 4.9663, time 121.04ms
iter 470030: loss 5.5879, time 122.30ms
iter 470040: loss 6.4331, time 123.62ms
iter 470050: loss 6.0261, time 121.44ms
iter 470060: loss 6.2641, time 121.73ms
iter 470070: loss 5.9529, time 124.51ms
iter 470080: loss 5.4688, time 121.79ms
iter 470090: loss 6.0023, time 122.57ms
iter 470100: loss 5.3989, time 123.83ms
iter 470110: loss 6.2686, time 121.99ms
iter 470120: loss 5.2122, time 121.95ms
iter 470130: loss 5.9301, time 122.12ms
iter 470140: loss 6.1014, time 122.00ms
iter 470150: loss 5.7881, time 122.25ms
iter 470160: loss 5.9778, time 122.35ms
iter 470170: loss 5.6573, time 121.97ms
iter 470180: loss 6.2325, time 122.04ms
iter 470190: loss 5.8654, time 123.25ms
iter 470200: loss 5.9050, time 122.10ms
iter 470210: loss 5.2768, time 121.96ms
iter 470220: loss 5.8339, time 122.92ms
iter 470230: loss 5.2033, time 121.45ms
iter 470240: loss 6.0224, time 122.20ms
step 470250: train loss 5.5596, val loss 5.5774
saving checkpoint to out-shakespeare-char
iter 470250: loss 5.5816, time 2898.30ms
iter 470260: loss 6.0038, time 121.54ms
iter 470270: loss 5.7131, time 122.63ms
iter 470280: loss 5.8530, time 122.69ms
iter 470290: loss 5.8499, time 123.05ms
iter 470300: loss 6.5614, time 121.48ms
iter 470310: loss 6.2330, time 121.63ms
iter 470320: loss 5.4700, time 122.46ms
iter 470330: loss 6.7486, time 121.90ms
iter 470340: loss 5.9315, time 121.57ms
iter 470350: loss 6.5105, time 122.42ms
iter 470360: loss 5.7111, time 121.53ms
iter 470370: loss 6.0674, time 121.51ms
iter 470380: loss 7.2095, time 123.97ms
iter 470390: loss 6.5971, time 121.67ms
iter 470400: loss 6.0532, time 122.22ms
iter 470410: loss 6.0143, time 123.27ms
iter 470420: loss 6.1470, time 121.63ms
iter 470430: loss 5.6068, time 120.66ms
iter 470440: loss 5.4772, time 123.02ms
iter 470450: loss 6.1777, time 122.61ms
iter 470460: loss 6.5483, time 121.98ms
iter 470470: loss 5.5055, time 123.14ms
iter 470480: loss 5.8606, time 121.55ms
iter 470490: loss 5.7639, time 121.62ms
step 470500: train loss 5.5573, val loss 5.5781
saving checkpoint to out-shakespeare-char
iter 470500: loss 5.7671, time 2926.65ms
iter 470510: loss 6.5529, time 121.60ms
iter 470520: loss 6.0130, time 118.53ms
iter 470530: loss 5.9618, time 121.67ms
iter 470540: loss 5.8320, time 121.98ms
iter 470550: loss 5.3509, time 121.40ms
iter 470560: loss 5.5261, time 123.10ms
iter 470570: loss 5.8372, time 121.21ms
iter 470580: loss 6.0611, time 131.77ms
iter 470590: loss 5.9709, time 127.52ms
iter 470600: loss 6.4142, time 125.62ms
iter 470610: loss 6.2546, time 126.14ms
iter 470620: loss 5.7616, time 125.74ms
iter 470630: loss 6.2857, time 125.39ms
iter 470640: loss 6.1516, time 125.81ms
iter 470650: loss 5.7496, time 125.97ms
iter 470660: loss 6.1831, time 125.12ms
iter 470670: loss 5.9572, time 126.35ms
iter 470680: loss 6.4447, time 125.12ms
iter 470690: loss 7.1214, time 126.75ms
iter 470700: loss 5.8126, time 125.38ms
iter 470710: loss 5.6961, time 125.27ms
iter 470720: loss 5.5854, time 126.42ms
iter 470730: loss 6.0032, time 125.68ms
iter 470740: loss 6.0485, time 125.63ms
step 470750: train loss 5.5712, val loss 5.5211
saving checkpoint to out-shakespeare-char
iter 470750: loss 6.0239, time 2888.53ms
iter 470760: loss 5.8694, time 122.32ms
iter 470770: loss 5.7530, time 120.88ms
iter 470780: loss 5.7269, time 121.47ms
iter 470790: loss 6.0489, time 121.45ms
iter 470800: loss 6.1376, time 121.11ms
iter 470810: loss 6.4726, time 121.74ms
iter 470820: loss 6.3223, time 121.76ms
iter 470830: loss 5.8532, time 122.83ms
iter 470840: loss 5.5717, time 121.54ms
iter 470850: loss 5.8968, time 121.64ms
iter 470860: loss 5.6100, time 122.36ms
iter 470870: loss 5.9932, time 121.85ms
iter 470880: loss 5.8087, time 121.82ms
iter 470890: loss 6.8587, time 122.23ms
iter 470900: loss 6.3271, time 121.50ms
iter 470910: loss 5.7333, time 121.62ms
iter 470920: loss 6.2112, time 122.42ms
iter 470930: loss 5.2348, time 121.55ms
iter 470940: loss 5.6560, time 121.89ms
iter 470950: loss 6.3496, time 122.27ms
iter 470960: loss 5.9360, time 121.25ms
iter 470970: loss 5.9489, time 125.60ms
iter 470980: loss 6.0183, time 126.24ms
iter 470990: loss 5.8265, time 123.63ms
step 471000: train loss 5.5961, val loss 5.5580
saving checkpoint to out-shakespeare-char
iter 471000: loss 6.4746, time 2890.61ms
iter 471010: loss 5.7728, time 125.55ms
iter 471020: loss 6.3180, time 124.21ms
iter 471030: loss 5.9943, time 127.24ms
iter 471040: loss 6.4352, time 125.32ms
iter 471050: loss 5.4305, time 128.25ms
iter 471060: loss 7.0406, time 125.43ms
iter 471070: loss 6.4497, time 127.97ms
iter 471080: loss 5.3359, time 125.87ms
iter 471090: loss 6.8179, time 128.36ms
iter 471100: loss 6.8839, time 125.62ms
iter 471110: loss 5.3932, time 128.53ms
iter 471120: loss 5.6342, time 125.39ms
iter 471130: loss 5.9547, time 127.90ms
iter 471140: loss 6.7074, time 126.90ms
iter 471150: loss 5.9555, time 125.50ms
iter 471160: loss 5.8657, time 125.46ms
iter 471170: loss 5.9347, time 125.88ms
iter 471180: loss 6.0295, time 124.89ms
iter 471190: loss 5.7770, time 125.31ms
iter 471200: loss 5.0226, time 126.01ms
iter 471210: loss 6.1811, time 125.43ms
iter 471220: loss 6.3713, time 125.92ms
iter 471230: loss 6.2315, time 125.31ms
iter 471240: loss 6.1791, time 125.41ms
step 471250: train loss 5.5987, val loss 5.5192
saving checkpoint to out-shakespeare-char
iter 471250: loss 5.4079, time 2897.84ms
iter 471260: loss 5.2304, time 121.86ms
iter 471270: loss 5.3888, time 122.69ms
iter 471280: loss 6.8552, time 122.08ms
iter 471290: loss 6.3788, time 121.67ms
iter 471300: loss 5.5590, time 122.96ms
iter 471310: loss 6.6007, time 120.68ms
iter 471320: loss 5.0958, time 122.54ms
iter 471330: loss 6.6244, time 121.59ms
iter 471340: loss 5.9224, time 121.59ms
iter 471350: loss 6.1884, time 124.06ms
iter 471360: loss 5.6218, time 121.46ms
iter 471370: loss 5.4309, time 121.43ms
iter 471380: loss 6.3868, time 121.51ms
iter 471390: loss 6.7833, time 121.61ms
iter 471400: loss 5.4354, time 121.78ms
iter 471410: loss 5.7779, time 121.53ms
iter 471420: loss 5.9355, time 122.54ms
iter 471430: loss 6.5221, time 121.72ms
iter 471440: loss 5.4028, time 121.84ms
iter 471450: loss 6.2988, time 122.75ms
iter 471460: loss 6.1386, time 121.60ms
iter 471470: loss 6.4825, time 121.02ms
iter 471480: loss 5.3310, time 123.49ms
iter 471490: loss 5.7835, time 121.55ms
step 471500: train loss 5.5791, val loss 5.5668
saving checkpoint to out-shakespeare-char
iter 471500: loss 5.8160, time 2890.48ms
iter 471510: loss 6.5601, time 121.43ms
iter 471520: loss 5.8849, time 122.96ms
iter 471530: loss 5.9292, time 121.55ms
iter 471540: loss 6.2173, time 121.56ms
iter 471550: loss 6.0800, time 124.00ms
iter 471560: loss 4.9845, time 121.48ms
iter 471570: loss 6.2341, time 121.82ms
iter 471580: loss 6.5537, time 121.79ms
iter 471590: loss 5.7558, time 121.23ms
iter 471600: loss 6.0003, time 122.89ms
iter 471610: loss 5.9778, time 122.23ms
iter 471620: loss 6.1352, time 121.89ms
iter 471630: loss 5.9099, time 121.49ms
iter 471640: loss 6.3207, time 121.48ms
iter 471650: loss 6.0586, time 121.81ms
iter 471660: loss 5.9143, time 121.84ms
iter 471670: loss 5.8986, time 122.61ms
iter 471680: loss 5.5225, time 121.42ms
iter 471690: loss 6.6576, time 121.46ms
iter 471700: loss 6.4194, time 122.51ms
iter 471710: loss 5.8072, time 121.06ms
iter 471720: loss 6.1199, time 122.64ms
iter 471730: loss 5.3686, time 121.61ms
iter 471740: loss 5.4210, time 121.56ms
step 471750: train loss 5.5261, val loss 5.5745
saving checkpoint to out-shakespeare-char
iter 471750: loss 6.5692, time 2898.76ms
iter 471760: loss 6.0354, time 121.56ms
iter 471770: loss 5.2635, time 122.67ms
iter 471780: loss 6.1331, time 121.46ms
iter 471790: loss 6.2769, time 121.42ms
iter 471800: loss 5.5214, time 121.66ms
iter 471810: loss 5.6029, time 122.55ms
iter 471820: loss 5.8279, time 121.58ms
iter 471830: loss 6.6260, time 121.48ms
iter 471840: loss 5.9380, time 122.43ms
iter 471850: loss 5.8311, time 121.56ms
iter 471860: loss 6.0041, time 121.46ms
iter 471870: loss 6.3692, time 124.06ms
iter 471880: loss 5.8210, time 122.23ms
iter 471890: loss 6.1046, time 122.80ms
iter 471900: loss 5.7331, time 122.23ms
iter 471910: loss 5.9685, time 121.50ms
iter 471920: loss 6.6144, time 121.43ms
iter 471930: loss 6.1283, time 121.36ms
iter 471940: loss 5.6775, time 121.48ms
iter 471950: loss 5.9976, time 121.50ms
iter 471960: loss 5.6614, time 122.66ms
iter 471970: loss 5.8930, time 121.43ms
iter 471980: loss 5.9108, time 121.43ms
iter 471990: loss 6.1391, time 123.99ms
step 472000: train loss 5.5449, val loss 5.5776
saving checkpoint to out-shakespeare-char
iter 472000: loss 5.3355, time 2889.14ms
iter 472010: loss 5.5294, time 121.56ms
iter 472020: loss 5.5597, time 121.54ms
iter 472030: loss 5.6523, time 124.30ms
iter 472040: loss 5.4494, time 121.60ms
iter 472050: loss 5.5377, time 124.67ms
iter 472060: loss 6.1198, time 121.58ms
iter 472070: loss 6.5549, time 121.44ms
iter 472080: loss 5.9744, time 122.70ms
iter 472090: loss 6.0465, time 122.53ms
iter 472100: loss 5.3950, time 121.51ms
iter 472110: loss 5.3965, time 121.38ms
iter 472120: loss 6.0716, time 124.06ms
iter 472130: loss 5.9613, time 121.56ms
iter 472140: loss 5.4679, time 121.46ms
iter 472150: loss 5.5405, time 121.56ms
iter 472160: loss 5.5854, time 121.63ms
iter 472170: loss 6.1639, time 121.52ms
iter 472180: loss 6.6509, time 121.55ms
iter 472190: loss 5.5205, time 122.72ms
iter 472200: loss 5.7415, time 122.71ms
iter 472210: loss 5.6574, time 123.29ms
iter 472220: loss 6.5770, time 121.56ms
iter 472230: loss 5.8576, time 121.56ms
iter 472240: loss 5.8592, time 124.16ms
step 472250: train loss 5.5891, val loss 5.5879
saving checkpoint to out-shakespeare-char
iter 472250: loss 5.6735, time 2896.85ms
iter 472260: loss 6.0374, time 120.67ms
iter 472270: loss 5.4336, time 122.72ms
iter 472280: loss 6.6229, time 122.62ms
iter 472290: loss 5.9486, time 120.73ms
iter 472300: loss 6.0083, time 122.61ms
iter 472310: loss 6.1502, time 121.27ms
iter 472320: loss 5.8160, time 121.44ms
iter 472330: loss 5.9523, time 124.10ms
iter 472340: loss 6.2936, time 121.48ms
iter 472350: loss 6.1189, time 121.55ms
iter 472360: loss 6.5175, time 121.76ms
iter 472370: loss 5.4376, time 121.29ms
iter 472380: loss 5.8887, time 121.45ms
iter 472390: loss 5.7927, time 122.68ms
iter 472400: loss 5.7299, time 121.71ms
iter 472410: loss 5.7708, time 121.64ms
iter 472420: loss 5.8869, time 122.71ms
iter 472430: loss 6.1220, time 121.35ms
iter 472440: loss 5.4250, time 121.47ms
iter 472450: loss 6.2140, time 124.74ms
iter 472460: loss 6.6355, time 121.99ms
iter 472470: loss 6.2897, time 120.22ms
iter 472480: loss 5.6360, time 121.62ms
iter 472490: loss 5.5810, time 121.34ms
step 472500: train loss 5.5612, val loss 5.5342
saving checkpoint to out-shakespeare-char
iter 472500: loss 6.1195, time 2892.29ms
iter 472510: loss 5.2706, time 121.20ms
iter 472520: loss 6.3847, time 121.32ms
iter 472530: loss 5.6298, time 121.24ms
iter 472540: loss 6.3524, time 121.29ms
iter 472550: loss 5.4171, time 124.18ms
iter 472560: loss 6.5092, time 121.24ms
iter 472570: loss 5.9135, time 122.93ms
iter 472580: loss 5.4839, time 121.31ms
iter 472590: loss 6.2676, time 121.12ms
iter 472600: loss 5.4717, time 121.84ms
iter 472610: loss 5.9305, time 122.40ms
iter 472620: loss 5.9136, time 121.37ms
iter 472630: loss 5.8791, time 121.42ms
iter 472640: loss 6.0586, time 123.61ms
iter 472650: loss 5.9519, time 121.36ms
iter 472660: loss 5.4144, time 121.96ms
iter 472670: loss 5.4638, time 124.26ms
iter 472680: loss 6.2773, time 121.68ms
iter 472690: loss 6.5606, time 122.92ms
iter 472700: loss 6.6514, time 121.45ms
iter 472710: loss 5.9742, time 121.19ms
iter 472720: loss 6.1850, time 121.51ms
iter 472730: loss 5.9625, time 123.93ms
iter 472740: loss 6.1591, time 122.29ms
step 472750: train loss 5.5920, val loss 5.5617
saving checkpoint to out-shakespeare-char
iter 472750: loss 5.9415, time 2907.69ms
iter 472760: loss 6.0946, time 121.63ms
iter 472770: loss 6.3877, time 122.05ms
iter 472780: loss 7.0353, time 121.91ms
iter 472790: loss 6.6342, time 121.90ms
iter 472800: loss 5.4487, time 121.70ms
iter 472810: loss 6.1941, time 121.55ms
iter 472820: loss 5.6282, time 122.75ms
iter 472830: loss 5.7874, time 121.50ms
iter 472840: loss 5.6135, time 121.57ms
iter 472850: loss 5.9725, time 121.57ms
iter 472860: loss 5.9661, time 121.26ms
iter 472870: loss 5.3881, time 122.65ms
iter 472880: loss 5.9283, time 121.36ms
iter 472890: loss 5.7247, time 121.67ms
iter 472900: loss 5.9185, time 122.56ms
iter 472910: loss 5.9209, time 121.80ms
iter 472920: loss 6.5011, time 121.47ms
iter 472930: loss 5.4628, time 122.80ms
iter 472940: loss 6.0276, time 121.49ms
iter 472950: loss 6.2471, time 120.42ms
iter 472960: loss 6.1855, time 123.98ms
iter 472970: loss 5.4198, time 118.53ms
iter 472980: loss 5.7394, time 121.52ms
iter 472990: loss 6.0417, time 122.81ms
step 473000: train loss 5.5602, val loss 5.5511
saving checkpoint to out-shakespeare-char
iter 473000: loss 5.9118, time 2879.95ms
iter 473010: loss 6.0298, time 121.26ms
iter 473020: loss 6.0826, time 121.54ms
iter 473030: loss 5.9968, time 122.68ms
iter 473040: loss 5.7036, time 121.66ms
iter 473050: loss 6.3060, time 122.45ms
iter 473060: loss 6.3972, time 123.96ms
iter 473070: loss 6.0824, time 121.83ms
iter 473080: loss 6.0737, time 120.76ms
iter 473090: loss 5.7216, time 121.94ms
iter 473100: loss 5.2264, time 121.40ms
iter 473110: loss 6.5354, time 121.66ms
iter 473120: loss 5.3548, time 121.65ms
iter 473130: loss 5.8086, time 126.07ms
iter 473140: loss 5.7385, time 121.44ms
iter 473150: loss 6.2692, time 121.58ms
iter 473160: loss 6.1232, time 124.32ms
iter 473170: loss 5.8520, time 122.70ms
iter 473180: loss 5.4707, time 121.31ms
iter 473190: loss 5.5478, time 123.08ms
iter 473200: loss 6.0990, time 121.44ms
iter 473210: loss 5.6603, time 121.62ms
iter 473220: loss 5.6781, time 122.59ms
iter 473230: loss 6.2649, time 120.35ms
iter 473240: loss 6.5343, time 121.61ms
step 473250: train loss 5.6051, val loss 5.6109
saving checkpoint to out-shakespeare-char
iter 473250: loss 6.1556, time 2902.63ms
iter 473260: loss 5.2949, time 122.56ms
iter 473270: loss 6.4988, time 122.26ms
iter 473280: loss 6.4046, time 121.50ms
iter 473290: loss 6.2004, time 122.55ms
iter 473300: loss 6.2883, time 121.46ms
iter 473310: loss 6.1266, time 122.38ms
iter 473320: loss 5.8238, time 124.25ms
iter 473330: loss 5.1946, time 121.78ms
iter 473340: loss 6.5884, time 121.42ms
iter 473350: loss 5.5766, time 122.16ms
iter 473360: loss 5.4136, time 121.63ms
iter 473370: loss 5.8931, time 121.53ms
iter 473380: loss 6.2173, time 122.70ms
iter 473390: loss 6.5977, time 122.15ms
iter 473400: loss 5.8792, time 121.59ms
iter 473410: loss 5.7965, time 124.08ms
iter 473420: loss 5.9559, time 121.54ms
iter 473430: loss 5.7014, time 121.89ms
iter 473440: loss 5.5271, time 121.46ms
iter 473450: loss 6.2784, time 121.48ms
iter 473460: loss 5.7782, time 123.88ms
iter 473470: loss 5.9632, time 122.89ms
iter 473480: loss 5.7103, time 121.68ms
iter 473490: loss 6.1810, time 121.82ms
step 473500: train loss 5.5548, val loss 5.6114
saving checkpoint to out-shakespeare-char
iter 473500: loss 5.7484, time 2900.19ms
iter 473510: loss 5.5058, time 121.64ms
iter 473520: loss 5.9126, time 122.69ms
iter 473530: loss 6.2345, time 122.53ms
iter 473540: loss 6.0933, time 122.09ms
iter 473550: loss 6.0343, time 121.53ms
iter 473560: loss 6.0186, time 121.43ms
iter 473570: loss 6.0416, time 121.77ms
iter 473580: loss 5.2511, time 122.37ms
iter 473590: loss 6.3887, time 122.20ms
iter 473600: loss 5.8835, time 121.50ms
iter 473610: loss 5.9722, time 121.63ms
iter 473620: loss 5.7074, time 122.61ms
iter 473630: loss 5.5888, time 122.00ms
iter 473640: loss 5.8029, time 122.84ms
iter 473650: loss 5.9743, time 122.50ms
iter 473660: loss 5.4910, time 121.79ms
iter 473670: loss 5.2466, time 121.86ms
iter 473680: loss 5.9558, time 122.83ms
iter 473690: loss 5.3474, time 121.47ms
iter 473700: loss 5.9555, time 121.82ms
iter 473710: loss 5.3980, time 124.19ms
iter 473720: loss 6.3398, time 121.70ms
iter 473730: loss 6.3420, time 121.80ms
iter 473740: loss 5.4605, time 122.04ms
step 473750: train loss 5.5828, val loss 5.5467
saving checkpoint to out-shakespeare-char
iter 473750: loss 5.7678, time 2910.08ms
iter 473760: loss 5.6606, time 119.49ms
iter 473770: loss 5.8024, time 119.68ms
iter 473780: loss 6.1383, time 121.05ms
iter 473790: loss 5.9281, time 120.49ms
iter 473800: loss 6.1562, time 119.58ms
iter 473810: loss 5.1801, time 124.72ms
iter 473820: loss 6.1758, time 126.72ms
iter 473830: loss 6.1470, time 125.66ms
iter 473840: loss 6.6990, time 125.54ms
iter 473850: loss 6.3774, time 125.81ms
iter 473860: loss 5.1846, time 125.47ms
iter 473870: loss 6.5550, time 125.65ms
iter 473880: loss 5.8723, time 125.25ms
iter 473890: loss 6.0052, time 124.90ms
iter 473900: loss 6.4292, time 124.87ms
iter 473910: loss 5.8409, time 125.23ms
iter 473920: loss 6.6173, time 124.47ms
iter 473930: loss 5.5430, time 125.27ms
iter 473940: loss 5.4427, time 125.66ms
iter 473950: loss 6.2479, time 125.21ms
iter 473960: loss 5.4471, time 125.28ms
iter 473970: loss 6.3250, time 125.22ms
iter 473980: loss 6.3237, time 125.46ms
iter 473990: loss 6.0313, time 125.13ms
step 474000: train loss 5.6210, val loss 5.5696
saving checkpoint to out-shakespeare-char
iter 474000: loss 6.0651, time 2890.59ms
iter 474010: loss 6.1748, time 125.80ms
iter 474020: loss 5.9581, time 128.07ms
iter 474030: loss 6.1341, time 125.14ms
iter 474040: loss 5.7282, time 128.20ms
iter 474050: loss 6.2168, time 124.95ms
iter 474060: loss 6.6468, time 127.95ms
iter 474070: loss 5.5429, time 125.08ms
iter 474080: loss 4.9522, time 126.84ms
iter 474090: loss 6.1080, time 124.37ms
iter 474100: loss 5.8125, time 125.10ms
iter 474110: loss 5.8074, time 125.17ms
iter 474120: loss 6.3325, time 125.60ms
iter 474130: loss 6.4021, time 126.57ms
iter 474140: loss 5.0111, time 125.86ms
iter 474150: loss 5.7095, time 125.69ms
iter 474160: loss 5.9607, time 125.66ms
iter 474170: loss 5.5416, time 125.46ms
iter 474180: loss 5.6059, time 125.33ms
iter 474190: loss 5.9162, time 125.53ms
iter 474200: loss 6.0082, time 126.91ms
iter 474210: loss 6.1970, time 125.63ms
iter 474220: loss 5.9252, time 125.78ms
iter 474230: loss 5.8117, time 125.76ms
iter 474240: loss 5.4821, time 126.09ms
step 474250: train loss 5.5944, val loss 5.5664
saving checkpoint to out-shakespeare-char
iter 474250: loss 5.9843, time 2895.91ms
iter 474260: loss 6.5687, time 128.27ms
iter 474270: loss 5.4539, time 125.72ms
iter 474280: loss 6.2478, time 128.26ms
iter 474290: loss 6.4777, time 125.40ms
iter 474300: loss 5.7159, time 127.71ms
iter 474310: loss 5.9505, time 126.14ms
iter 474320: loss 6.1170, time 125.32ms
iter 474330: loss 6.0346, time 125.26ms
iter 474340: loss 5.8846, time 124.49ms
iter 474350: loss 5.7102, time 124.70ms
iter 474360: loss 6.5816, time 126.66ms
iter 474370: loss 6.3987, time 129.14ms
iter 474380: loss 5.9145, time 125.60ms
iter 474390: loss 6.2755, time 127.52ms
iter 474400: loss 5.5343, time 125.93ms
iter 474410: loss 5.5922, time 128.56ms
iter 474420: loss 6.1097, time 125.32ms
iter 474430: loss 5.6393, time 127.35ms
iter 474440: loss 6.0186, time 125.79ms
iter 474450: loss 6.8522, time 128.40ms
iter 474460: loss 6.1443, time 125.92ms
iter 474470: loss 6.1699, time 127.27ms
iter 474480: loss 5.6233, time 125.77ms
iter 474490: loss 5.8586, time 126.51ms
step 474500: train loss 5.5625, val loss 5.5247
saving checkpoint to out-shakespeare-char
iter 474500: loss 5.8055, time 2892.15ms
iter 474510: loss 5.9593, time 124.92ms
iter 474520: loss 5.1057, time 123.14ms
iter 474530: loss 5.0996, time 124.88ms
iter 474540: loss 6.7097, time 124.34ms
iter 474550: loss 5.7762, time 124.33ms
iter 474560: loss 6.2365, time 124.96ms
iter 474570: loss 6.3562, time 124.59ms
iter 474580: loss 6.1156, time 124.23ms
iter 474590: loss 6.5530, time 124.48ms
iter 474600: loss 6.4400, time 124.63ms
iter 474610: loss 6.6363, time 124.07ms
iter 474620: loss 6.2540, time 125.00ms
iter 474630: loss 5.6035, time 126.81ms
iter 474640: loss 6.5097, time 125.78ms
iter 474650: loss 5.9588, time 125.14ms
iter 474660: loss 6.0414, time 125.80ms
iter 474670: loss 5.9470, time 125.56ms
iter 474680: loss 5.5455, time 125.12ms
iter 474690: loss 5.9517, time 125.36ms
iter 474700: loss 5.9752, time 125.68ms
iter 474710: loss 5.9461, time 125.32ms
iter 474720: loss 5.5957, time 125.89ms
iter 474730: loss 5.2181, time 125.43ms
iter 474740: loss 5.5507, time 125.76ms
step 474750: train loss 5.6096, val loss 5.5405
saving checkpoint to out-shakespeare-char
iter 474750: loss 5.5411, time 2890.63ms
iter 474760: loss 5.9523, time 125.74ms
iter 474770: loss 5.7794, time 125.61ms
iter 474780: loss 5.9025, time 126.01ms
iter 474790: loss 5.7859, time 125.49ms
iter 474800: loss 6.4160, time 125.51ms
iter 474810: loss 5.5844, time 124.84ms
iter 474820: loss 5.9295, time 125.58ms
iter 474830: loss 5.2009, time 125.93ms
iter 474840: loss 6.0020, time 125.60ms
iter 474850: loss 6.2016, time 125.52ms
iter 474860: loss 6.1755, time 125.63ms
iter 474870: loss 5.1888, time 125.40ms
iter 474880: loss 6.2305, time 125.43ms
iter 474890: loss 6.3821, time 125.41ms
iter 474900: loss 5.8548, time 125.89ms
iter 474910: loss 6.2619, time 125.53ms
iter 474920: loss 5.8785, time 125.96ms
iter 474930: loss 5.5483, time 125.52ms
iter 474940: loss 6.1260, time 125.99ms
iter 474950: loss 6.0307, time 125.74ms
iter 474960: loss 6.1536, time 125.69ms
iter 474970: loss 5.7188, time 125.76ms
iter 474980: loss 6.2198, time 125.58ms
iter 474990: loss 6.5450, time 125.75ms
step 475000: train loss 5.5885, val loss 5.5337
saving checkpoint to out-shakespeare-char
iter 475000: loss 6.1305, time 2887.46ms
iter 475010: loss 6.4403, time 124.12ms
iter 475020: loss 6.9341, time 127.96ms
iter 475030: loss 6.1769, time 124.79ms
iter 475040: loss 5.5526, time 127.26ms
iter 475050: loss 6.3357, time 125.06ms
iter 475060: loss 6.4549, time 127.27ms
iter 475070: loss 5.9327, time 124.83ms
iter 475080: loss 5.7307, time 124.86ms
iter 475090: loss 5.6466, time 125.11ms
iter 475100: loss 6.1502, time 124.58ms
iter 475110: loss 5.6088, time 124.60ms
iter 475120: loss 6.1651, time 124.95ms
iter 475130: loss 7.2104, time 124.08ms
iter 475140: loss 6.2424, time 124.18ms
iter 475150: loss 5.6417, time 124.85ms
iter 475160: loss 6.4816, time 124.59ms
iter 475170: loss 5.3260, time 124.37ms
iter 475180: loss 6.3488, time 125.67ms
iter 475190: loss 5.7671, time 125.39ms
iter 475200: loss 6.2050, time 125.26ms
iter 475210: loss 5.1550, time 125.24ms
iter 475220: loss 5.6311, time 125.23ms
iter 475230: loss 5.3804, time 124.99ms
iter 475240: loss 5.4114, time 125.14ms
step 475250: train loss 5.5638, val loss 5.5006
saving checkpoint to out-shakespeare-char
iter 475250: loss 5.5848, time 2918.62ms
iter 475260: loss 6.4800, time 127.55ms
iter 475270: loss 6.4899, time 124.84ms
iter 475280: loss 6.2897, time 127.87ms
iter 475290: loss 5.8295, time 121.53ms
iter 475300: loss 6.0493, time 121.63ms
iter 475310: loss 6.0174, time 122.99ms
iter 475320: loss 5.8661, time 121.47ms
iter 475330: loss 5.6125, time 121.66ms
iter 475340: loss 6.1912, time 124.34ms
iter 475350: loss 6.9964, time 121.83ms
iter 475360: loss 6.0656, time 121.66ms
iter 475370: loss 6.0464, time 121.63ms
iter 475380: loss 5.5132, time 121.87ms
iter 475390: loss 6.4038, time 121.78ms
iter 475400: loss 6.9317, time 121.75ms
iter 475410: loss 6.1171, time 121.90ms
iter 475420: loss 6.0449, time 121.73ms
iter 475430: loss 6.3877, time 121.34ms
iter 475440: loss 6.7782, time 122.66ms
iter 475450: loss 5.5177, time 121.32ms
iter 475460: loss 5.8029, time 121.65ms
iter 475470: loss 5.6938, time 124.08ms
iter 475480: loss 6.2802, time 121.49ms
iter 475490: loss 6.1964, time 121.58ms
step 475500: train loss 5.5589, val loss 5.5476
saving checkpoint to out-shakespeare-char
iter 475500: loss 6.3010, time 2912.54ms
iter 475510: loss 6.2319, time 121.65ms
iter 475520: loss 5.9023, time 121.23ms
iter 475530: loss 6.5522, time 121.10ms
iter 475540: loss 5.6027, time 121.44ms
iter 475550: loss 5.7279, time 121.60ms
iter 475560: loss 6.0281, time 120.69ms
iter 475570: loss 5.8155, time 123.07ms
iter 475580: loss 5.6915, time 120.90ms
iter 475590: loss 5.5868, time 121.92ms
iter 475600: loss 5.6583, time 122.70ms
iter 475610: loss 5.5377, time 120.90ms
iter 475620: loss 6.2215, time 121.66ms
iter 475630: loss 5.6509, time 124.41ms
iter 475640: loss 6.6885, time 121.88ms
iter 475650: loss 5.9002, time 121.74ms
iter 475660: loss 6.2706, time 121.48ms
iter 475670: loss 5.2181, time 121.70ms
iter 475680: loss 6.4323, time 121.69ms
iter 475690: loss 5.7486, time 121.48ms
iter 475700: loss 5.4459, time 123.47ms
iter 475710: loss 5.5378, time 121.40ms
iter 475720: loss 6.1744, time 121.53ms
iter 475730: loss 5.8002, time 122.75ms
iter 475740: loss 5.9579, time 121.42ms
step 475750: train loss 5.5612, val loss 5.5827
saving checkpoint to out-shakespeare-char
iter 475750: loss 6.1172, time 2910.13ms
iter 475760: loss 5.8749, time 121.44ms
iter 475770: loss 6.2645, time 122.84ms
iter 475780: loss 5.5447, time 121.47ms
iter 475790: loss 5.6789, time 120.30ms
iter 475800: loss 5.6819, time 121.08ms
iter 475810: loss 6.4346, time 121.80ms
iter 475820: loss 5.7028, time 121.56ms
iter 475830: loss 6.2936, time 124.34ms
iter 475840: loss 5.4906, time 121.88ms
iter 475850: loss 5.6684, time 121.48ms
iter 475860: loss 5.9511, time 121.24ms
iter 475870: loss 5.7832, time 121.25ms
iter 475880: loss 5.8739, time 120.61ms
iter 475890: loss 6.0552, time 121.14ms
iter 475900: loss 5.6029, time 123.14ms
iter 475910: loss 6.2876, time 121.57ms
iter 475920: loss 5.8464, time 121.17ms
iter 475930: loss 6.7032, time 124.60ms
iter 475940: loss 6.3764, time 121.71ms
iter 475950: loss 5.7488, time 119.97ms
iter 475960: loss 5.7243, time 121.81ms
iter 475970: loss 6.2018, time 122.31ms
iter 475980: loss 6.3210, time 121.72ms
iter 475990: loss 5.6958, time 121.52ms
step 476000: train loss 5.5783, val loss 5.5729
saving checkpoint to out-shakespeare-char
iter 476000: loss 5.9875, time 2905.71ms
iter 476010: loss 6.0720, time 125.80ms
iter 476020: loss 6.1226, time 125.81ms
iter 476030: loss 5.6476, time 125.69ms
iter 476040: loss 5.9960, time 125.74ms
iter 476050: loss 5.9465, time 125.74ms
iter 476060: loss 6.0223, time 125.54ms
iter 476070: loss 5.9312, time 125.96ms
iter 476080: loss 6.2707, time 125.85ms
iter 476090: loss 5.5819, time 126.11ms
iter 476100: loss 5.8523, time 125.23ms
iter 476110: loss 5.7568, time 125.98ms
iter 476120: loss 6.1318, time 126.28ms
iter 476130: loss 6.1977, time 125.79ms
iter 476140: loss 5.3649, time 125.44ms
iter 476150: loss 5.5088, time 125.55ms
iter 476160: loss 6.2724, time 125.74ms
iter 476170: loss 6.0863, time 125.71ms
iter 476180: loss 6.1856, time 125.80ms
iter 476190: loss 5.5511, time 125.70ms
iter 476200: loss 6.4423, time 125.57ms
iter 476210: loss 5.8651, time 125.65ms
iter 476220: loss 6.0053, time 126.31ms
iter 476230: loss 5.8751, time 125.56ms
iter 476240: loss 6.6654, time 126.20ms
step 476250: train loss 5.4909, val loss 5.5561
saving checkpoint to out-shakespeare-char
iter 476250: loss 5.4786, time 2910.02ms
iter 476260: loss 6.0808, time 126.25ms
iter 476270: loss 5.5316, time 125.66ms
iter 476280: loss 6.7942, time 125.10ms
iter 476290: loss 5.5799, time 124.68ms
iter 476300: loss 5.0854, time 125.77ms
iter 476310: loss 5.4180, time 125.70ms
iter 476320: loss 6.0310, time 126.33ms
iter 476330: loss 5.5626, time 126.50ms
iter 476340: loss 6.3908, time 126.31ms
iter 476350: loss 6.5721, time 124.29ms
iter 476360: loss 5.6371, time 125.04ms
iter 476370: loss 6.1789, time 124.99ms
iter 476380: loss 6.4413, time 125.17ms
iter 476390: loss 6.0030, time 125.36ms
iter 476400: loss 6.3731, time 126.39ms
iter 476410: loss 6.5855, time 125.06ms
iter 476420: loss 6.3704, time 127.58ms
iter 476430: loss 5.9523, time 126.20ms
iter 476440: loss 5.8788, time 124.89ms
iter 476450: loss 6.4131, time 127.60ms
iter 476460: loss 6.6853, time 124.87ms
iter 476470: loss 6.5822, time 124.94ms
iter 476480: loss 5.9526, time 124.87ms
iter 476490: loss 5.8820, time 119.59ms
step 476500: train loss 5.5853, val loss 5.5183
saving checkpoint to out-shakespeare-char
iter 476500: loss 5.8459, time 2893.32ms
iter 476510: loss 6.1589, time 124.83ms
iter 476520: loss 5.6696, time 122.98ms
iter 476530: loss 5.9819, time 125.08ms
iter 476540: loss 6.0100, time 124.21ms
iter 476550: loss 6.5268, time 124.60ms
iter 476560: loss 6.1312, time 124.08ms
iter 476570: loss 5.3378, time 124.96ms
iter 476580: loss 5.3374, time 126.00ms
iter 476590: loss 5.9018, time 125.47ms
iter 476600: loss 5.9105, time 125.47ms
iter 476610: loss 5.7633, time 126.99ms
iter 476620: loss 5.4216, time 125.84ms
iter 476630: loss 6.2644, time 125.58ms
iter 476640: loss 6.9748, time 125.34ms
iter 476650: loss 6.1385, time 125.45ms
iter 476660: loss 5.9223, time 124.43ms
iter 476670: loss 5.9628, time 125.81ms
iter 476680: loss 6.5407, time 125.18ms
iter 476690: loss 6.7983, time 125.42ms
iter 476700: loss 5.4771, time 125.47ms
iter 476710: loss 6.1690, time 125.72ms
iter 476720: loss 5.8731, time 127.13ms
iter 476730: loss 6.0673, time 125.71ms
iter 476740: loss 5.6033, time 125.60ms
step 476750: train loss 5.5405, val loss 5.5148
saving checkpoint to out-shakespeare-char
iter 476750: loss 6.2843, time 2893.43ms
iter 476760: loss 6.3696, time 125.83ms
iter 476770: loss 5.4449, time 126.06ms
iter 476780: loss 6.7785, time 124.28ms
iter 476790: loss 5.3395, time 125.28ms
iter 476800: loss 6.3187, time 124.98ms
iter 476810: loss 5.6273, time 125.20ms
iter 476820: loss 5.8534, time 125.82ms
iter 476830: loss 6.2383, time 125.47ms
iter 476840: loss 5.6400, time 125.96ms
iter 476850: loss 6.5233, time 125.78ms
iter 476860: loss 5.7499, time 126.33ms
iter 476870: loss 6.4733, time 126.11ms
iter 476880: loss 6.0500, time 126.15ms
iter 476890: loss 6.3815, time 128.66ms
iter 476900: loss 5.4432, time 125.69ms
iter 476910: loss 6.0721, time 128.24ms
iter 476920: loss 6.3342, time 125.55ms
iter 476930: loss 5.7767, time 127.61ms
iter 476940: loss 6.6165, time 131.09ms
iter 476950: loss 5.3707, time 125.59ms
iter 476960: loss 5.5343, time 125.53ms
iter 476970: loss 6.6553, time 128.13ms
iter 476980: loss 5.7039, time 125.53ms
iter 476990: loss 5.7082, time 124.83ms
step 477000: train loss 5.5286, val loss 5.5921
saving checkpoint to out-shakespeare-char
iter 477000: loss 5.3251, time 2888.44ms
iter 477010: loss 6.1011, time 125.83ms
iter 477020: loss 6.0598, time 125.48ms
iter 477030: loss 5.6729, time 125.77ms
iter 477040: loss 5.9778, time 127.21ms
iter 477050: loss 6.0938, time 125.58ms
iter 477060: loss 6.3870, time 125.67ms
iter 477070: loss 6.0113, time 125.60ms
iter 477080: loss 5.9843, time 125.78ms
iter 477090: loss 5.6722, time 125.53ms
iter 477100: loss 6.1411, time 125.77ms
iter 477110: loss 5.8987, time 125.71ms
iter 477120: loss 5.5517, time 125.95ms
iter 477130: loss 5.6770, time 125.71ms
iter 477140: loss 6.1727, time 126.00ms
iter 477150: loss 6.4167, time 125.50ms
iter 477160: loss 6.3790, time 125.77ms
iter 477170: loss 6.7763, time 125.59ms
iter 477180: loss 6.3579, time 125.99ms
iter 477190: loss 5.6658, time 125.75ms
iter 477200: loss 5.8468, time 125.88ms
iter 477210: loss 6.2528, time 125.54ms
iter 477220: loss 6.3104, time 125.82ms
iter 477230: loss 5.9415, time 125.74ms
iter 477240: loss 5.2238, time 125.98ms
step 477250: train loss 5.5665, val loss 5.5796
saving checkpoint to out-shakespeare-char
iter 477250: loss 5.2219, time 2880.85ms
iter 477260: loss 6.0806, time 126.30ms
iter 477270: loss 6.1144, time 125.64ms
iter 477280: loss 6.1746, time 126.08ms
iter 477290: loss 6.0645, time 125.82ms
iter 477300: loss 6.8130, time 126.85ms
iter 477310: loss 5.4296, time 125.66ms
iter 477320: loss 6.1859, time 125.80ms
iter 477330: loss 5.7238, time 125.85ms
iter 477340: loss 6.3503, time 125.54ms
iter 477350: loss 6.0236, time 125.98ms
iter 477360: loss 5.7655, time 125.29ms
iter 477370: loss 5.8782, time 124.75ms
iter 477380: loss 5.6939, time 125.59ms
iter 477390: loss 6.0917, time 125.88ms
iter 477400: loss 6.1461, time 125.39ms
iter 477410: loss 6.1054, time 124.92ms
iter 477420: loss 5.7979, time 125.78ms
iter 477430: loss 5.9120, time 126.19ms
iter 477440: loss 5.6209, time 125.81ms
iter 477450: loss 6.1596, time 126.74ms
iter 477460: loss 5.7843, time 125.85ms
iter 477470: loss 5.2343, time 125.77ms
iter 477480: loss 6.3081, time 125.51ms
iter 477490: loss 6.5976, time 126.07ms
step 477500: train loss 5.5374, val loss 5.5611
saving checkpoint to out-shakespeare-char
iter 477500: loss 6.1010, time 2896.82ms
iter 477510: loss 5.4883, time 129.45ms
iter 477520: loss 6.0153, time 126.06ms
iter 477530: loss 5.8601, time 128.48ms
iter 477540: loss 5.8061, time 125.64ms
iter 477550: loss 5.5592, time 128.19ms
iter 477560: loss 6.4560, time 125.26ms
iter 477570: loss 6.4404, time 128.10ms
iter 477580: loss 5.7193, time 126.49ms
iter 477590: loss 5.6511, time 125.70ms
iter 477600: loss 6.9710, time 125.84ms
iter 477610: loss 5.6432, time 125.66ms
iter 477620: loss 5.1637, time 125.19ms
iter 477630: loss 6.6787, time 126.47ms
iter 477640: loss 6.0082, time 125.31ms
iter 477650: loss 5.9914, time 125.10ms
iter 477660: loss 6.5177, time 124.89ms
iter 477670: loss 6.3537, time 125.05ms
iter 477680: loss 5.9602, time 126.37ms
iter 477690: loss 5.2949, time 124.90ms
iter 477700: loss 6.3146, time 125.08ms
iter 477710: loss 6.1177, time 124.97ms
iter 477720: loss 5.4757, time 125.28ms
iter 477730: loss 5.6358, time 124.83ms
iter 477740: loss 5.9870, time 125.00ms
step 477750: train loss 5.5562, val loss 5.5839
saving checkpoint to out-shakespeare-char
iter 477750: loss 7.1096, time 2878.36ms
iter 477760: loss 5.7237, time 125.70ms
iter 477770: loss 5.2414, time 124.72ms
iter 477780: loss 5.8817, time 125.09ms
iter 477790: loss 5.7127, time 125.18ms
iter 477800: loss 5.9711, time 127.18ms
iter 477810: loss 5.9594, time 125.25ms
iter 477820: loss 6.3928, time 125.64ms
iter 477830: loss 5.8508, time 125.88ms
iter 477840: loss 5.8771, time 124.91ms
iter 477850: loss 5.6794, time 125.36ms
iter 477860: loss 5.2660, time 126.20ms
iter 477870: loss 5.6820, time 125.26ms
iter 477880: loss 5.7355, time 125.29ms
iter 477890: loss 5.6948, time 124.91ms
iter 477900: loss 6.7126, time 125.19ms
iter 477910: loss 5.9401, time 125.13ms
iter 477920: loss 5.7894, time 125.37ms
iter 477930: loss 5.5710, time 125.19ms
iter 477940: loss 6.1976, time 125.32ms
iter 477950: loss 5.7541, time 125.34ms
iter 477960: loss 6.6670, time 125.65ms
iter 477970: loss 6.0957, time 125.44ms
iter 477980: loss 5.9563, time 124.67ms
iter 477990: loss 6.0695, time 125.07ms
step 478000: train loss 5.5193, val loss 5.5885
saving checkpoint to out-shakespeare-char
iter 478000: loss 5.8399, time 2888.25ms
iter 478010: loss 6.0839, time 124.14ms
iter 478020: loss 5.1483, time 125.30ms
iter 478030: loss 5.8574, time 125.16ms
iter 478040: loss 5.8686, time 125.65ms
iter 478050: loss 5.8736, time 125.26ms
iter 478060: loss 6.1666, time 125.12ms
iter 478070: loss 5.8612, time 125.36ms
iter 478080: loss 5.4772, time 125.47ms
iter 478090: loss 5.6870, time 125.27ms
iter 478100: loss 6.1641, time 125.53ms
iter 478110: loss 6.2649, time 125.61ms
iter 478120: loss 5.9153, time 125.52ms
iter 478130: loss 6.1000, time 125.38ms
iter 478140: loss 6.1342, time 125.28ms
iter 478150: loss 5.7096, time 125.35ms
iter 478160: loss 5.6768, time 124.53ms
iter 478170: loss 5.6673, time 125.16ms
iter 478180: loss 6.5312, time 124.70ms
iter 478190: loss 5.4090, time 125.49ms
iter 478200: loss 5.6695, time 124.83ms
iter 478210: loss 6.1896, time 125.14ms
iter 478220: loss 5.5607, time 124.71ms
iter 478230: loss 5.6586, time 126.96ms
iter 478240: loss 5.9591, time 124.53ms
step 478250: train loss 5.5666, val loss 5.5773
saving checkpoint to out-shakespeare-char
iter 478250: loss 6.5614, time 2896.88ms
iter 478260: loss 5.9809, time 125.90ms
iter 478270: loss 5.9498, time 125.65ms
iter 478280: loss 5.7156, time 127.50ms
iter 478290: loss 6.2230, time 125.61ms
iter 478300: loss 6.2097, time 125.35ms
iter 478310: loss 6.1772, time 125.21ms
iter 478320: loss 6.6588, time 125.09ms
iter 478330: loss 5.9912, time 124.98ms
iter 478340: loss 6.1576, time 125.37ms
iter 478350: loss 6.6949, time 125.26ms
iter 478360: loss 5.9415, time 125.08ms
iter 478370: loss 5.4401, time 125.07ms
iter 478380: loss 5.4428, time 125.25ms
iter 478390: loss 6.0154, time 126.50ms
iter 478400: loss 5.7497, time 125.45ms
iter 478410: loss 5.0960, time 125.37ms
iter 478420: loss 5.3592, time 125.25ms
iter 478430: loss 5.8318, time 125.63ms
iter 478440: loss 6.7290, time 125.63ms
iter 478450: loss 5.7047, time 125.41ms
iter 478460: loss 5.4812, time 125.30ms
iter 478470: loss 5.1287, time 125.42ms
iter 478480: loss 5.7249, time 124.38ms
iter 478490: loss 6.0047, time 124.70ms
step 478500: train loss 5.5293, val loss 5.6035
saving checkpoint to out-shakespeare-char
iter 478500: loss 5.4135, time 2856.00ms
iter 478510: loss 6.0765, time 125.53ms
iter 478520: loss 5.2053, time 125.58ms
iter 478530: loss 5.4893, time 125.61ms
iter 478540: loss 5.3390, time 125.86ms
iter 478550: loss 6.0340, time 127.14ms
iter 478560: loss 6.0003, time 125.91ms
iter 478570: loss 6.2770, time 125.98ms
iter 478580: loss 6.2813, time 125.48ms
iter 478590: loss 5.8777, time 126.53ms
iter 478600: loss 5.6077, time 125.47ms
iter 478610: loss 6.5617, time 125.39ms
iter 478620: loss 5.9852, time 125.74ms
iter 478630: loss 5.4102, time 125.81ms
iter 478640: loss 5.7450, time 125.52ms
iter 478650: loss 5.7519, time 125.83ms
iter 478660: loss 6.1036, time 127.17ms
iter 478670: loss 5.5128, time 125.82ms
iter 478680: loss 5.6843, time 125.81ms
iter 478690: loss 6.1343, time 125.64ms
iter 478700: loss 6.7027, time 124.89ms
iter 478710: loss 6.1744, time 125.64ms
iter 478720: loss 5.6855, time 125.17ms
iter 478730: loss 6.3242, time 125.17ms
iter 478740: loss 6.2636, time 126.70ms
step 478750: train loss 5.5723, val loss 5.5998
saving checkpoint to out-shakespeare-char
iter 478750: loss 6.0267, time 2906.24ms
iter 478760: loss 5.7860, time 125.33ms
iter 478770: loss 5.9339, time 125.28ms
iter 478780: loss 5.5607, time 125.68ms
iter 478790: loss 5.5733, time 124.84ms
iter 478800: loss 6.2781, time 125.02ms
iter 478810: loss 6.8713, time 125.11ms
iter 478820: loss 6.1801, time 124.85ms
iter 478830: loss 6.1864, time 125.36ms
iter 478840: loss 5.8036, time 125.23ms
iter 478850: loss 6.2829, time 125.34ms
iter 478860: loss 5.8298, time 125.26ms
iter 478870: loss 6.3791, time 125.45ms
iter 478880: loss 6.0123, time 125.42ms
iter 478890: loss 6.1656, time 125.33ms
iter 478900: loss 5.7876, time 125.28ms
iter 478910: loss 6.6283, time 125.08ms
iter 478920: loss 6.1951, time 126.56ms
iter 478930: loss 6.3635, time 125.09ms
iter 478940: loss 6.1061, time 125.41ms
iter 478950: loss 5.8423, time 125.21ms
iter 478960: loss 5.6941, time 124.98ms
iter 478970: loss 6.9981, time 130.36ms
iter 478980: loss 6.1767, time 125.53ms
iter 478990: loss 5.3265, time 127.81ms
step 479000: train loss 5.5273, val loss 5.5463
saving checkpoint to out-shakespeare-char
iter 479000: loss 6.0201, time 2909.69ms
iter 479010: loss 5.8734, time 125.23ms
iter 479020: loss 6.4165, time 124.92ms
iter 479030: loss 6.2260, time 125.28ms
iter 479040: loss 6.1923, time 125.24ms
iter 479050: loss 5.8705, time 121.45ms
iter 479060: loss 6.2329, time 121.56ms
iter 479070: loss 5.7758, time 121.10ms
iter 479080: loss 5.6966, time 121.54ms
iter 479090: loss 6.8062, time 121.35ms
iter 479100: loss 6.0453, time 123.00ms
iter 479110: loss 5.7597, time 121.30ms
iter 479120: loss 6.4115, time 123.75ms
iter 479130: loss 5.8366, time 121.35ms
iter 479140: loss 6.3428, time 121.39ms
iter 479150: loss 5.8394, time 121.39ms
iter 479160: loss 5.2019, time 122.51ms
iter 479170: loss 6.2794, time 121.87ms
iter 479180: loss 6.1729, time 121.94ms
iter 479190: loss 5.7958, time 122.82ms
iter 479200: loss 5.3715, time 121.35ms
iter 479210: loss 6.3229, time 120.99ms
iter 479220: loss 6.1515, time 121.60ms
iter 479230: loss 5.3586, time 120.39ms
iter 479240: loss 7.1093, time 121.18ms
step 479250: train loss 5.5724, val loss 5.5739
saving checkpoint to out-shakespeare-char
iter 479250: loss 6.0601, time 2896.13ms
iter 479260: loss 5.4908, time 121.38ms
iter 479270: loss 6.0624, time 122.05ms
iter 479280: loss 5.8780, time 123.07ms
iter 479290: loss 5.5792, time 122.39ms
iter 479300: loss 5.7322, time 121.88ms
iter 479310: loss 5.5573, time 121.74ms
iter 479320: loss 6.1420, time 122.97ms
iter 479330: loss 5.8880, time 121.93ms
iter 479340: loss 6.0192, time 121.85ms
iter 479350: loss 5.6652, time 122.95ms
iter 479360: loss 5.8386, time 121.80ms
iter 479370: loss 5.5549, time 122.02ms
iter 479380: loss 5.9995, time 124.46ms
iter 479390: loss 6.6024, time 121.09ms
iter 479400: loss 5.9790, time 122.92ms
iter 479410: loss 5.7941, time 122.84ms
iter 479420: loss 5.9493, time 121.79ms
iter 479430: loss 5.9327, time 125.59ms
iter 479440: loss 5.8088, time 126.01ms
iter 479450: loss 5.3744, time 125.64ms
iter 479460: loss 5.7089, time 125.72ms
iter 479470: loss 6.5973, time 125.62ms
iter 479480: loss 6.0020, time 125.54ms
iter 479490: loss 6.3269, time 125.67ms
step 479500: train loss 5.5906, val loss 5.5807
saving checkpoint to out-shakespeare-char
iter 479500: loss 5.8969, time 2915.50ms
iter 479510: loss 5.4907, time 125.34ms
iter 479520: loss 5.5767, time 124.95ms
iter 479530: loss 5.8797, time 125.19ms
iter 479540: loss 6.1511, time 125.15ms
iter 479550: loss 5.7832, time 125.23ms
iter 479560: loss 6.0762, time 126.93ms
iter 479570: loss 5.8649, time 125.75ms
iter 479580: loss 5.8077, time 125.75ms
iter 479590: loss 6.6859, time 123.26ms
iter 479600: loss 6.2361, time 121.81ms
iter 479610: loss 5.7563, time 122.26ms
iter 479620: loss 6.8111, time 124.25ms
iter 479630: loss 5.7558, time 121.08ms
iter 479640: loss 5.4754, time 121.66ms
iter 479650: loss 5.7838, time 121.55ms
iter 479660: loss 6.0670, time 122.47ms
iter 479670: loss 5.5121, time 120.61ms
iter 479680: loss 5.6744, time 122.77ms
iter 479690: loss 5.6710, time 121.62ms
iter 479700: loss 6.4817, time 121.44ms
iter 479710: loss 5.9786, time 121.62ms
iter 479720: loss 5.5913, time 121.47ms
iter 479730: loss 5.3796, time 121.10ms
iter 479740: loss 6.2420, time 121.63ms
step 479750: train loss 5.5455, val loss 5.5483
saving checkpoint to out-shakespeare-char
iter 479750: loss 5.6606, time 2883.54ms
iter 479760: loss 6.3092, time 121.43ms
iter 479770: loss 5.5888, time 121.93ms
iter 479780: loss 6.4445, time 123.33ms
iter 479790: loss 6.2705, time 121.88ms
iter 479800: loss 6.1799, time 122.39ms
iter 479810: loss 5.7589, time 124.19ms
iter 479820: loss 6.0015, time 123.70ms
iter 479830: loss 6.1471, time 122.19ms
iter 479840: loss 6.2587, time 121.55ms
iter 479850: loss 6.4928, time 122.90ms
iter 479860: loss 6.0133, time 123.11ms
iter 479870: loss 6.2799, time 121.63ms
iter 479880: loss 5.4885, time 121.51ms
iter 479890: loss 6.1257, time 121.77ms
iter 479900: loss 6.2241, time 122.97ms
iter 479910: loss 6.1925, time 121.84ms
iter 479920: loss 5.3367, time 121.90ms
iter 479930: loss 6.0262, time 122.87ms
iter 479940: loss 6.0950, time 121.86ms
iter 479950: loss 5.8758, time 121.95ms
iter 479960: loss 5.3716, time 124.35ms
iter 479970: loss 6.0567, time 121.33ms
iter 479980: loss 5.5680, time 123.01ms
iter 479990: loss 5.8022, time 124.76ms
step 480000: train loss 5.5745, val loss 5.5876
saving checkpoint to out-shakespeare-char
iter 480000: loss 5.5919, time 2896.66ms
iter 480010: loss 6.0495, time 121.55ms
iter 480020: loss 6.4208, time 121.45ms
iter 480030: loss 5.5542, time 122.75ms
iter 480040: loss 5.6754, time 121.81ms
iter 480050: loss 5.8374, time 121.35ms
iter 480060: loss 5.8546, time 122.41ms
iter 480070: loss 5.9995, time 122.49ms
iter 480080: loss 5.6613, time 121.74ms
iter 480090: loss 6.0692, time 121.52ms
iter 480100: loss 6.5910, time 122.54ms
iter 480110: loss 5.9999, time 121.21ms
iter 480120: loss 5.5389, time 121.22ms
iter 480130: loss 5.6584, time 121.45ms
iter 480140: loss 6.3066, time 122.54ms
iter 480150: loss 4.8414, time 121.95ms
iter 480160: loss 5.3849, time 123.81ms
iter 480170: loss 6.6050, time 121.27ms
iter 480180: loss 5.3795, time 121.37ms
iter 480190: loss 5.9073, time 122.04ms
iter 480200: loss 6.1922, time 122.15ms
iter 480210: loss 5.7326, time 121.12ms
iter 480220: loss 5.9862, time 121.20ms
iter 480230: loss 5.9446, time 123.83ms
iter 480240: loss 6.1386, time 121.23ms
step 480250: train loss 5.5339, val loss 5.6195
saving checkpoint to out-shakespeare-char
iter 480250: loss 6.1300, time 2877.30ms
iter 480260: loss 6.2045, time 125.70ms
iter 480270: loss 5.8185, time 125.59ms
iter 480280: loss 6.5870, time 125.46ms
iter 480290: loss 6.3491, time 125.34ms
iter 480300: loss 5.1237, time 124.68ms
iter 480310: loss 6.1986, time 124.41ms
iter 480320: loss 6.3686, time 125.35ms
iter 480330: loss 6.3862, time 125.02ms
iter 480340: loss 5.6389, time 125.41ms
iter 480350: loss 6.6614, time 125.09ms
iter 480360: loss 5.6936, time 124.85ms
iter 480370: loss 5.8352, time 123.67ms
iter 480380: loss 6.1189, time 125.18ms
iter 480390: loss 5.7885, time 125.09ms
iter 480400: loss 5.9334, time 125.17ms
iter 480410: loss 6.0135, time 124.89ms
iter 480420: loss 4.8086, time 126.50ms
iter 480430: loss 5.9088, time 124.99ms
iter 480440: loss 5.3998, time 125.45ms
iter 480450: loss 6.3377, time 125.50ms
iter 480460: loss 5.5620, time 125.47ms
iter 480470: loss 5.7783, time 125.44ms
iter 480480: loss 5.8297, time 125.43ms
iter 480490: loss 5.9313, time 130.05ms
step 480500: train loss 5.5511, val loss 5.5849
saving checkpoint to out-shakespeare-char
iter 480500: loss 5.7201, time 2887.98ms
iter 480510: loss 6.9409, time 121.70ms
iter 480520: loss 6.1251, time 122.36ms
iter 480530: loss 5.8369, time 121.51ms
iter 480540: loss 5.2473, time 121.58ms
iter 480550: loss 6.4357, time 121.32ms
iter 480560: loss 5.2258, time 122.59ms
iter 480570: loss 5.8016, time 121.46ms
iter 480580: loss 5.9916, time 121.68ms
iter 480590: loss 6.3239, time 122.19ms
iter 480600: loss 6.5842, time 121.29ms
iter 480610: loss 5.3872, time 122.46ms
iter 480620: loss 5.7918, time 121.24ms
iter 480630: loss 5.7135, time 121.21ms
iter 480640: loss 6.1213, time 123.90ms
iter 480650: loss 5.9005, time 122.15ms
iter 480660: loss 5.6207, time 121.50ms
iter 480670: loss 5.9093, time 121.81ms
iter 480680: loss 5.8704, time 121.53ms
iter 480690: loss 6.2817, time 121.15ms
iter 480700: loss 6.0560, time 121.57ms
iter 480710: loss 5.8132, time 121.40ms
iter 480720: loss 6.2405, time 121.93ms
iter 480730: loss 5.9164, time 122.63ms
iter 480740: loss 6.2197, time 123.76ms
step 480750: train loss 5.5894, val loss 5.5362
saving checkpoint to out-shakespeare-char
iter 480750: loss 5.7579, time 2882.93ms
iter 480760: loss 6.1090, time 121.51ms
iter 480770: loss 6.1173, time 122.09ms
iter 480780: loss 6.0431, time 123.62ms
iter 480790: loss 6.1587, time 122.61ms
iter 480800: loss 5.8462, time 121.92ms
iter 480810: loss 5.4784, time 121.57ms
iter 480820: loss 6.1898, time 121.64ms
iter 480830: loss 5.5751, time 124.20ms
iter 480840: loss 5.9061, time 121.56ms
iter 480850: loss 5.8509, time 120.87ms
iter 480860: loss 6.3889, time 121.25ms
iter 480870: loss 6.3991, time 122.82ms
iter 480880: loss 5.6382, time 121.44ms
iter 480890: loss 5.9397, time 121.36ms
iter 480900: loss 5.9068, time 124.06ms
iter 480910: loss 5.1724, time 122.71ms
iter 480920: loss 6.0999, time 124.29ms
iter 480930: loss 6.0604, time 121.45ms
iter 480940: loss 5.9813, time 121.44ms
iter 480950: loss 6.4536, time 121.11ms
iter 480960: loss 5.7182, time 120.72ms
iter 480970: loss 5.6525, time 121.24ms
iter 480980: loss 5.7851, time 121.96ms
iter 480990: loss 5.9875, time 124.11ms
step 481000: train loss 5.5495, val loss 5.5541
saving checkpoint to out-shakespeare-char
iter 481000: loss 5.7518, time 2914.86ms
iter 481010: loss 6.1722, time 125.43ms
iter 481020: loss 6.3490, time 125.38ms
iter 481030: loss 6.4397, time 125.85ms
iter 481040: loss 5.8498, time 125.36ms
iter 481050: loss 6.3184, time 124.95ms
iter 481060: loss 6.0560, time 125.69ms
iter 481070: loss 5.9127, time 125.81ms
iter 481080: loss 5.9572, time 126.82ms
iter 481090: loss 6.0278, time 128.86ms
iter 481100: loss 6.1137, time 125.66ms
iter 481110: loss 5.8059, time 126.23ms
iter 481120: loss 5.9800, time 126.08ms
iter 481130: loss 6.1039, time 125.43ms
iter 481140: loss 5.3157, time 125.51ms
iter 481150: loss 5.9071, time 125.19ms
iter 481160: loss 6.0902, time 125.66ms
iter 481170: loss 6.0993, time 125.00ms
iter 481180: loss 5.8657, time 126.91ms
iter 481190: loss 5.6584, time 128.33ms
iter 481200: loss 6.2354, time 125.11ms
iter 481210: loss 5.6613, time 130.65ms
iter 481220: loss 5.9573, time 125.73ms
iter 481230: loss 6.3395, time 125.62ms
iter 481240: loss 6.0522, time 124.83ms
step 481250: train loss 5.6198, val loss 5.5491
saving checkpoint to out-shakespeare-char
iter 481250: loss 5.5816, time 2889.38ms
iter 481260: loss 5.5666, time 122.76ms
iter 481270: loss 5.7242, time 122.15ms
iter 481280: loss 5.8942, time 122.06ms
iter 481290: loss 5.5401, time 121.40ms
iter 481300: loss 5.5710, time 122.10ms
iter 481310: loss 5.5421, time 120.19ms
iter 481320: loss 6.8801, time 119.95ms
iter 481330: loss 6.2211, time 121.48ms
iter 481340: loss 5.4200, time 123.03ms
iter 481350: loss 5.4881, time 123.13ms
iter 481360: loss 5.5617, time 121.49ms
iter 481370: loss 5.7063, time 121.82ms
iter 481380: loss 6.1081, time 124.17ms
iter 481390: loss 6.2677, time 121.56ms
iter 481400: loss 5.7888, time 125.22ms
iter 481410: loss 5.7635, time 125.77ms
iter 481420: loss 5.2400, time 125.73ms
iter 481430: loss 6.3945, time 126.76ms
iter 481440: loss 5.7291, time 125.78ms
iter 481450: loss 6.4006, time 124.91ms
iter 481460: loss 6.3749, time 127.07ms
iter 481470: loss 6.0767, time 125.63ms
iter 481480: loss 6.3885, time 125.43ms
iter 481490: loss 5.5956, time 125.66ms
step 481500: train loss 5.5931, val loss 5.6149
saving checkpoint to out-shakespeare-char
iter 481500: loss 6.0994, time 2883.28ms
iter 481510: loss 5.9293, time 123.21ms
iter 481520: loss 5.6209, time 121.70ms
iter 481530: loss 6.0199, time 124.38ms
iter 481540: loss 6.0194, time 121.64ms
iter 481550: loss 5.7713, time 121.69ms
iter 481560: loss 6.5276, time 121.63ms
iter 481570: loss 6.2543, time 121.64ms
iter 481580: loss 4.9100, time 121.85ms
iter 481590: loss 6.1057, time 121.92ms
iter 481600: loss 5.0014, time 123.07ms
iter 481610: loss 5.9250, time 121.98ms
iter 481620: loss 6.2085, time 125.44ms
iter 481630: loss 5.5452, time 125.47ms
iter 481640: loss 5.8132, time 125.47ms
iter 481650: loss 5.3374, time 125.70ms
iter 481660: loss 6.2828, time 124.74ms
iter 481670: loss 5.9001, time 125.82ms
iter 481680: loss 5.9488, time 124.54ms
iter 481690: loss 5.7925, time 125.82ms
iter 481700: loss 5.1773, time 125.80ms
iter 481710: loss 5.8146, time 126.03ms
iter 481720: loss 5.6592, time 125.01ms
iter 481730: loss 6.0422, time 127.49ms
iter 481740: loss 6.0231, time 125.73ms
step 481750: train loss 5.5265, val loss 5.5327
saving checkpoint to out-shakespeare-char
iter 481750: loss 6.1684, time 2895.47ms
iter 481760: loss 6.4441, time 127.74ms
iter 481770: loss 5.8936, time 127.41ms
iter 481780: loss 6.3964, time 125.89ms
iter 481790: loss 5.3321, time 125.83ms
iter 481800: loss 5.6748, time 126.09ms
iter 481810: loss 5.6046, time 126.14ms
iter 481820: loss 6.2421, time 125.79ms
iter 481830: loss 5.8100, time 125.89ms
iter 481840: loss 6.4328, time 125.88ms
iter 481850: loss 6.4894, time 125.88ms
iter 481860: loss 5.9541, time 125.87ms
iter 481870: loss 6.3095, time 125.69ms
iter 481880: loss 6.0155, time 126.21ms
iter 481890: loss 6.3013, time 126.41ms
iter 481900: loss 5.9060, time 125.01ms
iter 481910: loss 6.2394, time 125.78ms
iter 481920: loss 6.5226, time 125.84ms
iter 481930: loss 6.2849, time 125.74ms
iter 481940: loss 6.6564, time 125.71ms
iter 481950: loss 5.8215, time 125.19ms
iter 481960: loss 5.8553, time 126.21ms
iter 481970: loss 5.3856, time 126.14ms
iter 481980: loss 5.7154, time 126.82ms
iter 481990: loss 6.6066, time 125.58ms
step 482000: train loss 5.5325, val loss 5.5648
saving checkpoint to out-shakespeare-char
iter 482000: loss 5.3570, time 2882.72ms
iter 482010: loss 6.3187, time 125.79ms
iter 482020: loss 5.8937, time 125.86ms
iter 482030: loss 6.3157, time 127.05ms
iter 482040: loss 5.2535, time 125.74ms
iter 482050: loss 6.0702, time 125.73ms
iter 482060: loss 5.5440, time 124.13ms
iter 482070: loss 5.7223, time 125.79ms
iter 482080: loss 6.1546, time 125.77ms
iter 482090: loss 5.9252, time 125.59ms
iter 482100: loss 5.9426, time 125.22ms
iter 482110: loss 5.4865, time 125.38ms
iter 482120: loss 5.8271, time 125.62ms
iter 482130: loss 5.7086, time 127.20ms
iter 482140: loss 5.9473, time 125.59ms
iter 482150: loss 5.8887, time 126.02ms
iter 482160: loss 6.3779, time 125.51ms
iter 482170: loss 6.5026, time 125.50ms
iter 482180: loss 6.3285, time 125.71ms
iter 482190: loss 6.1232, time 125.47ms
iter 482200: loss 5.2959, time 125.65ms
iter 482210: loss 6.5263, time 125.91ms
iter 482220: loss 6.0037, time 125.66ms
iter 482230: loss 5.9388, time 125.58ms
iter 482240: loss 6.0739, time 122.45ms
step 482250: train loss 5.5492, val loss 5.5263
saving checkpoint to out-shakespeare-char
iter 482250: loss 5.7280, time 2893.92ms
iter 482260: loss 6.2098, time 129.19ms
iter 482270: loss 5.3429, time 125.85ms
iter 482280: loss 6.0131, time 128.20ms
iter 482290: loss 5.5827, time 125.59ms
iter 482300: loss 6.4731, time 125.74ms
iter 482310: loss 5.8825, time 125.73ms
iter 482320: loss 6.4069, time 125.89ms
iter 482330: loss 5.9941, time 125.68ms
iter 482340: loss 5.6496, time 125.52ms
iter 482350: loss 6.0504, time 125.05ms
iter 482360: loss 6.2845, time 125.93ms
iter 482370: loss 6.3249, time 125.41ms
iter 482380: loss 5.5766, time 125.72ms
iter 482390: loss 6.0320, time 125.49ms
iter 482400: loss 5.6852, time 125.59ms
iter 482410: loss 5.7302, time 125.60ms
iter 482420: loss 5.6347, time 125.59ms
iter 482430: loss 6.0622, time 125.99ms
iter 482440: loss 6.3457, time 125.45ms
iter 482450: loss 5.9580, time 125.57ms
iter 482460: loss 5.9356, time 125.50ms
iter 482470: loss 5.7848, time 125.53ms
iter 482480: loss 6.0626, time 125.62ms
iter 482490: loss 5.8522, time 125.61ms
step 482500: train loss 5.6059, val loss 5.5636
saving checkpoint to out-shakespeare-char
iter 482500: loss 5.1241, time 2878.80ms
iter 482510: loss 6.3395, time 125.84ms
iter 482520: loss 5.9815, time 125.63ms
iter 482530: loss 6.4666, time 125.64ms
iter 482540: loss 5.6741, time 125.74ms
iter 482550: loss 6.3769, time 125.20ms
iter 482560: loss 6.2240, time 125.29ms
iter 482570: loss 6.2782, time 125.48ms
iter 482580: loss 6.5990, time 125.37ms
iter 482590: loss 6.7990, time 125.62ms
iter 482600: loss 5.7965, time 125.42ms
iter 482610: loss 6.1715, time 125.67ms
iter 482620: loss 5.8791, time 125.58ms
iter 482630: loss 5.8574, time 125.52ms
iter 482640: loss 5.6855, time 125.25ms
iter 482650: loss 6.0103, time 125.73ms
iter 482660: loss 6.9431, time 125.21ms
iter 482670: loss 6.7944, time 125.32ms
iter 482680: loss 5.2452, time 126.53ms
iter 482690: loss 6.1768, time 125.54ms
iter 482700: loss 5.9511, time 125.20ms
iter 482710: loss 6.4705, time 124.02ms
iter 482720: loss 5.9894, time 125.97ms
iter 482730: loss 6.4562, time 125.59ms
iter 482740: loss 5.9251, time 125.28ms
step 482750: train loss 5.5895, val loss 5.5417
saving checkpoint to out-shakespeare-char
iter 482750: loss 5.9059, time 2896.85ms
iter 482760: loss 6.1442, time 121.51ms
iter 482770: loss 6.1958, time 120.88ms
iter 482780: loss 5.8466, time 122.10ms
iter 482790: loss 6.0308, time 121.45ms
iter 482800: loss 6.0207, time 121.61ms
iter 482810: loss 5.8624, time 122.79ms
iter 482820: loss 5.6353, time 121.59ms
iter 482830: loss 6.1774, time 122.04ms
iter 482840: loss 5.8065, time 124.04ms
iter 482850: loss 5.9343, time 121.57ms
iter 482860: loss 5.9242, time 121.54ms
iter 482870: loss 5.7825, time 120.59ms
iter 482880: loss 5.8362, time 122.62ms
iter 482890: loss 6.4616, time 121.36ms
iter 482900: loss 6.2010, time 121.64ms
iter 482910: loss 6.2256, time 123.22ms
iter 482920: loss 6.0363, time 121.55ms
iter 482930: loss 6.3302, time 121.76ms
iter 482940: loss 5.8839, time 124.20ms
iter 482950: loss 5.5031, time 122.96ms
iter 482960: loss 5.4225, time 121.07ms
iter 482970: loss 6.8309, time 121.72ms
iter 482980: loss 6.1109, time 121.55ms
iter 482990: loss 6.7189, time 121.59ms
step 483000: train loss 5.5769, val loss 5.5556
saving checkpoint to out-shakespeare-char
iter 483000: loss 6.2048, time 2900.79ms
iter 483010: loss 6.1673, time 125.75ms
iter 483020: loss 6.1405, time 124.58ms
iter 483030: loss 6.0566, time 125.42ms
iter 483040: loss 6.3744, time 125.61ms
iter 483050: loss 6.8431, time 125.48ms
iter 483060: loss 6.0890, time 125.52ms
iter 483070: loss 6.2759, time 125.48ms
iter 483080: loss 5.7122, time 125.52ms
iter 483090: loss 5.3334, time 125.51ms
iter 483100: loss 5.8125, time 125.31ms
iter 483110: loss 5.8218, time 124.62ms
iter 483120: loss 6.2133, time 124.32ms
iter 483130: loss 5.6272, time 124.85ms
iter 483140: loss 5.3103, time 124.88ms
iter 483150: loss 5.4492, time 124.88ms
iter 483160: loss 6.0736, time 125.11ms
iter 483170: loss 5.9587, time 124.83ms
iter 483180: loss 6.2433, time 124.07ms
iter 483190: loss 6.2882, time 124.91ms
iter 483200: loss 6.3738, time 124.75ms
iter 483210: loss 6.2265, time 125.04ms
iter 483220: loss 5.6507, time 124.71ms
iter 483230: loss 5.8061, time 125.91ms
iter 483240: loss 6.4027, time 125.71ms
step 483250: train loss 5.5978, val loss 5.5684
saving checkpoint to out-shakespeare-char
iter 483250: loss 6.4135, time 2903.05ms
iter 483260: loss 6.2094, time 125.37ms
iter 483270: loss 6.5961, time 128.52ms
iter 483280: loss 5.5934, time 125.12ms
iter 483290: loss 5.8340, time 128.34ms
iter 483300: loss 5.7340, time 125.77ms
iter 483310: loss 6.3152, time 128.36ms
iter 483320: loss 5.9375, time 125.54ms
iter 483330: loss 6.1376, time 128.03ms
iter 483340: loss 6.1874, time 124.90ms
iter 483350: loss 6.1230, time 127.80ms
iter 483360: loss 5.9469, time 125.11ms
iter 483370: loss 5.7199, time 127.84ms
iter 483380: loss 6.1087, time 125.27ms
iter 483390: loss 5.9959, time 128.03ms
iter 483400: loss 6.2252, time 125.76ms
iter 483410: loss 5.8495, time 127.92ms
iter 483420: loss 5.6345, time 125.21ms
iter 483430: loss 6.2785, time 127.73ms
iter 483440: loss 5.0112, time 125.45ms
iter 483450: loss 5.9812, time 127.64ms
iter 483460: loss 5.8092, time 125.26ms
iter 483470: loss 5.7957, time 127.66ms
iter 483480: loss 5.2282, time 125.75ms
iter 483490: loss 6.5464, time 128.70ms
step 483500: train loss 5.5037, val loss 5.5503
saving checkpoint to out-shakespeare-char
iter 483500: loss 5.8230, time 2872.87ms
iter 483510: loss 5.6477, time 121.94ms
iter 483520: loss 6.2229, time 122.95ms
iter 483530: loss 5.8497, time 122.09ms
iter 483540: loss 6.5371, time 121.90ms
iter 483550: loss 5.4459, time 124.65ms
iter 483560: loss 6.1828, time 122.06ms
iter 483570: loss 6.4614, time 121.77ms
iter 483580: loss 5.6086, time 121.83ms
iter 483590: loss 5.4846, time 121.71ms
iter 483600: loss 5.0431, time 121.76ms
iter 483610: loss 6.2308, time 122.38ms
iter 483620: loss 5.7466, time 121.89ms
iter 483630: loss 6.6137, time 121.76ms
iter 483640: loss 5.9988, time 122.07ms
iter 483650: loss 5.6268, time 122.69ms
iter 483660: loss 6.8174, time 121.81ms
iter 483670: loss 5.3266, time 121.98ms
iter 483680: loss 6.5948, time 124.31ms
iter 483690: loss 6.0256, time 121.81ms
iter 483700: loss 5.9243, time 121.73ms
iter 483710: loss 5.8800, time 121.95ms
iter 483720: loss 6.2413, time 123.10ms
iter 483730: loss 6.0409, time 121.72ms
iter 483740: loss 5.8825, time 121.77ms
step 483750: train loss 5.5633, val loss 5.5748
saving checkpoint to out-shakespeare-char
iter 483750: loss 6.0408, time 2901.88ms
iter 483760: loss 6.1758, time 122.81ms
iter 483770: loss 5.3322, time 121.85ms
iter 483780: loss 6.1415, time 123.19ms
iter 483790: loss 5.5321, time 121.64ms
iter 483800: loss 5.7878, time 121.80ms
iter 483810: loss 5.5861, time 124.43ms
iter 483820: loss 6.1803, time 121.71ms
iter 483830: loss 6.2418, time 121.74ms
iter 483840: loss 6.1037, time 120.74ms
iter 483850: loss 5.6564, time 121.78ms
iter 483860: loss 5.8151, time 121.99ms
iter 483870: loss 5.9752, time 120.95ms
iter 483880: loss 6.1031, time 122.97ms
iter 483890: loss 6.4254, time 120.69ms
iter 483900: loss 6.2570, time 121.76ms
iter 483910: loss 6.3741, time 122.76ms
iter 483920: loss 5.4650, time 121.67ms
iter 483930: loss 6.1626, time 121.96ms
iter 483940: loss 6.0075, time 124.41ms
iter 483950: loss 5.5004, time 121.84ms
iter 483960: loss 5.7578, time 121.68ms
iter 483970: loss 6.3566, time 121.69ms
iter 483980: loss 5.6899, time 121.76ms
iter 483990: loss 5.9211, time 121.95ms
step 484000: train loss 5.5669, val loss 5.5880
saving checkpoint to out-shakespeare-char
iter 484000: loss 5.6787, time 2902.41ms
iter 484010: loss 5.5148, time 125.73ms
iter 484020: loss 6.4834, time 124.36ms
iter 484030: loss 5.9720, time 125.16ms
iter 484040: loss 5.7683, time 125.35ms
iter 484050: loss 5.7699, time 125.52ms
iter 484060: loss 6.2574, time 124.84ms
iter 484070: loss 5.8635, time 127.77ms
iter 484080: loss 5.7807, time 125.34ms
iter 484090: loss 5.8302, time 125.41ms
iter 484100: loss 5.7269, time 124.41ms
iter 484110: loss 6.0131, time 124.65ms
iter 484120: loss 5.6102, time 125.26ms
iter 484130: loss 5.5506, time 125.15ms
iter 484140: loss 6.1681, time 124.52ms
iter 484150: loss 5.8490, time 125.86ms
iter 484160: loss 6.4082, time 124.95ms
iter 484170: loss 5.9904, time 125.40ms
iter 484180: loss 6.5589, time 125.73ms
iter 484190: loss 5.2115, time 124.66ms
iter 484200: loss 5.9172, time 125.87ms
iter 484210: loss 5.9327, time 125.24ms
iter 484220: loss 6.2072, time 125.95ms
iter 484230: loss 5.8012, time 125.80ms
iter 484240: loss 6.5513, time 125.43ms
step 484250: train loss 5.5306, val loss 5.5630
saving checkpoint to out-shakespeare-char
iter 484250: loss 5.5005, time 2891.01ms
iter 484260: loss 4.9128, time 125.60ms
iter 484270: loss 5.8044, time 126.63ms
iter 484280: loss 5.7755, time 125.80ms
iter 484290: loss 6.1069, time 126.29ms
iter 484300: loss 6.4602, time 125.27ms
iter 484310: loss 6.1943, time 125.48ms
iter 484320: loss 6.8773, time 125.14ms
iter 484330: loss 6.1496, time 125.57ms
iter 484340: loss 5.8310, time 125.16ms
iter 484350: loss 6.3294, time 125.41ms
iter 484360: loss 6.7932, time 125.10ms
iter 484370: loss 6.7074, time 125.30ms
iter 484380: loss 6.3028, time 125.14ms
iter 484390: loss 5.5668, time 125.24ms
iter 484400: loss 6.7465, time 125.29ms
iter 484410: loss 5.5581, time 125.40ms
iter 484420: loss 5.8431, time 125.24ms
iter 484430: loss 5.9641, time 125.53ms
iter 484440: loss 5.6087, time 125.52ms
iter 484450: loss 6.3239, time 125.37ms
iter 484460: loss 6.3809, time 125.16ms
iter 484470: loss 7.1348, time 125.51ms
iter 484480: loss 6.4155, time 125.18ms
iter 484490: loss 6.4082, time 125.38ms
step 484500: train loss 5.5661, val loss 5.5516
saving checkpoint to out-shakespeare-char
iter 484500: loss 6.6275, time 2905.92ms
iter 484510: loss 6.1555, time 125.40ms
iter 484520: loss 5.8791, time 125.23ms
iter 484530: loss 6.8698, time 125.60ms
iter 484540: loss 6.3013, time 125.14ms
iter 484550: loss 5.5978, time 125.16ms
iter 484560: loss 6.0363, time 125.36ms
iter 484570: loss 5.3509, time 125.14ms
iter 484580: loss 5.6341, time 125.62ms
iter 484590: loss 6.1882, time 125.41ms
iter 484600: loss 6.2926, time 126.23ms
iter 484610: loss 6.2345, time 125.34ms
iter 484620: loss 6.3534, time 125.26ms
iter 484630: loss 6.3617, time 125.07ms
iter 484640: loss 5.1750, time 125.42ms
iter 484650: loss 6.2861, time 125.18ms
iter 484660: loss 5.2678, time 125.32ms
iter 484670: loss 5.1917, time 125.37ms
iter 484680: loss 6.5200, time 125.52ms
iter 484690: loss 5.1363, time 125.32ms
iter 484700: loss 6.5301, time 125.19ms
iter 484710: loss 5.8425, time 125.14ms
iter 484720: loss 5.3521, time 124.20ms
iter 484730: loss 5.5890, time 125.32ms
iter 484740: loss 6.8709, time 126.46ms
step 484750: train loss 5.5354, val loss 5.5195
saving checkpoint to out-shakespeare-char
iter 484750: loss 6.3371, time 2886.48ms
iter 484760: loss 5.7572, time 126.59ms
iter 484770: loss 5.6897, time 125.97ms
iter 484780: loss 6.2365, time 122.65ms
iter 484790: loss 6.1895, time 121.68ms
iter 484800: loss 6.3076, time 121.42ms
iter 484810: loss 5.8790, time 124.27ms
iter 484820: loss 6.4019, time 121.49ms
iter 484830: loss 5.9717, time 121.74ms
iter 484840: loss 5.9511, time 121.72ms
iter 484850: loss 5.8296, time 121.43ms
iter 484860: loss 5.8269, time 121.70ms
iter 484870: loss 5.6894, time 121.91ms
iter 484880: loss 5.8797, time 123.39ms
iter 484890: loss 6.5482, time 121.91ms
iter 484900: loss 5.4000, time 121.84ms
iter 484910: loss 5.4888, time 123.01ms
iter 484920: loss 6.5208, time 121.89ms
iter 484930: loss 5.9359, time 122.82ms
iter 484940: loss 6.2889, time 123.38ms
iter 484950: loss 5.9644, time 121.66ms
iter 484960: loss 5.8056, time 122.42ms
iter 484970: loss 6.2911, time 121.64ms
iter 484980: loss 5.1293, time 121.53ms
iter 484990: loss 5.8313, time 121.95ms
step 485000: train loss 5.5391, val loss 5.5459
saving checkpoint to out-shakespeare-char
iter 485000: loss 5.0207, time 2900.32ms
iter 485010: loss 5.9474, time 121.47ms
iter 485020: loss 6.6497, time 121.74ms
iter 485030: loss 5.8907, time 122.30ms
iter 485040: loss 6.5469, time 121.93ms
iter 485050: loss 6.3945, time 121.82ms
iter 485060: loss 6.0149, time 121.64ms
iter 485070: loss 6.6032, time 122.17ms
iter 485080: loss 5.9404, time 122.01ms
iter 485090: loss 6.5145, time 121.71ms
iter 485100: loss 6.1939, time 122.96ms
iter 485110: loss 6.2568, time 121.69ms
iter 485120: loss 6.1617, time 121.87ms
iter 485130: loss 5.2295, time 124.31ms
iter 485140: loss 5.3999, time 121.66ms
iter 485150: loss 5.3852, time 121.40ms
iter 485160: loss 6.7786, time 121.91ms
iter 485170: loss 6.5196, time 121.85ms
iter 485180: loss 5.8209, time 122.01ms
iter 485190: loss 5.7248, time 121.84ms
iter 485200: loss 5.9351, time 122.84ms
iter 485210: loss 6.0854, time 121.75ms
iter 485220: loss 6.0230, time 121.88ms
iter 485230: loss 6.6794, time 122.02ms
iter 485240: loss 6.0627, time 121.89ms
step 485250: train loss 5.5486, val loss 5.5931
saving checkpoint to out-shakespeare-char
iter 485250: loss 5.7164, time 2901.39ms
iter 485260: loss 5.1050, time 121.92ms
iter 485270: loss 5.5107, time 121.72ms
iter 485280: loss 6.4576, time 121.81ms
iter 485290: loss 5.6031, time 122.01ms
iter 485300: loss 5.7326, time 123.22ms
iter 485310: loss 5.9904, time 121.63ms
iter 485320: loss 6.2952, time 121.69ms
iter 485330: loss 6.2086, time 124.41ms
iter 485340: loss 6.4619, time 120.43ms
iter 485350: loss 5.7969, time 121.88ms
iter 485360: loss 6.0894, time 121.91ms
iter 485370: loss 5.9177, time 121.82ms
iter 485380: loss 5.8447, time 121.86ms
iter 485390: loss 4.8322, time 121.91ms
iter 485400: loss 5.9208, time 122.98ms
iter 485410: loss 6.1514, time 121.78ms
iter 485420: loss 6.2260, time 121.68ms
iter 485430: loss 5.5380, time 122.91ms
iter 485440: loss 6.2920, time 121.82ms
iter 485450: loss 6.5643, time 121.74ms
iter 485460: loss 6.5698, time 124.39ms
iter 485470: loss 6.0251, time 121.82ms
iter 485480: loss 5.9128, time 121.88ms
iter 485490: loss 6.0865, time 122.84ms
step 485500: train loss 5.5504, val loss 5.5738
saving checkpoint to out-shakespeare-char
iter 485500: loss 5.3142, time 2904.55ms
iter 485510: loss 5.7954, time 121.70ms
iter 485520: loss 6.6078, time 122.82ms
iter 485530: loss 5.6928, time 121.82ms
iter 485540: loss 5.0740, time 121.84ms
iter 485550: loss 5.9541, time 124.51ms
iter 485560: loss 6.4297, time 121.83ms
iter 485570: loss 6.5256, time 122.01ms
iter 485580: loss 6.6262, time 121.88ms
iter 485590: loss 5.9248, time 121.78ms
iter 485600: loss 6.6278, time 121.77ms
iter 485610: loss 5.4622, time 121.77ms
iter 485620: loss 5.9190, time 123.14ms
iter 485630: loss 6.8373, time 122.48ms
iter 485640: loss 6.5947, time 122.28ms
iter 485650: loss 5.2703, time 124.97ms
iter 485660: loss 5.3756, time 122.56ms
iter 485670: loss 6.3117, time 122.27ms
iter 485680: loss 6.4245, time 121.45ms
iter 485690: loss 5.9765, time 122.12ms
iter 485700: loss 5.3808, time 122.09ms
iter 485710: loss 6.4022, time 122.46ms
iter 485720: loss 6.0588, time 123.57ms
iter 485730: loss 5.5367, time 122.41ms
iter 485740: loss 5.8425, time 121.98ms
step 485750: train loss 5.5567, val loss 5.5709
saving checkpoint to out-shakespeare-char
iter 485750: loss 5.5240, time 2911.47ms
iter 485760: loss 6.0340, time 122.22ms
iter 485770: loss 5.9658, time 122.17ms
iter 485780: loss 6.1882, time 122.49ms
iter 485790: loss 6.2819, time 122.37ms
iter 485800: loss 6.7958, time 122.18ms
iter 485810: loss 5.6928, time 122.31ms
iter 485820: loss 5.6582, time 123.45ms
iter 485830: loss 6.2096, time 123.44ms
iter 485840: loss 6.2978, time 122.34ms
iter 485850: loss 5.6379, time 122.00ms
iter 485860: loss 5.7910, time 123.23ms
iter 485870: loss 5.6000, time 121.98ms
iter 485880: loss 6.6909, time 121.78ms
iter 485890: loss 5.7083, time 121.74ms
iter 485900: loss 6.5052, time 123.33ms
iter 485910: loss 5.8075, time 121.66ms
iter 485920: loss 5.4500, time 121.11ms
iter 485930: loss 5.9820, time 123.86ms
iter 485940: loss 6.3280, time 122.56ms
iter 485950: loss 6.4833, time 121.19ms
iter 485960: loss 6.6201, time 122.87ms
iter 485970: loss 6.5897, time 131.82ms
iter 485980: loss 5.7747, time 122.05ms
iter 485990: loss 5.8079, time 121.92ms
step 486000: train loss 5.5597, val loss 5.5842
saving checkpoint to out-shakespeare-char
iter 486000: loss 5.6189, time 2891.54ms
iter 486010: loss 6.6552, time 122.92ms
iter 486020: loss 5.7689, time 121.81ms
iter 486030: loss 6.7171, time 121.81ms
iter 486040: loss 5.8059, time 122.11ms
iter 486050: loss 6.6483, time 121.94ms
iter 486060: loss 6.5717, time 121.73ms
iter 486070: loss 5.5647, time 124.43ms
iter 486080: loss 5.4107, time 122.47ms
iter 486090: loss 6.2247, time 121.83ms
iter 486100: loss 5.5678, time 120.69ms
iter 486110: loss 6.1778, time 124.10ms
iter 486120: loss 5.2049, time 121.31ms
iter 486130: loss 6.2192, time 121.85ms
iter 486140: loss 6.0765, time 122.37ms
iter 486150: loss 5.4210, time 122.96ms
iter 486160: loss 6.5024, time 122.77ms
iter 486170: loss 6.5429, time 122.91ms
iter 486180: loss 6.3269, time 121.66ms
iter 486190: loss 5.8515, time 124.07ms
iter 486200: loss 5.7397, time 122.27ms
iter 486210: loss 5.5167, time 121.98ms
iter 486220: loss 6.1617, time 122.18ms
iter 486230: loss 6.4485, time 121.82ms
iter 486240: loss 5.8579, time 122.46ms
step 486250: train loss 5.5348, val loss 5.6135
saving checkpoint to out-shakespeare-char
iter 486250: loss 6.1366, time 2905.56ms
iter 486260: loss 6.3866, time 121.95ms
iter 486270: loss 6.0716, time 121.80ms
iter 486280: loss 6.3944, time 121.94ms
iter 486290: loss 5.6864, time 123.27ms
iter 486300: loss 5.5069, time 122.07ms
iter 486310: loss 5.9577, time 122.01ms
iter 486320: loss 5.9139, time 122.92ms
iter 486330: loss 5.9641, time 121.59ms
iter 486340: loss 6.4816, time 122.94ms
iter 486350: loss 6.2066, time 121.87ms
iter 486360: loss 6.0088, time 123.32ms
iter 486370: loss 5.2589, time 121.34ms
iter 486380: loss 5.4285, time 121.76ms
iter 486390: loss 6.0266, time 124.10ms
iter 486400: loss 6.5194, time 121.81ms
iter 486410: loss 5.1447, time 121.86ms
iter 486420: loss 6.0836, time 122.32ms
iter 486430: loss 5.8236, time 121.95ms
iter 486440: loss 6.0126, time 121.94ms
iter 486450: loss 6.3473, time 121.80ms
iter 486460: loss 5.5556, time 122.57ms
iter 486470: loss 6.3411, time 121.96ms
iter 486480: loss 6.2372, time 122.24ms
iter 486490: loss 5.8543, time 122.68ms
step 486500: train loss 5.5326, val loss 5.5088
saving checkpoint to out-shakespeare-char
iter 486500: loss 6.4357, time 2897.12ms
iter 486510: loss 6.5999, time 121.94ms
iter 486520: loss 6.0708, time 121.69ms
iter 486530: loss 6.5223, time 121.73ms
iter 486540: loss 5.8250, time 121.64ms
iter 486550: loss 6.2121, time 121.63ms
iter 486560: loss 5.2891, time 123.32ms
iter 486570: loss 5.8675, time 122.21ms
iter 486580: loss 6.2136, time 122.07ms
iter 486590: loss 5.8894, time 123.21ms
iter 486600: loss 5.4975, time 121.30ms
iter 486610: loss 6.3589, time 121.88ms
iter 486620: loss 5.8460, time 124.39ms
iter 486630: loss 6.0582, time 122.10ms
iter 486640: loss 5.7671, time 121.84ms
iter 486650: loss 5.8070, time 121.87ms
iter 486660: loss 6.1568, time 121.82ms
iter 486670: loss 5.0351, time 122.04ms
iter 486680: loss 5.5815, time 121.90ms
iter 486690: loss 5.8170, time 123.11ms
iter 486700: loss 5.8930, time 122.19ms
iter 486710: loss 5.9173, time 121.87ms
iter 486720: loss 6.0805, time 123.35ms
iter 486730: loss 6.3563, time 121.01ms
iter 486740: loss 5.6766, time 121.89ms
step 486750: train loss 5.5384, val loss 5.5600
saving checkpoint to out-shakespeare-char
iter 486750: loss 6.0641, time 2899.32ms
iter 486760: loss 6.1837, time 126.34ms
iter 486770: loss 5.5877, time 125.98ms
iter 486780: loss 6.4970, time 126.89ms
iter 486790: loss 5.9120, time 125.19ms
iter 486800: loss 6.3608, time 125.89ms
iter 486810: loss 5.8845, time 127.88ms
iter 486820: loss 5.8061, time 126.14ms
iter 486830: loss 5.8569, time 128.69ms
iter 486840: loss 6.3343, time 125.02ms
iter 486850: loss 5.6203, time 128.13ms
iter 486860: loss 6.0017, time 125.36ms
iter 486870: loss 5.8146, time 125.87ms
iter 486880: loss 6.5545, time 125.61ms
iter 486890: loss 5.8995, time 126.46ms
iter 486900: loss 5.2877, time 125.95ms
iter 486910: loss 6.2007, time 125.95ms
iter 486920: loss 5.3250, time 125.92ms
iter 486930: loss 4.9387, time 125.23ms
iter 486940: loss 5.9553, time 125.66ms
iter 486950: loss 5.7494, time 126.07ms
iter 486960: loss 6.4299, time 125.39ms
iter 486970: loss 5.9276, time 125.78ms
iter 486980: loss 6.2739, time 125.66ms
iter 486990: loss 5.9186, time 126.15ms
step 487000: train loss 5.5399, val loss 5.5694
saving checkpoint to out-shakespeare-char
iter 487000: loss 5.6372, time 2887.01ms
iter 487010: loss 5.9368, time 125.92ms
iter 487020: loss 5.9507, time 126.28ms
iter 487030: loss 6.6932, time 125.92ms
iter 487040: loss 5.8628, time 126.21ms
iter 487050: loss 4.6601, time 125.76ms
iter 487060: loss 5.9883, time 125.96ms
iter 487070: loss 5.6654, time 125.69ms
iter 487080: loss 5.9812, time 126.09ms
iter 487090: loss 5.6674, time 125.84ms
iter 487100: loss 5.8511, time 126.47ms
iter 487110: loss 6.4876, time 125.59ms
iter 487120: loss 6.3526, time 125.59ms
iter 487130: loss 5.6829, time 125.63ms
iter 487140: loss 6.1214, time 125.90ms
iter 487150: loss 5.7762, time 133.88ms
iter 487160: loss 6.2065, time 124.71ms
iter 487170: loss 5.9614, time 125.45ms
iter 487180: loss 5.9333, time 125.58ms
iter 487190: loss 6.0628, time 125.87ms
iter 487200: loss 5.6158, time 125.00ms
iter 487210: loss 5.9188, time 125.49ms
iter 487220: loss 5.9615, time 125.87ms
iter 487230: loss 5.8288, time 125.61ms
iter 487240: loss 6.3939, time 125.56ms
step 487250: train loss 5.5808, val loss 5.5495
saving checkpoint to out-shakespeare-char
iter 487250: loss 5.9406, time 2913.87ms
iter 487260: loss 5.4176, time 125.19ms
iter 487270: loss 5.9510, time 125.32ms
iter 487280: loss 5.3684, time 125.08ms
iter 487290: loss 6.3312, time 125.23ms
iter 487300: loss 6.8019, time 124.53ms
iter 487310: loss 5.8953, time 125.24ms
iter 487320: loss 5.6957, time 125.58ms
iter 487330: loss 5.9700, time 125.49ms
iter 487340: loss 5.6782, time 125.25ms
iter 487350: loss 6.0533, time 125.30ms
iter 487360: loss 5.7990, time 125.19ms
iter 487370: loss 5.6997, time 125.03ms
iter 487380: loss 6.1318, time 124.82ms
iter 487390: loss 5.5667, time 124.95ms
iter 487400: loss 5.7369, time 124.30ms
iter 487410: loss 6.4918, time 125.32ms
iter 487420: loss 5.5705, time 125.02ms
iter 487430: loss 5.7304, time 124.16ms
iter 487440: loss 6.7066, time 125.15ms
iter 487450: loss 5.7707, time 125.54ms
iter 487460: loss 6.0576, time 125.52ms
iter 487470: loss 5.5016, time 124.88ms
iter 487480: loss 5.5718, time 124.98ms
iter 487490: loss 5.9237, time 125.46ms
step 487500: train loss 5.5704, val loss 5.5229
saving checkpoint to out-shakespeare-char
iter 487500: loss 6.0030, time 2907.78ms
iter 487510: loss 5.8491, time 127.33ms
iter 487520: loss 6.2496, time 125.28ms
iter 487530: loss 6.7506, time 127.29ms
iter 487540: loss 6.0298, time 124.53ms
iter 487550: loss 6.0419, time 125.58ms
iter 487560: loss 5.9793, time 124.50ms
iter 487570: loss 6.1445, time 124.84ms
iter 487580: loss 5.2443, time 125.62ms
iter 487590: loss 6.3563, time 125.51ms
iter 487600: loss 5.8155, time 126.16ms
iter 487610: loss 5.9633, time 125.48ms
iter 487620: loss 6.2541, time 125.32ms
iter 487630: loss 5.8310, time 125.22ms
iter 487640: loss 5.3892, time 126.58ms
iter 487650: loss 5.8341, time 125.64ms
iter 487660: loss 6.3838, time 125.58ms
iter 487670: loss 5.6999, time 125.48ms
iter 487680: loss 5.5234, time 125.55ms
iter 487690: loss 5.1568, time 125.37ms
iter 487700: loss 6.9534, time 125.57ms
iter 487710: loss 6.0305, time 125.25ms
iter 487720: loss 6.0480, time 125.42ms
iter 487730: loss 5.5765, time 125.76ms
iter 487740: loss 5.2479, time 125.55ms
step 487750: train loss 5.5751, val loss 5.6173
saving checkpoint to out-shakespeare-char
iter 487750: loss 6.0931, time 2888.48ms
iter 487760: loss 5.9853, time 125.80ms
iter 487770: loss 5.8979, time 125.89ms
iter 487780: loss 6.8520, time 125.52ms
iter 487790: loss 6.4548, time 121.80ms
iter 487800: loss 6.2604, time 121.61ms
iter 487810: loss 5.6233, time 124.25ms
iter 487820: loss 6.0635, time 121.62ms
iter 487830: loss 5.9916, time 121.33ms
iter 487840: loss 6.3015, time 121.64ms
iter 487850: loss 5.5490, time 120.13ms
iter 487860: loss 6.3048, time 122.75ms
iter 487870: loss 6.2247, time 121.62ms
iter 487880: loss 6.2016, time 123.29ms
iter 487890: loss 5.7926, time 121.57ms
iter 487900: loss 5.7580, time 122.06ms
iter 487910: loss 5.9208, time 124.13ms
iter 487920: loss 6.3099, time 121.56ms
iter 487930: loss 6.2859, time 122.13ms
iter 487940: loss 5.8045, time 119.93ms
iter 487950: loss 6.0479, time 122.10ms
iter 487960: loss 5.7154, time 121.50ms
iter 487970: loss 6.8293, time 121.57ms
iter 487980: loss 6.1917, time 122.66ms
iter 487990: loss 6.4282, time 121.54ms
step 488000: train loss 5.5497, val loss 5.5484
saving checkpoint to out-shakespeare-char
iter 488000: loss 6.5126, time 2897.12ms
iter 488010: loss 6.3004, time 123.07ms
iter 488020: loss 6.4612, time 121.59ms
iter 488030: loss 6.6739, time 121.73ms
iter 488040: loss 5.4309, time 121.84ms
iter 488050: loss 5.6275, time 122.33ms
iter 488060: loss 6.9231, time 121.77ms
iter 488070: loss 6.6749, time 124.61ms
iter 488080: loss 6.1150, time 123.20ms
iter 488090: loss 5.8368, time 123.09ms
iter 488100: loss 5.8365, time 121.85ms
iter 488110: loss 6.2359, time 121.82ms
iter 488120: loss 5.4887, time 121.90ms
iter 488130: loss 6.2386, time 120.36ms
iter 488140: loss 6.1335, time 122.18ms
iter 488150: loss 6.0299, time 121.83ms
iter 488160: loss 5.9427, time 123.14ms
iter 488170: loss 5.6826, time 121.97ms
iter 488180: loss 6.1357, time 121.97ms
iter 488190: loss 5.8542, time 122.83ms
iter 488200: loss 6.2282, time 121.66ms
iter 488210: loss 6.0957, time 121.80ms
iter 488220: loss 5.9468, time 123.01ms
iter 488230: loss 6.7093, time 122.10ms
iter 488240: loss 6.2021, time 121.75ms
step 488250: train loss 5.5578, val loss 5.5004
saving checkpoint to out-shakespeare-char
iter 488250: loss 5.2351, time 2905.35ms
iter 488260: loss 5.8863, time 121.72ms
iter 488270: loss 5.4896, time 121.60ms
iter 488280: loss 5.8140, time 121.68ms
iter 488290: loss 6.0681, time 123.20ms
iter 488300: loss 6.0949, time 121.62ms
iter 488310: loss 6.2668, time 122.27ms
iter 488320: loss 5.4906, time 123.90ms
iter 488330: loss 5.9866, time 121.99ms
iter 488340: loss 6.0745, time 121.93ms
iter 488350: loss 5.7627, time 121.72ms
iter 488360: loss 6.7802, time 121.57ms
iter 488370: loss 5.5067, time 121.76ms
iter 488380: loss 6.1839, time 121.66ms
iter 488390: loss 5.7785, time 122.80ms
iter 488400: loss 5.6366, time 121.63ms
iter 488410: loss 6.1239, time 121.81ms
iter 488420: loss 6.1896, time 124.10ms
iter 488430: loss 6.3420, time 121.57ms
iter 488440: loss 5.4479, time 121.53ms
iter 488450: loss 6.1183, time 121.65ms
iter 488460: loss 5.6082, time 123.36ms
iter 488470: loss 6.9595, time 121.67ms
iter 488480: loss 6.2358, time 121.73ms
iter 488490: loss 6.0601, time 124.87ms
step 488500: train loss 5.5724, val loss 5.5311
saving checkpoint to out-shakespeare-char
iter 488500: loss 5.9234, time 2906.57ms
iter 488510: loss 5.9075, time 121.81ms
iter 488520: loss 5.6978, time 122.24ms
iter 488530: loss 6.1531, time 122.07ms
iter 488540: loss 5.7485, time 122.34ms
iter 488550: loss 5.9718, time 121.60ms
iter 488560: loss 5.6338, time 122.48ms
iter 488570: loss 6.9327, time 121.70ms
iter 488580: loss 6.1101, time 121.81ms
iter 488590: loss 6.3459, time 123.91ms
iter 488600: loss 5.2902, time 121.63ms
iter 488610: loss 5.9334, time 122.20ms
iter 488620: loss 5.6447, time 124.37ms
iter 488630: loss 6.2182, time 122.04ms
iter 488640: loss 5.9632, time 122.74ms
iter 488650: loss 6.0053, time 121.88ms
iter 488660: loss 6.0094, time 122.14ms
iter 488670: loss 5.6297, time 121.59ms
iter 488680: loss 5.5685, time 121.95ms
iter 488690: loss 6.0870, time 121.87ms
iter 488700: loss 6.7294, time 121.49ms
iter 488710: loss 5.8914, time 121.94ms
iter 488720: loss 5.1245, time 124.48ms
iter 488730: loss 6.1051, time 121.56ms
iter 488740: loss 5.6770, time 122.01ms
step 488750: train loss 5.5118, val loss 5.5617
saving checkpoint to out-shakespeare-char
iter 488750: loss 7.2200, time 2917.78ms
iter 488760: loss 5.9711, time 121.94ms
iter 488770: loss 6.4917, time 121.87ms
iter 488780: loss 5.6579, time 122.07ms
iter 488790: loss 6.5217, time 123.26ms
iter 488800: loss 6.4801, time 121.83ms
iter 488810: loss 6.4744, time 121.97ms
iter 488820: loss 5.7557, time 122.12ms
iter 488830: loss 6.2877, time 121.33ms
iter 488840: loss 6.1953, time 121.98ms
iter 488850: loss 6.2267, time 124.31ms
iter 488860: loss 5.8596, time 121.70ms
iter 488870: loss 6.3496, time 121.93ms
iter 488880: loss 6.1002, time 121.92ms
iter 488890: loss 6.2579, time 122.01ms
iter 488900: loss 5.6605, time 121.85ms
iter 488910: loss 5.8813, time 122.01ms
iter 488920: loss 6.7827, time 123.07ms
iter 488930: loss 4.9052, time 122.02ms
iter 488940: loss 6.6911, time 121.83ms
iter 488950: loss 5.0624, time 122.80ms
iter 488960: loss 5.7294, time 121.84ms
iter 488970: loss 5.8738, time 121.32ms
iter 488980: loss 6.2620, time 124.50ms
iter 488990: loss 5.6331, time 121.87ms
step 489000: train loss 5.5484, val loss 5.5699
saving checkpoint to out-shakespeare-char
iter 489000: loss 5.9015, time 2889.53ms
iter 489010: loss 5.7524, time 121.81ms
iter 489020: loss 6.0044, time 123.12ms
iter 489030: loss 5.6145, time 121.60ms
iter 489040: loss 5.1288, time 121.87ms
iter 489050: loss 5.6835, time 121.96ms
iter 489060: loss 5.0819, time 121.55ms
iter 489070: loss 5.6599, time 121.52ms
iter 489080: loss 5.9028, time 121.41ms
iter 489090: loss 6.4378, time 122.64ms
iter 489100: loss 6.5644, time 121.44ms
iter 489110: loss 6.3190, time 125.91ms
iter 489120: loss 6.0170, time 124.72ms
iter 489130: loss 5.9053, time 125.35ms
iter 489140: loss 5.9375, time 125.01ms
iter 489150: loss 5.2458, time 125.94ms
iter 489160: loss 5.4383, time 125.32ms
iter 489170: loss 6.3932, time 125.48ms
iter 489180: loss 6.0783, time 125.43ms
iter 489190: loss 5.4374, time 127.37ms
iter 489200: loss 5.7998, time 125.39ms
iter 489210: loss 6.4946, time 127.20ms
iter 489220: loss 5.4208, time 125.21ms
iter 489230: loss 5.6830, time 126.84ms
iter 489240: loss 5.6376, time 125.77ms
step 489250: train loss 5.5731, val loss 5.6034
saving checkpoint to out-shakespeare-char
iter 489250: loss 5.9288, time 2880.11ms
iter 489260: loss 5.8082, time 126.05ms
iter 489270: loss 5.9625, time 125.86ms
iter 489280: loss 6.2043, time 124.82ms
iter 489290: loss 6.2809, time 124.91ms
iter 489300: loss 6.3765, time 125.25ms
iter 489310: loss 5.3197, time 124.82ms
iter 489320: loss 5.9076, time 126.03ms
iter 489330: loss 5.3051, time 125.70ms
iter 489340: loss 5.8173, time 125.84ms
iter 489350: loss 6.1085, time 125.67ms
iter 489360: loss 5.6509, time 126.03ms
iter 489370: loss 6.1279, time 125.95ms
iter 489380: loss 6.7748, time 125.84ms
iter 489390: loss 6.1670, time 126.24ms
iter 489400: loss 5.7388, time 125.31ms
iter 489410: loss 6.4971, time 123.49ms
iter 489420: loss 5.8476, time 125.38ms
iter 489430: loss 6.6251, time 125.70ms
iter 489440: loss 6.0021, time 124.79ms
iter 489450: loss 5.8828, time 125.64ms
iter 489460: loss 5.1522, time 125.54ms
iter 489470: loss 5.2205, time 125.81ms
iter 489480: loss 5.8044, time 125.69ms
iter 489490: loss 6.2660, time 125.52ms
step 489500: train loss 5.5336, val loss 5.5801
saving checkpoint to out-shakespeare-char
iter 489500: loss 6.2924, time 2878.35ms
iter 489510: loss 5.9172, time 125.67ms
iter 489520: loss 6.5564, time 125.31ms
iter 489530: loss 6.3885, time 125.43ms
iter 489540: loss 5.4845, time 126.23ms
iter 489550: loss 6.9499, time 126.02ms
iter 489560: loss 5.4558, time 125.62ms
iter 489570: loss 6.4400, time 125.43ms
iter 489580: loss 5.8306, time 125.40ms
iter 489590: loss 4.7573, time 125.24ms
iter 489600: loss 5.9969, time 125.43ms
iter 489610: loss 6.3455, time 125.14ms
iter 489620: loss 5.2502, time 125.12ms
iter 489630: loss 6.0762, time 125.18ms
iter 489640: loss 5.0493, time 124.98ms
iter 489650: loss 6.6667, time 125.34ms
iter 489660: loss 5.0337, time 125.18ms
iter 489670: loss 6.5706, time 124.37ms
iter 489680: loss 6.1280, time 125.48ms
iter 489690: loss 6.1755, time 124.46ms
iter 489700: loss 5.6513, time 125.18ms
iter 489710: loss 5.8719, time 124.18ms
iter 489720: loss 6.0976, time 124.65ms
iter 489730: loss 5.5858, time 124.82ms
iter 489740: loss 5.2674, time 125.09ms
step 489750: train loss 5.5393, val loss 5.5496
saving checkpoint to out-shakespeare-char
iter 489750: loss 5.7847, time 2865.96ms
iter 489760: loss 5.8262, time 124.94ms
iter 489770: loss 5.8918, time 124.74ms
iter 489780: loss 6.3717, time 125.30ms
iter 489790: loss 6.0146, time 126.51ms
iter 489800: loss 6.3837, time 124.43ms
iter 489810: loss 6.3182, time 124.52ms
iter 489820: loss 6.2163, time 124.64ms
iter 489830: loss 6.2131, time 124.66ms
iter 489840: loss 7.2553, time 124.48ms
iter 489850: loss 6.5321, time 124.70ms
iter 489860: loss 5.6911, time 123.44ms
iter 489870: loss 6.2109, time 125.75ms
iter 489880: loss 5.9423, time 125.66ms
iter 489890: loss 5.8142, time 125.00ms
iter 489900: loss 6.0353, time 125.55ms
iter 489910: loss 5.2723, time 126.01ms
iter 489920: loss 6.2072, time 125.76ms
iter 489930: loss 6.6346, time 125.85ms
iter 489940: loss 6.6810, time 125.48ms
iter 489950: loss 6.1454, time 125.92ms
iter 489960: loss 5.8684, time 125.53ms
iter 489970: loss 6.0845, time 127.13ms
iter 489980: loss 5.5510, time 125.99ms
iter 489990: loss 6.2872, time 125.47ms
step 490000: train loss 5.5293, val loss 5.5826
saving checkpoint to out-shakespeare-char
iter 490000: loss 5.4968, time 2877.84ms
iter 490010: loss 5.4326, time 124.78ms
iter 490020: loss 6.2136, time 128.38ms
iter 490030: loss 6.5587, time 125.70ms
iter 490040: loss 6.0160, time 125.98ms
iter 490050: loss 6.4423, time 125.70ms
iter 490060: loss 6.3559, time 125.59ms
iter 490070: loss 6.3756, time 126.17ms
iter 490080: loss 5.8570, time 125.72ms
iter 490090: loss 5.7391, time 127.99ms
iter 490100: loss 5.7895, time 125.74ms
iter 490110: loss 6.2648, time 127.38ms
iter 490120: loss 6.2219, time 124.91ms
iter 490130: loss 6.0845, time 127.61ms
iter 490140: loss 5.6900, time 125.44ms
iter 490150: loss 5.4795, time 127.22ms
iter 490160: loss 6.4277, time 124.78ms
iter 490170: loss 6.1333, time 124.70ms
iter 490180: loss 6.2345, time 125.04ms
iter 490190: loss 5.2948, time 125.41ms
iter 490200: loss 6.3582, time 125.20ms
iter 490210: loss 6.1622, time 124.81ms
iter 490220: loss 5.2872, time 124.61ms
iter 490230: loss 5.9182, time 124.96ms
iter 490240: loss 6.5396, time 125.11ms
step 490250: train loss 5.5605, val loss 5.5566
saving checkpoint to out-shakespeare-char
iter 490250: loss 5.4808, time 2882.88ms
iter 490260: loss 6.2931, time 125.88ms
iter 490270: loss 5.6820, time 125.83ms
iter 490280: loss 6.2013, time 126.03ms
iter 490290: loss 6.4289, time 126.02ms
iter 490300: loss 5.6380, time 126.00ms
iter 490310: loss 5.4752, time 125.55ms
iter 490320: loss 5.6708, time 124.90ms
iter 490330: loss 5.6036, time 125.24ms
iter 490340: loss 6.2540, time 125.33ms
iter 490350: loss 5.5330, time 125.64ms
iter 490360: loss 6.5010, time 125.55ms
iter 490370: loss 6.1521, time 125.55ms
iter 490380: loss 5.4677, time 125.73ms
iter 490390: loss 6.4496, time 125.49ms
iter 490400: loss 5.7599, time 125.69ms
iter 490410: loss 6.2666, time 125.87ms
iter 490420: loss 6.6237, time 125.47ms
iter 490430: loss 6.2553, time 124.93ms
iter 490440: loss 6.2587, time 125.36ms
iter 490450: loss 5.8881, time 126.18ms
iter 490460: loss 5.3595, time 125.52ms
iter 490470: loss 6.0174, time 125.54ms
iter 490480: loss 5.7411, time 125.95ms
iter 490490: loss 5.9076, time 125.27ms
step 490500: train loss 5.4891, val loss 5.5177
saving checkpoint to out-shakespeare-char
iter 490500: loss 5.9564, time 2881.11ms
iter 490510: loss 6.6121, time 125.51ms
iter 490520: loss 5.9206, time 125.09ms
iter 490530: loss 5.1535, time 125.62ms
iter 490540: loss 6.2051, time 124.74ms
iter 490550: loss 6.0192, time 125.42ms
iter 490560: loss 6.0272, time 125.23ms
iter 490570: loss 6.4316, time 125.50ms
iter 490580: loss 5.5059, time 125.85ms
iter 490590: loss 5.9565, time 125.48ms
iter 490600: loss 6.7838, time 125.40ms
iter 490610: loss 5.8757, time 124.96ms
iter 490620: loss 6.0007, time 125.64ms
iter 490630: loss 5.8624, time 125.51ms
iter 490640: loss 5.7852, time 125.39ms
iter 490650: loss 5.8241, time 125.55ms
iter 490660: loss 7.2409, time 125.65ms
iter 490670: loss 6.1900, time 127.88ms
iter 490680: loss 5.9385, time 124.43ms
iter 490690: loss 6.4139, time 128.10ms
iter 490700: loss 6.0554, time 125.72ms
iter 490710: loss 5.8337, time 128.17ms
iter 490720: loss 6.1739, time 125.27ms
iter 490730: loss 5.5203, time 128.06ms
iter 490740: loss 6.3168, time 125.96ms
step 490750: train loss 5.5290, val loss 5.5303
saving checkpoint to out-shakespeare-char
iter 490750: loss 6.0418, time 2917.92ms
iter 490760: loss 6.2685, time 125.48ms
iter 490770: loss 6.4924, time 124.37ms
iter 490780: loss 5.5962, time 125.44ms
iter 490790: loss 5.7477, time 125.86ms
iter 490800: loss 5.2440, time 125.56ms
iter 490810: loss 6.3243, time 125.53ms
iter 490820: loss 6.1989, time 125.86ms
iter 490830: loss 6.4835, time 126.01ms
iter 490840: loss 6.2035, time 125.93ms
iter 490850: loss 5.5787, time 124.47ms
iter 490860: loss 6.2010, time 125.30ms
iter 490870: loss 6.2386, time 125.80ms
iter 490880: loss 5.9080, time 125.50ms
iter 490890: loss 6.3470, time 124.88ms
iter 490900: loss 5.5482, time 124.93ms
iter 490910: loss 5.9732, time 125.75ms
iter 490920: loss 6.0057, time 125.56ms
iter 490930: loss 6.3173, time 124.80ms
iter 490940: loss 5.7515, time 125.82ms
iter 490950: loss 5.6570, time 125.84ms
iter 490960: loss 5.9067, time 125.20ms
iter 490970: loss 6.0897, time 125.44ms
iter 490980: loss 6.1740, time 125.73ms
iter 490990: loss 6.2128, time 125.79ms
step 491000: train loss 5.5582, val loss 5.6044
saving checkpoint to out-shakespeare-char
iter 491000: loss 5.9395, time 2872.26ms
iter 491010: loss 6.0589, time 125.83ms
iter 491020: loss 5.9658, time 125.68ms
iter 491030: loss 6.2943, time 125.85ms
iter 491040: loss 5.8679, time 125.62ms
iter 491050: loss 6.6263, time 126.54ms
iter 491060: loss 6.1065, time 125.63ms
iter 491070: loss 5.1439, time 125.54ms
iter 491080: loss 6.0152, time 125.62ms
iter 491090: loss 6.5759, time 125.58ms
iter 491100: loss 6.3403, time 125.92ms
iter 491110: loss 5.9624, time 125.28ms
iter 491120: loss 5.9000, time 125.88ms
iter 491130: loss 6.3824, time 125.79ms
iter 491140: loss 6.1568, time 125.86ms
iter 491150: loss 7.0789, time 125.55ms
iter 491160: loss 6.1134, time 125.05ms
iter 491170: loss 6.3563, time 125.65ms
iter 491180: loss 6.2541, time 125.95ms
iter 491190: loss 6.3763, time 125.76ms
iter 491200: loss 5.9860, time 128.39ms
iter 491210: loss 6.2805, time 124.90ms
iter 491220: loss 5.5850, time 128.48ms
iter 491230: loss 5.3602, time 125.88ms
iter 491240: loss 5.8485, time 128.50ms
step 491250: train loss 5.5376, val loss 5.5199
saving checkpoint to out-shakespeare-char
iter 491250: loss 5.8202, time 2916.86ms
iter 491260: loss 5.7294, time 125.92ms
iter 491270: loss 5.3780, time 125.22ms
iter 491280: loss 5.9954, time 125.81ms
iter 491290: loss 6.5594, time 125.32ms
iter 491300: loss 6.1612, time 124.58ms
iter 491310: loss 5.9692, time 125.70ms
iter 491320: loss 5.6913, time 124.94ms
iter 491330: loss 6.3043, time 125.66ms
iter 491340: loss 5.9341, time 125.84ms
iter 491350: loss 6.4411, time 125.95ms
iter 491360: loss 6.1408, time 125.00ms
iter 491370: loss 5.7577, time 125.74ms
iter 491380: loss 5.7894, time 125.60ms
iter 491390: loss 5.8700, time 125.34ms
iter 491400: loss 5.6341, time 124.35ms
iter 491410: loss 5.6900, time 126.50ms
iter 491420: loss 6.0152, time 125.70ms
iter 491430: loss 5.8891, time 125.96ms
iter 491440: loss 5.7721, time 125.45ms
iter 491450: loss 5.6643, time 125.30ms
iter 491460: loss 6.4005, time 126.76ms
iter 491470: loss 5.4622, time 125.48ms
iter 491480: loss 7.1404, time 124.90ms
iter 491490: loss 5.9846, time 125.87ms
step 491500: train loss 5.5655, val loss 5.5600
saving checkpoint to out-shakespeare-char
iter 491500: loss 5.6754, time 2891.11ms
iter 491510: loss 5.8906, time 126.08ms
iter 491520: loss 6.8596, time 124.86ms
iter 491530: loss 5.9741, time 125.61ms
iter 491540: loss 6.3000, time 125.26ms
iter 491550: loss 6.6816, time 126.33ms
iter 491560: loss 5.9310, time 125.67ms
iter 491570: loss 6.1161, time 125.67ms
iter 491580: loss 5.3579, time 126.13ms
iter 491590: loss 6.0409, time 126.37ms
iter 491600: loss 5.9500, time 125.94ms
iter 491610: loss 5.1548, time 126.60ms
iter 491620: loss 5.9790, time 125.48ms
iter 491630: loss 6.1064, time 125.95ms
iter 491640: loss 6.3391, time 125.59ms
iter 491650: loss 6.4031, time 125.53ms
iter 491660: loss 5.9690, time 125.97ms
iter 491670: loss 5.6972, time 125.79ms
iter 491680: loss 5.6791, time 125.36ms
iter 491690: loss 5.1074, time 126.17ms
iter 491700: loss 5.6722, time 125.63ms
iter 491710: loss 5.9178, time 125.85ms
iter 491720: loss 5.9832, time 125.45ms
iter 491730: loss 6.0865, time 125.38ms
iter 491740: loss 6.0419, time 125.17ms
step 491750: train loss 5.5405, val loss 5.5241
saving checkpoint to out-shakespeare-char
iter 491750: loss 6.1425, time 2865.88ms
iter 491760: loss 5.9527, time 125.22ms
iter 491770: loss 5.7503, time 125.45ms
iter 491780: loss 5.8744, time 124.27ms
iter 491790: loss 5.4533, time 125.16ms
iter 491800: loss 6.3120, time 125.37ms
iter 491810: loss 6.4560, time 125.35ms
iter 491820: loss 6.2986, time 125.21ms
iter 491830: loss 6.0426, time 125.58ms
iter 491840: loss 5.9718, time 124.85ms
iter 491850: loss 5.7816, time 125.59ms
iter 491860: loss 5.6284, time 125.59ms
iter 491870: loss 6.1230, time 127.53ms
iter 491880: loss 5.9505, time 126.50ms
iter 491890: loss 5.6439, time 127.51ms
iter 491900: loss 5.0325, time 124.48ms
iter 491910: loss 5.7822, time 127.66ms
iter 491920: loss 5.5558, time 125.48ms
iter 491930: loss 5.9848, time 125.38ms
iter 491940: loss 5.7216, time 125.16ms
iter 491950: loss 6.2313, time 125.08ms
iter 491960: loss 4.9700, time 125.30ms
iter 491970: loss 6.1085, time 124.42ms
iter 491980: loss 6.0120, time 124.03ms
iter 491990: loss 5.5814, time 125.81ms
step 492000: train loss 5.4872, val loss 5.5592
saving checkpoint to out-shakespeare-char
iter 492000: loss 5.9078, time 2858.15ms
iter 492010: loss 5.7623, time 121.71ms
iter 492020: loss 6.1030, time 121.51ms
iter 492030: loss 5.2032, time 120.69ms
iter 492040: loss 6.7192, time 123.05ms
iter 492050: loss 5.9950, time 121.65ms
iter 492060: loss 5.6921, time 121.81ms
iter 492070: loss 5.6494, time 122.67ms
iter 492080: loss 5.1911, time 121.66ms
iter 492090: loss 5.9329, time 121.71ms
iter 492100: loss 5.6020, time 124.22ms
iter 492110: loss 5.8881, time 121.51ms
iter 492120: loss 5.9068, time 121.39ms
iter 492130: loss 5.3908, time 121.50ms
iter 492140: loss 5.7999, time 123.67ms
iter 492150: loss 6.0677, time 121.93ms
iter 492160: loss 6.1234, time 121.57ms
iter 492170: loss 5.6317, time 124.36ms
iter 492180: loss 5.8630, time 121.61ms
iter 492190: loss 6.7821, time 121.52ms
iter 492200: loss 6.1600, time 121.70ms
iter 492210: loss 5.5496, time 121.42ms
iter 492220: loss 6.0037, time 121.50ms
iter 492230: loss 6.4163, time 121.00ms
iter 492240: loss 5.9074, time 122.50ms
step 492250: train loss 5.5170, val loss 5.5741
saving checkpoint to out-shakespeare-char
iter 492250: loss 5.4959, time 2907.45ms
iter 492260: loss 6.2641, time 125.42ms
iter 492270: loss 5.5901, time 125.24ms
iter 492280: loss 6.5595, time 124.81ms
iter 492290: loss 5.2376, time 124.86ms
iter 492300: loss 6.4880, time 124.95ms
iter 492310: loss 6.3200, time 124.68ms
iter 492320: loss 5.5831, time 124.89ms
iter 492330: loss 5.8584, time 124.91ms
iter 492340: loss 6.6827, time 125.13ms
iter 492350: loss 5.6362, time 125.75ms
iter 492360: loss 6.0283, time 124.70ms
iter 492370: loss 5.6322, time 125.29ms
iter 492380: loss 6.4082, time 123.80ms
iter 492390: loss 5.8041, time 124.12ms
iter 492400: loss 5.7605, time 124.80ms
iter 492410: loss 5.4833, time 124.72ms
iter 492420: loss 5.6250, time 124.70ms
iter 492430: loss 6.1735, time 125.46ms
iter 492440: loss 5.9924, time 124.66ms
iter 492450: loss 5.2083, time 124.86ms
iter 492460: loss 5.6927, time 124.86ms
iter 492470: loss 5.8720, time 125.77ms
iter 492480: loss 5.9560, time 125.21ms
iter 492490: loss 5.7489, time 126.02ms
step 492500: train loss 5.5161, val loss 5.5568
saving checkpoint to out-shakespeare-char
iter 492500: loss 5.9480, time 2909.05ms
iter 492510: loss 6.0165, time 125.74ms
iter 492520: loss 5.9071, time 125.28ms
iter 492530: loss 5.3981, time 125.35ms
iter 492540: loss 5.9034, time 125.28ms
iter 492550: loss 6.1428, time 125.36ms
iter 492560: loss 6.3784, time 125.40ms
iter 492570: loss 5.6147, time 125.72ms
iter 492580: loss 5.7227, time 125.58ms
iter 492590: loss 6.7619, time 126.97ms
iter 492600: loss 5.7669, time 125.78ms
iter 492610: loss 6.0866, time 128.19ms
iter 492620: loss 5.6252, time 124.75ms
iter 492630: loss 6.2510, time 128.03ms
iter 492640: loss 5.8296, time 125.78ms
iter 492650: loss 5.8246, time 128.89ms
iter 492660: loss 6.0127, time 125.58ms
iter 492670: loss 6.1420, time 127.84ms
iter 492680: loss 5.4416, time 125.68ms
iter 492690: loss 5.4009, time 125.50ms
iter 492700: loss 5.3938, time 125.22ms
iter 492710: loss 5.6201, time 125.85ms
iter 492720: loss 5.8102, time 125.68ms
iter 492730: loss 5.7076, time 125.24ms
iter 492740: loss 5.5737, time 125.57ms
step 492750: train loss 5.5238, val loss 5.5826
saving checkpoint to out-shakespeare-char
iter 492750: loss 5.2392, time 2902.67ms
iter 492760: loss 5.6009, time 126.32ms
iter 492770: loss 5.5895, time 125.95ms
iter 492780: loss 6.3304, time 124.70ms
iter 492790: loss 6.0057, time 124.79ms
iter 492800: loss 5.9810, time 125.14ms
iter 492810: loss 5.6442, time 124.99ms
iter 492820: loss 6.4843, time 125.76ms
iter 492830: loss 5.4495, time 128.24ms
iter 492840: loss 6.0874, time 125.33ms
iter 492850: loss 6.3946, time 121.75ms
iter 492860: loss 5.8751, time 122.90ms
iter 492870: loss 6.3647, time 121.33ms
iter 492880: loss 5.9290, time 121.67ms
iter 492890: loss 6.8422, time 122.89ms
iter 492900: loss 5.8899, time 121.62ms
iter 492910: loss 6.1434, time 121.47ms
iter 492920: loss 6.1706, time 124.43ms
iter 492930: loss 6.0748, time 121.68ms
iter 492940: loss 6.4589, time 121.21ms
iter 492950: loss 6.0266, time 121.56ms
iter 492960: loss 6.7672, time 121.38ms
iter 492970: loss 5.4469, time 121.64ms
iter 492980: loss 5.8613, time 121.92ms
iter 492990: loss 6.3117, time 122.76ms
step 493000: train loss 5.5365, val loss 5.5425
saving checkpoint to out-shakespeare-char
iter 493000: loss 6.1324, time 2892.72ms
iter 493010: loss 6.1734, time 121.56ms
iter 493020: loss 6.3574, time 122.57ms
iter 493030: loss 5.5437, time 122.69ms
iter 493040: loss 5.8639, time 121.65ms
iter 493050: loss 6.0713, time 121.63ms
iter 493060: loss 5.6587, time 124.31ms
iter 493070: loss 6.5666, time 121.47ms
iter 493080: loss 5.9674, time 121.73ms
iter 493090: loss 6.0068, time 121.59ms
iter 493100: loss 6.1721, time 122.70ms
iter 493110: loss 5.9302, time 121.72ms
iter 493120: loss 5.7554, time 121.87ms
iter 493130: loss 5.7071, time 124.16ms
iter 493140: loss 6.2144, time 120.79ms
iter 493150: loss 6.0892, time 121.61ms
iter 493160: loss 5.8418, time 121.59ms
iter 493170: loss 6.4901, time 121.91ms
iter 493180: loss 5.3740, time 121.43ms
iter 493190: loss 6.0697, time 121.85ms
iter 493200: loss 5.3816, time 122.46ms
iter 493210: loss 5.6090, time 121.23ms
iter 493220: loss 5.4188, time 121.86ms
iter 493230: loss 5.9456, time 124.15ms
iter 493240: loss 5.8951, time 121.93ms
step 493250: train loss 5.5678, val loss 5.6081
saving checkpoint to out-shakespeare-char
iter 493250: loss 6.1836, time 2883.05ms
iter 493260: loss 6.0717, time 122.00ms
iter 493270: loss 6.0550, time 124.38ms
iter 493280: loss 6.0565, time 122.14ms
iter 493290: loss 6.1604, time 121.77ms
iter 493300: loss 5.6303, time 121.72ms
iter 493310: loss 5.9914, time 120.62ms
iter 493320: loss 6.5275, time 121.25ms
iter 493330: loss 6.6650, time 122.08ms
iter 493340: loss 6.2391, time 122.96ms
iter 493350: loss 6.0472, time 121.62ms
iter 493360: loss 5.8918, time 123.44ms
iter 493370: loss 5.1968, time 124.26ms
iter 493380: loss 5.4654, time 121.76ms
iter 493390: loss 5.7966, time 121.57ms
iter 493400: loss 5.5844, time 120.95ms
iter 493410: loss 5.7458, time 121.91ms
iter 493420: loss 6.3934, time 121.91ms
iter 493430: loss 5.8676, time 121.82ms
iter 493440: loss 6.4202, time 122.73ms
iter 493450: loss 6.5611, time 122.56ms
iter 493460: loss 5.9559, time 121.79ms
iter 493470: loss 5.6921, time 122.95ms
iter 493480: loss 6.4732, time 122.09ms
iter 493490: loss 5.8413, time 121.56ms
step 493500: train loss 5.5318, val loss 5.5912
saving checkpoint to out-shakespeare-char
iter 493500: loss 6.2514, time 2919.84ms
iter 493510: loss 5.7531, time 121.25ms
iter 493520: loss 6.7746, time 121.86ms
iter 493530: loss 5.7244, time 124.32ms
iter 493540: loss 6.1616, time 121.82ms
iter 493550: loss 5.9414, time 121.75ms
iter 493560: loss 6.4229, time 121.72ms
iter 493570: loss 6.1643, time 121.99ms
iter 493580: loss 6.1023, time 121.93ms
iter 493590: loss 5.9758, time 121.37ms
iter 493600: loss 5.9142, time 122.60ms
iter 493610: loss 6.4844, time 121.74ms
iter 493620: loss 5.6715, time 121.78ms
iter 493630: loss 6.0847, time 123.06ms
iter 493640: loss 6.2738, time 121.58ms
iter 493650: loss 6.5047, time 121.85ms
iter 493660: loss 5.6340, time 122.42ms
iter 493670: loss 6.3919, time 121.84ms
iter 493680: loss 6.5894, time 121.70ms
iter 493690: loss 5.9430, time 121.11ms
iter 493700: loss 6.0408, time 122.10ms
iter 493710: loss 6.0731, time 122.68ms
iter 493720: loss 5.9166, time 121.55ms
iter 493730: loss 5.7674, time 124.43ms
iter 493740: loss 6.1038, time 121.80ms
step 493750: train loss 5.5515, val loss 5.5206
saving checkpoint to out-shakespeare-char
iter 493750: loss 6.1082, time 2906.63ms
iter 493760: loss 6.3034, time 121.51ms
iter 493770: loss 6.0231, time 122.08ms
iter 493780: loss 5.3123, time 121.39ms
iter 493790: loss 5.1667, time 121.44ms
iter 493800: loss 6.3445, time 121.59ms
iter 493810: loss 6.3224, time 122.36ms
iter 493820: loss 6.0019, time 121.06ms
iter 493830: loss 6.1371, time 121.85ms
iter 493840: loss 5.7852, time 124.04ms
iter 493850: loss 6.5078, time 121.57ms
iter 493860: loss 5.7664, time 121.74ms
iter 493870: loss 6.3318, time 121.44ms
iter 493880: loss 5.8122, time 121.15ms
iter 493890: loss 6.4587, time 122.03ms
iter 493900: loss 6.1850, time 121.17ms
iter 493910: loss 5.9705, time 121.56ms
iter 493920: loss 6.5962, time 120.22ms
iter 493930: loss 6.1605, time 120.92ms
iter 493940: loss 5.8268, time 121.99ms
iter 493950: loss 6.9938, time 123.83ms
iter 493960: loss 6.0317, time 121.35ms
iter 493970: loss 5.7941, time 121.19ms
iter 493980: loss 5.4486, time 122.48ms
iter 493990: loss 5.6635, time 123.84ms
step 494000: train loss 5.5927, val loss 5.5435
saving checkpoint to out-shakespeare-char
iter 494000: loss 5.6519, time 2878.56ms
iter 494010: loss 6.1756, time 125.56ms
iter 494020: loss 5.1445, time 125.23ms
iter 494030: loss 6.1563, time 124.31ms
iter 494040: loss 5.7025, time 125.34ms
iter 494050: loss 5.6278, time 124.45ms
iter 494060: loss 6.1649, time 125.11ms
iter 494070: loss 5.6746, time 125.10ms
iter 494080: loss 5.1486, time 125.31ms
iter 494090: loss 5.0449, time 124.34ms
iter 494100: loss 6.1234, time 124.44ms
iter 494110: loss 5.6912, time 124.47ms
iter 494120: loss 5.8112, time 125.43ms
iter 494130: loss 6.4879, time 125.09ms
iter 494140: loss 6.1857, time 125.00ms
iter 494150: loss 6.2286, time 125.43ms
iter 494160: loss 5.7156, time 125.45ms
iter 494170: loss 5.9484, time 124.55ms
iter 494180: loss 5.8165, time 125.36ms
iter 494190: loss 6.4353, time 124.14ms
iter 494200: loss 5.5827, time 124.90ms
iter 494210: loss 5.9221, time 124.56ms
iter 494220: loss 6.3645, time 124.43ms
iter 494230: loss 5.8302, time 124.90ms
iter 494240: loss 6.5707, time 124.81ms
step 494250: train loss 5.5863, val loss 5.5579
saving checkpoint to out-shakespeare-char
iter 494250: loss 6.5880, time 2910.01ms
iter 494260: loss 6.4494, time 125.17ms
iter 494270: loss 5.4157, time 125.89ms
iter 494280: loss 6.3155, time 126.16ms
iter 494290: loss 5.7312, time 125.62ms
iter 494300: loss 4.9504, time 125.48ms
iter 494310: loss 6.0662, time 126.14ms
iter 494320: loss 5.8518, time 126.32ms
iter 494330: loss 5.0455, time 125.10ms
iter 494340: loss 6.3576, time 126.02ms
iter 494350: loss 5.8289, time 126.40ms
iter 494360: loss 5.9667, time 125.27ms
iter 494370: loss 6.3505, time 124.03ms
iter 494380: loss 6.4349, time 126.11ms
iter 494390: loss 6.5893, time 125.83ms
iter 494400: loss 5.6100, time 127.53ms
iter 494410: loss 6.1233, time 124.96ms
iter 494420: loss 5.7514, time 125.50ms
iter 494430: loss 5.6117, time 125.26ms
iter 494440: loss 5.9608, time 125.49ms
iter 494450: loss 6.3133, time 125.54ms
iter 494460: loss 5.7496, time 125.57ms
iter 494470: loss 6.1715, time 125.06ms
iter 494480: loss 5.4471, time 125.18ms
iter 494490: loss 5.6637, time 125.32ms
step 494500: train loss 5.5711, val loss 5.5391
saving checkpoint to out-shakespeare-char
iter 494500: loss 5.4678, time 2912.95ms
iter 494510: loss 5.8364, time 124.98ms
iter 494520: loss 6.6165, time 125.44ms
iter 494530: loss 6.2263, time 125.23ms
iter 494540: loss 6.3379, time 125.51ms
iter 494550: loss 6.2786, time 125.00ms
iter 494560: loss 5.8407, time 125.28ms
iter 494570: loss 6.2442, time 125.42ms
iter 494580: loss 5.5114, time 125.12ms
iter 494590: loss 5.6339, time 124.85ms
iter 494600: loss 5.2436, time 127.86ms
iter 494610: loss 5.6767, time 125.60ms
iter 494620: loss 6.2602, time 125.51ms
iter 494630: loss 6.2026, time 125.64ms
iter 494640: loss 5.7109, time 125.91ms
iter 494650: loss 5.3187, time 125.64ms
iter 494660: loss 5.4771, time 125.62ms
iter 494670: loss 5.8543, time 125.55ms
iter 494680: loss 5.4597, time 126.84ms
iter 494690: loss 6.1481, time 125.71ms
iter 494700: loss 6.0304, time 126.11ms
iter 494710: loss 5.9921, time 126.34ms
iter 494720: loss 6.2370, time 125.39ms
iter 494730: loss 5.6252, time 125.72ms
iter 494740: loss 5.3914, time 126.18ms
step 494750: train loss 5.5955, val loss 5.5596
saving checkpoint to out-shakespeare-char
iter 494750: loss 5.1982, time 2893.36ms
iter 494760: loss 6.2254, time 122.13ms
iter 494770: loss 6.9517, time 121.38ms
iter 494780: loss 5.1224, time 124.36ms
iter 494790: loss 5.9742, time 121.65ms
iter 494800: loss 5.4782, time 121.33ms
iter 494810: loss 5.8052, time 121.60ms
iter 494820: loss 6.4555, time 121.68ms
iter 494830: loss 5.9523, time 121.48ms
iter 494840: loss 5.9711, time 121.75ms
iter 494850: loss 5.5788, time 122.67ms
iter 494860: loss 5.2959, time 121.48ms
iter 494870: loss 5.3790, time 121.65ms
iter 494880: loss 6.4958, time 122.57ms
iter 494890: loss 5.7400, time 122.57ms
iter 494900: loss 5.9333, time 121.37ms
iter 494910: loss 5.9016, time 124.36ms
iter 494920: loss 6.4791, time 121.55ms
iter 494930: loss 5.3349, time 121.49ms
iter 494940: loss 6.1160, time 121.29ms
iter 494950: loss 6.1998, time 121.60ms
iter 494960: loss 6.2788, time 121.78ms
iter 494970: loss 5.2806, time 121.33ms
iter 494980: loss 5.6697, time 122.94ms
iter 494990: loss 5.5405, time 121.56ms
step 495000: train loss 5.5752, val loss 5.5599
saving checkpoint to out-shakespeare-char
iter 495000: loss 6.1261, time 2902.07ms
iter 495010: loss 5.5292, time 121.51ms
iter 495020: loss 5.4117, time 123.21ms
iter 495030: loss 6.2740, time 121.49ms
iter 495040: loss 6.0544, time 121.74ms
iter 495050: loss 5.8501, time 122.94ms
iter 495060: loss 6.5305, time 121.38ms
iter 495070: loss 6.3356, time 121.64ms
iter 495080: loss 5.9353, time 124.25ms
iter 495090: loss 5.4449, time 121.74ms
iter 495100: loss 5.6748, time 121.46ms
iter 495110: loss 5.7263, time 121.24ms
iter 495120: loss 5.8608, time 122.13ms
iter 495130: loss 5.7514, time 121.62ms
iter 495140: loss 6.4524, time 121.49ms
iter 495150: loss 5.8951, time 122.94ms
iter 495160: loss 5.8740, time 121.31ms
iter 495170: loss 6.3643, time 121.25ms
iter 495180: loss 6.2342, time 122.79ms
iter 495190: loss 5.9223, time 121.29ms
iter 495200: loss 5.6145, time 121.34ms
iter 495210: loss 5.8854, time 124.49ms
iter 495220: loss 6.9217, time 121.49ms
iter 495230: loss 6.0996, time 121.60ms
iter 495240: loss 5.3452, time 121.72ms
step 495250: train loss 5.5896, val loss 5.5323
saving checkpoint to out-shakespeare-char
iter 495250: loss 5.9702, time 2909.39ms
iter 495260: loss 5.9808, time 121.41ms
iter 495270: loss 6.1191, time 121.95ms
iter 495280: loss 6.0379, time 121.50ms
iter 495290: loss 6.3560, time 120.48ms
iter 495300: loss 6.0011, time 121.64ms
iter 495310: loss 5.7440, time 122.81ms
iter 495320: loss 5.7875, time 121.26ms
iter 495330: loss 5.4968, time 121.19ms
iter 495340: loss 6.6301, time 122.45ms
iter 495350: loss 6.1198, time 121.14ms
iter 495360: loss 5.8952, time 121.47ms
iter 495370: loss 5.8630, time 124.20ms
iter 495380: loss 5.8927, time 121.39ms
iter 495390: loss 5.6950, time 121.47ms
iter 495400: loss 5.7805, time 121.75ms
iter 495410: loss 6.6032, time 121.56ms
iter 495420: loss 6.3228, time 121.17ms
iter 495430: loss 6.0665, time 121.18ms
iter 495440: loss 6.1350, time 122.47ms
iter 495450: loss 5.6611, time 121.68ms
iter 495460: loss 6.0430, time 121.08ms
iter 495470: loss 6.1990, time 122.46ms
iter 495480: loss 5.5315, time 121.65ms
iter 495490: loss 6.0267, time 121.68ms
step 495500: train loss 5.5410, val loss 5.5667
saving checkpoint to out-shakespeare-char
iter 495500: loss 6.1088, time 2907.94ms
iter 495510: loss 5.9156, time 125.77ms
iter 495520: loss 5.9225, time 125.93ms
iter 495530: loss 5.9826, time 125.89ms
iter 495540: loss 5.9905, time 125.36ms
iter 495550: loss 5.7912, time 125.89ms
iter 495560: loss 6.5214, time 125.21ms
iter 495570: loss 6.5884, time 125.47ms
iter 495580: loss 5.4471, time 125.50ms
iter 495590: loss 5.7813, time 125.54ms
iter 495600: loss 6.1447, time 125.82ms
iter 495610: loss 5.9873, time 125.98ms
iter 495620: loss 5.8148, time 125.40ms
iter 495630: loss 6.3081, time 125.66ms
iter 495640: loss 6.5803, time 125.66ms
iter 495650: loss 6.3283, time 126.00ms
iter 495660: loss 5.4113, time 125.38ms
iter 495670: loss 6.4497, time 125.49ms
iter 495680: loss 5.8645, time 125.84ms
iter 495690: loss 6.4023, time 125.45ms
iter 495700: loss 5.3641, time 125.80ms
iter 495710: loss 5.5585, time 125.60ms
iter 495720: loss 6.2100, time 125.44ms
iter 495730: loss 6.1250, time 125.30ms
iter 495740: loss 5.8223, time 125.75ms
step 495750: train loss 5.5457, val loss 5.5926
saving checkpoint to out-shakespeare-char
iter 495750: loss 6.2309, time 2896.09ms
iter 495760: loss 6.5011, time 125.40ms
iter 495770: loss 5.5908, time 125.71ms
iter 495780: loss 5.6058, time 124.97ms
iter 495790: loss 5.8611, time 125.35ms
iter 495800: loss 5.7817, time 124.99ms
iter 495810: loss 5.6306, time 125.31ms
iter 495820: loss 6.4681, time 125.13ms
iter 495830: loss 5.6207, time 125.07ms
iter 495840: loss 6.0095, time 124.99ms
iter 495850: loss 5.7519, time 126.43ms
iter 495860: loss 6.1141, time 125.69ms
iter 495870: loss 6.4490, time 125.51ms
iter 495880: loss 5.1006, time 126.13ms
iter 495890: loss 6.7430, time 124.92ms
iter 495900: loss 6.2059, time 125.66ms
iter 495910: loss 6.2534, time 125.80ms
iter 495920: loss 5.2820, time 125.79ms
iter 495930: loss 6.4076, time 125.62ms
iter 495940: loss 5.6123, time 126.00ms
iter 495950: loss 6.5014, time 125.87ms
iter 495960: loss 5.9442, time 125.61ms
iter 495970: loss 6.0481, time 125.47ms
iter 495980: loss 6.2150, time 125.63ms
iter 495990: loss 5.8170, time 125.54ms
step 496000: train loss 5.5501, val loss 5.5731
saving checkpoint to out-shakespeare-char
iter 496000: loss 5.7477, time 2876.99ms
iter 496010: loss 6.7216, time 126.14ms
iter 496020: loss 5.7212, time 128.59ms
iter 496030: loss 5.6397, time 126.27ms
iter 496040: loss 6.2281, time 128.39ms
iter 496050: loss 5.2551, time 125.70ms
iter 496060: loss 5.9140, time 128.40ms
iter 496070: loss 5.4007, time 126.50ms
iter 496080: loss 5.8024, time 128.33ms
iter 496090: loss 5.8963, time 125.86ms
iter 496100: loss 6.8531, time 128.33ms
iter 496110: loss 6.3786, time 125.55ms
iter 496120: loss 5.2848, time 128.18ms
iter 496130: loss 5.8475, time 125.56ms
iter 496140: loss 5.9063, time 128.20ms
iter 496150: loss 6.2576, time 125.82ms
iter 496160: loss 5.9145, time 128.73ms
iter 496170: loss 5.8964, time 125.50ms
iter 496180: loss 6.3509, time 128.27ms
iter 496190: loss 5.5401, time 125.80ms
iter 496200: loss 5.4621, time 125.96ms
iter 496210: loss 6.3377, time 125.57ms
iter 496220: loss 6.4656, time 125.68ms
iter 496230: loss 6.2466, time 125.71ms
iter 496240: loss 5.9651, time 128.38ms
step 496250: train loss 5.5721, val loss 5.5923
saving checkpoint to out-shakespeare-char
iter 496250: loss 5.9354, time 2881.80ms
iter 496260: loss 5.6240, time 121.30ms
iter 496270: loss 5.9248, time 122.83ms
iter 496280: loss 6.5411, time 121.19ms
iter 496290: loss 6.3622, time 121.45ms
iter 496300: loss 5.7189, time 124.29ms
iter 496310: loss 5.6322, time 121.53ms
iter 496320: loss 5.3099, time 121.51ms
iter 496330: loss 5.9742, time 121.45ms
iter 496340: loss 6.0782, time 121.66ms
iter 496350: loss 6.3349, time 120.87ms
iter 496360: loss 6.0743, time 121.78ms
iter 496370: loss 6.0738, time 122.91ms
iter 496380: loss 5.1286, time 121.73ms
iter 496390: loss 5.9232, time 120.66ms
iter 496400: loss 5.9843, time 122.76ms
iter 496410: loss 5.6252, time 120.74ms
iter 496420: loss 6.2169, time 121.68ms
iter 496430: loss 5.8479, time 124.20ms
iter 496440: loss 6.0929, time 121.59ms
iter 496450: loss 6.0810, time 121.64ms
iter 496460: loss 5.7966, time 122.16ms
iter 496470: loss 6.5058, time 121.59ms
iter 496480: loss 6.8946, time 121.71ms
iter 496490: loss 5.8987, time 121.66ms
step 496500: train loss 5.5560, val loss 5.5935
saving checkpoint to out-shakespeare-char
iter 496500: loss 6.5785, time 2909.63ms
iter 496510: loss 5.7604, time 120.76ms
iter 496520: loss 5.5972, time 121.49ms
iter 496530: loss 6.0772, time 124.31ms
iter 496540: loss 5.5549, time 121.67ms
iter 496550: loss 6.0201, time 121.32ms
iter 496560: loss 5.7230, time 122.15ms
iter 496570: loss 5.9830, time 121.58ms
iter 496580: loss 5.9165, time 121.54ms
iter 496590: loss 6.0531, time 121.73ms
iter 496600: loss 6.1195, time 122.48ms
iter 496610: loss 5.6882, time 121.68ms
iter 496620: loss 6.8397, time 121.43ms
iter 496630: loss 5.7324, time 122.74ms
iter 496640: loss 6.2908, time 121.43ms
iter 496650: loss 5.6704, time 121.89ms
iter 496660: loss 6.1774, time 124.16ms
iter 496670: loss 6.5790, time 120.83ms
iter 496680: loss 6.0494, time 121.35ms
iter 496690: loss 5.9284, time 121.73ms
iter 496700: loss 6.6566, time 121.63ms
iter 496710: loss 6.0763, time 121.49ms
iter 496720: loss 6.2629, time 121.71ms
iter 496730: loss 5.6805, time 122.76ms
iter 496740: loss 6.6649, time 121.63ms
step 496750: train loss 5.5476, val loss 5.5580
saving checkpoint to out-shakespeare-char
iter 496750: loss 6.3800, time 2909.55ms
iter 496760: loss 5.9664, time 122.76ms
iter 496770: loss 5.4530, time 121.94ms
iter 496780: loss 5.4855, time 121.47ms
iter 496790: loss 6.2207, time 124.21ms
iter 496800: loss 5.8445, time 121.61ms
iter 496810: loss 5.6357, time 121.41ms
iter 496820: loss 6.2543, time 121.50ms
iter 496830: loss 5.9454, time 122.04ms
iter 496840: loss 5.7036, time 121.49ms
iter 496850: loss 4.9657, time 121.81ms
iter 496860: loss 6.0979, time 122.68ms
iter 496870: loss 5.7337, time 121.54ms
iter 496880: loss 6.1120, time 121.35ms
iter 496890: loss 5.3676, time 122.80ms
iter 496900: loss 5.7705, time 121.42ms
iter 496910: loss 6.4853, time 121.49ms
iter 496920: loss 6.0089, time 124.25ms
iter 496930: loss 5.7413, time 121.70ms
iter 496940: loss 6.0825, time 121.37ms
iter 496950: loss 5.4366, time 121.64ms
iter 496960: loss 6.5716, time 121.72ms
iter 496970: loss 6.6954, time 121.34ms
iter 496980: loss 6.1320, time 121.54ms
iter 496990: loss 6.0239, time 123.04ms
step 497000: train loss 5.5171, val loss 5.5820
saving checkpoint to out-shakespeare-char
iter 497000: loss 5.5054, time 2906.85ms
iter 497010: loss 6.0359, time 121.24ms
iter 497020: loss 5.8921, time 122.59ms
iter 497030: loss 5.0563, time 121.72ms
iter 497040: loss 6.0972, time 121.43ms
iter 497050: loss 6.0273, time 121.60ms
iter 497060: loss 5.6936, time 122.00ms
iter 497070: loss 6.4706, time 121.31ms
iter 497080: loss 5.4548, time 121.44ms
iter 497090: loss 6.0114, time 122.63ms
iter 497100: loss 5.9644, time 121.24ms
iter 497110: loss 6.2971, time 121.39ms
iter 497120: loss 6.0589, time 122.70ms
iter 497130: loss 5.7636, time 121.34ms
iter 497140: loss 6.1465, time 121.61ms
iter 497150: loss 5.4338, time 124.17ms
iter 497160: loss 5.9157, time 121.57ms
iter 497170: loss 6.0073, time 121.28ms
iter 497180: loss 6.4223, time 121.58ms
iter 497190: loss 6.1635, time 121.41ms
iter 497200: loss 5.9429, time 121.52ms
iter 497210: loss 6.1905, time 121.79ms
iter 497220: loss 6.0268, time 122.57ms
iter 497230: loss 5.1081, time 121.61ms
iter 497240: loss 5.8972, time 122.00ms
step 497250: train loss 5.5785, val loss 5.5562
saving checkpoint to out-shakespeare-char
iter 497250: loss 5.9533, time 2894.85ms
iter 497260: loss 6.1328, time 123.06ms
iter 497270: loss 6.1812, time 121.58ms
iter 497280: loss 5.6510, time 121.62ms
iter 497290: loss 5.6979, time 122.53ms
iter 497300: loss 7.0957, time 121.40ms
iter 497310: loss 5.9857, time 121.45ms
iter 497320: loss 5.0439, time 123.98ms
iter 497330: loss 5.9500, time 121.69ms
iter 497340: loss 6.7733, time 121.25ms
iter 497350: loss 5.3838, time 121.39ms
iter 497360: loss 5.3609, time 121.79ms
iter 497370: loss 6.4284, time 122.26ms
iter 497380: loss 5.6954, time 121.92ms
iter 497390: loss 6.8818, time 122.54ms
iter 497400: loss 6.1615, time 121.33ms
iter 497410: loss 5.8304, time 121.43ms
iter 497420: loss 6.2706, time 123.64ms
iter 497430: loss 6.4608, time 119.69ms
iter 497440: loss 5.6605, time 120.91ms
iter 497450: loss 6.2509, time 121.38ms
iter 497460: loss 5.0416, time 121.13ms
iter 497470: loss 5.9482, time 121.49ms
iter 497480: loss 5.9839, time 121.26ms
iter 497490: loss 6.0464, time 122.27ms
step 497500: train loss 5.5373, val loss 5.5456
saving checkpoint to out-shakespeare-char
iter 497500: loss 5.9474, time 2908.70ms
iter 497510: loss 5.0793, time 121.65ms
iter 497520: loss 6.1152, time 121.51ms
iter 497530: loss 5.6233, time 122.68ms
iter 497540: loss 6.3116, time 121.47ms
iter 497550: loss 6.6070, time 121.26ms
iter 497560: loss 6.0940, time 124.69ms
iter 497570: loss 6.1299, time 121.40ms
iter 497580: loss 5.8259, time 121.32ms
iter 497590: loss 5.9782, time 121.24ms
iter 497600: loss 5.4408, time 121.96ms
iter 497610: loss 5.8471, time 121.48ms
iter 497620: loss 5.7014, time 120.57ms
iter 497630: loss 6.6233, time 123.05ms
iter 497640: loss 6.3086, time 121.69ms
iter 497650: loss 5.8460, time 122.27ms
iter 497660: loss 6.5495, time 122.98ms
iter 497670: loss 5.5599, time 122.53ms
iter 497680: loss 5.5557, time 121.69ms
iter 497690: loss 5.7901, time 124.12ms
iter 497700: loss 6.3221, time 121.46ms
iter 497710: loss 5.3984, time 121.75ms
iter 497720: loss 5.9358, time 121.61ms
iter 497730: loss 5.5278, time 121.53ms
iter 497740: loss 5.9576, time 121.53ms
step 497750: train loss 5.5664, val loss 5.5660
saving checkpoint to out-shakespeare-char
iter 497750: loss 6.3815, time 2901.93ms
iter 497760: loss 6.0717, time 123.57ms
iter 497770: loss 6.0833, time 121.50ms
iter 497780: loss 6.8764, time 121.27ms
iter 497790: loss 5.7746, time 121.47ms
iter 497800: loss 6.0179, time 121.53ms
iter 497810: loss 5.8815, time 121.17ms
iter 497820: loss 6.0224, time 121.74ms
iter 497830: loss 6.0810, time 122.88ms
iter 497840: loss 5.7263, time 121.87ms
iter 497850: loss 5.3956, time 121.73ms
iter 497860: loss 6.0194, time 122.56ms
iter 497870: loss 5.8032, time 121.56ms
iter 497880: loss 7.0359, time 121.42ms
iter 497890: loss 4.8969, time 121.56ms
iter 497900: loss 5.7411, time 120.94ms
iter 497910: loss 5.5042, time 121.52ms
iter 497920: loss 5.9487, time 121.93ms
iter 497930: loss 6.6036, time 121.57ms
iter 497940: loss 5.2893, time 121.54ms
iter 497950: loss 5.0335, time 121.42ms
iter 497960: loss 6.4892, time 122.53ms
iter 497970: loss 6.2000, time 121.31ms
iter 497980: loss 6.3578, time 121.56ms
iter 497990: loss 5.2957, time 124.07ms
step 498000: train loss 5.5002, val loss 5.5952
saving checkpoint to out-shakespeare-char
iter 498000: loss 6.2644, time 2894.77ms
iter 498010: loss 6.0756, time 125.39ms
iter 498020: loss 6.4826, time 125.27ms
iter 498030: loss 6.0141, time 124.14ms
iter 498040: loss 6.1272, time 125.09ms
iter 498050: loss 6.2535, time 125.27ms
iter 498060: loss 6.2238, time 125.55ms
iter 498070: loss 6.0464, time 125.32ms
iter 498080: loss 5.7691, time 125.08ms
iter 498090: loss 6.1276, time 124.96ms
iter 498100: loss 6.3647, time 124.88ms
iter 498110: loss 5.6370, time 125.17ms
iter 498120: loss 6.0908, time 125.41ms
iter 498130: loss 6.3766, time 125.20ms
iter 498140: loss 6.0933, time 125.14ms
iter 498150: loss 5.4667, time 125.48ms
iter 498160: loss 6.2324, time 125.04ms
iter 498170: loss 5.9285, time 125.16ms
iter 498180: loss 5.9143, time 125.29ms
iter 498190: loss 6.2064, time 125.18ms
iter 498200: loss 6.3270, time 125.49ms
iter 498210: loss 6.4388, time 125.21ms
iter 498220: loss 5.7617, time 125.58ms
iter 498230: loss 5.8360, time 126.12ms
iter 498240: loss 5.2794, time 125.29ms
step 498250: train loss 5.5272, val loss 5.5866
saving checkpoint to out-shakespeare-char
iter 498250: loss 5.6015, time 2896.34ms
iter 498260: loss 5.7255, time 125.63ms
iter 498270: loss 5.2671, time 125.51ms
iter 498280: loss 5.1800, time 126.36ms
iter 498290: loss 5.5287, time 125.99ms
iter 498300: loss 5.9465, time 124.81ms
iter 498310: loss 5.9099, time 125.29ms
iter 498320: loss 5.9279, time 125.87ms
iter 498330: loss 6.6493, time 125.53ms
iter 498340: loss 5.8883, time 125.04ms
iter 498350: loss 5.6233, time 125.53ms
iter 498360: loss 6.1931, time 124.61ms
iter 498370: loss 5.9679, time 125.59ms
iter 498380: loss 5.8136, time 125.24ms
iter 498390: loss 6.0656, time 125.42ms
iter 498400: loss 5.4299, time 124.81ms
iter 498410: loss 5.6823, time 125.78ms
iter 498420: loss 6.8373, time 127.41ms
iter 498430: loss 6.0858, time 125.54ms
iter 498440: loss 6.6974, time 125.16ms
iter 498450: loss 6.7691, time 126.03ms
iter 498460: loss 5.6426, time 124.70ms
iter 498470: loss 5.7925, time 125.77ms
iter 498480: loss 5.3864, time 124.70ms
iter 498490: loss 5.7827, time 125.64ms
step 498500: train loss 5.5628, val loss 5.5680
saving checkpoint to out-shakespeare-char
iter 498500: loss 6.2592, time 2898.11ms
iter 498510: loss 5.5626, time 119.77ms
iter 498520: loss 5.7029, time 119.72ms
iter 498530: loss 6.0366, time 126.94ms
iter 498540: loss 6.2330, time 125.89ms
iter 498550: loss 5.6530, time 125.73ms
iter 498560: loss 5.5198, time 126.60ms
iter 498570: loss 6.1021, time 121.73ms
iter 498580: loss 6.1093, time 124.56ms
iter 498590: loss 6.0759, time 122.30ms
iter 498600: loss 5.6757, time 121.62ms
iter 498610: loss 5.4326, time 124.12ms
iter 498620: loss 5.4310, time 124.76ms
iter 498630: loss 5.9005, time 125.64ms
iter 498640: loss 6.1354, time 125.14ms
iter 498650: loss 6.9162, time 129.33ms
iter 498660: loss 6.3657, time 130.04ms
iter 498670: loss 5.3623, time 128.02ms
iter 498680: loss 5.5442, time 126.17ms
iter 498690: loss 5.5968, time 125.93ms
iter 498700: loss 5.4957, time 125.13ms
iter 498710: loss 6.0393, time 124.78ms
iter 498720: loss 6.0575, time 125.85ms
iter 498730: loss 6.5784, time 126.75ms
iter 498740: loss 6.1975, time 126.10ms
step 498750: train loss 5.5370, val loss 5.5505
saving checkpoint to out-shakespeare-char
iter 498750: loss 5.2482, time 2893.82ms
iter 498760: loss 5.9858, time 125.09ms
iter 498770: loss 5.7805, time 125.76ms
iter 498780: loss 6.2934, time 125.96ms
iter 498790: loss 6.3815, time 124.50ms
iter 498800: loss 5.9585, time 125.46ms
iter 498810: loss 6.3326, time 125.08ms
iter 498820: loss 5.9120, time 125.29ms
iter 498830: loss 5.4945, time 125.42ms
iter 498840: loss 5.5507, time 125.57ms
iter 498850: loss 6.1651, time 121.60ms
iter 498860: loss 5.8807, time 121.72ms
iter 498870: loss 5.6302, time 121.45ms
iter 498880: loss 6.5711, time 121.50ms
iter 498890: loss 5.9045, time 122.49ms
iter 498900: loss 5.9556, time 123.01ms
iter 498910: loss 6.0515, time 121.70ms
iter 498920: loss 5.7322, time 121.65ms
iter 498930: loss 5.2304, time 122.80ms
iter 498940: loss 6.3869, time 121.88ms
iter 498950: loss 5.6609, time 121.62ms
iter 498960: loss 5.2567, time 124.08ms
iter 498970: loss 6.3029, time 122.01ms
iter 498980: loss 5.7078, time 121.75ms
iter 498990: loss 5.2010, time 121.91ms
step 499000: train loss 5.5861, val loss 5.6016
saving checkpoint to out-shakespeare-char
iter 499000: loss 6.2583, time 2925.50ms
iter 499010: loss 6.2995, time 121.37ms
iter 499020: loss 5.8135, time 121.48ms
iter 499030: loss 6.7876, time 124.86ms
iter 499040: loss 6.2636, time 121.76ms
iter 499050: loss 6.2868, time 121.88ms
iter 499060: loss 5.9511, time 121.51ms
iter 499070: loss 5.5963, time 121.99ms
iter 499080: loss 5.4189, time 121.57ms
iter 499090: loss 5.9004, time 122.01ms
iter 499100: loss 6.0711, time 123.14ms
iter 499110: loss 6.0420, time 121.45ms
iter 499120: loss 6.0117, time 121.42ms
iter 499130: loss 6.1182, time 122.76ms
iter 499140: loss 5.9881, time 121.76ms
iter 499150: loss 5.9837, time 121.58ms
iter 499160: loss 5.6654, time 123.82ms
iter 499170: loss 5.6856, time 121.71ms
iter 499180: loss 6.2730, time 121.36ms
iter 499190: loss 5.1284, time 121.64ms
iter 499200: loss 4.9396, time 121.66ms
iter 499210: loss 6.3026, time 121.77ms
iter 499220: loss 5.9459, time 121.57ms
iter 499230: loss 5.9362, time 123.42ms
iter 499240: loss 5.3998, time 121.34ms
step 499250: train loss 5.5762, val loss 5.5416
saving checkpoint to out-shakespeare-char
iter 499250: loss 6.2457, time 2878.94ms
iter 499260: loss 5.7693, time 121.55ms
iter 499270: loss 5.8742, time 122.02ms
iter 499280: loss 6.6795, time 121.56ms
iter 499290: loss 6.3510, time 121.44ms
iter 499300: loss 5.2913, time 123.78ms
iter 499310: loss 6.0126, time 121.51ms
iter 499320: loss 5.4054, time 120.63ms
iter 499330: loss 5.9850, time 121.64ms
iter 499340: loss 6.3823, time 121.44ms
iter 499350: loss 6.3062, time 121.25ms
iter 499360: loss 6.2081, time 121.73ms
iter 499370: loss 6.8279, time 122.82ms
iter 499380: loss 6.0945, time 120.67ms
iter 499390: loss 5.7350, time 121.27ms
iter 499400: loss 6.2126, time 121.63ms
iter 499410: loss 6.8182, time 121.64ms
iter 499420: loss 6.2863, time 121.45ms
iter 499430: loss 5.4054, time 120.61ms
iter 499440: loss 5.1186, time 122.66ms
iter 499450: loss 6.5014, time 121.43ms
iter 499460: loss 6.2690, time 121.44ms
iter 499470: loss 6.3703, time 123.08ms
iter 499480: loss 5.8820, time 121.23ms
iter 499490: loss 5.8966, time 121.07ms
step 499500: train loss 5.5256, val loss 5.5688
saving checkpoint to out-shakespeare-char
iter 499500: loss 5.6771, time 2900.48ms
iter 499510: loss 5.0935, time 121.82ms
iter 499520: loss 5.8872, time 121.78ms
iter 499530: loss 5.8895, time 121.40ms
iter 499540: loss 6.5990, time 123.00ms
iter 499550: loss 5.9594, time 121.55ms
iter 499560: loss 5.4300, time 121.44ms
iter 499570: loss 6.3979, time 122.99ms
iter 499580: loss 6.1188, time 121.90ms
iter 499590: loss 5.6862, time 121.79ms
iter 499600: loss 5.2471, time 124.38ms
iter 499610: loss 5.2756, time 121.54ms
iter 499620: loss 4.9581, time 121.21ms
iter 499630: loss 6.0302, time 121.70ms
iter 499640: loss 5.7179, time 123.27ms
iter 499650: loss 5.1702, time 121.54ms
iter 499660: loss 6.0604, time 121.89ms
iter 499670: loss 5.6579, time 122.65ms
iter 499680: loss 5.4578, time 121.33ms
iter 499690: loss 5.9056, time 122.86ms
iter 499700: loss 6.1313, time 124.15ms
iter 499710: loss 5.9269, time 121.55ms
iter 499720: loss 5.9482, time 121.50ms
iter 499730: loss 5.6215, time 121.92ms
iter 499740: loss 5.9261, time 121.57ms
step 499750: train loss 5.5457, val loss 5.5890
saving checkpoint to out-shakespeare-char
iter 499750: loss 6.0037, time 2908.35ms
iter 499760: loss 5.9672, time 121.39ms
iter 499770: loss 5.2993, time 121.69ms
iter 499780: loss 5.7183, time 120.71ms
iter 499790: loss 5.7095, time 121.67ms
iter 499800: loss 6.4118, time 121.95ms
iter 499810: loss 6.1562, time 123.59ms
iter 499820: loss 5.8102, time 120.02ms
iter 499830: loss 6.5944, time 121.72ms
iter 499840: loss 5.3618, time 123.76ms
iter 499850: loss 5.9926, time 121.13ms
iter 499860: loss 5.6379, time 121.35ms
iter 499870: loss 5.0956, time 121.67ms
iter 499880: loss 5.8710, time 122.92ms
iter 499890: loss 6.1348, time 121.68ms
iter 499900: loss 5.6899, time 121.65ms
iter 499910: loss 5.6770, time 122.54ms
iter 499920: loss 7.1149, time 121.80ms
iter 499930: loss 6.0503, time 121.51ms
iter 499940: loss 5.7632, time 122.94ms
iter 499950: loss 5.3172, time 121.00ms
iter 499960: loss 5.8945, time 121.67ms
iter 499970: loss 6.2582, time 124.29ms
iter 499980: loss 6.0444, time 122.50ms
iter 499990: loss 5.8127, time 120.81ms
step 500000: train loss 5.5207, val loss 5.5577
saving checkpoint to out-shakespeare-char
iter 500000: loss 6.1752, time 2919.23ms
iter 500010: loss 6.0124, time 125.62ms
iter 500020: loss 6.2735, time 126.79ms
iter 500030: loss 5.1690, time 127.21ms
iter 500040: loss 5.7470, time 128.46ms
iter 500050: loss 6.2768, time 125.08ms
iter 500060: loss 6.1124, time 128.15ms
iter 500070: loss 6.5469, time 125.67ms
iter 500080: loss 6.3779, time 128.06ms
iter 500090: loss 6.8330, time 125.42ms
iter 500100: loss 6.1304, time 127.41ms
iter 500110: loss 5.8617, time 124.75ms
iter 500120: loss 5.6414, time 125.89ms
iter 500130: loss 5.9862, time 125.57ms
iter 500140: loss 6.0064, time 125.62ms
iter 500150: loss 6.0105, time 124.80ms
iter 500160: loss 5.7650, time 125.67ms
iter 500170: loss 5.4112, time 125.79ms
iter 500180: loss 6.1816, time 126.01ms
iter 500190: loss 6.3263, time 125.65ms
iter 500200: loss 5.8996, time 125.73ms
iter 500210: loss 6.3776, time 124.91ms
iter 500220: loss 6.0054, time 125.17ms
iter 500230: loss 6.1350, time 124.85ms
iter 500240: loss 5.2925, time 125.71ms
step 500250: train loss 5.5419, val loss 5.5483
saving checkpoint to out-shakespeare-char
iter 500250: loss 6.2867, time 2904.94ms
iter 500260: loss 5.9779, time 125.16ms
iter 500270: loss 5.8701, time 124.75ms
iter 500280: loss 6.4396, time 125.43ms
iter 500290: loss 5.7795, time 125.15ms
iter 500300: loss 5.8886, time 126.03ms
iter 500310: loss 6.3325, time 125.49ms
iter 500320: loss 5.6146, time 125.60ms
iter 500330: loss 6.3034, time 125.38ms
iter 500340: loss 6.1545, time 125.48ms
iter 500350: loss 5.6352, time 125.26ms
iter 500360: loss 5.4111, time 126.13ms
iter 500370: loss 5.3881, time 125.54ms
iter 500380: loss 6.3746, time 125.43ms
iter 500390: loss 5.9521, time 125.32ms
iter 500400: loss 5.4053, time 125.39ms
iter 500410: loss 5.9264, time 125.47ms
iter 500420: loss 5.9457, time 126.02ms
iter 500430: loss 6.3283, time 125.77ms
iter 500440: loss 5.8395, time 124.29ms
iter 500450: loss 6.2433, time 124.34ms
iter 500460: loss 5.7184, time 125.67ms
iter 500470: loss 5.9446, time 125.29ms
iter 500480: loss 5.9723, time 125.15ms
iter 500490: loss 6.2087, time 125.11ms
step 500500: train loss 5.5393, val loss 5.5526
saving checkpoint to out-shakespeare-char
iter 500500: loss 5.6704, time 2900.82ms
iter 500510: loss 6.0080, time 125.55ms
iter 500520: loss 5.8342, time 126.87ms
iter 500530: loss 5.8455, time 125.51ms
iter 500540: loss 5.9979, time 125.48ms
iter 500550: loss 5.6349, time 125.49ms
iter 500560: loss 5.7339, time 125.63ms
iter 500570: loss 6.8702, time 125.65ms
iter 500580: loss 6.1730, time 124.96ms
iter 500590: loss 5.7764, time 125.33ms
iter 500600: loss 5.9004, time 125.21ms
iter 500610: loss 5.2815, time 124.99ms
iter 500620: loss 6.5281, time 124.97ms
iter 500630: loss 5.5178, time 125.22ms
iter 500640: loss 6.0168, time 124.97ms
iter 500650: loss 6.1106, time 125.12ms
iter 500660: loss 6.0122, time 125.00ms
iter 500670: loss 5.7620, time 125.03ms
iter 500680: loss 5.9878, time 125.12ms
iter 500690: loss 6.1112, time 125.09ms
iter 500700: loss 5.8554, time 124.33ms
iter 500710: loss 5.9218, time 125.22ms
iter 500720: loss 6.0624, time 125.10ms
iter 500730: loss 5.5842, time 125.29ms
iter 500740: loss 6.0826, time 125.16ms
step 500750: train loss 5.5759, val loss 5.5306
saving checkpoint to out-shakespeare-char
iter 500750: loss 5.8297, time 2879.47ms
iter 500760: loss 5.1146, time 121.93ms
iter 500770: loss 6.1438, time 121.37ms
iter 500780: loss 5.6903, time 121.62ms
iter 500790: loss 6.3852, time 122.55ms
iter 500800: loss 5.4406, time 121.48ms
iter 500810: loss 6.4267, time 121.20ms
iter 500820: loss 5.9520, time 120.49ms
iter 500830: loss 5.6839, time 121.50ms
iter 500840: loss 6.1650, time 120.50ms
iter 500850: loss 6.3412, time 121.09ms
iter 500860: loss 5.7649, time 122.02ms
iter 500870: loss 6.0779, time 121.48ms
iter 500880: loss 6.2833, time 121.63ms
iter 500890: loss 6.0777, time 124.09ms
iter 500900: loss 5.5611, time 121.43ms
iter 500910: loss 5.4794, time 120.61ms
iter 500920: loss 5.9225, time 121.64ms
iter 500930: loss 6.0261, time 121.57ms
iter 500940: loss 6.3627, time 120.83ms
iter 500950: loss 5.6040, time 121.39ms
iter 500960: loss 5.7252, time 123.81ms
iter 500970: loss 6.1835, time 122.15ms
iter 500980: loss 6.4023, time 121.33ms
iter 500990: loss 5.8907, time 122.67ms
step 501000: train loss 5.5642, val loss 5.5880
saving checkpoint to out-shakespeare-char
iter 501000: loss 5.6367, time 2891.06ms
iter 501010: loss 5.8339, time 121.90ms
iter 501020: loss 6.3047, time 121.92ms
iter 501030: loss 5.8948, time 124.42ms
iter 501040: loss 6.2052, time 121.97ms
iter 501050: loss 5.8838, time 121.42ms
iter 501060: loss 5.4659, time 121.50ms
iter 501070: loss 5.8081, time 121.92ms
iter 501080: loss 6.4219, time 122.39ms
iter 501090: loss 6.3342, time 122.15ms
iter 501100: loss 6.2421, time 123.43ms
iter 501110: loss 6.0815, time 121.96ms
iter 501120: loss 5.4971, time 122.06ms
iter 501130: loss 6.4831, time 122.45ms
iter 501140: loss 5.4253, time 121.71ms
iter 501150: loss 6.1431, time 121.87ms
iter 501160: loss 6.7941, time 124.26ms
iter 501170: loss 5.6130, time 121.79ms
iter 501180: loss 5.5672, time 121.75ms
iter 501190: loss 5.8747, time 121.92ms
iter 501200: loss 6.5795, time 121.83ms
iter 501210: loss 6.8155, time 121.74ms
iter 501220: loss 6.1522, time 120.89ms
iter 501230: loss 5.7196, time 122.85ms
iter 501240: loss 5.5580, time 122.19ms
step 501250: train loss 5.5397, val loss 5.5514
saving checkpoint to out-shakespeare-char
iter 501250: loss 6.1253, time 2895.44ms
iter 501260: loss 5.4404, time 121.05ms
iter 501270: loss 5.2341, time 121.50ms
iter 501280: loss 6.7308, time 120.96ms
iter 501290: loss 5.7197, time 120.84ms
iter 501300: loss 6.2524, time 122.91ms
iter 501310: loss 6.5342, time 121.53ms
iter 501320: loss 6.0840, time 121.05ms
iter 501330: loss 6.1533, time 122.68ms
iter 501340: loss 6.1862, time 122.49ms
iter 501350: loss 6.0310, time 121.67ms
iter 501360: loss 6.1027, time 121.76ms
iter 501370: loss 5.4683, time 122.65ms
iter 501380: loss 5.6107, time 121.31ms
iter 501390: loss 5.7062, time 120.66ms
iter 501400: loss 6.1381, time 124.15ms
iter 501410: loss 5.7162, time 121.39ms
iter 501420: loss 6.0729, time 121.39ms
iter 501430: loss 5.6628, time 121.50ms
iter 501440: loss 5.9381, time 121.99ms
iter 501450: loss 5.5620, time 121.13ms
iter 501460: loss 5.9279, time 120.09ms
iter 501470: loss 5.0500, time 123.46ms
iter 501480: loss 6.3602, time 121.46ms
iter 501490: loss 6.1035, time 121.51ms
step 501500: train loss 5.5997, val loss 5.5396
saving checkpoint to out-shakespeare-char
iter 501500: loss 6.3830, time 2899.95ms
iter 501510: loss 5.5674, time 124.01ms
iter 501520: loss 6.0154, time 121.88ms
iter 501530: loss 5.8992, time 121.44ms
iter 501540: loss 6.5058, time 121.49ms
iter 501550: loss 5.9794, time 121.38ms
iter 501560: loss 6.3370, time 121.19ms
iter 501570: loss 6.6334, time 121.57ms
iter 501580: loss 5.8755, time 122.92ms
iter 501590: loss 6.0971, time 121.46ms
iter 501600: loss 5.8084, time 121.50ms
iter 501610: loss 6.0055, time 123.38ms
iter 501620: loss 5.7418, time 121.55ms
iter 501630: loss 6.0921, time 121.06ms
iter 501640: loss 6.4472, time 120.71ms
iter 501650: loss 5.5614, time 122.68ms
iter 501660: loss 6.4077, time 121.02ms
iter 501670: loss 5.8726, time 121.23ms
iter 501680: loss 6.2941, time 122.54ms
iter 501690: loss 5.8765, time 121.64ms
iter 501700: loss 6.2595, time 120.41ms
iter 501710: loss 5.6567, time 123.78ms
iter 501720: loss 5.8943, time 121.46ms
iter 501730: loss 5.8642, time 121.65ms
iter 501740: loss 5.8847, time 121.66ms
step 501750: train loss 5.5566, val loss 5.5356
saving checkpoint to out-shakespeare-char
iter 501750: loss 6.5704, time 2892.79ms
iter 501760: loss 5.0231, time 121.81ms
iter 501770: loss 6.4250, time 120.87ms
iter 501780: loss 5.0120, time 122.82ms
iter 501790: loss 6.0357, time 122.95ms
iter 501800: loss 6.0514, time 121.22ms
iter 501810: loss 5.8003, time 121.69ms
iter 501820: loss 5.9765, time 122.28ms
iter 501830: loss 6.5164, time 121.52ms
iter 501840: loss 5.6881, time 121.58ms
iter 501850: loss 5.5932, time 121.35ms
iter 501860: loss 6.3167, time 121.44ms
iter 501870: loss 5.5135, time 121.68ms
iter 501880: loss 5.6999, time 121.68ms
iter 501890: loss 6.9301, time 123.93ms
iter 501900: loss 5.4258, time 121.23ms
iter 501910: loss 5.9575, time 121.28ms
iter 501920: loss 5.8669, time 120.58ms
iter 501930: loss 6.3692, time 121.91ms
iter 501940: loss 6.4792, time 120.31ms
iter 501950: loss 6.1632, time 121.16ms
iter 501960: loss 7.2117, time 120.97ms
iter 501970: loss 5.9044, time 122.61ms
iter 501980: loss 6.0711, time 121.46ms
iter 501990: loss 6.1955, time 121.51ms
step 502000: train loss 5.5413, val loss 5.5133
saving checkpoint to out-shakespeare-char
iter 502000: loss 6.5048, time 2889.52ms
iter 502010: loss 6.1941, time 120.77ms
iter 502020: loss 6.6242, time 122.56ms
iter 502030: loss 5.8547, time 121.54ms
iter 502040: loss 5.9663, time 121.75ms
iter 502050: loss 5.8962, time 124.09ms
iter 502060: loss 5.8172, time 122.71ms
iter 502070: loss 6.4600, time 121.55ms
iter 502080: loss 5.0807, time 121.32ms
iter 502090: loss 5.2996, time 123.78ms
iter 502100: loss 5.5094, time 121.57ms
iter 502110: loss 6.0172, time 121.58ms
iter 502120: loss 5.9694, time 123.02ms
iter 502130: loss 5.7589, time 121.49ms
iter 502140: loss 5.8380, time 123.00ms
iter 502150: loss 6.2473, time 121.45ms
iter 502160: loss 6.4743, time 122.54ms
iter 502170: loss 5.9929, time 121.63ms
iter 502180: loss 6.0895, time 120.64ms
iter 502190: loss 5.7979, time 122.80ms
iter 502200: loss 5.4918, time 121.71ms
iter 502210: loss 5.5744, time 121.66ms
iter 502220: loss 6.1952, time 124.45ms
iter 502230: loss 6.2930, time 121.70ms
iter 502240: loss 5.9094, time 121.60ms
step 502250: train loss 5.5362, val loss 5.5251
saving checkpoint to out-shakespeare-char
iter 502250: loss 5.2397, time 2902.25ms
iter 502260: loss 5.9801, time 122.96ms
iter 502270: loss 5.3346, time 121.76ms
iter 502280: loss 6.2907, time 121.84ms
iter 502290: loss 6.2446, time 124.31ms
iter 502300: loss 6.7156, time 121.48ms
iter 502310: loss 5.8796, time 121.52ms
iter 502320: loss 5.7439, time 121.45ms
iter 502330: loss 5.8219, time 122.67ms
iter 502340: loss 6.0102, time 121.51ms
iter 502350: loss 5.7842, time 121.51ms
iter 502360: loss 5.5795, time 122.62ms
iter 502370: loss 5.5433, time 121.65ms
iter 502380: loss 5.8677, time 121.61ms
iter 502390: loss 5.7888, time 124.07ms
iter 502400: loss 5.6342, time 121.64ms
iter 502410: loss 5.5171, time 121.63ms
iter 502420: loss 6.2496, time 121.64ms
iter 502430: loss 5.8044, time 121.35ms
iter 502440: loss 6.1674, time 121.32ms
iter 502450: loss 6.0256, time 121.52ms
iter 502460: loss 6.3355, time 121.71ms
iter 502470: loss 5.3467, time 121.43ms
iter 502480: loss 5.7009, time 121.51ms
iter 502490: loss 6.0465, time 124.21ms
step 502500: train loss 5.5214, val loss 5.5668
saving checkpoint to out-shakespeare-char
iter 502500: loss 6.7154, time 2902.14ms
iter 502510: loss 5.8756, time 125.50ms
iter 502520: loss 6.5560, time 125.51ms
iter 502530: loss 5.8828, time 125.76ms
iter 502540: loss 6.3055, time 125.28ms
iter 502550: loss 5.6157, time 125.13ms
iter 502560: loss 6.6859, time 125.20ms
iter 502570: loss 5.5337, time 125.13ms
iter 502580: loss 6.2291, time 125.87ms
iter 502590: loss 5.7158, time 125.10ms
iter 502600: loss 6.0025, time 125.16ms
iter 502610: loss 6.2141, time 124.90ms
iter 502620: loss 6.0109, time 124.97ms
iter 502630: loss 5.7740, time 125.00ms
iter 502640: loss 5.7682, time 125.60ms
iter 502650: loss 5.7160, time 125.28ms
iter 502660: loss 6.0240, time 126.07ms
iter 502670: loss 5.5604, time 125.42ms
iter 502680: loss 6.4213, time 125.44ms
iter 502690: loss 6.4648, time 125.14ms
iter 502700: loss 5.5511, time 125.47ms
iter 502710: loss 5.5645, time 125.51ms
iter 502720: loss 6.2311, time 125.21ms
iter 502730: loss 5.9191, time 125.49ms
iter 502740: loss 5.9228, time 126.02ms
step 502750: train loss 5.5335, val loss 5.5922
saving checkpoint to out-shakespeare-char
iter 502750: loss 6.1285, time 2924.35ms
iter 502760: loss 5.7358, time 125.09ms
iter 502770: loss 5.9455, time 126.08ms
iter 502780: loss 6.4853, time 125.60ms
iter 502790: loss 5.8887, time 126.28ms
iter 502800: loss 6.5509, time 125.46ms
iter 502810: loss 5.9078, time 126.07ms
iter 502820: loss 6.0104, time 125.97ms
iter 502830: loss 5.6906, time 126.01ms
iter 502840: loss 6.5654, time 126.21ms
iter 502850: loss 5.4710, time 126.00ms
iter 502860: loss 6.2868, time 126.23ms
iter 502870: loss 6.2128, time 125.97ms
iter 502880: loss 5.9334, time 126.21ms
iter 502890: loss 6.3289, time 125.95ms
iter 502900: loss 5.0151, time 125.67ms
iter 502910: loss 5.0392, time 125.92ms
iter 502920: loss 5.8548, time 125.55ms
iter 502930: loss 5.7391, time 125.85ms
iter 502940: loss 5.3728, time 125.85ms
iter 502950: loss 5.8219, time 125.66ms
iter 502960: loss 6.1262, time 125.79ms
iter 502970: loss 6.1299, time 125.78ms
iter 502980: loss 5.6453, time 125.71ms
iter 502990: loss 5.9586, time 127.06ms
step 503000: train loss 5.5977, val loss 5.5692
saving checkpoint to out-shakespeare-char
iter 503000: loss 6.0397, time 2906.17ms
iter 503010: loss 6.3358, time 125.61ms
iter 503020: loss 6.3002, time 122.10ms
iter 503030: loss 5.5999, time 123.10ms
iter 503040: loss 6.2433, time 122.44ms
iter 503050: loss 6.0994, time 121.90ms
iter 503060: loss 5.6487, time 122.10ms
iter 503070: loss 5.8301, time 124.25ms
iter 503080: loss 6.7943, time 121.75ms
iter 503090: loss 5.3775, time 122.16ms
iter 503100: loss 6.3300, time 125.09ms
iter 503110: loss 5.6683, time 121.99ms
iter 503120: loss 5.9935, time 121.81ms
iter 503130: loss 5.5734, time 122.01ms
iter 503140: loss 6.1836, time 122.16ms
iter 503150: loss 5.3904, time 122.01ms
iter 503160: loss 5.5038, time 122.53ms
iter 503170: loss 5.7946, time 123.18ms
iter 503180: loss 5.4702, time 121.19ms
iter 503190: loss 5.9836, time 122.02ms
iter 503200: loss 6.2881, time 123.04ms
iter 503210: loss 6.2647, time 122.09ms
iter 503220: loss 5.9389, time 122.09ms
iter 503230: loss 6.0495, time 123.38ms
iter 503240: loss 6.5836, time 121.82ms
step 503250: train loss 5.5215, val loss 5.5406
saving checkpoint to out-shakespeare-char
iter 503250: loss 6.2785, time 2916.74ms
iter 503260: loss 5.9825, time 128.62ms
iter 503270: loss 5.2185, time 125.80ms
iter 503280: loss 5.2228, time 126.05ms
iter 503290: loss 6.0174, time 125.30ms
iter 503300: loss 5.7985, time 125.77ms
iter 503310: loss 6.1843, time 125.65ms
iter 503320: loss 6.1864, time 125.85ms
iter 503330: loss 5.6802, time 126.22ms
iter 503340: loss 5.3173, time 125.92ms
iter 503350: loss 5.6047, time 125.69ms
iter 503360: loss 5.8695, time 124.68ms
iter 503370: loss 6.3065, time 125.69ms
iter 503380: loss 5.5758, time 125.31ms
iter 503390: loss 6.3788, time 124.27ms
iter 503400: loss 6.5443, time 125.61ms
iter 503410: loss 5.6290, time 125.20ms
iter 503420: loss 6.2146, time 125.18ms
iter 503430: loss 6.5778, time 126.03ms
iter 503440: loss 6.0497, time 124.78ms
iter 503450: loss 6.6215, time 125.24ms
iter 503460: loss 6.0068, time 125.31ms
iter 503470: loss 5.9515, time 125.52ms
iter 503480: loss 5.9626, time 124.96ms
iter 503490: loss 6.2240, time 125.31ms
step 503500: train loss 5.5110, val loss 5.5420
saving checkpoint to out-shakespeare-char
iter 503500: loss 5.9676, time 2866.35ms
iter 503510: loss 6.1601, time 121.58ms
iter 503520: loss 5.9132, time 121.92ms
iter 503530: loss 6.8439, time 121.11ms
iter 503540: loss 6.7991, time 127.97ms
iter 503550: loss 5.8104, time 125.37ms
iter 503560: loss 6.5422, time 127.38ms
iter 503570: loss 6.5108, time 125.71ms
iter 503580: loss 6.4209, time 127.89ms
iter 503590: loss 6.2557, time 125.58ms
iter 503600: loss 5.9170, time 127.92ms
iter 503610: loss 5.8212, time 124.72ms
iter 503620: loss 6.3593, time 126.83ms
iter 503630: loss 6.0191, time 124.92ms
iter 503640: loss 6.1513, time 124.77ms
iter 503650: loss 5.9288, time 125.34ms
iter 503660: loss 5.5961, time 124.84ms
iter 503670: loss 6.1101, time 125.01ms
iter 503680: loss 5.7047, time 125.00ms
iter 503690: loss 6.8392, time 125.47ms
iter 503700: loss 6.1627, time 125.09ms
iter 503710: loss 5.9685, time 125.33ms
iter 503720: loss 5.9804, time 125.21ms
iter 503730: loss 5.9709, time 125.52ms
iter 503740: loss 5.9172, time 125.22ms
step 503750: train loss 5.5140, val loss 5.5660
saving checkpoint to out-shakespeare-char
iter 503750: loss 5.3376, time 2890.22ms
iter 503760: loss 6.1957, time 125.30ms
iter 503770: loss 6.7117, time 125.48ms
iter 503780: loss 5.9654, time 125.11ms
iter 503790: loss 5.8872, time 125.78ms
iter 503800: loss 5.9783, time 125.11ms
iter 503810: loss 6.4615, time 124.86ms
iter 503820: loss 5.5282, time 124.65ms
iter 503830: loss 5.6103, time 125.30ms
iter 503840: loss 6.3274, time 125.03ms
iter 503850: loss 5.6839, time 125.18ms
iter 503860: loss 5.9760, time 125.75ms
iter 503870: loss 6.2100, time 125.12ms
iter 503880: loss 6.6171, time 125.20ms
iter 503890: loss 6.0374, time 125.96ms
iter 503900: loss 5.9068, time 126.43ms
iter 503910: loss 6.2637, time 126.24ms
iter 503920: loss 5.7649, time 125.65ms
iter 503930: loss 5.9896, time 125.59ms
iter 503940: loss 6.1313, time 125.84ms
iter 503950: loss 5.9721, time 125.86ms
iter 503960: loss 6.2378, time 126.23ms
iter 503970: loss 6.7268, time 122.13ms
iter 503980: loss 7.6956, time 122.77ms
iter 503990: loss 5.0732, time 121.43ms
step 504000: train loss 5.5616, val loss 5.6275
saving checkpoint to out-shakespeare-char
iter 504000: loss 5.5453, time 2857.00ms
iter 504010: loss 5.9094, time 128.34ms
iter 504020: loss 5.9412, time 126.03ms
iter 504030: loss 6.1288, time 121.73ms
iter 504040: loss 6.4252, time 121.53ms
iter 504050: loss 5.6120, time 121.39ms
iter 504060: loss 6.2508, time 121.43ms
iter 504070: loss 5.6192, time 122.46ms
iter 504080: loss 6.5688, time 121.23ms
iter 504090: loss 5.6944, time 121.89ms
iter 504100: loss 6.3913, time 122.36ms
iter 504110: loss 5.9548, time 122.00ms
iter 504120: loss 5.8430, time 121.38ms
iter 504130: loss 6.2194, time 123.87ms
iter 504140: loss 5.0452, time 121.36ms
iter 504150: loss 6.2526, time 121.43ms
iter 504160: loss 6.0318, time 121.43ms
iter 504170: loss 6.0331, time 121.94ms
iter 504180: loss 5.7351, time 121.29ms
iter 504190: loss 5.9714, time 121.43ms
iter 504200: loss 6.7391, time 122.32ms
iter 504210: loss 6.1087, time 121.24ms
iter 504220: loss 6.7564, time 121.56ms
iter 504230: loss 5.8868, time 124.34ms
iter 504240: loss 6.2393, time 121.38ms
step 504250: train loss 5.5206, val loss 5.6170
saving checkpoint to out-shakespeare-char
iter 504250: loss 5.7230, time 2892.47ms
iter 504260: loss 5.6875, time 120.90ms
iter 504270: loss 7.3670, time 124.28ms
iter 504280: loss 5.9099, time 121.40ms
iter 504290: loss 6.2096, time 121.71ms
iter 504300: loss 6.0072, time 121.94ms
iter 504310: loss 5.9833, time 122.46ms
iter 504320: loss 5.9536, time 121.52ms
iter 504330: loss 6.1956, time 121.71ms
iter 504340: loss 5.8043, time 122.97ms
iter 504350: loss 6.0178, time 121.74ms
iter 504360: loss 6.0760, time 122.01ms
iter 504370: loss 5.6654, time 122.86ms
iter 504380: loss 6.7138, time 121.63ms
iter 504390: loss 5.9854, time 121.71ms
iter 504400: loss 5.2532, time 124.28ms
iter 504410: loss 5.8717, time 121.32ms
iter 504420: loss 5.8728, time 121.79ms
iter 504430: loss 5.4328, time 121.12ms
iter 504440: loss 5.8393, time 121.81ms
iter 504450: loss 5.5430, time 121.39ms
iter 504460: loss 6.2664, time 121.43ms
iter 504470: loss 6.0210, time 122.58ms
iter 504480: loss 5.3980, time 121.21ms
iter 504490: loss 5.9682, time 121.50ms
step 504500: train loss 5.5556, val loss 5.5399
saving checkpoint to out-shakespeare-char
iter 504500: loss 5.6961, time 2900.73ms
iter 504510: loss 5.3116, time 125.26ms
iter 504520: loss 6.3012, time 124.62ms
iter 504530: loss 6.1318, time 125.14ms
iter 504540: loss 5.6198, time 124.97ms
iter 504550: loss 5.7803, time 124.93ms
iter 504560: loss 6.5482, time 125.05ms
iter 504570: loss 6.0786, time 125.17ms
iter 504580: loss 6.4141, time 124.86ms
iter 504590: loss 5.9914, time 125.31ms
iter 504600: loss 6.1626, time 125.06ms
iter 504610: loss 5.8610, time 125.26ms
iter 504620: loss 5.9113, time 125.19ms
iter 504630: loss 5.9718, time 125.20ms
iter 504640: loss 6.1717, time 125.08ms
iter 504650: loss 6.1864, time 125.31ms
iter 504660: loss 5.6058, time 124.97ms
iter 504670: loss 5.4958, time 125.19ms
iter 504680: loss 5.8275, time 125.33ms
iter 504690: loss 6.3841, time 125.14ms
iter 504700: loss 6.0985, time 124.98ms
iter 504710: loss 5.9713, time 125.35ms
iter 504720: loss 6.2624, time 125.72ms
iter 504730: loss 5.9886, time 126.14ms
iter 504740: loss 5.3495, time 124.25ms
step 504750: train loss 5.5818, val loss 5.5207
saving checkpoint to out-shakespeare-char
iter 504750: loss 6.4346, time 2913.20ms
iter 504760: loss 5.8520, time 126.02ms
iter 504770: loss 5.5475, time 124.99ms
iter 504780: loss 5.5394, time 125.73ms
iter 504790: loss 6.2924, time 125.36ms
iter 504800: loss 5.8633, time 125.31ms
iter 504810: loss 6.4281, time 124.91ms
iter 504820: loss 5.8253, time 125.35ms
iter 504830: loss 6.2470, time 125.42ms
iter 504840: loss 6.2475, time 125.27ms
iter 504850: loss 6.3750, time 125.11ms
iter 504860: loss 5.5650, time 125.43ms
iter 504870: loss 5.4328, time 125.19ms
iter 504880: loss 5.7107, time 125.27ms
iter 504890: loss 5.8010, time 125.16ms
iter 504900: loss 6.0740, time 124.98ms
iter 504910: loss 5.6448, time 125.08ms
iter 504920: loss 5.8317, time 125.14ms
iter 504930: loss 5.7133, time 125.14ms
iter 504940: loss 5.9320, time 125.97ms
iter 504950: loss 5.8439, time 126.66ms
iter 504960: loss 6.2116, time 128.86ms
iter 504970: loss 6.7055, time 125.63ms
iter 504980: loss 6.7325, time 128.63ms
iter 504990: loss 5.9530, time 125.57ms
step 505000: train loss 5.5128, val loss 5.6076
saving checkpoint to out-shakespeare-char
iter 505000: loss 5.6087, time 2894.98ms
iter 505010: loss 5.9024, time 127.08ms
iter 505020: loss 5.8960, time 125.60ms
iter 505030: loss 6.2289, time 125.78ms
iter 505040: loss 6.1906, time 125.25ms
iter 505050: loss 6.2389, time 125.71ms
iter 505060: loss 5.9162, time 125.88ms
iter 505070: loss 6.0242, time 125.64ms
iter 505080: loss 6.0253, time 125.73ms
iter 505090: loss 6.0484, time 125.81ms
iter 505100: loss 5.4368, time 125.79ms
iter 505110: loss 6.0121, time 125.61ms
iter 505120: loss 6.6374, time 128.30ms
iter 505130: loss 5.7878, time 125.97ms
iter 505140: loss 5.9740, time 128.21ms
iter 505150: loss 5.3940, time 125.69ms
iter 505160: loss 6.0250, time 128.20ms
iter 505170: loss 6.1707, time 125.61ms
iter 505180: loss 5.9890, time 128.36ms
iter 505190: loss 5.8622, time 125.58ms
iter 505200: loss 5.5567, time 128.39ms
iter 505210: loss 6.2850, time 125.85ms
iter 505220: loss 6.9302, time 125.87ms
iter 505230: loss 5.3989, time 125.88ms
iter 505240: loss 7.5937, time 125.74ms
step 505250: train loss 5.5046, val loss 5.5371
saving checkpoint to out-shakespeare-char
iter 505250: loss 5.6127, time 2874.83ms
iter 505260: loss 6.2402, time 125.93ms
iter 505270: loss 6.7900, time 125.97ms
iter 505280: loss 6.0610, time 125.49ms
iter 505290: loss 5.3100, time 125.81ms
iter 505300: loss 5.9051, time 126.32ms
iter 505310: loss 6.3923, time 125.28ms
iter 505320: loss 6.2894, time 124.97ms
iter 505330: loss 5.5715, time 126.27ms
iter 505340: loss 5.6551, time 125.34ms
iter 505350: loss 5.8495, time 125.34ms
iter 505360: loss 5.7857, time 125.56ms
iter 505370: loss 5.4759, time 126.17ms
iter 505380: loss 5.9497, time 124.89ms
iter 505390: loss 5.9870, time 126.30ms
iter 505400: loss 5.3545, time 125.74ms
iter 505410: loss 5.7465, time 125.92ms
iter 505420: loss 6.5677, time 125.23ms
iter 505430: loss 5.8314, time 125.28ms
iter 505440: loss 5.8513, time 125.19ms
iter 505450: loss 6.1814, time 125.28ms
iter 505460: loss 6.0351, time 125.26ms
iter 505470: loss 6.0498, time 125.41ms
iter 505480: loss 6.1590, time 125.25ms
iter 505490: loss 5.5495, time 125.58ms
step 505500: train loss 5.5612, val loss 5.5577
saving checkpoint to out-shakespeare-char
iter 505500: loss 5.3517, time 2913.51ms
iter 505510: loss 6.1391, time 128.40ms
iter 505520: loss 6.7561, time 125.48ms
iter 505530: loss 5.5836, time 127.74ms
iter 505540: loss 6.7742, time 125.54ms
iter 505550: loss 5.7538, time 128.39ms
iter 505560: loss 6.2341, time 125.32ms
iter 505570: loss 5.5283, time 128.30ms
iter 505580: loss 6.6421, time 125.58ms
iter 505590: loss 5.8259, time 127.90ms
iter 505600: loss 6.3231, time 125.62ms
iter 505610: loss 5.6190, time 128.04ms
iter 505620: loss 5.6987, time 125.12ms
iter 505630: loss 6.4918, time 127.77ms
iter 505640: loss 6.5998, time 125.30ms
iter 505650: loss 6.5486, time 127.73ms
iter 505660: loss 5.7837, time 125.46ms
iter 505670: loss 5.8534, time 127.43ms
iter 505680: loss 5.6837, time 125.01ms
iter 505690: loss 5.7854, time 127.83ms
iter 505700: loss 5.3566, time 125.24ms
iter 505710: loss 6.2297, time 128.12ms
iter 505720: loss 6.0645, time 125.46ms
iter 505730: loss 5.8119, time 124.74ms
iter 505740: loss 6.1345, time 124.20ms
step 505750: train loss 5.5855, val loss 5.5733
saving checkpoint to out-shakespeare-char
iter 505750: loss 6.1392, time 2895.46ms
iter 505760: loss 5.5787, time 125.86ms
iter 505770: loss 6.0111, time 125.45ms
iter 505780: loss 5.7094, time 125.47ms
iter 505790: loss 5.5460, time 125.44ms
iter 505800: loss 6.2806, time 125.28ms
iter 505810: loss 5.5447, time 124.45ms
iter 505820: loss 6.4123, time 125.57ms
iter 505830: loss 6.4377, time 125.22ms
iter 505840: loss 5.8313, time 125.28ms
iter 505850: loss 6.0628, time 127.89ms
iter 505860: loss 6.0579, time 125.29ms
iter 505870: loss 6.1845, time 125.07ms
iter 505880: loss 5.9728, time 125.32ms
iter 505890: loss 6.2344, time 125.46ms
iter 505900: loss 5.8121, time 125.34ms
iter 505910: loss 6.2628, time 125.45ms
iter 505920: loss 5.9395, time 125.27ms
iter 505930: loss 6.2855, time 125.48ms
iter 505940: loss 6.8156, time 124.93ms
iter 505950: loss 6.3236, time 125.51ms
iter 505960: loss 5.9270, time 125.21ms
iter 505970: loss 6.5763, time 125.49ms
iter 505980: loss 6.1267, time 125.52ms
iter 505990: loss 6.4374, time 125.68ms
step 506000: train loss 5.5400, val loss 5.5619
saving checkpoint to out-shakespeare-char
iter 506000: loss 6.1107, time 2900.47ms
iter 506010: loss 6.0893, time 125.90ms
iter 506020: loss 6.0641, time 125.89ms
iter 506030: loss 5.7197, time 126.26ms
iter 506040: loss 5.5360, time 126.24ms
iter 506050: loss 5.6241, time 125.62ms
iter 506060: loss 5.8626, time 126.34ms
iter 506070: loss 6.5707, time 125.64ms
iter 506080: loss 5.8830, time 126.02ms
iter 506090: loss 6.0779, time 125.29ms
iter 506100: loss 5.3251, time 125.32ms
iter 506110: loss 5.9419, time 125.05ms
iter 506120: loss 5.1925, time 125.45ms
iter 506130: loss 6.3334, time 125.02ms
iter 506140: loss 5.9707, time 125.63ms
iter 506150: loss 5.6692, time 125.55ms
iter 506160: loss 5.8035, time 125.75ms
iter 506170: loss 6.0689, time 125.70ms
iter 506180: loss 5.9652, time 125.82ms
iter 506190: loss 6.5308, time 125.78ms
iter 506200: loss 6.1486, time 125.39ms
iter 506210: loss 6.0832, time 125.74ms
iter 506220: loss 5.3840, time 125.42ms
iter 506230: loss 5.8787, time 125.36ms
iter 506240: loss 6.9065, time 125.47ms
step 506250: train loss 5.5355, val loss 5.5771
saving checkpoint to out-shakespeare-char
iter 506250: loss 5.4892, time 2901.08ms
iter 506260: loss 5.9966, time 131.34ms
iter 506270: loss 5.6398, time 125.88ms
iter 506280: loss 6.5909, time 127.99ms
iter 506290: loss 5.7018, time 124.28ms
iter 506300: loss 6.5187, time 128.14ms
iter 506310: loss 5.3685, time 124.77ms
iter 506320: loss 5.2891, time 127.98ms
iter 506330: loss 6.5303, time 125.77ms
iter 506340: loss 6.2306, time 125.48ms
iter 506350: loss 6.3560, time 125.22ms
iter 506360: loss 5.3433, time 125.21ms
iter 506370: loss 6.5321, time 125.31ms
iter 506380: loss 6.4203, time 125.40ms
iter 506390: loss 5.7859, time 125.70ms
iter 506400: loss 5.9413, time 125.42ms
iter 506410: loss 6.0450, time 125.33ms
iter 506420: loss 5.7084, time 125.15ms
iter 506430: loss 6.1018, time 125.55ms
iter 506440: loss 6.0361, time 125.73ms
iter 506450: loss 6.2557, time 125.68ms
iter 506460: loss 6.3272, time 125.95ms
iter 506470: loss 6.0768, time 125.57ms
iter 506480: loss 5.5708, time 125.73ms
iter 506490: loss 6.1121, time 125.34ms
step 506500: train loss 5.5290, val loss 5.5124
saving checkpoint to out-shakespeare-char
iter 506500: loss 5.9463, time 2895.07ms
iter 506510: loss 6.1486, time 125.40ms
iter 506520: loss 6.1288, time 125.44ms
iter 506530: loss 5.8643, time 125.09ms
iter 506540: loss 6.7726, time 124.94ms
iter 506550: loss 6.1639, time 125.03ms
iter 506560: loss 6.2979, time 125.09ms
iter 506570: loss 5.5024, time 125.13ms
iter 506580: loss 6.0414, time 124.24ms
iter 506590: loss 6.1894, time 124.79ms
iter 506600: loss 6.2922, time 125.52ms
iter 506610: loss 6.3681, time 125.21ms
iter 506620: loss 6.1157, time 125.05ms
iter 506630: loss 5.5608, time 125.56ms
iter 506640: loss 5.7778, time 125.54ms
iter 506650: loss 5.6149, time 125.21ms
iter 506660: loss 5.4503, time 125.36ms
iter 506670: loss 6.6361, time 125.14ms
iter 506680: loss 5.8108, time 125.29ms
iter 506690: loss 6.3568, time 125.32ms
iter 506700: loss 6.4229, time 125.11ms
iter 506710: loss 5.2706, time 124.86ms
iter 506720: loss 6.3358, time 125.28ms
iter 506730: loss 6.1989, time 125.19ms
iter 506740: loss 5.8088, time 125.31ms
step 506750: train loss 5.5625, val loss 5.5603
saving checkpoint to out-shakespeare-char
iter 506750: loss 5.3238, time 2874.39ms
iter 506760: loss 5.4641, time 126.00ms
iter 506770: loss 6.2585, time 125.77ms
iter 506780: loss 6.4413, time 125.79ms
iter 506790: loss 5.8159, time 125.86ms
iter 506800: loss 6.1141, time 125.84ms
iter 506810: loss 5.2705, time 125.53ms
iter 506820: loss 5.6641, time 125.04ms
iter 506830: loss 5.4484, time 125.31ms
iter 506840: loss 6.4170, time 125.39ms
iter 506850: loss 6.1516, time 125.22ms
iter 506860: loss 5.9869, time 125.65ms
iter 506870: loss 6.7904, time 125.53ms
iter 506880: loss 6.2759, time 124.80ms
iter 506890: loss 6.3206, time 121.51ms
iter 506900: loss 5.5089, time 121.71ms
iter 506910: loss 5.7778, time 123.94ms
iter 506920: loss 6.2818, time 121.51ms
iter 506930: loss 6.1980, time 121.40ms
iter 506940: loss 5.7553, time 121.65ms
iter 506950: loss 5.7256, time 122.80ms
iter 506960: loss 6.9455, time 121.62ms
iter 506970: loss 5.2297, time 121.48ms
iter 506980: loss 5.2451, time 123.92ms
iter 506990: loss 5.5983, time 121.62ms
step 507000: train loss 5.5776, val loss 5.5956
saving checkpoint to out-shakespeare-char
iter 507000: loss 6.0618, time 2892.51ms
iter 507010: loss 5.9613, time 121.60ms
iter 507020: loss 6.0986, time 121.35ms
iter 507030: loss 5.8042, time 121.31ms
iter 507040: loss 6.9456, time 120.99ms
iter 507050: loss 6.1497, time 122.60ms
iter 507060: loss 6.2710, time 122.44ms
iter 507070: loss 5.9166, time 121.51ms
iter 507080: loss 5.6488, time 121.50ms
iter 507090: loss 5.5271, time 123.23ms
iter 507100: loss 6.1711, time 121.42ms
iter 507110: loss 6.3064, time 120.75ms
iter 507120: loss 6.4071, time 121.49ms
iter 507130: loss 5.4859, time 121.74ms
iter 507140: loss 5.7503, time 121.86ms
iter 507150: loss 6.4252, time 121.42ms
iter 507160: loss 5.7651, time 123.49ms
iter 507170: loss 5.7620, time 121.86ms
iter 507180: loss 6.2053, time 121.59ms
iter 507190: loss 5.3874, time 121.46ms
iter 507200: loss 6.0075, time 121.41ms
iter 507210: loss 6.4465, time 121.34ms
iter 507220: loss 6.0153, time 121.29ms
iter 507230: loss 5.6891, time 122.51ms
iter 507240: loss 6.5818, time 121.37ms
step 507250: train loss 5.5794, val loss 5.5361
saving checkpoint to out-shakespeare-char
iter 507250: loss 5.6288, time 2893.66ms
iter 507260: loss 5.6585, time 125.51ms
iter 507270: loss 5.9527, time 125.26ms
iter 507280: loss 6.3821, time 125.19ms
iter 507290: loss 5.9048, time 125.16ms
iter 507300: loss 5.8855, time 125.44ms
iter 507310: loss 5.8812, time 125.40ms
iter 507320: loss 5.3972, time 125.27ms
iter 507330: loss 5.5232, time 125.13ms
iter 507340: loss 6.0378, time 124.97ms
iter 507350: loss 5.9715, time 125.34ms
iter 507360: loss 6.3216, time 125.35ms
iter 507370: loss 5.4719, time 125.02ms
iter 507380: loss 5.8548, time 125.29ms
iter 507390: loss 5.9942, time 125.03ms
iter 507400: loss 6.1864, time 125.26ms
iter 507410: loss 5.5445, time 125.07ms
iter 507420: loss 6.2383, time 125.31ms
iter 507430: loss 5.2486, time 125.47ms
iter 507440: loss 6.1283, time 125.29ms
iter 507450: loss 6.6511, time 125.03ms
iter 507460: loss 6.4292, time 125.03ms
iter 507470: loss 5.9053, time 125.20ms
iter 507480: loss 5.9700, time 125.07ms
iter 507490: loss 6.6374, time 124.76ms
step 507500: train loss 5.5547, val loss 5.5427
saving checkpoint to out-shakespeare-char
iter 507500: loss 5.3311, time 2884.64ms
iter 507510: loss 5.6898, time 125.30ms
iter 507520: loss 5.7415, time 125.81ms
iter 507530: loss 6.4281, time 125.63ms
iter 507540: loss 5.9101, time 125.61ms
iter 507550: loss 5.6889, time 125.75ms
iter 507560: loss 5.7497, time 126.01ms
iter 507570: loss 5.3565, time 125.65ms
iter 507580: loss 6.2845, time 125.92ms
iter 507590: loss 6.0439, time 125.85ms
iter 507600: loss 5.5181, time 125.53ms
iter 507610: loss 5.6908, time 125.67ms
iter 507620: loss 6.8339, time 124.78ms
iter 507630: loss 5.4166, time 125.41ms
iter 507640: loss 5.5877, time 125.55ms
iter 507650: loss 5.6787, time 125.71ms
iter 507660: loss 6.9907, time 125.21ms
iter 507670: loss 5.2942, time 125.67ms
iter 507680: loss 6.1206, time 125.48ms
iter 507690: loss 7.0825, time 125.51ms
iter 507700: loss 6.1782, time 125.67ms
iter 507710: loss 6.3682, time 124.98ms
iter 507720: loss 5.9177, time 125.68ms
iter 507730: loss 6.3494, time 124.97ms
iter 507740: loss 6.0011, time 125.60ms
step 507750: train loss 5.5240, val loss 5.5448
saving checkpoint to out-shakespeare-char
iter 507750: loss 6.2540, time 2894.01ms
iter 507760: loss 6.2332, time 126.03ms
iter 507770: loss 6.4666, time 125.47ms
iter 507780: loss 5.2004, time 125.74ms
iter 507790: loss 6.1967, time 125.35ms
iter 507800: loss 6.5457, time 125.71ms
iter 507810: loss 5.4937, time 125.66ms
iter 507820: loss 6.1375, time 125.93ms
iter 507830: loss 5.4988, time 125.34ms
iter 507840: loss 5.7523, time 125.72ms
iter 507850: loss 5.8853, time 125.44ms
iter 507860: loss 5.9477, time 126.14ms
iter 507870: loss 6.4618, time 125.34ms
iter 507880: loss 6.0882, time 125.64ms
iter 507890: loss 5.6084, time 125.58ms
iter 507900: loss 6.1320, time 125.65ms
iter 507910: loss 6.4831, time 125.32ms
iter 507920: loss 5.9882, time 124.61ms
iter 507930: loss 6.0380, time 125.34ms
iter 507940: loss 5.5705, time 125.80ms
iter 507950: loss 6.0479, time 125.51ms
iter 507960: loss 6.1608, time 125.27ms
iter 507970: loss 5.7372, time 125.38ms
iter 507980: loss 5.6984, time 125.51ms
iter 507990: loss 6.0798, time 125.36ms
step 508000: train loss 5.5061, val loss 5.5429
saving checkpoint to out-shakespeare-char
iter 508000: loss 5.7711, time 2881.92ms
iter 508010: loss 5.8124, time 120.73ms
iter 508020: loss 6.5376, time 122.83ms
iter 508030: loss 6.1619, time 121.64ms
iter 508040: loss 5.8674, time 121.69ms
iter 508050: loss 5.5539, time 123.81ms
iter 508060: loss 5.7392, time 121.30ms
iter 508070: loss 6.2682, time 121.83ms
iter 508080: loss 5.9490, time 121.64ms
iter 508090: loss 5.8715, time 121.20ms
iter 508100: loss 5.8584, time 120.60ms
iter 508110: loss 6.0912, time 121.51ms
iter 508120: loss 5.8958, time 121.99ms
iter 508130: loss 5.7953, time 121.50ms
iter 508140: loss 5.3883, time 121.69ms
iter 508150: loss 5.4871, time 124.20ms
iter 508160: loss 5.8846, time 121.24ms
iter 508170: loss 5.8808, time 121.52ms
iter 508180: loss 5.6181, time 121.35ms
iter 508190: loss 5.7128, time 122.36ms
iter 508200: loss 6.0367, time 121.37ms
iter 508210: loss 5.6447, time 120.98ms
iter 508220: loss 6.5174, time 123.93ms
iter 508230: loss 5.3444, time 121.56ms
iter 508240: loss 5.4251, time 121.09ms
step 508250: train loss 5.5470, val loss 5.5457
saving checkpoint to out-shakespeare-char
iter 508250: loss 6.1492, time 2887.07ms
iter 508260: loss 6.4808, time 122.83ms
iter 508270: loss 6.0115, time 121.52ms
iter 508280: loss 5.9102, time 121.52ms
iter 508290: loss 6.7039, time 121.68ms
iter 508300: loss 6.1119, time 121.65ms
iter 508310: loss 6.0099, time 121.54ms
iter 508320: loss 6.4493, time 121.46ms
iter 508330: loss 6.2488, time 122.58ms
iter 508340: loss 6.3047, time 120.72ms
iter 508350: loss 5.8142, time 121.37ms
iter 508360: loss 6.1533, time 123.55ms
iter 508370: loss 6.5973, time 121.49ms
iter 508380: loss 5.4492, time 121.43ms
iter 508390: loss 6.0598, time 121.50ms
iter 508400: loss 6.0286, time 121.31ms
iter 508410: loss 5.9602, time 121.73ms
iter 508420: loss 5.2004, time 121.64ms
iter 508430: loss 5.3008, time 123.09ms
iter 508440: loss 5.9354, time 121.57ms
iter 508450: loss 5.9301, time 121.52ms
iter 508460: loss 6.5590, time 121.39ms
iter 508470: loss 5.5049, time 121.50ms
iter 508480: loss 5.4331, time 121.76ms
iter 508490: loss 6.3106, time 121.58ms
step 508500: train loss 5.5220, val loss 5.5526
saving checkpoint to out-shakespeare-char
iter 508500: loss 6.2366, time 2906.81ms
iter 508510: loss 6.1711, time 125.84ms
iter 508520: loss 6.1689, time 125.58ms
iter 508530: loss 5.9937, time 125.67ms
iter 508540: loss 5.3923, time 125.57ms
iter 508550: loss 6.5430, time 125.65ms
iter 508560: loss 6.0954, time 125.64ms
iter 508570: loss 5.2714, time 125.78ms
iter 508580: loss 5.7655, time 125.63ms
iter 508590: loss 6.3955, time 124.79ms
iter 508600: loss 6.3243, time 125.08ms
iter 508610: loss 5.8271, time 124.57ms
iter 508620: loss 5.8727, time 125.41ms
iter 508630: loss 6.3108, time 125.01ms
iter 508640: loss 6.5718, time 125.39ms
iter 508650: loss 6.5897, time 124.77ms
iter 508660: loss 5.3231, time 124.96ms
iter 508670: loss 5.8742, time 126.21ms
iter 508680: loss 5.6605, time 124.81ms
iter 508690: loss 6.7608, time 124.95ms
iter 508700: loss 5.7744, time 124.96ms
iter 508710: loss 6.1848, time 125.21ms
iter 508720: loss 6.5113, time 124.91ms
iter 508730: loss 5.0485, time 125.50ms
iter 508740: loss 5.8174, time 125.84ms
step 508750: train loss 5.5373, val loss 5.5724
saving checkpoint to out-shakespeare-char
iter 508750: loss 7.0614, time 2888.78ms
iter 508760: loss 6.1292, time 125.66ms
iter 508770: loss 5.9996, time 125.35ms
iter 508780: loss 5.4337, time 125.31ms
iter 508790: loss 5.9267, time 125.90ms
iter 508800: loss 5.9180, time 125.39ms
iter 508810: loss 5.6600, time 125.99ms
iter 508820: loss 6.3623, time 124.99ms
iter 508830: loss 5.5039, time 124.91ms
iter 508840: loss 6.4743, time 125.08ms
iter 508850: loss 5.7371, time 125.25ms
iter 508860: loss 6.0660, time 125.07ms
iter 508870: loss 5.6769, time 124.55ms
iter 508880: loss 6.2472, time 125.25ms
iter 508890: loss 5.8203, time 125.60ms
iter 508900: loss 6.2786, time 126.60ms
iter 508910: loss 6.1819, time 125.34ms
iter 508920: loss 5.7256, time 125.20ms
iter 508930: loss 5.8568, time 125.52ms
iter 508940: loss 4.6984, time 125.20ms
iter 508950: loss 5.5819, time 125.01ms
iter 508960: loss 6.6557, time 124.99ms
iter 508970: loss 5.8051, time 125.31ms
iter 508980: loss 6.5353, time 124.90ms
iter 508990: loss 5.9780, time 124.35ms
step 509000: train loss 5.5393, val loss 5.5320
saving checkpoint to out-shakespeare-char
iter 509000: loss 5.6606, time 2911.47ms
iter 509010: loss 5.9846, time 125.06ms
iter 509020: loss 6.4274, time 125.01ms
iter 509030: loss 5.4279, time 124.87ms
iter 509040: loss 5.4624, time 125.10ms
iter 509050: loss 5.6590, time 124.08ms
iter 509060: loss 6.0131, time 125.46ms
iter 509070: loss 6.1735, time 124.11ms
iter 509080: loss 6.6695, time 125.37ms
iter 509090: loss 5.5717, time 125.06ms
iter 509100: loss 5.8030, time 125.21ms
iter 509110: loss 5.6101, time 125.30ms
iter 509120: loss 6.9201, time 125.16ms
iter 509130: loss 5.9852, time 125.05ms
iter 509140: loss 5.4882, time 125.23ms
iter 509150: loss 6.0085, time 125.46ms
iter 509160: loss 5.9416, time 124.28ms
iter 509170: loss 5.7224, time 125.08ms
iter 509180: loss 5.9665, time 125.05ms
iter 509190: loss 5.7924, time 124.97ms
iter 509200: loss 6.6643, time 124.46ms
iter 509210: loss 5.6797, time 125.13ms
iter 509220: loss 5.6299, time 125.13ms
iter 509230: loss 5.7843, time 125.21ms
iter 509240: loss 5.4387, time 123.96ms
step 509250: train loss 5.4961, val loss 5.5397
saving checkpoint to out-shakespeare-char
iter 509250: loss 6.2775, time 2893.22ms
iter 509260: loss 6.0478, time 125.67ms
iter 509270: loss 6.3777, time 125.73ms
iter 509280: loss 6.1239, time 125.46ms
iter 509290: loss 6.3617, time 125.41ms
iter 509300: loss 5.8615, time 125.88ms
iter 509310: loss 6.0556, time 125.50ms
iter 509320: loss 6.2064, time 125.72ms
iter 509330: loss 5.7354, time 125.51ms
iter 509340: loss 5.8473, time 125.40ms
iter 509350: loss 5.2123, time 125.52ms
iter 509360: loss 6.8405, time 124.97ms
iter 509370: loss 5.6558, time 125.86ms
iter 509380: loss 5.8556, time 126.74ms
iter 509390: loss 5.7700, time 126.02ms
iter 509400: loss 6.4650, time 125.21ms
iter 509410: loss 6.8454, time 124.98ms
iter 509420: loss 5.9472, time 124.60ms
iter 509430: loss 6.3029, time 125.19ms
iter 509440: loss 5.2573, time 125.26ms
iter 509450: loss 5.4483, time 124.66ms
iter 509460: loss 5.1084, time 125.52ms
iter 509470: loss 6.3351, time 125.60ms
iter 509480: loss 5.2137, time 124.95ms
iter 509490: loss 5.8884, time 124.86ms
step 509500: train loss 5.5435, val loss 5.5345
saving checkpoint to out-shakespeare-char
iter 509500: loss 5.4033, time 2904.67ms
iter 509510: loss 5.6467, time 124.94ms
iter 509520: loss 6.5485, time 125.09ms
iter 509530: loss 6.8065, time 124.68ms
iter 509540: loss 6.3285, time 125.15ms
iter 509550: loss 6.7908, time 125.14ms
iter 509560: loss 6.0620, time 123.90ms
iter 509570: loss 5.7270, time 123.83ms
iter 509580: loss 6.3335, time 125.03ms
iter 509590: loss 5.9502, time 124.83ms
iter 509600: loss 6.2779, time 125.03ms
iter 509610: loss 6.0962, time 124.88ms
iter 509620: loss 5.4011, time 125.20ms
iter 509630: loss 5.5043, time 123.72ms
iter 509640: loss 5.5521, time 125.04ms
iter 509650: loss 6.0313, time 123.88ms
iter 509660: loss 6.2383, time 125.07ms
iter 509670: loss 5.1340, time 126.16ms
iter 509680: loss 5.8186, time 124.94ms
iter 509690: loss 6.3504, time 124.77ms
iter 509700: loss 6.0403, time 124.81ms
iter 509710: loss 6.1993, time 124.99ms
iter 509720: loss 5.6796, time 125.14ms
iter 509730: loss 5.4946, time 125.25ms
iter 509740: loss 5.7343, time 125.21ms
step 509750: train loss 5.5811, val loss 5.5013
saving checkpoint to out-shakespeare-char
iter 509750: loss 5.5100, time 2905.34ms
iter 509760: loss 5.9288, time 125.35ms
iter 509770: loss 6.2692, time 125.20ms
iter 509780: loss 5.3921, time 125.20ms
iter 509790: loss 5.2312, time 125.40ms
iter 509800: loss 6.2482, time 126.13ms
iter 509810: loss 5.7340, time 125.32ms
iter 509820: loss 5.8714, time 124.40ms
iter 509830: loss 5.7546, time 125.26ms
iter 509840: loss 6.3213, time 125.53ms
iter 509850: loss 6.5794, time 125.30ms
iter 509860: loss 5.4101, time 125.28ms
iter 509870: loss 5.6093, time 125.31ms
iter 509880: loss 5.8322, time 125.32ms
iter 509890: loss 5.7764, time 125.23ms
iter 509900: loss 6.3487, time 124.93ms
iter 509910: loss 6.0785, time 125.75ms
iter 509920: loss 5.7330, time 125.72ms
iter 509930: loss 5.9525, time 125.69ms
iter 509940: loss 5.9411, time 125.97ms
iter 509950: loss 6.1061, time 125.89ms
iter 509960: loss 5.5437, time 125.44ms
iter 509970: loss 5.8820, time 124.27ms
iter 509980: loss 6.0085, time 125.36ms
iter 509990: loss 6.2313, time 125.07ms
step 510000: train loss 5.5838, val loss 5.5526
saving checkpoint to out-shakespeare-char
iter 510000: loss 5.4504, time 2887.03ms
iter 510010: loss 5.1982, time 125.39ms
iter 510020: loss 6.2558, time 125.66ms
iter 510030: loss 6.3965, time 126.04ms
iter 510040: loss 5.8649, time 125.55ms
iter 510050: loss 6.5691, time 124.79ms
iter 510060: loss 5.3105, time 126.12ms
iter 510070: loss 6.3141, time 124.93ms
iter 510080: loss 6.4582, time 125.77ms
iter 510090: loss 5.5565, time 127.74ms
iter 510100: loss 5.7636, time 125.07ms
iter 510110: loss 5.8546, time 125.23ms
iter 510120: loss 6.9386, time 125.32ms
iter 510130: loss 5.8508, time 125.87ms
iter 510140: loss 5.1240, time 125.28ms
iter 510150: loss 5.9370, time 125.35ms
iter 510160: loss 6.1581, time 125.67ms
iter 510170: loss 6.3033, time 124.80ms
iter 510180: loss 5.3484, time 125.28ms
iter 510190: loss 5.4501, time 126.31ms
iter 510200: loss 5.4897, time 120.90ms
iter 510210: loss 5.2805, time 120.96ms
iter 510220: loss 6.0577, time 123.91ms
iter 510230: loss 6.3967, time 121.61ms
iter 510240: loss 6.0245, time 121.52ms
step 510250: train loss 5.5659, val loss 5.5801
saving checkpoint to out-shakespeare-char
iter 510250: loss 6.4500, time 2919.44ms
iter 510260: loss 6.3682, time 121.37ms
iter 510270: loss 5.8743, time 121.33ms
iter 510280: loss 6.3339, time 124.14ms
iter 510290: loss 6.6133, time 121.58ms
iter 510300: loss 6.1579, time 121.21ms
iter 510310: loss 6.9671, time 121.38ms
iter 510320: loss 5.9087, time 121.59ms
iter 510330: loss 6.3960, time 121.25ms
iter 510340: loss 6.3671, time 121.21ms
iter 510350: loss 6.5761, time 123.05ms
iter 510360: loss 6.2808, time 121.31ms
iter 510370: loss 6.4490, time 121.31ms
iter 510380: loss 6.4433, time 122.85ms
iter 510390: loss 5.5155, time 121.23ms
iter 510400: loss 5.9304, time 121.30ms
iter 510410: loss 5.4156, time 124.25ms
iter 510420: loss 5.9786, time 121.42ms
iter 510430: loss 5.6502, time 121.28ms
iter 510440: loss 6.6435, time 121.51ms
iter 510450: loss 5.9914, time 121.98ms
iter 510460: loss 5.6695, time 122.71ms
iter 510470: loss 6.4747, time 120.78ms
iter 510480: loss 5.7766, time 122.52ms
iter 510490: loss 6.4022, time 121.12ms
step 510500: train loss 5.5620, val loss 5.5901
saving checkpoint to out-shakespeare-char
iter 510500: loss 5.8254, time 2916.02ms
iter 510510: loss 5.8171, time 121.44ms
iter 510520: loss 6.0159, time 122.65ms
iter 510530: loss 6.3913, time 121.25ms
iter 510540: loss 5.6757, time 121.32ms
iter 510550: loss 5.6959, time 124.08ms
iter 510560: loss 6.1601, time 121.39ms
iter 510570: loss 5.8673, time 121.17ms
iter 510580: loss 6.0756, time 121.55ms
iter 510590: loss 5.8011, time 121.68ms
iter 510600: loss 6.4406, time 121.14ms
iter 510610: loss 6.1013, time 121.33ms
iter 510620: loss 5.7581, time 122.54ms
iter 510630: loss 6.2914, time 121.34ms
iter 510640: loss 5.7733, time 121.53ms
iter 510650: loss 6.3938, time 122.62ms
iter 510660: loss 6.3598, time 121.27ms
iter 510670: loss 5.7371, time 121.46ms
iter 510680: loss 6.3505, time 124.18ms
iter 510690: loss 6.0413, time 121.48ms
iter 510700: loss 6.3810, time 121.24ms
iter 510710: loss 5.6649, time 121.79ms
iter 510720: loss 6.0260, time 120.95ms
iter 510730: loss 5.2281, time 121.37ms
iter 510740: loss 6.6465, time 121.21ms
step 510750: train loss 5.5224, val loss 5.5316
saving checkpoint to out-shakespeare-char
iter 510750: loss 5.6562, time 2906.69ms
iter 510760: loss 6.7052, time 122.51ms
iter 510770: loss 6.0183, time 121.36ms
iter 510780: loss 5.7546, time 124.11ms
iter 510790: loss 5.2362, time 121.41ms
iter 510800: loss 5.9157, time 121.38ms
iter 510810: loss 5.6784, time 121.50ms
iter 510820: loss 6.1479, time 121.50ms
iter 510830: loss 6.8832, time 121.42ms
iter 510840: loss 6.1806, time 121.19ms
iter 510850: loss 6.7040, time 122.45ms
iter 510860: loss 5.5868, time 121.22ms
iter 510870: loss 5.8741, time 121.29ms
iter 510880: loss 5.7625, time 123.03ms
iter 510890: loss 6.5145, time 121.17ms
iter 510900: loss 6.1367, time 121.23ms
iter 510910: loss 6.6344, time 123.98ms
iter 510920: loss 5.8865, time 121.29ms
iter 510930: loss 6.0607, time 121.21ms
iter 510940: loss 5.7317, time 121.29ms
iter 510950: loss 5.9878, time 121.50ms
iter 510960: loss 5.2621, time 120.69ms
iter 510970: loss 6.6619, time 121.15ms
iter 510980: loss 5.4347, time 121.80ms
iter 510990: loss 6.2577, time 121.53ms
step 511000: train loss 5.5035, val loss 5.5191
saving checkpoint to out-shakespeare-char
iter 511000: loss 5.6753, time 2903.49ms
iter 511010: loss 5.0010, time 122.71ms
iter 511020: loss 6.1643, time 121.39ms
iter 511030: loss 6.2499, time 121.23ms
iter 511040: loss 6.4513, time 124.02ms
iter 511050: loss 5.7224, time 121.58ms
iter 511060: loss 6.0669, time 121.25ms
iter 511070: loss 5.8202, time 121.33ms
iter 511080: loss 5.6562, time 121.45ms
iter 511090: loss 5.9107, time 121.04ms
iter 511100: loss 6.0596, time 121.10ms
iter 511110: loss 6.5757, time 122.55ms
iter 511120: loss 6.1121, time 121.38ms
iter 511130: loss 5.8876, time 121.23ms
iter 511140: loss 5.5473, time 122.60ms
iter 511150: loss 6.4089, time 121.31ms
iter 511160: loss 6.4527, time 121.33ms
iter 511170: loss 6.2672, time 124.02ms
iter 511180: loss 5.7017, time 121.27ms
iter 511190: loss 5.8921, time 121.13ms
iter 511200: loss 6.1438, time 120.27ms
iter 511210: loss 5.5349, time 121.49ms
iter 511220: loss 6.3288, time 121.10ms
iter 511230: loss 6.1890, time 121.27ms
iter 511240: loss 5.8737, time 122.51ms
step 511250: train loss 5.5305, val loss 5.6029
saving checkpoint to out-shakespeare-char
iter 511250: loss 5.7497, time 2905.25ms
iter 511260: loss 6.5913, time 125.60ms
iter 511270: loss 5.6132, time 125.85ms
iter 511280: loss 5.6887, time 125.41ms
iter 511290: loss 5.7533, time 125.06ms
iter 511300: loss 6.2907, time 125.59ms
iter 511310: loss 5.7621, time 125.35ms
iter 511320: loss 5.8241, time 125.38ms
iter 511330: loss 5.6620, time 124.67ms
iter 511340: loss 6.3017, time 125.34ms
iter 511350: loss 5.4497, time 124.90ms
iter 511360: loss 5.6626, time 125.61ms
iter 511370: loss 6.4658, time 125.30ms
iter 511380: loss 6.0049, time 125.62ms
iter 511390: loss 5.9432, time 125.19ms
iter 511400: loss 5.8861, time 125.75ms
iter 511410: loss 5.6265, time 125.50ms
iter 511420: loss 5.9638, time 125.64ms
iter 511430: loss 6.8287, time 125.87ms
iter 511440: loss 6.1880, time 125.66ms
iter 511450: loss 5.9969, time 125.83ms
iter 511460: loss 5.9699, time 125.61ms
iter 511470: loss 6.6467, time 125.48ms
iter 511480: loss 6.1591, time 125.78ms
iter 511490: loss 5.1843, time 125.77ms
step 511500: train loss 5.5705, val loss 5.5746
saving checkpoint to out-shakespeare-char
iter 511500: loss 5.6329, time 2897.32ms
iter 511510: loss 5.9248, time 126.52ms
iter 511520: loss 5.8785, time 128.01ms
iter 511530: loss 5.8459, time 125.60ms
iter 511540: loss 6.0650, time 127.75ms
iter 511550: loss 5.6401, time 125.44ms
iter 511560: loss 6.1291, time 128.06ms
iter 511570: loss 6.0055, time 125.84ms
iter 511580: loss 5.2893, time 129.28ms
iter 511590: loss 6.1810, time 127.12ms
iter 511600: loss 5.8563, time 125.59ms
iter 511610: loss 5.9765, time 125.95ms
iter 511620: loss 6.3986, time 125.45ms
iter 511630: loss 6.0853, time 119.39ms
iter 511640: loss 6.1313, time 119.24ms
iter 511650: loss 5.2443, time 119.67ms
iter 511660: loss 6.1961, time 119.16ms
iter 511670: loss 6.9017, time 118.89ms
iter 511680: loss 5.8452, time 119.37ms
iter 511690: loss 5.7830, time 119.61ms
iter 511700: loss 6.1532, time 119.38ms
iter 511710: loss 6.3691, time 120.74ms
iter 511720: loss 6.3701, time 122.03ms
iter 511730: loss 5.1906, time 120.69ms
iter 511740: loss 5.6673, time 119.67ms
step 511750: train loss 5.5292, val loss 5.5624
saving checkpoint to out-shakespeare-char
iter 511750: loss 5.9886, time 2867.83ms
iter 511760: loss 6.0826, time 119.84ms
iter 511770: loss 5.2174, time 120.82ms
iter 511780: loss 6.4801, time 119.58ms
iter 511790: loss 6.5715, time 120.59ms
iter 511800: loss 5.6291, time 119.67ms
iter 511810: loss 6.3753, time 119.63ms
iter 511820: loss 5.4338, time 119.99ms
iter 511830: loss 5.7421, time 120.97ms
iter 511840: loss 5.9218, time 120.78ms
iter 511850: loss 5.4365, time 119.62ms
iter 511860: loss 6.5729, time 119.51ms
iter 511870: loss 5.7159, time 119.85ms
iter 511880: loss 5.4054, time 121.39ms
iter 511890: loss 6.3691, time 119.66ms
iter 511900: loss 6.0219, time 120.57ms
iter 511910: loss 5.9578, time 119.76ms
iter 511920: loss 6.1878, time 119.66ms
iter 511930: loss 6.0351, time 120.88ms
iter 511940: loss 6.2424, time 119.71ms
iter 511950: loss 6.2848, time 120.61ms
iter 511960: loss 5.4283, time 119.29ms
iter 511970: loss 6.4458, time 120.40ms
iter 511980: loss 5.8660, time 119.74ms
iter 511990: loss 5.8631, time 120.59ms
step 512000: train loss 5.4922, val loss 5.5807
saving checkpoint to out-shakespeare-char
iter 512000: loss 6.1710, time 2896.81ms
iter 512010: loss 5.3225, time 119.68ms
iter 512020: loss 5.5508, time 119.52ms
iter 512030: loss 5.3646, time 120.78ms
iter 512040: loss 5.7150, time 119.95ms
iter 512050: loss 5.2673, time 119.53ms
iter 512060: loss 5.7747, time 119.62ms
iter 512070: loss 5.8634, time 122.93ms
iter 512080: loss 6.0274, time 121.91ms
iter 512090: loss 6.1111, time 121.79ms
iter 512100: loss 5.9626, time 121.82ms
iter 512110: loss 6.0339, time 122.00ms
iter 512120: loss 5.3320, time 121.79ms
iter 512130: loss 5.9273, time 121.04ms
iter 512140: loss 5.7003, time 122.76ms
iter 512150: loss 6.8049, time 121.72ms
iter 512160: loss 5.7134, time 121.83ms
iter 512170: loss 5.6911, time 124.43ms
iter 512180: loss 5.7747, time 121.84ms
iter 512190: loss 6.3437, time 121.61ms
iter 512200: loss 6.0154, time 121.64ms
iter 512210: loss 5.5737, time 121.73ms
iter 512220: loss 5.4298, time 121.89ms
iter 512230: loss 6.1510, time 121.80ms
iter 512240: loss 6.2018, time 122.85ms
step 512250: train loss 5.4666, val loss 5.5364
saving checkpoint to out-shakespeare-char
iter 512250: loss 5.9861, time 2891.89ms
iter 512260: loss 5.8410, time 121.71ms
iter 512270: loss 6.3938, time 121.71ms
iter 512280: loss 5.8293, time 122.36ms
iter 512290: loss 5.3715, time 122.10ms
iter 512300: loss 5.5995, time 121.82ms
iter 512310: loss 5.4409, time 122.87ms
iter 512320: loss 6.1258, time 121.98ms
iter 512330: loss 6.1631, time 122.17ms
iter 512340: loss 4.9528, time 122.46ms
iter 512350: loss 5.9311, time 122.27ms
iter 512360: loss 6.3445, time 122.24ms
iter 512370: loss 5.9870, time 124.15ms
iter 512380: loss 5.6879, time 121.72ms
iter 512390: loss 6.1293, time 121.67ms
iter 512400: loss 6.3844, time 121.79ms
iter 512410: loss 5.7822, time 121.60ms
iter 512420: loss 5.8039, time 121.74ms
iter 512430: loss 5.7826, time 121.59ms
iter 512440: loss 6.3848, time 122.83ms
iter 512450: loss 6.5418, time 121.78ms
iter 512460: loss 5.9034, time 121.73ms
iter 512470: loss 6.0812, time 122.72ms
iter 512480: loss 6.1945, time 122.03ms
iter 512490: loss 5.4319, time 121.77ms
step 512500: train loss 5.5467, val loss 5.5686
saving checkpoint to out-shakespeare-char
iter 512500: loss 6.0049, time 2879.14ms
iter 512510: loss 6.0430, time 125.78ms
iter 512520: loss 6.0232, time 125.65ms
iter 512530: loss 5.9260, time 125.45ms
iter 512540: loss 6.9464, time 125.66ms
iter 512550: loss 6.1743, time 125.49ms
iter 512560: loss 5.3373, time 124.99ms
iter 512570: loss 6.4512, time 125.32ms
iter 512580: loss 6.7399, time 125.74ms
iter 512590: loss 5.5364, time 125.45ms
iter 512600: loss 6.1336, time 125.46ms
iter 512610: loss 6.3102, time 125.65ms
iter 512620: loss 5.8736, time 125.64ms
iter 512630: loss 6.4049, time 125.47ms
iter 512640: loss 6.5443, time 125.31ms
iter 512650: loss 5.9083, time 124.88ms
iter 512660: loss 6.0752, time 125.32ms
iter 512670: loss 5.6843, time 125.89ms
iter 512680: loss 5.3809, time 125.74ms
iter 512690: loss 6.5123, time 125.49ms
iter 512700: loss 5.7178, time 125.89ms
iter 512710: loss 5.6977, time 125.77ms
iter 512720: loss 5.9453, time 125.41ms
iter 512730: loss 4.9420, time 125.65ms
iter 512740: loss 6.5038, time 125.63ms
step 512750: train loss 5.5031, val loss 5.5850
saving checkpoint to out-shakespeare-char
iter 512750: loss 6.0056, time 2892.55ms
iter 512760: loss 6.5888, time 122.91ms
iter 512770: loss 6.5824, time 121.88ms
iter 512780: loss 6.4964, time 121.79ms
iter 512790: loss 6.5293, time 124.28ms
iter 512800: loss 5.9315, time 121.67ms
iter 512810: loss 5.9635, time 121.71ms
iter 512820: loss 5.8050, time 121.78ms
iter 512830: loss 5.5802, time 121.71ms
iter 512840: loss 5.1687, time 121.94ms
iter 512850: loss 6.0169, time 121.73ms
iter 512860: loss 5.8713, time 122.97ms
iter 512870: loss 6.3727, time 121.68ms
iter 512880: loss 6.0036, time 121.88ms
iter 512890: loss 5.7052, time 122.82ms
iter 512900: loss 6.0048, time 121.86ms
iter 512910: loss 5.9193, time 121.85ms
iter 512920: loss 5.5703, time 124.21ms
iter 512930: loss 6.0193, time 121.68ms
iter 512940: loss 5.5736, time 121.80ms
iter 512950: loss 6.3432, time 121.81ms
iter 512960: loss 6.2736, time 122.82ms
iter 512970: loss 6.3327, time 121.59ms
iter 512980: loss 6.5093, time 121.62ms
iter 512990: loss 6.4552, time 122.81ms
step 513000: train loss 5.5103, val loss 5.5691
saving checkpoint to out-shakespeare-char
iter 513000: loss 6.4097, time 2898.80ms
iter 513010: loss 5.3731, time 121.84ms
iter 513020: loss 5.9310, time 121.80ms
iter 513030: loss 6.0512, time 121.62ms
iter 513040: loss 6.1948, time 121.74ms
iter 513050: loss 6.4379, time 121.73ms
iter 513060: loss 5.8904, time 122.85ms
iter 513070: loss 6.0820, time 121.67ms
iter 513080: loss 6.4856, time 121.95ms
iter 513090: loss 5.5463, time 123.11ms
iter 513100: loss 6.1592, time 121.69ms
iter 513110: loss 5.3081, time 121.75ms
iter 513120: loss 5.7089, time 124.37ms
iter 513130: loss 6.6422, time 121.63ms
iter 513140: loss 6.1650, time 121.68ms
iter 513150: loss 5.9863, time 121.75ms
iter 513160: loss 5.8933, time 121.82ms
iter 513170: loss 5.8891, time 121.83ms
iter 513180: loss 5.7237, time 121.77ms
iter 513190: loss 5.7900, time 123.21ms
iter 513200: loss 5.4748, time 121.98ms
iter 513210: loss 5.1544, time 120.82ms
iter 513220: loss 6.0410, time 122.74ms
iter 513230: loss 5.8417, time 121.35ms
iter 513240: loss 5.7064, time 121.96ms
step 513250: train loss 5.5848, val loss 5.5299
saving checkpoint to out-shakespeare-char
iter 513250: loss 6.9732, time 2894.47ms
iter 513260: loss 5.9413, time 125.13ms
iter 513270: loss 5.9525, time 125.19ms
iter 513280: loss 5.7211, time 124.95ms
iter 513290: loss 6.0788, time 124.94ms
iter 513300: loss 6.3406, time 125.05ms
iter 513310: loss 5.6608, time 125.17ms
iter 513320: loss 5.6797, time 125.23ms
iter 513330: loss 5.8588, time 124.18ms
iter 513340: loss 5.3464, time 125.07ms
iter 513350: loss 5.8467, time 125.08ms
iter 513360: loss 6.3686, time 124.94ms
iter 513370: loss 5.1608, time 125.43ms
iter 513380: loss 6.2012, time 124.48ms
iter 513390: loss 5.6266, time 125.64ms
iter 513400: loss 6.2557, time 124.89ms
iter 513410: loss 5.8041, time 125.18ms
iter 513420: loss 5.2376, time 125.49ms
iter 513430: loss 6.1534, time 125.38ms
iter 513440: loss 5.6716, time 125.32ms
iter 513450: loss 6.1750, time 125.76ms
iter 513460: loss 6.6837, time 125.40ms
iter 513470: loss 5.7267, time 125.68ms
iter 513480: loss 5.8256, time 125.57ms
iter 513490: loss 5.4861, time 125.24ms
step 513500: train loss 5.5897, val loss 5.5387
saving checkpoint to out-shakespeare-char
iter 513500: loss 5.8701, time 2888.58ms
iter 513510: loss 6.4418, time 124.22ms
iter 513520: loss 5.7404, time 124.53ms
iter 513530: loss 5.7917, time 125.78ms
iter 513540: loss 5.6124, time 125.77ms
iter 513550: loss 6.4029, time 125.55ms
iter 513560: loss 5.6684, time 125.87ms
iter 513570: loss 6.1073, time 125.58ms
iter 513580: loss 6.1504, time 125.59ms
iter 513590: loss 5.6397, time 125.49ms
iter 513600: loss 6.5791, time 126.43ms
iter 513610: loss 6.7019, time 125.56ms
iter 513620: loss 6.0541, time 126.52ms
iter 513630: loss 5.7989, time 125.35ms
iter 513640: loss 5.9928, time 125.57ms
iter 513650: loss 5.8094, time 126.36ms
iter 513660: loss 5.7929, time 125.26ms
iter 513670: loss 6.1413, time 125.39ms
iter 513680: loss 5.7450, time 124.92ms
iter 513690: loss 5.8589, time 125.04ms
iter 513700: loss 5.6787, time 126.84ms
iter 513710: loss 6.1385, time 125.42ms
iter 513720: loss 5.6860, time 125.45ms
iter 513730: loss 5.8428, time 125.70ms
iter 513740: loss 5.9337, time 125.48ms
step 513750: train loss 5.5562, val loss 5.5339
saving checkpoint to out-shakespeare-char
iter 513750: loss 5.8201, time 2882.59ms
iter 513760: loss 5.8894, time 125.38ms
iter 513770: loss 5.8682, time 125.61ms
iter 513780: loss 6.2589, time 125.27ms
iter 513790: loss 5.4125, time 125.29ms
iter 513800: loss 6.2352, time 125.19ms
iter 513810: loss 6.0785, time 124.54ms
iter 513820: loss 5.9431, time 126.02ms
iter 513830: loss 6.1588, time 126.72ms
iter 513840: loss 5.6427, time 124.87ms
iter 513850: loss 6.7537, time 121.67ms
iter 513860: loss 6.9223, time 122.37ms
iter 513870: loss 5.8101, time 123.36ms
iter 513880: loss 5.2929, time 121.64ms
iter 513890: loss 5.6981, time 121.14ms
iter 513900: loss 5.8957, time 122.25ms
iter 513910: loss 5.8984, time 121.44ms
iter 513920: loss 5.8828, time 121.24ms
iter 513930: loss 6.1410, time 124.19ms
iter 513940: loss 5.9032, time 121.25ms
iter 513950: loss 6.0960, time 121.49ms
iter 513960: loss 5.9287, time 121.53ms
iter 513970: loss 5.6368, time 121.30ms
iter 513980: loss 6.3227, time 121.18ms
iter 513990: loss 6.0294, time 121.59ms
step 514000: train loss 5.5939, val loss 5.5982
saving checkpoint to out-shakespeare-char
iter 514000: loss 5.8593, time 2896.04ms
iter 514010: loss 6.2251, time 125.09ms
iter 514020: loss 6.0008, time 125.28ms
iter 514030: loss 5.8246, time 124.78ms
iter 514040: loss 6.5630, time 124.86ms
iter 514050: loss 5.7474, time 125.24ms
iter 514060: loss 6.0223, time 125.10ms
iter 514070: loss 6.8169, time 125.56ms
iter 514080: loss 6.2835, time 125.13ms
iter 514090: loss 6.8074, time 125.20ms
iter 514100: loss 6.5852, time 124.68ms
iter 514110: loss 6.3500, time 124.78ms
iter 514120: loss 5.4001, time 123.35ms
iter 514130: loss 5.4251, time 125.14ms
iter 514140: loss 5.6524, time 124.48ms
iter 514150: loss 6.1690, time 124.14ms
iter 514160: loss 6.2011, time 124.00ms
iter 514170: loss 6.0871, time 125.32ms
iter 514180: loss 6.4870, time 123.30ms
iter 514190: loss 5.9076, time 125.16ms
iter 514200: loss 5.2172, time 124.72ms
iter 514210: loss 5.7877, time 125.33ms
iter 514220: loss 6.4266, time 124.57ms
iter 514230: loss 6.4695, time 125.36ms
iter 514240: loss 6.2761, time 125.68ms
step 514250: train loss 5.5575, val loss 5.5447
saving checkpoint to out-shakespeare-char
iter 514250: loss 6.4489, time 2898.38ms
iter 514260: loss 6.2218, time 128.30ms
iter 514270: loss 6.0063, time 125.47ms
iter 514280: loss 7.0021, time 127.83ms
iter 514290: loss 6.5690, time 125.40ms
iter 514300: loss 5.4981, time 127.97ms
iter 514310: loss 6.4650, time 125.85ms
iter 514320: loss 6.9881, time 128.35ms
iter 514330: loss 5.5927, time 125.45ms
iter 514340: loss 6.1417, time 128.33ms
iter 514350: loss 6.2112, time 125.93ms
iter 514360: loss 5.4242, time 127.80ms
iter 514370: loss 5.1443, time 126.07ms
iter 514380: loss 5.7016, time 126.14ms
iter 514390: loss 5.7177, time 125.66ms
iter 514400: loss 5.5453, time 125.79ms
iter 514410: loss 5.7013, time 125.62ms
iter 514420: loss 5.7933, time 125.45ms
iter 514430: loss 6.4715, time 125.42ms
iter 514440: loss 6.4014, time 128.12ms
iter 514450: loss 6.3914, time 125.45ms
iter 514460: loss 5.5821, time 127.65ms
iter 514470: loss 6.8199, time 125.26ms
iter 514480: loss 5.3796, time 127.86ms
iter 514490: loss 6.4716, time 125.72ms
step 514500: train loss 5.5753, val loss 5.5564
saving checkpoint to out-shakespeare-char
iter 514500: loss 6.1328, time 2884.94ms
iter 514510: loss 5.4651, time 125.85ms
iter 514520: loss 4.9471, time 125.63ms
iter 514530: loss 6.0930, time 125.87ms
iter 514540: loss 5.7788, time 125.16ms
iter 514550: loss 6.1900, time 125.70ms
iter 514560: loss 5.7287, time 126.23ms
iter 514570: loss 5.7271, time 125.66ms
iter 514580: loss 5.8152, time 125.83ms
iter 514590: loss 5.7721, time 125.68ms
iter 514600: loss 6.2975, time 125.80ms
iter 514610: loss 6.2016, time 125.68ms
iter 514620: loss 6.2616, time 125.59ms
iter 514630: loss 6.0742, time 125.72ms
iter 514640: loss 6.5336, time 125.47ms
iter 514650: loss 5.8961, time 125.46ms
iter 514660: loss 6.0430, time 125.80ms
iter 514670: loss 5.8289, time 125.65ms
iter 514680: loss 5.5983, time 125.89ms
iter 514690: loss 6.4723, time 125.53ms
iter 514700: loss 6.4790, time 125.71ms
iter 514710: loss 5.6499, time 125.58ms
iter 514720: loss 6.3732, time 125.97ms
iter 514730: loss 6.4824, time 126.17ms
iter 514740: loss 6.4421, time 125.76ms
step 514750: train loss 5.5854, val loss 5.5173
saving checkpoint to out-shakespeare-char
iter 514750: loss 5.9516, time 2885.56ms
iter 514760: loss 5.7909, time 125.88ms
iter 514770: loss 6.2019, time 125.82ms
iter 514780: loss 6.0034, time 125.57ms
iter 514790: loss 5.5571, time 125.64ms
iter 514800: loss 6.1265, time 125.61ms
iter 514810: loss 5.9527, time 125.73ms
iter 514820: loss 6.0412, time 126.04ms
iter 514830: loss 6.4030, time 125.59ms
iter 514840: loss 6.3706, time 120.79ms
iter 514850: loss 5.5866, time 122.33ms
iter 514860: loss 6.0354, time 121.26ms
iter 514870: loss 6.2027, time 121.33ms
iter 514880: loss 6.7458, time 121.33ms
iter 514890: loss 6.1577, time 121.06ms
iter 514900: loss 5.6716, time 121.56ms
iter 514910: loss 6.3442, time 121.43ms
iter 514920: loss 6.2050, time 122.72ms
iter 514930: loss 6.8896, time 121.07ms
iter 514940: loss 5.6776, time 121.71ms
iter 514950: loss 6.3629, time 124.13ms
iter 514960: loss 5.1661, time 121.55ms
iter 514970: loss 6.4349, time 121.75ms
iter 514980: loss 5.9925, time 120.92ms
iter 514990: loss 6.1226, time 123.31ms
step 515000: train loss 5.5866, val loss 5.5171
saving checkpoint to out-shakespeare-char
iter 515000: loss 6.0308, time 2893.95ms
iter 515010: loss 6.4624, time 121.64ms
iter 515020: loss 5.6846, time 121.78ms
iter 515030: loss 6.9582, time 121.03ms
iter 515040: loss 6.1525, time 121.70ms
iter 515050: loss 5.3727, time 121.26ms
iter 515060: loss 5.7618, time 124.39ms
iter 515070: loss 5.8222, time 121.63ms
iter 515080: loss 6.4135, time 121.69ms
iter 515090: loss 5.8281, time 121.88ms
iter 515100: loss 5.8337, time 121.68ms
iter 515110: loss 6.5403, time 122.07ms
iter 515120: loss 6.6658, time 121.73ms
iter 515130: loss 5.6721, time 121.79ms
iter 515140: loss 6.1183, time 121.34ms
iter 515150: loss 6.1580, time 120.72ms
iter 515160: loss 6.4962, time 121.27ms
iter 515170: loss 6.1698, time 120.83ms
iter 515180: loss 5.9244, time 121.83ms
iter 515190: loss 6.0997, time 121.87ms
iter 515200: loss 6.2542, time 121.75ms
iter 515210: loss 6.5343, time 122.36ms
iter 515220: loss 5.2344, time 122.16ms
iter 515230: loss 6.0786, time 122.25ms
iter 515240: loss 6.1010, time 124.07ms
step 515250: train loss 5.4902, val loss 5.4592
saving checkpoint to out-shakespeare-char
iter 515250: loss 6.7153, time 2888.35ms
iter 515260: loss 5.7892, time 125.84ms
iter 515270: loss 5.4566, time 126.11ms
iter 515280: loss 6.1141, time 125.73ms
iter 515290: loss 5.2474, time 125.72ms
iter 515300: loss 5.6100, time 125.54ms
iter 515310: loss 5.6662, time 125.97ms
iter 515320: loss 5.9281, time 125.44ms
iter 515330: loss 6.1694, time 125.70ms
iter 515340: loss 6.0806, time 125.88ms
iter 515350: loss 6.2387, time 126.00ms
iter 515360: loss 6.1531, time 126.07ms
iter 515370: loss 5.9876, time 125.98ms
iter 515380: loss 5.0315, time 125.64ms
iter 515390: loss 6.0624, time 125.94ms
iter 515400: loss 5.7453, time 126.03ms
iter 515410: loss 5.7434, time 126.01ms
iter 515420: loss 5.8961, time 124.39ms
iter 515430: loss 6.4853, time 125.48ms
iter 515440: loss 6.2526, time 125.32ms
iter 515450: loss 5.8810, time 126.08ms
iter 515460: loss 6.0256, time 125.25ms
iter 515470: loss 5.1942, time 125.23ms
iter 515480: loss 6.6298, time 125.12ms
iter 515490: loss 6.1035, time 125.80ms
step 515500: train loss 5.5214, val loss 5.5252
saving checkpoint to out-shakespeare-char
iter 515500: loss 6.1557, time 2891.18ms
iter 515510: loss 5.8282, time 125.60ms
iter 515520: loss 5.8631, time 125.58ms
iter 515530: loss 6.0083, time 124.68ms
iter 515540: loss 6.3471, time 126.11ms
iter 515550: loss 6.6423, time 125.99ms
iter 515560: loss 5.9418, time 125.48ms
iter 515570: loss 6.3808, time 125.89ms
iter 515580: loss 5.7889, time 125.67ms
iter 515590: loss 5.3736, time 125.53ms
iter 515600: loss 6.4009, time 125.81ms
iter 515610: loss 6.6129, time 125.84ms
iter 515620: loss 6.2750, time 125.56ms
iter 515630: loss 5.6250, time 124.82ms
iter 515640: loss 5.9779, time 128.01ms
iter 515650: loss 5.5352, time 125.69ms
iter 515660: loss 5.9688, time 128.54ms
iter 515670: loss 5.7202, time 126.18ms
iter 515680: loss 5.8328, time 128.01ms
iter 515690: loss 6.1509, time 125.91ms
iter 515700: loss 5.4918, time 127.91ms
iter 515710: loss 6.6497, time 125.80ms
iter 515720: loss 5.9209, time 128.46ms
iter 515730: loss 6.4852, time 125.72ms
iter 515740: loss 5.6261, time 128.07ms
step 515750: train loss 5.5709, val loss 5.5899
saving checkpoint to out-shakespeare-char
iter 515750: loss 5.7463, time 2890.47ms
iter 515760: loss 5.7371, time 124.41ms
iter 515770: loss 5.9786, time 124.82ms
iter 515780: loss 5.9010, time 123.99ms
iter 515790: loss 5.7046, time 124.41ms
iter 515800: loss 6.6929, time 123.83ms
iter 515810: loss 5.8562, time 124.45ms
iter 515820: loss 6.1897, time 125.47ms
iter 515830: loss 5.7964, time 125.13ms
iter 515840: loss 5.1826, time 126.47ms
iter 515850: loss 5.9776, time 124.93ms
iter 515860: loss 6.2607, time 125.22ms
iter 515870: loss 6.5687, time 124.63ms
iter 515880: loss 6.7625, time 125.03ms
iter 515890: loss 5.8626, time 125.48ms
iter 515900: loss 5.6577, time 125.92ms
iter 515910: loss 5.9619, time 125.07ms
iter 515920: loss 5.1768, time 125.08ms
iter 515930: loss 6.5853, time 124.94ms
iter 515940: loss 5.7047, time 125.48ms
iter 515950: loss 5.7009, time 124.71ms
iter 515960: loss 6.0249, time 124.46ms
iter 515970: loss 6.5583, time 124.87ms
iter 515980: loss 6.0434, time 124.39ms
iter 515990: loss 6.3884, time 124.86ms
step 516000: train loss 5.5140, val loss 5.5316
saving checkpoint to out-shakespeare-char
iter 516000: loss 5.6098, time 2900.91ms
iter 516010: loss 5.8563, time 125.08ms
iter 516020: loss 5.9517, time 129.66ms
iter 516030: loss 6.4494, time 126.10ms
iter 516040: loss 5.3568, time 128.35ms
iter 516050: loss 6.1387, time 123.93ms
iter 516060: loss 5.5013, time 127.64ms
iter 516070: loss 6.4617, time 125.29ms
iter 516080: loss 5.4586, time 126.72ms
iter 516090: loss 5.9920, time 125.36ms
iter 516100: loss 6.0946, time 128.06ms
iter 516110: loss 6.5770, time 124.91ms
iter 516120: loss 6.0422, time 126.79ms
iter 516130: loss 5.6578, time 124.95ms
iter 516140: loss 6.5788, time 127.60ms
iter 516150: loss 6.0145, time 125.29ms
iter 516160: loss 6.0963, time 126.66ms
iter 516170: loss 5.3609, time 124.40ms
iter 516180: loss 5.6260, time 127.57ms
iter 516190: loss 6.2332, time 124.82ms
iter 516200: loss 5.5314, time 127.43ms
iter 516210: loss 5.6907, time 125.13ms
iter 516220: loss 5.7139, time 127.49ms
iter 516230: loss 6.0636, time 125.46ms
iter 516240: loss 5.2826, time 125.75ms
step 516250: train loss 5.5117, val loss 5.4982
saving checkpoint to out-shakespeare-char
iter 516250: loss 5.7031, time 2889.00ms
iter 516260: loss 6.6660, time 124.77ms
iter 516270: loss 5.7397, time 124.91ms
iter 516280: loss 5.8072, time 123.98ms
iter 516290: loss 6.5852, time 125.32ms
iter 516300: loss 6.2335, time 125.05ms
iter 516310: loss 6.0080, time 125.10ms
iter 516320: loss 6.1394, time 123.56ms
iter 516330: loss 6.4091, time 125.32ms
iter 516340: loss 6.0898, time 125.79ms
iter 516350: loss 6.0601, time 124.58ms
iter 516360: loss 5.7334, time 122.98ms
iter 516370: loss 6.4762, time 125.42ms
iter 516380: loss 6.0145, time 123.41ms
iter 516390: loss 7.0124, time 125.37ms
iter 516400: loss 6.3039, time 123.21ms
iter 516410: loss 5.9088, time 125.73ms
iter 516420: loss 5.9514, time 123.84ms
iter 516430: loss 6.5503, time 125.45ms
iter 516440: loss 6.1156, time 125.26ms
iter 516450: loss 6.7532, time 125.04ms
iter 516460: loss 6.4574, time 125.50ms
iter 516470: loss 5.6100, time 126.02ms
iter 516480: loss 6.4740, time 125.56ms
iter 516490: loss 6.3824, time 125.48ms
step 516500: train loss 5.5360, val loss 5.5357
saving checkpoint to out-shakespeare-char
iter 516500: loss 6.2195, time 2879.65ms
iter 516510: loss 5.6377, time 125.69ms
iter 516520: loss 5.6088, time 126.19ms
iter 516530: loss 5.6212, time 123.47ms
iter 516540: loss 6.2607, time 124.75ms
iter 516550: loss 5.8306, time 124.05ms
iter 516560: loss 6.6915, time 125.04ms
iter 516570: loss 5.5379, time 124.44ms
iter 516580: loss 5.3372, time 125.11ms
iter 516590: loss 5.6634, time 125.17ms
iter 516600: loss 6.0870, time 124.93ms
iter 516610: loss 6.1888, time 124.27ms
iter 516620: loss 5.5515, time 124.67ms
iter 516630: loss 6.3745, time 124.74ms
iter 516640: loss 6.0945, time 124.84ms
iter 516650: loss 5.8289, time 124.49ms
iter 516660: loss 5.4332, time 124.00ms
iter 516670: loss 6.2274, time 124.59ms
iter 516680: loss 5.9098, time 124.35ms
iter 516690: loss 5.7689, time 123.76ms
iter 516700: loss 6.2585, time 124.95ms
iter 516710: loss 6.1438, time 124.14ms
iter 516720: loss 5.7787, time 125.72ms
iter 516730: loss 5.5623, time 124.45ms
iter 516740: loss 6.1363, time 125.45ms
step 516750: train loss 5.5475, val loss 5.5149
saving checkpoint to out-shakespeare-char
iter 516750: loss 5.9558, time 2870.03ms
iter 516760: loss 5.7332, time 125.80ms
iter 516770: loss 5.6378, time 125.66ms
iter 516780: loss 5.8767, time 124.14ms
iter 516790: loss 5.4344, time 125.17ms
iter 516800: loss 6.2308, time 125.20ms
iter 516810: loss 5.8382, time 124.59ms
iter 516820: loss 6.3757, time 124.07ms
iter 516830: loss 6.7700, time 125.68ms
iter 516840: loss 6.3720, time 125.34ms
iter 516850: loss 6.3928, time 125.50ms
iter 516860: loss 6.5105, time 124.33ms
iter 516870: loss 5.9801, time 125.09ms
iter 516880: loss 6.2649, time 125.42ms
iter 516890: loss 5.8622, time 125.67ms
iter 516900: loss 5.8063, time 127.01ms
iter 516910: loss 6.0678, time 125.09ms
iter 516920: loss 6.9836, time 127.79ms
iter 516930: loss 5.3395, time 125.30ms
iter 516940: loss 6.0033, time 127.74ms
iter 516950: loss 6.3150, time 124.97ms
iter 516960: loss 5.6092, time 127.65ms
iter 516970: loss 5.6705, time 124.81ms
iter 516980: loss 6.4499, time 128.19ms
iter 516990: loss 5.6355, time 124.99ms
step 517000: train loss 5.5148, val loss 5.5433
saving checkpoint to out-shakespeare-char
iter 517000: loss 5.9711, time 2910.59ms
iter 517010: loss 5.8485, time 125.29ms
iter 517020: loss 5.4263, time 125.11ms
iter 517030: loss 6.3031, time 124.67ms
iter 517040: loss 6.1075, time 125.08ms
iter 517050: loss 6.3641, time 125.34ms
iter 517060: loss 5.3422, time 125.21ms
iter 517070: loss 6.5535, time 125.62ms
iter 517080: loss 6.3976, time 125.36ms
iter 517090: loss 6.9454, time 125.24ms
iter 517100: loss 5.5019, time 125.19ms
iter 517110: loss 5.7143, time 125.24ms
iter 517120: loss 6.4313, time 127.70ms
iter 517130: loss 6.3728, time 125.06ms
iter 517140: loss 6.2062, time 128.08ms
iter 517150: loss 5.2424, time 125.06ms
iter 517160: loss 5.7100, time 127.78ms
iter 517170: loss 5.9322, time 125.03ms
iter 517180: loss 5.5250, time 127.68ms
iter 517190: loss 5.7561, time 125.11ms
iter 517200: loss 6.3651, time 127.56ms
iter 517210: loss 6.3048, time 125.11ms
iter 517220: loss 5.9768, time 127.66ms
iter 517230: loss 5.7140, time 124.99ms
iter 517240: loss 5.8179, time 126.58ms
step 517250: train loss 5.5641, val loss 5.5559
saving checkpoint to out-shakespeare-char
iter 517250: loss 5.5374, time 2889.33ms
iter 517260: loss 5.7919, time 125.45ms
iter 517270: loss 5.5013, time 125.01ms
iter 517280: loss 6.3564, time 124.83ms
iter 517290: loss 6.0569, time 125.05ms
iter 517300: loss 5.9094, time 125.10ms
iter 517310: loss 5.8459, time 125.13ms
iter 517320: loss 6.0768, time 127.69ms
iter 517330: loss 6.3825, time 125.13ms
iter 517340: loss 7.1074, time 127.53ms
iter 517350: loss 6.7271, time 125.39ms
iter 517360: loss 6.3122, time 127.76ms
iter 517370: loss 5.9153, time 125.45ms
iter 517380: loss 5.9713, time 128.11ms
iter 517390: loss 6.0074, time 124.74ms
iter 517400: loss 6.3781, time 127.59ms
iter 517410: loss 6.2884, time 125.36ms
iter 517420: loss 5.8517, time 125.14ms
iter 517430: loss 5.3865, time 125.70ms
iter 517440: loss 5.6429, time 125.26ms
iter 517450: loss 5.8750, time 125.24ms
iter 517460: loss 6.9698, time 125.09ms
iter 517470: loss 6.1511, time 125.85ms
iter 517480: loss 5.8033, time 124.95ms
iter 517490: loss 6.1672, time 124.76ms
step 517500: train loss 5.5450, val loss 5.5280
saving checkpoint to out-shakespeare-char
iter 517500: loss 5.6849, time 2874.41ms
iter 517510: loss 5.8631, time 125.18ms
iter 517520: loss 5.9821, time 125.34ms
iter 517530: loss 6.1083, time 126.24ms
iter 517540: loss 5.1928, time 125.19ms
iter 517550: loss 6.1864, time 125.37ms
iter 517560: loss 5.8924, time 124.82ms
iter 517570: loss 5.7054, time 125.62ms
iter 517580: loss 5.7056, time 125.21ms
iter 517590: loss 5.3165, time 125.15ms
iter 517600: loss 5.7828, time 125.38ms
iter 517610: loss 6.6570, time 125.15ms
iter 517620: loss 5.6440, time 125.13ms
iter 517630: loss 5.6807, time 125.23ms
iter 517640: loss 6.0375, time 124.87ms
iter 517650: loss 5.3426, time 125.23ms
iter 517660: loss 5.5087, time 125.16ms
iter 517670: loss 5.8569, time 125.18ms
iter 517680: loss 5.4810, time 124.99ms
iter 517690: loss 5.8285, time 125.45ms
iter 517700: loss 6.5692, time 125.04ms
iter 517710: loss 5.8217, time 125.26ms
iter 517720: loss 5.4426, time 124.85ms
iter 517730: loss 6.2066, time 125.20ms
iter 517740: loss 6.6419, time 125.13ms
step 517750: train loss 5.5235, val loss 5.5457
saving checkpoint to out-shakespeare-char
iter 517750: loss 5.9105, time 2884.04ms
iter 517760: loss 6.0809, time 121.96ms
iter 517770: loss 6.1933, time 121.86ms
iter 517780: loss 5.3048, time 122.96ms
iter 517790: loss 5.7332, time 121.14ms
iter 517800: loss 5.9170, time 121.69ms
iter 517810: loss 6.3404, time 122.85ms
iter 517820: loss 5.5478, time 121.67ms
iter 517830: loss 5.4335, time 121.70ms
iter 517840: loss 5.3549, time 124.56ms
iter 517850: loss 5.5867, time 121.17ms
iter 517860: loss 6.2594, time 121.87ms
iter 517870: loss 5.5869, time 121.70ms
iter 517880: loss 6.1876, time 120.81ms
iter 517890: loss 5.9558, time 121.61ms
iter 517900: loss 5.3987, time 121.36ms
iter 517910: loss 5.8264, time 122.02ms
iter 517920: loss 5.5117, time 121.88ms
iter 517930: loss 6.0341, time 121.80ms
iter 517940: loss 5.6683, time 122.92ms
iter 517950: loss 6.1087, time 121.85ms
iter 517960: loss 5.7830, time 120.63ms
iter 517970: loss 6.4462, time 124.21ms
iter 517980: loss 6.4497, time 121.65ms
iter 517990: loss 6.1441, time 121.82ms
step 518000: train loss 5.5603, val loss 5.5881
saving checkpoint to out-shakespeare-char
iter 518000: loss 6.0496, time 2883.87ms
iter 518010: loss 6.3593, time 121.74ms
iter 518020: loss 6.2878, time 121.77ms
iter 518030: loss 5.8890, time 121.40ms
iter 518040: loss 6.7905, time 123.06ms
iter 518050: loss 5.8530, time 121.71ms
iter 518060: loss 6.1367, time 121.86ms
iter 518070: loss 5.5791, time 124.42ms
iter 518080: loss 5.5898, time 121.38ms
iter 518090: loss 6.1622, time 121.83ms
iter 518100: loss 5.6838, time 121.66ms
iter 518110: loss 5.8976, time 121.86ms
iter 518120: loss 5.7130, time 121.91ms
iter 518130: loss 5.7159, time 121.02ms
iter 518140: loss 6.4238, time 123.64ms
iter 518150: loss 6.7382, time 121.51ms
iter 518160: loss 5.9075, time 121.57ms
iter 518170: loss 5.4401, time 120.85ms
iter 518180: loss 4.8364, time 122.63ms
iter 518190: loss 6.1723, time 121.68ms
iter 518200: loss 5.6708, time 121.41ms
iter 518210: loss 6.2315, time 121.51ms
iter 518220: loss 6.0562, time 121.66ms
iter 518230: loss 6.2708, time 121.40ms
iter 518240: loss 5.2272, time 121.39ms
step 518250: train loss 5.5068, val loss 5.5121
saving checkpoint to out-shakespeare-char
iter 518250: loss 6.5516, time 2898.44ms
iter 518260: loss 6.4993, time 121.08ms
iter 518270: loss 5.8741, time 121.57ms
iter 518280: loss 6.1142, time 126.32ms
iter 518290: loss 6.1827, time 125.75ms
iter 518300: loss 6.1538, time 125.50ms
iter 518310: loss 6.3256, time 125.01ms
iter 518320: loss 5.1903, time 125.55ms
iter 518330: loss 6.0442, time 125.44ms
iter 518340: loss 6.0143, time 126.37ms
iter 518350: loss 5.3515, time 125.52ms
iter 518360: loss 5.8824, time 125.62ms
iter 518370: loss 5.8047, time 121.50ms
iter 518380: loss 6.4811, time 122.93ms
iter 518390: loss 5.9245, time 121.40ms
iter 518400: loss 6.3347, time 121.92ms
iter 518410: loss 5.9428, time 123.87ms
iter 518420: loss 6.4872, time 122.19ms
iter 518430: loss 5.9162, time 121.72ms
iter 518440: loss 5.8748, time 120.59ms
iter 518450: loss 5.6943, time 122.08ms
iter 518460: loss 6.8674, time 121.95ms
iter 518470: loss 5.9737, time 121.33ms
iter 518480: loss 6.2191, time 122.82ms
iter 518490: loss 6.2354, time 121.58ms
step 518500: train loss 5.5592, val loss 5.5567
saving checkpoint to out-shakespeare-char
iter 518500: loss 5.9929, time 2884.10ms
iter 518510: loss 5.8555, time 126.20ms
iter 518520: loss 5.6057, time 125.39ms
iter 518530: loss 5.4841, time 125.91ms
iter 518540: loss 6.0680, time 125.04ms
iter 518550: loss 5.9486, time 125.52ms
iter 518560: loss 6.1521, time 124.62ms
iter 518570: loss 5.9548, time 124.67ms
iter 518580: loss 6.0345, time 124.79ms
iter 518590: loss 6.1919, time 124.44ms
iter 518600: loss 6.4379, time 124.50ms
iter 518610: loss 6.2465, time 124.76ms
iter 518620: loss 6.6606, time 124.84ms
iter 518630: loss 5.8447, time 124.23ms
iter 518640: loss 6.0202, time 124.46ms
iter 518650: loss 5.6422, time 124.55ms
iter 518660: loss 6.2853, time 124.79ms
iter 518670: loss 5.7891, time 124.83ms
iter 518680: loss 6.2235, time 124.47ms
iter 518690: loss 6.4402, time 124.75ms
iter 518700: loss 6.1734, time 124.53ms
iter 518710: loss 5.1384, time 124.52ms
iter 518720: loss 6.1954, time 124.97ms
iter 518730: loss 6.1748, time 124.41ms
iter 518740: loss 4.8387, time 124.95ms
step 518750: train loss 5.5727, val loss 5.5870
saving checkpoint to out-shakespeare-char
iter 518750: loss 6.0629, time 2891.45ms
iter 518760: loss 6.0012, time 125.64ms
iter 518770: loss 5.5634, time 125.68ms
iter 518780: loss 5.7812, time 125.40ms
iter 518790: loss 6.0372, time 125.71ms
iter 518800: loss 5.2424, time 125.23ms
iter 518810: loss 6.5462, time 125.36ms
iter 518820: loss 5.2516, time 125.83ms
iter 518830: loss 6.3211, time 125.71ms
iter 518840: loss 5.6784, time 125.70ms
iter 518850: loss 5.2054, time 126.09ms
iter 518860: loss 5.6791, time 125.63ms
iter 518870: loss 6.3964, time 125.43ms
iter 518880: loss 6.3278, time 125.64ms
iter 518890: loss 5.8679, time 126.82ms
iter 518900: loss 5.7405, time 125.52ms
iter 518910: loss 6.3154, time 125.59ms
iter 518920: loss 5.5440, time 125.81ms
iter 518930: loss 5.7618, time 126.15ms
iter 518940: loss 6.1746, time 124.69ms
iter 518950: loss 5.6220, time 124.47ms
iter 518960: loss 5.9855, time 124.52ms
iter 518970: loss 5.8151, time 125.02ms
iter 518980: loss 6.5320, time 121.42ms
iter 518990: loss 5.7987, time 122.21ms
step 519000: train loss 5.5297, val loss 5.4732
saving checkpoint to out-shakespeare-char
iter 519000: loss 6.1053, time 2861.47ms
iter 519010: loss 6.1774, time 121.12ms
iter 519020: loss 6.0852, time 121.21ms
iter 519030: loss 5.8510, time 121.41ms
iter 519040: loss 5.8765, time 122.49ms
iter 519050: loss 5.7009, time 121.42ms
iter 519060: loss 6.1160, time 121.28ms
iter 519070: loss 5.4721, time 120.75ms
iter 519080: loss 5.9632, time 121.25ms
iter 519090: loss 5.7061, time 121.51ms
iter 519100: loss 5.9419, time 121.42ms
iter 519110: loss 6.0833, time 123.42ms
iter 519120: loss 6.5459, time 121.60ms
iter 519130: loss 6.0887, time 121.58ms
iter 519140: loss 6.0486, time 123.99ms
iter 519150: loss 7.0895, time 122.61ms
iter 519160: loss 5.9007, time 121.46ms
iter 519170: loss 5.7041, time 121.51ms
iter 519180: loss 6.2265, time 122.56ms
iter 519190: loss 6.0915, time 121.57ms
iter 519200: loss 6.1132, time 121.56ms
iter 519210: loss 6.2765, time 124.44ms
iter 519220: loss 6.2128, time 121.61ms
iter 519230: loss 5.4790, time 121.28ms
iter 519240: loss 5.5964, time 121.71ms
step 519250: train loss 5.4986, val loss 5.5370
saving checkpoint to out-shakespeare-char
iter 519250: loss 6.4443, time 2907.04ms
iter 519260: loss 4.8607, time 125.01ms
iter 519270: loss 6.5002, time 124.81ms
iter 519280: loss 6.0097, time 125.00ms
iter 519290: loss 5.3035, time 125.11ms
iter 519300: loss 6.3513, time 124.88ms
iter 519310: loss 5.1295, time 125.46ms
iter 519320: loss 6.7111, time 124.87ms
iter 519330: loss 6.1812, time 125.04ms
iter 519340: loss 5.8903, time 124.64ms
iter 519350: loss 5.7428, time 124.92ms
iter 519360: loss 6.5810, time 124.83ms
iter 519370: loss 6.2925, time 124.94ms
iter 519380: loss 5.9217, time 124.89ms
iter 519390: loss 6.3467, time 125.18ms
iter 519400: loss 6.2954, time 124.82ms
iter 519410: loss 6.2483, time 124.91ms
iter 519420: loss 6.5151, time 124.79ms
iter 519430: loss 5.6772, time 125.07ms
iter 519440: loss 6.7568, time 125.10ms
iter 519450: loss 5.5152, time 124.33ms
iter 519460: loss 6.2902, time 125.01ms
iter 519470: loss 6.1082, time 126.22ms
iter 519480: loss 6.2291, time 124.61ms
iter 519490: loss 5.8165, time 124.13ms
step 519500: train loss 5.5701, val loss 5.5232
saving checkpoint to out-shakespeare-char
iter 519500: loss 5.5641, time 2893.70ms
iter 519510: loss 5.9890, time 126.80ms
iter 519520: loss 5.4478, time 125.31ms
iter 519530: loss 6.9479, time 127.89ms
iter 519540: loss 5.2228, time 125.18ms
iter 519550: loss 6.2793, time 127.70ms
iter 519560: loss 6.3847, time 125.19ms
iter 519570: loss 6.8966, time 125.07ms
iter 519580: loss 5.4984, time 124.68ms
iter 519590: loss 6.2513, time 124.91ms
iter 519600: loss 6.2219, time 125.01ms
iter 519610: loss 6.8434, time 125.59ms
iter 519620: loss 5.9638, time 125.14ms
iter 519630: loss 4.9485, time 124.85ms
iter 519640: loss 6.5287, time 124.41ms
iter 519650: loss 5.5180, time 127.52ms
iter 519660: loss 5.8908, time 124.94ms
iter 519670: loss 6.6771, time 124.90ms
iter 519680: loss 6.4038, time 124.82ms
iter 519690: loss 5.7484, time 125.22ms
iter 519700: loss 6.4353, time 125.01ms
iter 519710: loss 5.8074, time 124.46ms
iter 519720: loss 5.0855, time 125.01ms
iter 519730: loss 5.6722, time 124.89ms
iter 519740: loss 5.7896, time 124.25ms
step 519750: train loss 5.5516, val loss 5.5624
saving checkpoint to out-shakespeare-char
iter 519750: loss 6.2578, time 2897.44ms
iter 519760: loss 5.5649, time 125.07ms
iter 519770: loss 5.6751, time 124.91ms
iter 519780: loss 6.0599, time 125.21ms
iter 519790: loss 6.0746, time 125.02ms
iter 519800: loss 6.0684, time 124.86ms
iter 519810: loss 5.7157, time 125.07ms
iter 519820: loss 6.0675, time 125.03ms
iter 519830: loss 6.0072, time 125.21ms
iter 519840: loss 5.4828, time 124.94ms
iter 519850: loss 6.8407, time 125.23ms
iter 519860: loss 6.0428, time 125.33ms
iter 519870: loss 5.8673, time 124.92ms
iter 519880: loss 5.6267, time 124.84ms
iter 519890: loss 5.2591, time 125.36ms
iter 519900: loss 5.7399, time 125.03ms
iter 519910: loss 5.4133, time 124.99ms
iter 519920: loss 5.9191, time 125.04ms
iter 519930: loss 6.7130, time 124.83ms
iter 519940: loss 6.1683, time 125.20ms
iter 519950: loss 5.8738, time 124.77ms
iter 519960: loss 6.0790, time 124.63ms
iter 519970: loss 6.0734, time 124.84ms
iter 519980: loss 6.3203, time 124.71ms
iter 519990: loss 5.8335, time 124.97ms
step 520000: train loss 5.5704, val loss 5.5445
saving checkpoint to out-shakespeare-char
iter 520000: loss 5.5313, time 2881.58ms
iter 520010: loss 5.7009, time 125.49ms
iter 520020: loss 6.0804, time 125.13ms
iter 520030: loss 6.1488, time 124.37ms
iter 520040: loss 5.4633, time 125.33ms
iter 520050: loss 6.7185, time 125.37ms
iter 520060: loss 5.9944, time 125.48ms
iter 520070: loss 6.0443, time 125.02ms
iter 520080: loss 6.2322, time 124.33ms
iter 520090: loss 5.9238, time 124.78ms
iter 520100: loss 6.2973, time 125.06ms
iter 520110: loss 6.3011, time 124.97ms
iter 520120: loss 6.3639, time 125.19ms
iter 520130: loss 6.3366, time 125.18ms
iter 520140: loss 6.1039, time 124.69ms
iter 520150: loss 5.8578, time 125.49ms
iter 520160: loss 5.6994, time 124.94ms
iter 520170: loss 5.4417, time 125.05ms
iter 520180: loss 6.4990, time 124.72ms
iter 520190: loss 5.4720, time 125.20ms
iter 520200: loss 6.4364, time 124.38ms
iter 520210: loss 5.7640, time 125.11ms
iter 520220: loss 5.9345, time 124.96ms
iter 520230: loss 6.3771, time 124.79ms
iter 520240: loss 6.1252, time 124.75ms
step 520250: train loss 5.5402, val loss 5.4955
saving checkpoint to out-shakespeare-char
iter 520250: loss 6.4619, time 2889.80ms
iter 520260: loss 5.6834, time 125.72ms
iter 520270: loss 5.5938, time 123.64ms
iter 520280: loss 6.4035, time 125.10ms
iter 520290: loss 5.6189, time 124.61ms
iter 520300: loss 5.8819, time 125.14ms
iter 520310: loss 5.6108, time 124.93ms
iter 520320: loss 5.8452, time 125.06ms
iter 520330: loss 5.5181, time 124.77ms
iter 520340: loss 5.9097, time 124.76ms
iter 520350: loss 5.9612, time 121.35ms
iter 520360: loss 5.3460, time 121.29ms
iter 520370: loss 5.8265, time 122.50ms
iter 520380: loss 5.2349, time 121.46ms
iter 520390: loss 5.7169, time 121.42ms
iter 520400: loss 5.7485, time 121.32ms
iter 520410: loss 6.1521, time 122.67ms
iter 520420: loss 5.7963, time 121.55ms
iter 520430: loss 5.9146, time 121.47ms
iter 520440: loss 5.9216, time 123.99ms
iter 520450: loss 5.6318, time 121.49ms
iter 520460: loss 6.0735, time 121.58ms
iter 520470: loss 6.1016, time 121.79ms
iter 520480: loss 5.7611, time 121.69ms
iter 520490: loss 5.7085, time 122.09ms
step 520500: train loss 5.5460, val loss 5.5379
saving checkpoint to out-shakespeare-char
iter 520500: loss 6.2756, time 2903.58ms
iter 520510: loss 6.1245, time 126.71ms
iter 520520: loss 5.7862, time 125.73ms
iter 520530: loss 5.8546, time 125.62ms
iter 520540: loss 6.6288, time 125.01ms
iter 520550: loss 5.6791, time 124.88ms
iter 520560: loss 5.4574, time 125.13ms
iter 520570: loss 6.1262, time 125.71ms
iter 520580: loss 5.4186, time 125.10ms
iter 520590: loss 5.3945, time 124.21ms
iter 520600: loss 6.5245, time 125.05ms
iter 520610: loss 6.0815, time 125.25ms
iter 520620: loss 6.2477, time 125.11ms
iter 520630: loss 5.9034, time 124.18ms
iter 520640: loss 6.3714, time 124.90ms
iter 520650: loss 5.0199, time 124.78ms
iter 520660: loss 5.8410, time 125.03ms
iter 520670: loss 5.4431, time 124.11ms
iter 520680: loss 5.8277, time 124.99ms
iter 520690: loss 5.5866, time 125.26ms
iter 520700: loss 5.7119, time 124.87ms
iter 520710: loss 5.7548, time 124.69ms
iter 520720: loss 6.3102, time 125.17ms
iter 520730: loss 5.2627, time 124.22ms
iter 520740: loss 5.8623, time 125.16ms
step 520750: train loss 5.5461, val loss 5.5601
saving checkpoint to out-shakespeare-char
iter 520750: loss 6.1989, time 2896.22ms
iter 520760: loss 5.8882, time 123.65ms
iter 520770: loss 5.9680, time 124.69ms
iter 520780: loss 5.6603, time 126.46ms
iter 520790: loss 5.2889, time 125.06ms
iter 520800: loss 5.3952, time 126.48ms
iter 520810: loss 5.5363, time 125.03ms
iter 520820: loss 6.1722, time 126.76ms
iter 520830: loss 6.0980, time 125.21ms
iter 520840: loss 6.2887, time 126.51ms
iter 520850: loss 5.5209, time 126.76ms
iter 520860: loss 5.2579, time 123.94ms
iter 520870: loss 6.3536, time 124.97ms
iter 520880: loss 5.9893, time 124.37ms
iter 520890: loss 6.1736, time 124.97ms
iter 520900: loss 5.8146, time 124.21ms
iter 520910: loss 6.7364, time 124.98ms
iter 520920: loss 6.0475, time 123.82ms
iter 520930: loss 5.3948, time 124.78ms
iter 520940: loss 6.4859, time 124.73ms
iter 520950: loss 5.8518, time 123.90ms
iter 520960: loss 5.7762, time 125.04ms
iter 520970: loss 6.1352, time 125.19ms
iter 520980: loss 6.0365, time 124.90ms
iter 520990: loss 5.8657, time 124.80ms
step 521000: train loss 5.5450, val loss 5.5083
saving checkpoint to out-shakespeare-char
iter 521000: loss 6.1205, time 2893.17ms
iter 521010: loss 5.0170, time 123.40ms
iter 521020: loss 6.0561, time 121.59ms
iter 521030: loss 5.4522, time 121.62ms
iter 521040: loss 5.6184, time 122.61ms
iter 521050: loss 6.5403, time 122.06ms
iter 521060: loss 5.4778, time 121.87ms
iter 521070: loss 6.7009, time 124.10ms
iter 521080: loss 6.4284, time 121.61ms
iter 521090: loss 5.6863, time 121.37ms
iter 521100: loss 5.8765, time 121.85ms
iter 521110: loss 5.8981, time 121.96ms
iter 521120: loss 6.1949, time 121.77ms
iter 521130: loss 5.3749, time 121.31ms
iter 521140: loss 5.8582, time 122.74ms
iter 521150: loss 6.1281, time 121.00ms
iter 521160: loss 6.4931, time 121.44ms
iter 521170: loss 5.4641, time 123.24ms
iter 521180: loss 6.5980, time 121.13ms
iter 521190: loss 6.0482, time 121.41ms
iter 521200: loss 4.9272, time 121.35ms
iter 521210: loss 5.6028, time 122.39ms
iter 521220: loss 6.4918, time 121.52ms
iter 521230: loss 5.5969, time 121.66ms
iter 521240: loss 6.5822, time 122.65ms
step 521250: train loss 5.5159, val loss 5.5473
saving checkpoint to out-shakespeare-char
iter 521250: loss 5.6829, time 2894.39ms
iter 521260: loss 6.0637, time 121.46ms
iter 521270: loss 6.9697, time 121.34ms
iter 521280: loss 6.2040, time 122.87ms
iter 521290: loss 6.2633, time 121.55ms
iter 521300: loss 6.0330, time 120.87ms
iter 521310: loss 5.7517, time 124.20ms
iter 521320: loss 6.1943, time 122.29ms
iter 521330: loss 5.4801, time 121.47ms
iter 521340: loss 6.5261, time 121.84ms
iter 521350: loss 5.9294, time 121.58ms
iter 521360: loss 6.5709, time 121.41ms
iter 521370: loss 6.4301, time 121.67ms
iter 521380: loss 5.6834, time 123.14ms
iter 521390: loss 5.7050, time 121.55ms
iter 521400: loss 6.0087, time 121.66ms
iter 521410: loss 6.3746, time 122.55ms
iter 521420: loss 5.8087, time 120.73ms
iter 521430: loss 6.8250, time 122.05ms
iter 521440: loss 5.8996, time 124.65ms
iter 521450: loss 5.9249, time 122.04ms
iter 521460: loss 5.6075, time 121.54ms
iter 521470: loss 5.9498, time 120.88ms
iter 521480: loss 5.4850, time 121.64ms
iter 521490: loss 5.0280, time 121.67ms
step 521500: train loss 5.5264, val loss 5.6165
saving checkpoint to out-shakespeare-char
iter 521500: loss 6.2348, time 2903.18ms
iter 521510: loss 6.0798, time 124.16ms
iter 521520: loss 5.9043, time 121.57ms
iter 521530: loss 5.9988, time 121.77ms
iter 521540: loss 6.1186, time 121.37ms
iter 521550: loss 5.6716, time 122.38ms
iter 521560: loss 5.7186, time 121.82ms
iter 521570: loss 6.1037, time 121.62ms
iter 521580: loss 5.9539, time 122.76ms
iter 521590: loss 5.7128, time 121.46ms
iter 521600: loss 5.7013, time 121.49ms
iter 521610: loss 6.2979, time 124.06ms
iter 521620: loss 5.9852, time 121.56ms
iter 521630: loss 5.6945, time 121.64ms
iter 521640: loss 6.2654, time 121.52ms
iter 521650: loss 6.0353, time 121.44ms
iter 521660: loss 6.3901, time 121.66ms
iter 521670: loss 5.6491, time 120.66ms
iter 521680: loss 5.2374, time 123.23ms
iter 521690: loss 5.4635, time 121.82ms
iter 521700: loss 5.8662, time 121.79ms
iter 521710: loss 6.3977, time 122.96ms
iter 521720: loss 5.1433, time 121.43ms
iter 521730: loss 6.0513, time 121.63ms
iter 521740: loss 5.9856, time 123.86ms
step 521750: train loss 5.5324, val loss 5.5454
saving checkpoint to out-shakespeare-char
iter 521750: loss 6.0045, time 2916.11ms
iter 521760: loss 5.9347, time 125.64ms
iter 521770: loss 5.4263, time 124.85ms
iter 521780: loss 4.9643, time 125.21ms
iter 521790: loss 5.2449, time 125.33ms
iter 521800: loss 5.9056, time 125.41ms
iter 521810: loss 5.9295, time 125.03ms
iter 521820: loss 5.8988, time 125.39ms
iter 521830: loss 5.8122, time 125.21ms
iter 521840: loss 5.4939, time 125.54ms
iter 521850: loss 5.8816, time 126.03ms
iter 521860: loss 5.9012, time 125.84ms
iter 521870: loss 5.7203, time 125.24ms
iter 521880: loss 5.5898, time 125.37ms
iter 521890: loss 5.5590, time 124.51ms
iter 521900: loss 6.3895, time 125.14ms
iter 521910: loss 6.0680, time 125.15ms
iter 521920: loss 5.8943, time 125.18ms
iter 521930: loss 6.5096, time 124.62ms
iter 521940: loss 5.9287, time 125.42ms
iter 521950: loss 6.2848, time 125.19ms
iter 521960: loss 5.0405, time 125.20ms
iter 521970: loss 6.4255, time 125.24ms
iter 521980: loss 6.0972, time 125.17ms
iter 521990: loss 5.9175, time 125.24ms
step 522000: train loss 5.5784, val loss 5.5239
saving checkpoint to out-shakespeare-char
iter 522000: loss 5.7606, time 2866.30ms
iter 522010: loss 5.2656, time 121.56ms
iter 522020: loss 6.4743, time 122.94ms
iter 522030: loss 5.5597, time 121.40ms
iter 522040: loss 5.7324, time 121.49ms
iter 522050: loss 4.9547, time 121.73ms
iter 522060: loss 6.3939, time 122.69ms
iter 522070: loss 5.6137, time 122.47ms
iter 522080: loss 5.0790, time 121.67ms
iter 522090: loss 6.3366, time 124.07ms
iter 522100: loss 5.5975, time 122.18ms
iter 522110: loss 5.7647, time 121.31ms
iter 522120: loss 6.4018, time 121.70ms
iter 522130: loss 6.0688, time 121.63ms
iter 522140: loss 5.8111, time 121.48ms
iter 522150: loss 5.7510, time 121.58ms
iter 522160: loss 5.8407, time 122.64ms
iter 522170: loss 5.7726, time 122.00ms
iter 522180: loss 5.9128, time 121.68ms
iter 522190: loss 5.4258, time 122.64ms
iter 522200: loss 5.9039, time 122.39ms
iter 522210: loss 5.6627, time 121.58ms
iter 522220: loss 5.7702, time 124.08ms
iter 522230: loss 6.3087, time 121.51ms
iter 522240: loss 5.8089, time 121.99ms
step 522250: train loss 5.5296, val loss 5.5917
saving checkpoint to out-shakespeare-char
iter 522250: loss 5.8933, time 2909.25ms
iter 522260: loss 5.2858, time 122.11ms
iter 522270: loss 5.8584, time 121.79ms
iter 522280: loss 6.8072, time 121.89ms
iter 522290: loss 6.5420, time 122.57ms
iter 522300: loss 6.1395, time 121.17ms
iter 522310: loss 5.5788, time 122.65ms
iter 522320: loss 6.2398, time 122.53ms
iter 522330: loss 6.0828, time 121.52ms
iter 522340: loss 5.9518, time 121.40ms
iter 522350: loss 5.8530, time 124.00ms
iter 522360: loss 6.2051, time 122.13ms
iter 522370: loss 6.7575, time 121.61ms
iter 522380: loss 5.6857, time 121.39ms
iter 522390: loss 5.9382, time 121.25ms
iter 522400: loss 6.1442, time 121.59ms
iter 522410: loss 5.6028, time 121.86ms
iter 522420: loss 5.3111, time 122.85ms
iter 522430: loss 5.8870, time 121.07ms
iter 522440: loss 6.1909, time 121.66ms
iter 522450: loss 5.9882, time 122.87ms
iter 522460: loss 5.7851, time 121.63ms
iter 522470: loss 6.0225, time 121.28ms
iter 522480: loss 6.1524, time 124.08ms
iter 522490: loss 6.1151, time 121.56ms
step 522500: train loss 5.5441, val loss 5.5161
saving checkpoint to out-shakespeare-char
iter 522500: loss 5.8785, time 2909.85ms
iter 522510: loss 5.8789, time 121.82ms
iter 522520: loss 6.6145, time 123.67ms
iter 522530: loss 5.8318, time 121.70ms
iter 522540: loss 6.2063, time 121.57ms
iter 522550: loss 6.3699, time 125.44ms
iter 522560: loss 5.4598, time 121.00ms
iter 522570: loss 6.2655, time 126.03ms
iter 522580: loss 5.8475, time 126.93ms
iter 522590: loss 5.5658, time 124.49ms
iter 522600: loss 5.8292, time 125.36ms
iter 522610: loss 6.4541, time 125.32ms
iter 522620: loss 6.3785, time 125.79ms
iter 522630: loss 6.2447, time 125.52ms
iter 522640: loss 5.8756, time 125.70ms
iter 522650: loss 6.2188, time 125.60ms
iter 522660: loss 5.8102, time 125.32ms
iter 522670: loss 5.6026, time 125.10ms
iter 522680: loss 7.1106, time 125.22ms
iter 522690: loss 5.7596, time 127.32ms
iter 522700: loss 5.6354, time 125.99ms
iter 522710: loss 6.4810, time 125.27ms
iter 522720: loss 5.2945, time 125.91ms
iter 522730: loss 6.6906, time 125.61ms
iter 522740: loss 5.9165, time 125.41ms
step 522750: train loss 5.5675, val loss 5.5377
saving checkpoint to out-shakespeare-char
iter 522750: loss 6.7304, time 2879.15ms
iter 522760: loss 5.9073, time 125.84ms
iter 522770: loss 6.2122, time 125.72ms
iter 522780: loss 5.9132, time 125.59ms
iter 522790: loss 5.5370, time 128.49ms
iter 522800: loss 5.5676, time 125.53ms
iter 522810: loss 6.3167, time 128.97ms
iter 522820: loss 5.5128, time 125.64ms
iter 522830: loss 5.8643, time 125.81ms
iter 522840: loss 5.5343, time 125.78ms
iter 522850: loss 6.3301, time 125.36ms
iter 522860: loss 5.7387, time 125.39ms
iter 522870: loss 6.7223, time 126.03ms
iter 522880: loss 6.2809, time 126.08ms
iter 522890: loss 5.9526, time 125.53ms
iter 522900: loss 5.9916, time 124.84ms
iter 522910: loss 5.2513, time 125.67ms
iter 522920: loss 5.4202, time 125.40ms
iter 522930: loss 5.8814, time 125.81ms
iter 522940: loss 5.7822, time 125.54ms
iter 522950: loss 6.2715, time 126.00ms
iter 522960: loss 5.6813, time 125.28ms
iter 522970: loss 6.2277, time 125.49ms
iter 522980: loss 5.4497, time 125.53ms
iter 522990: loss 6.2715, time 126.14ms
step 523000: train loss 5.5560, val loss 5.5290
saving checkpoint to out-shakespeare-char
iter 523000: loss 6.8771, time 2899.61ms
iter 523010: loss 5.9605, time 128.52ms
iter 523020: loss 6.0252, time 125.64ms
iter 523030: loss 5.8737, time 128.38ms
iter 523040: loss 6.4128, time 126.14ms
iter 523050: loss 5.9335, time 126.10ms
iter 523060: loss 5.6348, time 125.69ms
iter 523070: loss 6.1864, time 125.98ms
iter 523080: loss 5.5181, time 125.80ms
iter 523090: loss 6.1275, time 126.13ms
iter 523100: loss 5.7219, time 125.55ms
iter 523110: loss 5.7687, time 126.18ms
iter 523120: loss 6.3137, time 125.99ms
iter 523130: loss 6.3042, time 125.39ms
iter 523140: loss 6.2157, time 125.02ms
iter 523150: loss 6.0126, time 125.29ms
iter 523160: loss 6.1474, time 125.05ms
iter 523170: loss 6.1679, time 125.56ms
iter 523180: loss 5.8298, time 125.09ms
iter 523190: loss 5.2977, time 125.86ms
iter 523200: loss 6.1076, time 125.67ms
iter 523210: loss 5.1509, time 125.92ms
iter 523220: loss 6.5234, time 125.55ms
iter 523230: loss 6.1439, time 125.79ms
iter 523240: loss 6.3461, time 125.08ms
step 523250: train loss 5.5887, val loss 5.5452
saving checkpoint to out-shakespeare-char
iter 523250: loss 6.1521, time 2882.24ms
iter 523260: loss 6.2842, time 125.20ms
iter 523270: loss 6.4172, time 125.60ms
iter 523280: loss 5.8011, time 119.44ms
iter 523290: loss 6.0113, time 119.51ms
iter 523300: loss 5.7361, time 120.68ms
iter 523310: loss 6.1941, time 123.40ms
iter 523320: loss 5.4224, time 119.72ms
iter 523330: loss 5.7369, time 119.51ms
iter 523340: loss 6.0833, time 121.51ms
iter 523350: loss 6.5177, time 124.02ms
iter 523360: loss 6.3069, time 121.81ms
iter 523370: loss 6.3154, time 121.55ms
iter 523380: loss 5.9717, time 121.33ms
iter 523390: loss 5.9114, time 121.44ms
iter 523400: loss 6.4058, time 121.40ms
iter 523410: loss 6.0573, time 121.44ms
iter 523420: loss 6.4177, time 122.67ms
iter 523430: loss 6.6284, time 121.68ms
iter 523440: loss 5.4775, time 121.75ms
iter 523450: loss 5.8620, time 122.54ms
iter 523460: loss 5.4291, time 121.43ms
iter 523470: loss 5.9161, time 121.64ms
iter 523480: loss 5.6602, time 123.99ms
iter 523490: loss 5.5632, time 121.79ms
step 523500: train loss 5.5407, val loss 5.5051
saving checkpoint to out-shakespeare-char
iter 523500: loss 5.4876, time 2886.99ms
iter 523510: loss 5.4873, time 121.59ms
iter 523520: loss 5.8212, time 123.93ms
iter 523530: loss 5.8214, time 120.78ms
iter 523540: loss 5.4240, time 121.44ms
iter 523550: loss 5.6145, time 121.84ms
iter 523560: loss 6.0933, time 122.95ms
iter 523570: loss 6.4080, time 121.55ms
iter 523580: loss 5.4835, time 121.36ms
iter 523590: loss 5.5701, time 119.56ms
iter 523600: loss 6.2583, time 122.29ms
iter 523610: loss 5.9002, time 121.56ms
iter 523620: loss 5.9618, time 121.47ms
iter 523630: loss 6.1844, time 121.41ms
iter 523640: loss 5.7338, time 122.06ms
iter 523650: loss 5.8166, time 121.43ms
iter 523660: loss 5.8896, time 121.72ms
iter 523670: loss 6.7688, time 124.50ms
iter 523680: loss 6.5996, time 121.96ms
iter 523690: loss 6.7812, time 121.67ms
iter 523700: loss 6.2396, time 121.44ms
iter 523710: loss 5.7005, time 121.92ms
iter 523720: loss 6.2096, time 121.93ms
iter 523730: loss 5.6897, time 121.42ms
iter 523740: loss 6.4154, time 123.99ms
step 523750: train loss 5.5009, val loss 5.5479
saving checkpoint to out-shakespeare-char
iter 523750: loss 6.6099, time 2892.10ms
iter 523760: loss 6.2936, time 121.52ms
iter 523770: loss 6.0886, time 121.39ms
iter 523780: loss 6.8688, time 121.30ms
iter 523790: loss 5.8982, time 122.13ms
iter 523800: loss 6.6424, time 121.26ms
iter 523810: loss 5.2586, time 121.85ms
iter 523820: loss 5.7955, time 120.79ms
iter 523830: loss 5.8758, time 121.07ms
iter 523840: loss 5.5762, time 122.15ms
iter 523850: loss 5.9043, time 122.13ms
iter 523860: loss 5.8512, time 121.58ms
iter 523870: loss 5.4023, time 121.46ms
iter 523880: loss 6.8551, time 124.25ms
iter 523890: loss 6.6353, time 121.65ms
iter 523900: loss 5.9967, time 120.92ms
iter 523910: loss 5.4242, time 121.92ms
iter 523920: loss 6.1414, time 122.01ms
iter 523930: loss 5.9306, time 121.26ms
iter 523940: loss 6.5832, time 122.01ms
iter 523950: loss 6.5593, time 123.85ms
iter 523960: loss 6.0666, time 121.93ms
iter 523970: loss 5.4312, time 122.04ms
iter 523980: loss 5.8217, time 121.44ms
iter 523990: loss 5.5630, time 121.48ms
step 524000: train loss 5.5848, val loss 5.4984
saving checkpoint to out-shakespeare-char
iter 524000: loss 5.8730, time 2911.54ms
iter 524010: loss 5.5761, time 128.88ms
iter 524020: loss 5.6722, time 125.35ms
iter 524030: loss 5.7194, time 127.87ms
iter 524040: loss 5.7383, time 121.75ms
iter 524050: loss 6.3810, time 121.35ms
iter 524060: loss 5.3471, time 124.35ms
iter 524070: loss 6.0335, time 125.83ms
iter 524080: loss 6.2349, time 125.45ms
iter 524090: loss 6.0515, time 126.03ms
iter 524100: loss 5.7087, time 124.36ms
iter 524110: loss 6.3022, time 125.76ms
iter 524120: loss 6.4470, time 125.85ms
iter 524130: loss 5.3627, time 125.52ms
iter 524140: loss 6.2686, time 125.34ms
iter 524150: loss 5.6804, time 125.39ms
iter 524160: loss 6.2580, time 129.06ms
iter 524170: loss 6.3686, time 124.80ms
iter 524180: loss 6.4181, time 126.60ms
iter 524190: loss 5.9199, time 124.48ms
iter 524200: loss 5.4804, time 124.44ms
iter 524210: loss 5.8690, time 124.52ms
iter 524220: loss 6.6491, time 124.71ms
iter 524230: loss 6.3875, time 124.27ms
iter 524240: loss 5.7821, time 124.86ms
step 524250: train loss 5.5280, val loss 5.5215
saving checkpoint to out-shakespeare-char
iter 524250: loss 6.2158, time 2887.34ms
iter 524260: loss 6.2975, time 125.11ms
iter 524270: loss 5.5030, time 125.21ms
iter 524280: loss 6.5121, time 124.99ms
iter 524290: loss 5.6882, time 125.33ms
iter 524300: loss 5.3847, time 125.15ms
iter 524310: loss 5.7552, time 125.39ms
iter 524320: loss 6.8325, time 124.77ms
iter 524330: loss 5.7102, time 125.00ms
iter 524340: loss 6.6048, time 125.35ms
iter 524350: loss 5.9641, time 125.19ms
iter 524360: loss 5.9222, time 125.09ms
iter 524370: loss 5.7661, time 125.34ms
iter 524380: loss 6.0062, time 126.87ms
iter 524390: loss 6.1346, time 125.48ms
iter 524400: loss 6.1575, time 125.26ms
iter 524410: loss 6.0411, time 125.11ms
iter 524420: loss 6.0209, time 125.46ms
iter 524430: loss 6.3162, time 127.40ms
iter 524440: loss 5.7193, time 125.02ms
iter 524450: loss 5.9134, time 125.05ms
iter 524460: loss 5.2562, time 124.52ms
iter 524470: loss 6.0154, time 124.05ms
iter 524480: loss 6.2701, time 124.89ms
iter 524490: loss 6.2805, time 126.55ms
step 524500: train loss 5.5447, val loss 5.5837
saving checkpoint to out-shakespeare-char
iter 524500: loss 5.5798, time 2870.22ms
iter 524510: loss 6.1980, time 125.29ms
iter 524520: loss 5.6651, time 124.52ms
iter 524530: loss 6.0948, time 124.48ms
iter 524540: loss 5.5082, time 125.19ms
iter 524550: loss 5.7529, time 124.81ms
iter 524560: loss 5.7992, time 124.24ms
iter 524570: loss 5.6293, time 124.80ms
iter 524580: loss 5.8000, time 125.11ms
iter 524590: loss 6.1066, time 124.84ms
iter 524600: loss 5.4625, time 124.33ms
iter 524610: loss 5.5095, time 125.08ms
iter 524620: loss 6.3618, time 127.65ms
iter 524630: loss 5.8882, time 125.00ms
iter 524640: loss 5.9659, time 126.82ms
iter 524650: loss 7.0891, time 124.59ms
iter 524660: loss 6.5903, time 127.27ms
iter 524670: loss 6.3539, time 124.90ms
iter 524680: loss 5.4722, time 126.88ms
iter 524690: loss 5.8137, time 124.70ms
iter 524700: loss 6.2340, time 127.56ms
iter 524710: loss 6.2424, time 124.63ms
iter 524720: loss 5.3513, time 126.25ms
iter 524730: loss 5.6170, time 124.85ms
iter 524740: loss 6.3364, time 127.26ms
step 524750: train loss 5.5545, val loss 5.5611
saving checkpoint to out-shakespeare-char
iter 524750: loss 6.0906, time 2882.34ms
iter 524760: loss 6.6661, time 125.16ms
iter 524770: loss 5.7712, time 125.51ms
iter 524780: loss 6.7648, time 124.94ms
iter 524790: loss 5.7985, time 125.00ms
iter 524800: loss 5.5511, time 125.21ms
iter 524810: loss 5.7257, time 125.09ms
iter 524820: loss 5.2869, time 124.83ms
iter 524830: loss 5.7842, time 124.78ms
iter 524840: loss 5.8454, time 124.82ms
iter 524850: loss 5.4729, time 124.60ms
iter 524860: loss 6.1172, time 125.31ms
iter 524870: loss 5.4494, time 124.71ms
iter 524880: loss 5.6664, time 124.86ms
iter 524890: loss 6.2775, time 124.80ms
iter 524900: loss 6.2347, time 125.20ms
iter 524910: loss 5.8919, time 126.04ms
iter 524920: loss 6.3746, time 124.61ms
iter 524930: loss 5.6453, time 125.29ms
iter 524940: loss 6.3122, time 125.78ms
iter 524950: loss 5.9781, time 125.48ms
iter 524960: loss 5.8697, time 125.54ms
iter 524970: loss 5.9249, time 125.94ms
iter 524980: loss 6.0415, time 125.83ms
iter 524990: loss 5.7769, time 125.54ms
step 525000: train loss 5.5957, val loss 5.5912
saving checkpoint to out-shakespeare-char
iter 525000: loss 5.5665, time 2857.16ms
iter 525010: loss 5.9736, time 125.84ms
iter 525020: loss 6.1998, time 125.68ms
iter 525030: loss 5.8657, time 125.50ms
iter 525040: loss 6.3716, time 125.39ms
iter 525050: loss 5.9543, time 125.93ms
iter 525060: loss 5.7862, time 126.13ms
iter 525070: loss 5.7233, time 125.53ms
iter 525080: loss 6.5052, time 125.42ms
iter 525090: loss 5.9619, time 125.58ms
iter 525100: loss 5.7636, time 125.32ms
iter 525110: loss 5.6593, time 124.92ms
iter 525120: loss 6.2172, time 125.47ms
iter 525130: loss 6.3116, time 125.80ms
iter 525140: loss 5.7383, time 126.02ms
iter 525150: loss 5.6388, time 125.70ms
iter 525160: loss 5.6099, time 126.02ms
iter 525170: loss 6.0496, time 125.18ms
iter 525180: loss 6.2018, time 126.61ms
iter 525190: loss 6.0982, time 126.00ms
iter 525200: loss 6.3850, time 125.33ms
iter 525210: loss 5.9274, time 125.67ms
iter 525220: loss 5.7231, time 125.29ms
iter 525230: loss 5.9061, time 125.85ms
iter 525240: loss 5.7660, time 125.80ms
step 525250: train loss 5.5226, val loss 5.5920
saving checkpoint to out-shakespeare-char
iter 525250: loss 5.6993, time 2862.68ms
iter 525260: loss 6.5814, time 125.60ms
iter 525270: loss 5.2464, time 125.60ms
iter 525280: loss 6.1475, time 125.56ms
iter 525290: loss 6.2297, time 125.51ms
iter 525300: loss 5.8486, time 125.08ms
iter 525310: loss 6.1557, time 124.86ms
iter 525320: loss 5.8882, time 124.90ms
iter 525330: loss 6.1900, time 125.00ms
iter 525340: loss 6.0559, time 124.56ms
iter 525350: loss 6.8350, time 123.43ms
iter 525360: loss 5.7658, time 124.77ms
iter 525370: loss 5.5273, time 125.01ms
iter 525380: loss 6.0070, time 124.77ms
iter 525390: loss 6.3479, time 124.41ms
iter 525400: loss 5.1989, time 124.58ms
iter 525410: loss 5.5360, time 124.63ms
iter 525420: loss 5.8907, time 124.61ms
iter 525430: loss 5.9738, time 124.75ms
iter 525440: loss 5.9360, time 124.80ms
iter 525450: loss 5.4161, time 124.73ms
iter 525460: loss 6.9082, time 124.52ms
iter 525470: loss 5.5567, time 124.63ms
iter 525480: loss 6.1246, time 124.50ms
iter 525490: loss 6.1489, time 124.80ms
step 525500: train loss 5.5057, val loss 5.5271
saving checkpoint to out-shakespeare-char
iter 525500: loss 5.5688, time 2891.75ms
iter 525510: loss 6.1277, time 126.20ms
iter 525520: loss 5.5975, time 125.47ms
iter 525530: loss 6.3045, time 125.52ms
iter 525540: loss 6.2363, time 125.71ms
iter 525550: loss 6.4750, time 126.12ms
iter 525560: loss 5.7732, time 126.32ms
iter 525570: loss 6.4666, time 125.64ms
iter 525580: loss 6.3820, time 125.71ms
iter 525590: loss 6.1143, time 125.34ms
iter 525600: loss 6.1154, time 125.32ms
iter 525610: loss 5.9137, time 128.29ms
iter 525620: loss 5.8853, time 125.39ms
iter 525630: loss 5.5342, time 128.17ms
iter 525640: loss 6.1391, time 125.38ms
iter 525650: loss 6.5200, time 127.90ms
iter 525660: loss 6.3590, time 125.46ms
iter 525670: loss 6.3427, time 128.04ms
iter 525680: loss 5.8974, time 126.48ms
iter 525690: loss 5.8093, time 125.41ms
iter 525700: loss 5.8086, time 125.36ms
iter 525710: loss 5.4314, time 125.50ms
iter 525720: loss 6.5206, time 125.35ms
iter 525730: loss 6.1146, time 125.45ms
iter 525740: loss 5.8935, time 125.48ms
step 525750: train loss 5.5577, val loss 5.5320
saving checkpoint to out-shakespeare-char
iter 525750: loss 5.7654, time 2893.71ms
iter 525760: loss 6.3889, time 125.15ms
iter 525770: loss 5.8251, time 128.19ms
iter 525780: loss 5.7564, time 125.48ms
iter 525790: loss 6.1684, time 128.23ms
iter 525800: loss 6.1938, time 125.46ms
iter 525810: loss 5.4813, time 128.34ms
iter 525820: loss 6.1652, time 125.31ms
iter 525830: loss 6.4983, time 126.47ms
iter 525840: loss 5.4432, time 125.03ms
iter 525850: loss 6.1611, time 124.90ms
iter 525860: loss 6.3453, time 126.23ms
iter 525870: loss 5.9869, time 127.37ms
iter 525880: loss 5.3247, time 125.07ms
iter 525890: loss 5.9481, time 127.69ms
iter 525900: loss 6.0012, time 125.09ms
iter 525910: loss 5.8977, time 127.72ms
iter 525920: loss 5.6355, time 124.96ms
iter 525930: loss 6.2670, time 127.66ms
iter 525940: loss 5.9874, time 125.34ms
iter 525950: loss 5.8735, time 127.87ms
iter 525960: loss 5.8701, time 124.83ms
iter 525970: loss 6.0982, time 128.21ms
iter 525980: loss 5.9572, time 124.96ms
iter 525990: loss 5.8420, time 127.59ms
step 526000: train loss 5.5849, val loss 5.5516
saving checkpoint to out-shakespeare-char
iter 526000: loss 6.0632, time 2889.97ms
iter 526010: loss 6.1535, time 124.98ms
iter 526020: loss 6.1891, time 125.00ms
iter 526030: loss 5.9298, time 124.49ms
iter 526040: loss 5.9103, time 125.02ms
iter 526050: loss 5.3374, time 124.99ms
iter 526060: loss 5.3591, time 124.30ms
iter 526070: loss 6.4345, time 124.98ms
iter 526080: loss 5.7163, time 124.73ms
iter 526090: loss 6.0859, time 124.89ms
iter 526100: loss 6.2939, time 124.21ms
iter 526110: loss 5.8842, time 127.69ms
iter 526120: loss 6.1038, time 124.37ms
iter 526130: loss 5.9723, time 125.25ms
iter 526140: loss 5.9933, time 125.73ms
iter 526150: loss 6.0655, time 125.02ms
iter 526160: loss 5.5877, time 124.78ms
iter 526170: loss 6.3907, time 125.35ms
iter 526180: loss 6.5304, time 125.14ms
iter 526190: loss 6.7842, time 125.88ms
iter 526200: loss 6.3496, time 125.37ms
iter 526210: loss 6.1334, time 126.39ms
iter 526220: loss 6.1067, time 125.06ms
iter 526230: loss 6.1793, time 125.06ms
iter 526240: loss 6.0824, time 125.14ms
step 526250: train loss 5.4986, val loss 5.5266
saving checkpoint to out-shakespeare-char
iter 526250: loss 6.1187, time 2902.34ms
iter 526260: loss 6.3821, time 125.63ms
iter 526270: loss 6.3327, time 124.97ms
iter 526280: loss 6.3008, time 125.37ms
iter 526290: loss 5.3630, time 124.46ms
iter 526300: loss 5.5369, time 123.59ms
iter 526310: loss 5.9305, time 124.95ms
iter 526320: loss 6.0335, time 125.12ms
iter 526330: loss 6.5276, time 125.19ms
iter 526340: loss 6.1639, time 125.36ms
iter 526350: loss 6.7637, time 124.46ms
iter 526360: loss 5.4240, time 125.07ms
iter 526370: loss 5.3161, time 125.33ms
iter 526380: loss 6.7178, time 125.01ms
iter 526390: loss 6.0953, time 125.18ms
iter 526400: loss 5.9541, time 125.15ms
iter 526410: loss 5.1954, time 125.17ms
iter 526420: loss 6.2079, time 124.98ms
iter 526430: loss 5.4094, time 125.77ms
iter 526440: loss 5.9717, time 125.59ms
iter 526450: loss 5.9864, time 125.42ms
iter 526460: loss 6.3602, time 125.34ms
iter 526470: loss 5.9353, time 125.30ms
iter 526480: loss 5.4195, time 124.99ms
iter 526490: loss 7.0758, time 125.74ms
step 526500: train loss 5.5419, val loss 5.5215
saving checkpoint to out-shakespeare-char
iter 526500: loss 6.2578, time 2909.51ms
iter 526510: loss 6.2581, time 126.68ms
iter 526520: loss 5.5624, time 124.62ms
iter 526530: loss 5.9296, time 125.31ms
iter 526540: loss 5.8571, time 125.94ms
iter 526550: loss 5.7958, time 125.25ms
iter 526560: loss 6.3290, time 124.89ms
iter 526570: loss 6.3879, time 124.79ms
iter 526580: loss 6.1939, time 127.33ms
iter 526590: loss 5.9136, time 121.54ms
iter 526600: loss 5.7001, time 122.74ms
iter 526610: loss 6.0459, time 121.59ms
iter 526620: loss 5.9209, time 122.30ms
iter 526630: loss 6.0023, time 123.74ms
iter 526640: loss 6.1004, time 121.16ms
iter 526650: loss 6.0172, time 121.21ms
iter 526660: loss 6.2543, time 121.25ms
iter 526670: loss 6.0481, time 121.89ms
iter 526680: loss 6.0752, time 121.84ms
iter 526690: loss 6.2482, time 120.91ms
iter 526700: loss 5.7977, time 121.16ms
iter 526710: loss 5.6014, time 120.62ms
iter 526720: loss 5.5707, time 121.14ms
iter 526730: loss 5.6395, time 121.18ms
iter 526740: loss 6.5622, time 122.44ms
step 526750: train loss 5.5659, val loss 5.5137
saving checkpoint to out-shakespeare-char
iter 526750: loss 6.6099, time 2896.06ms
iter 526760: loss 5.5175, time 121.03ms
iter 526770: loss 5.7971, time 122.42ms
iter 526780: loss 6.1347, time 121.30ms
iter 526790: loss 5.6962, time 121.41ms
iter 526800: loss 5.9453, time 123.79ms
iter 526810: loss 5.7138, time 121.35ms
iter 526820: loss 6.5266, time 121.40ms
iter 526830: loss 6.1640, time 121.56ms
iter 526840: loss 6.2348, time 121.26ms
iter 526850: loss 6.0002, time 121.28ms
iter 526860: loss 5.8886, time 121.38ms
iter 526870: loss 6.0463, time 122.47ms
iter 526880: loss 6.7495, time 121.46ms
iter 526890: loss 6.1660, time 120.44ms
iter 526900: loss 5.9260, time 122.58ms
iter 526910: loss 5.5353, time 121.80ms
iter 526920: loss 5.4941, time 122.00ms
iter 526930: loss 5.8136, time 124.12ms
iter 526940: loss 5.8977, time 121.90ms
iter 526950: loss 5.7899, time 121.36ms
iter 526960: loss 5.3793, time 121.85ms
iter 526970: loss 5.8451, time 120.44ms
iter 526980: loss 5.9964, time 121.33ms
iter 526990: loss 6.1123, time 121.40ms
step 527000: train loss 5.5499, val loss 5.5568
saving checkpoint to out-shakespeare-char
iter 527000: loss 5.8047, time 2901.59ms
iter 527010: loss 6.4543, time 121.37ms
iter 527020: loss 5.8357, time 121.41ms
iter 527030: loss 5.5152, time 121.43ms
iter 527040: loss 6.2877, time 121.55ms
iter 527050: loss 5.7843, time 121.58ms
iter 527060: loss 5.9002, time 121.41ms
iter 527070: loss 5.9763, time 122.23ms
iter 527080: loss 5.4995, time 121.64ms
iter 527090: loss 5.8317, time 121.46ms
iter 527100: loss 5.5097, time 123.42ms
iter 527110: loss 5.6283, time 122.19ms
iter 527120: loss 5.6163, time 121.84ms
iter 527130: loss 5.6220, time 124.23ms
iter 527140: loss 7.0040, time 121.47ms
iter 527150: loss 5.3490, time 120.81ms
iter 527160: loss 6.8214, time 121.79ms
iter 527170: loss 6.6712, time 121.34ms
iter 527180: loss 6.2861, time 121.51ms
iter 527190: loss 5.8450, time 121.60ms
iter 527200: loss 5.9598, time 122.91ms
iter 527210: loss 5.9869, time 121.29ms
iter 527220: loss 5.7681, time 120.48ms
iter 527230: loss 5.8655, time 122.48ms
iter 527240: loss 6.2622, time 121.41ms
step 527250: train loss 5.5445, val loss 5.5339
saving checkpoint to out-shakespeare-char
iter 527250: loss 6.1988, time 2899.86ms
iter 527260: loss 6.5678, time 121.42ms
iter 527270: loss 6.3894, time 121.57ms
iter 527280: loss 6.4919, time 121.20ms
iter 527290: loss 6.2752, time 122.39ms
iter 527300: loss 6.3010, time 121.45ms
iter 527310: loss 6.1572, time 121.39ms
iter 527320: loss 5.5925, time 122.80ms
iter 527330: loss 5.7969, time 121.75ms
iter 527340: loss 5.4184, time 121.94ms
iter 527350: loss 5.6284, time 124.03ms
iter 527360: loss 6.2351, time 121.37ms
iter 527370: loss 5.8908, time 121.39ms
iter 527380: loss 6.3331, time 121.36ms
iter 527390: loss 5.3894, time 120.56ms
iter 527400: loss 5.2516, time 120.51ms
iter 527410: loss 6.1552, time 121.47ms
iter 527420: loss 5.2788, time 122.55ms
iter 527430: loss 5.8448, time 121.28ms
iter 527440: loss 5.3903, time 121.86ms
iter 527450: loss 5.2066, time 122.71ms
iter 527460: loss 6.1169, time 121.36ms
iter 527470: loss 5.7041, time 121.30ms
iter 527480: loss 5.6321, time 123.83ms
iter 527490: loss 6.0838, time 121.23ms
step 527500: train loss 5.5718, val loss 5.5448
saving checkpoint to out-shakespeare-char
iter 527500: loss 6.0840, time 2903.90ms
iter 527510: loss 6.3912, time 121.62ms
iter 527520: loss 6.1187, time 121.94ms
iter 527530: loss 5.9774, time 121.72ms
iter 527540: loss 6.1872, time 122.59ms
iter 527550: loss 6.2295, time 122.81ms
iter 527560: loss 5.8132, time 121.99ms
iter 527570: loss 6.2015, time 121.80ms
iter 527580: loss 6.2968, time 122.80ms
iter 527590: loss 5.7782, time 121.78ms
iter 527600: loss 5.2947, time 123.01ms
iter 527610: loss 5.1556, time 124.38ms
iter 527620: loss 6.1786, time 121.70ms
iter 527630: loss 6.2055, time 122.00ms
iter 527640: loss 6.0822, time 121.77ms
iter 527650: loss 6.5773, time 121.74ms
iter 527660: loss 6.4295, time 121.75ms
iter 527670: loss 5.6406, time 121.98ms
iter 527680: loss 5.5730, time 123.20ms
iter 527690: loss 6.1367, time 121.59ms
iter 527700: loss 5.7729, time 122.14ms
iter 527710: loss 6.0182, time 121.73ms
iter 527720: loss 5.8103, time 122.08ms
iter 527730: loss 5.7671, time 122.26ms
iter 527740: loss 5.3814, time 124.04ms
step 527750: train loss 5.5520, val loss 5.5523
saving checkpoint to out-shakespeare-char
iter 527750: loss 5.7947, time 2901.48ms
iter 527760: loss 6.4361, time 121.58ms
iter 527770: loss 5.8002, time 122.79ms
iter 527780: loss 6.2159, time 121.43ms
iter 527790: loss 5.6918, time 121.92ms
iter 527800: loss 5.6769, time 123.83ms
iter 527810: loss 5.9648, time 121.38ms
iter 527820: loss 5.7326, time 121.77ms
iter 527830: loss 6.3182, time 121.22ms
iter 527840: loss 6.5425, time 121.40ms
iter 527850: loss 5.9144, time 121.24ms
iter 527860: loss 6.0819, time 121.26ms
iter 527870: loss 5.2635, time 122.64ms
iter 527880: loss 5.7141, time 121.42ms
iter 527890: loss 6.3555, time 121.33ms
iter 527900: loss 5.6727, time 122.54ms
iter 527910: loss 5.4865, time 121.29ms
iter 527920: loss 5.6126, time 121.40ms
iter 527930: loss 5.3078, time 123.92ms
iter 527940: loss 6.1979, time 121.34ms
iter 527950: loss 6.0200, time 122.02ms
iter 527960: loss 5.6570, time 121.43ms
iter 527970: loss 6.1815, time 123.02ms
iter 527980: loss 5.9857, time 121.40ms
iter 527990: loss 6.3281, time 121.23ms
step 528000: train loss 5.5335, val loss 5.5523
saving checkpoint to out-shakespeare-char
iter 528000: loss 6.1029, time 2897.98ms
iter 528010: loss 6.3279, time 121.67ms
iter 528020: loss 6.4314, time 121.64ms
iter 528030: loss 5.5055, time 121.43ms
iter 528040: loss 5.5056, time 121.50ms
iter 528050: loss 6.2703, time 121.49ms
iter 528060: loss 5.5776, time 121.50ms
iter 528070: loss 5.5885, time 122.92ms
iter 528080: loss 5.5463, time 121.42ms
iter 528090: loss 6.2227, time 121.58ms
iter 528100: loss 6.1779, time 123.29ms
iter 528110: loss 5.7995, time 122.05ms
iter 528120: loss 6.2766, time 121.44ms
iter 528130: loss 6.7813, time 124.02ms
iter 528140: loss 6.1543, time 121.73ms
iter 528150: loss 6.2669, time 121.32ms
iter 528160: loss 5.5501, time 121.48ms
iter 528170: loss 5.4911, time 121.32ms
iter 528180: loss 6.1092, time 121.37ms
iter 528190: loss 5.5320, time 121.86ms
iter 528200: loss 6.2742, time 122.49ms
iter 528210: loss 5.8134, time 121.57ms
iter 528220: loss 6.0152, time 121.53ms
iter 528230: loss 5.8175, time 122.45ms
iter 528240: loss 6.0252, time 121.52ms
step 528250: train loss 5.5117, val loss 5.5331
saving checkpoint to out-shakespeare-char
iter 528250: loss 5.7062, time 2898.06ms
iter 528260: loss 6.4050, time 121.52ms
iter 528270: loss 5.7770, time 122.50ms
iter 528280: loss 6.2580, time 121.28ms
iter 528290: loss 5.5926, time 121.21ms
iter 528300: loss 5.9335, time 123.10ms
iter 528310: loss 6.0348, time 121.26ms
iter 528320: loss 5.3490, time 121.87ms
iter 528330: loss 6.3790, time 121.45ms
iter 528340: loss 6.6334, time 121.62ms
iter 528350: loss 6.0321, time 121.38ms
iter 528360: loss 5.5988, time 121.39ms
iter 528370: loss 6.5437, time 122.50ms
iter 528380: loss 5.9774, time 121.13ms
iter 528390: loss 6.0455, time 121.85ms
iter 528400: loss 5.9946, time 122.59ms
iter 528410: loss 5.8612, time 121.40ms
iter 528420: loss 6.4779, time 121.33ms
iter 528430: loss 5.7280, time 123.48ms
iter 528440: loss 6.2121, time 121.52ms
iter 528450: loss 5.8188, time 121.31ms
iter 528460: loss 6.0106, time 124.35ms
iter 528470: loss 5.9741, time 121.61ms
iter 528480: loss 6.7024, time 122.60ms
iter 528490: loss 6.0223, time 124.29ms
step 528500: train loss 5.5384, val loss 5.5411
saving checkpoint to out-shakespeare-char
iter 528500: loss 6.2801, time 2892.87ms
iter 528510: loss 5.2893, time 121.22ms
iter 528520: loss 6.1354, time 121.16ms
iter 528530: loss 6.6067, time 121.56ms
iter 528540: loss 5.7598, time 121.29ms
iter 528550: loss 5.6581, time 121.23ms
iter 528560: loss 5.8660, time 121.27ms
iter 528570: loss 5.8074, time 121.65ms
iter 528580: loss 5.9372, time 121.16ms
iter 528590: loss 6.1616, time 121.36ms
iter 528600: loss 6.2420, time 120.88ms
iter 528610: loss 6.2287, time 121.39ms
iter 528620: loss 6.0988, time 121.70ms
iter 528630: loss 5.5822, time 123.90ms
iter 528640: loss 5.9975, time 121.38ms
iter 528650: loss 5.3599, time 121.35ms
iter 528660: loss 5.8991, time 121.40ms
iter 528670: loss 5.8239, time 122.00ms
iter 528680: loss 5.5781, time 121.32ms
iter 528690: loss 5.6310, time 122.60ms
iter 528700: loss 6.2559, time 121.23ms
iter 528710: loss 5.1841, time 121.30ms
iter 528720: loss 5.9757, time 121.31ms
iter 528730: loss 6.1039, time 121.44ms
iter 528740: loss 6.0719, time 120.66ms
step 528750: train loss 5.5256, val loss 5.4835
saving checkpoint to out-shakespeare-char
iter 528750: loss 6.1686, time 2887.53ms
iter 528760: loss 5.8383, time 121.33ms
iter 528770: loss 5.7349, time 123.96ms
iter 528780: loss 5.6110, time 121.54ms
iter 528790: loss 6.0844, time 121.37ms
iter 528800: loss 5.7282, time 121.20ms
iter 528810: loss 6.0901, time 121.18ms
iter 528820: loss 4.9415, time 121.34ms
iter 528830: loss 6.6505, time 121.35ms
iter 528840: loss 6.0732, time 122.36ms
iter 528850: loss 5.8095, time 121.21ms
iter 528860: loss 5.7630, time 121.17ms
iter 528870: loss 5.4566, time 122.60ms
iter 528880: loss 5.5329, time 121.22ms
iter 528890: loss 5.7892, time 121.39ms
iter 528900: loss 6.3566, time 123.75ms
iter 528910: loss 6.9061, time 121.30ms
iter 528920: loss 5.9524, time 121.18ms
iter 528930: loss 6.1040, time 121.54ms
iter 528940: loss 5.8682, time 121.44ms
iter 528950: loss 6.2351, time 121.31ms
iter 528960: loss 6.0948, time 120.44ms
iter 528970: loss 5.6379, time 122.43ms
iter 528980: loss 6.3013, time 121.05ms
iter 528990: loss 6.1330, time 121.23ms
step 529000: train loss 5.5513, val loss 5.5747
saving checkpoint to out-shakespeare-char
iter 529000: loss 5.7296, time 2879.93ms
iter 529010: loss 6.4069, time 125.96ms
iter 529020: loss 6.0680, time 129.46ms
iter 529030: loss 6.4953, time 125.82ms
iter 529040: loss 6.2817, time 128.84ms
iter 529050: loss 6.2882, time 125.74ms
iter 529060: loss 6.0305, time 126.05ms
iter 529070: loss 5.6594, time 127.04ms
iter 529080: loss 5.8684, time 125.69ms
iter 529090: loss 5.6857, time 125.49ms
iter 529100: loss 5.4423, time 126.07ms
iter 529110: loss 6.0188, time 125.67ms
iter 529120: loss 6.4304, time 125.89ms
iter 529130: loss 6.4362, time 125.95ms
iter 529140: loss 5.7887, time 125.94ms
iter 529150: loss 5.5215, time 125.56ms
iter 529160: loss 6.1837, time 125.71ms
iter 529170: loss 5.2734, time 125.83ms
iter 529180: loss 5.6373, time 125.75ms
iter 529190: loss 6.1396, time 125.84ms
iter 529200: loss 5.1372, time 125.83ms
iter 529210: loss 5.9117, time 125.70ms
iter 529220: loss 6.1064, time 125.65ms
iter 529230: loss 5.8245, time 125.72ms
iter 529240: loss 6.0000, time 125.78ms
step 529250: train loss 5.5779, val loss 5.5209
saving checkpoint to out-shakespeare-char
iter 529250: loss 6.0047, time 2896.25ms
iter 529260: loss 5.1501, time 125.36ms
iter 529270: loss 5.5582, time 125.25ms
iter 529280: loss 6.5603, time 125.14ms
iter 529290: loss 6.1247, time 125.09ms
iter 529300: loss 5.7540, time 125.19ms
iter 529310: loss 5.6298, time 124.95ms
iter 529320: loss 6.1062, time 125.15ms
iter 529330: loss 5.6693, time 125.09ms
iter 529340: loss 6.5158, time 124.74ms
iter 529350: loss 6.2215, time 125.34ms
iter 529360: loss 6.1868, time 125.27ms
iter 529370: loss 6.1998, time 125.84ms
iter 529380: loss 5.3456, time 127.27ms
iter 529390: loss 5.8963, time 125.54ms
iter 529400: loss 6.4082, time 125.29ms
iter 529410: loss 5.3911, time 125.59ms
iter 529420: loss 6.3254, time 127.39ms
iter 529430: loss 5.8136, time 125.00ms
iter 529440: loss 5.2272, time 127.59ms
iter 529450: loss 5.9773, time 125.34ms
iter 529460: loss 5.9869, time 127.92ms
iter 529470: loss 5.7601, time 125.19ms
iter 529480: loss 6.2575, time 127.84ms
iter 529490: loss 5.9157, time 124.87ms
step 529500: train loss 5.5951, val loss 5.5094
saving checkpoint to out-shakespeare-char
iter 529500: loss 5.7499, time 2892.86ms
iter 529510: loss 5.5494, time 125.72ms
iter 529520: loss 6.5088, time 125.13ms
iter 529530: loss 6.6375, time 124.60ms
iter 529540: loss 5.6942, time 125.16ms
iter 529550: loss 6.2400, time 124.65ms
iter 529560: loss 6.0164, time 124.81ms
iter 529570: loss 6.1187, time 124.95ms
iter 529580: loss 5.7316, time 124.80ms
iter 529590: loss 5.9991, time 124.83ms
iter 529600: loss 6.1276, time 125.10ms
iter 529610: loss 7.0568, time 124.76ms
iter 529620: loss 6.1922, time 124.68ms
iter 529630: loss 6.3918, time 124.84ms
iter 529640: loss 5.7682, time 124.86ms
iter 529650: loss 5.3166, time 124.35ms
iter 529660: loss 5.6978, time 124.84ms
iter 529670: loss 5.7686, time 125.05ms
iter 529680: loss 5.7858, time 124.80ms
iter 529690: loss 6.4259, time 123.98ms
iter 529700: loss 6.1285, time 124.19ms
iter 529710: loss 5.5450, time 125.48ms
iter 529720: loss 6.4219, time 125.30ms
iter 529730: loss 6.0685, time 125.12ms
iter 529740: loss 5.6902, time 125.47ms
step 529750: train loss 5.5656, val loss 5.5420
saving checkpoint to out-shakespeare-char
iter 529750: loss 6.3940, time 2893.87ms
iter 529760: loss 5.5446, time 127.56ms
iter 529770: loss 6.5861, time 124.93ms
iter 529780: loss 5.3933, time 127.66ms
iter 529790: loss 6.7510, time 124.82ms
iter 529800: loss 5.7275, time 128.62ms
iter 529810: loss 5.8488, time 125.23ms
iter 529820: loss 6.4415, time 127.72ms
iter 529830: loss 5.4904, time 124.64ms
iter 529840: loss 5.6825, time 127.79ms
iter 529850: loss 5.5029, time 125.11ms
iter 529860: loss 5.8612, time 127.82ms
iter 529870: loss 6.2679, time 124.39ms
iter 529880: loss 5.7929, time 127.41ms
iter 529890: loss 6.0814, time 124.83ms
iter 529900: loss 6.2504, time 127.37ms
iter 529910: loss 6.4785, time 123.52ms
iter 529920: loss 5.5213, time 124.80ms
iter 529930: loss 6.4041, time 125.06ms
iter 529940: loss 6.6127, time 125.98ms
iter 529950: loss 5.8860, time 125.80ms
iter 529960: loss 5.8010, time 126.23ms
iter 529970: loss 5.9149, time 125.82ms
iter 529980: loss 5.3848, time 125.90ms
iter 529990: loss 6.1934, time 126.08ms
step 530000: train loss 5.5446, val loss 5.5326
saving checkpoint to out-shakespeare-char
iter 530000: loss 6.0107, time 2894.48ms
iter 530010: loss 5.8549, time 124.60ms
iter 530020: loss 5.3714, time 124.47ms
iter 530030: loss 5.8558, time 124.77ms
iter 530040: loss 6.3006, time 125.16ms
iter 530050: loss 5.9318, time 124.83ms
iter 530060: loss 6.1071, time 124.53ms
iter 530070: loss 5.9985, time 125.13ms
iter 530080: loss 6.4769, time 126.16ms
iter 530090: loss 5.9434, time 126.39ms
iter 530100: loss 6.3640, time 125.28ms
iter 530110: loss 6.4768, time 124.04ms
iter 530120: loss 6.1719, time 124.50ms
iter 530130: loss 6.0618, time 125.15ms
iter 530140: loss 5.8602, time 124.60ms
iter 530150: loss 5.9627, time 124.54ms
iter 530160: loss 5.6212, time 124.66ms
iter 530170: loss 5.9437, time 124.56ms
iter 530180: loss 6.0482, time 124.04ms
iter 530190: loss 6.4514, time 124.41ms
iter 530200: loss 5.0040, time 123.90ms
iter 530210: loss 5.7911, time 124.37ms
iter 530220: loss 6.2595, time 124.03ms
iter 530230: loss 6.6700, time 125.57ms
iter 530240: loss 5.5733, time 125.35ms
step 530250: train loss 5.5016, val loss 5.5416
saving checkpoint to out-shakespeare-char
iter 530250: loss 5.7753, time 2895.38ms
iter 530260: loss 5.8305, time 125.41ms
iter 530270: loss 6.1008, time 125.62ms
iter 530280: loss 6.0751, time 125.48ms
iter 530290: loss 6.1462, time 125.09ms
iter 530300: loss 6.2395, time 125.30ms
iter 530310: loss 5.6893, time 126.34ms
iter 530320: loss 5.8744, time 125.52ms
iter 530330: loss 6.6418, time 125.46ms
iter 530340: loss 6.0906, time 125.60ms
iter 530350: loss 5.2800, time 125.16ms
iter 530360: loss 5.9272, time 125.08ms
iter 530370: loss 6.2980, time 125.84ms
iter 530380: loss 6.0215, time 125.56ms
iter 530390: loss 5.6207, time 125.24ms
iter 530400: loss 6.2248, time 125.46ms
iter 530410: loss 6.0347, time 125.34ms
iter 530420: loss 5.9989, time 125.13ms
iter 530430: loss 6.0277, time 125.00ms
iter 530440: loss 5.6841, time 125.55ms
iter 530450: loss 5.9398, time 124.90ms
iter 530460: loss 5.5042, time 125.44ms
iter 530470: loss 6.2605, time 125.21ms
iter 530480: loss 6.6077, time 125.26ms
iter 530490: loss 6.3542, time 125.39ms
step 530500: train loss 5.5110, val loss 5.5209
saving checkpoint to out-shakespeare-char
iter 530500: loss 5.8167, time 2885.10ms
iter 530510: loss 6.0924, time 125.38ms
iter 530520: loss 5.5787, time 127.58ms
iter 530530: loss 6.3930, time 125.04ms
iter 530540: loss 5.8607, time 127.60ms
iter 530550: loss 6.2853, time 125.36ms
iter 530560: loss 6.0229, time 127.42ms
iter 530570: loss 5.4636, time 125.35ms
iter 530580: loss 5.6276, time 128.54ms
iter 530590: loss 5.7820, time 125.33ms
iter 530600: loss 5.9382, time 127.79ms
iter 530610: loss 6.0122, time 125.52ms
iter 530620: loss 6.4990, time 128.24ms
iter 530630: loss 5.7460, time 125.58ms
iter 530640: loss 6.0053, time 127.48ms
iter 530650: loss 5.8020, time 125.61ms
iter 530660: loss 5.7616, time 127.43ms
iter 530670: loss 5.8393, time 125.64ms
iter 530680: loss 6.2563, time 128.09ms
iter 530690: loss 5.5234, time 125.65ms
iter 530700: loss 6.2128, time 128.68ms
iter 530710: loss 5.6597, time 125.75ms
iter 530720: loss 6.4206, time 128.50ms
iter 530730: loss 6.7996, time 125.51ms
iter 530740: loss 5.4907, time 128.26ms
step 530750: train loss 5.5613, val loss 5.5232
saving checkpoint to out-shakespeare-char
iter 530750: loss 5.9293, time 2877.10ms
iter 530760: loss 5.9510, time 125.57ms
iter 530770: loss 6.8210, time 125.78ms
iter 530780: loss 5.9868, time 125.46ms
iter 530790: loss 6.0026, time 125.40ms
iter 530800: loss 6.3407, time 125.42ms
iter 530810: loss 5.6986, time 125.56ms
iter 530820: loss 5.5065, time 125.26ms
iter 530830: loss 6.2477, time 125.60ms
iter 530840: loss 5.5142, time 126.23ms
iter 530850: loss 5.5904, time 125.87ms
iter 530860: loss 6.7210, time 125.43ms
iter 530870: loss 5.1407, time 125.44ms
iter 530880: loss 6.6362, time 125.38ms
iter 530890: loss 5.4529, time 125.83ms
iter 530900: loss 5.7218, time 125.41ms
iter 530910: loss 6.5266, time 125.53ms
iter 530920: loss 5.9767, time 125.78ms
iter 530930: loss 6.3586, time 126.15ms
iter 530940: loss 6.3354, time 127.64ms
iter 530950: loss 6.0905, time 125.40ms
iter 530960: loss 6.4495, time 125.47ms
iter 530970: loss 6.9783, time 125.51ms
iter 530980: loss 6.1373, time 125.48ms
iter 530990: loss 5.9926, time 125.62ms
step 531000: train loss 5.5408, val loss 5.5670
saving checkpoint to out-shakespeare-char
iter 531000: loss 5.5654, time 2879.31ms
iter 531010: loss 5.2900, time 125.09ms
iter 531020: loss 6.1401, time 125.14ms
iter 531030: loss 6.5010, time 124.49ms
iter 531040: loss 6.3898, time 125.28ms
iter 531050: loss 6.1910, time 125.00ms
iter 531060: loss 5.8830, time 125.46ms
iter 531070: loss 5.9712, time 125.06ms
iter 531080: loss 5.5140, time 125.31ms
iter 531090: loss 6.1180, time 125.17ms
iter 531100: loss 5.8109, time 125.15ms
iter 531110: loss 6.9071, time 124.65ms
iter 531120: loss 5.5424, time 125.13ms
iter 531130: loss 6.0471, time 125.23ms
iter 531140: loss 5.4097, time 125.75ms
iter 531150: loss 5.3908, time 125.21ms
iter 531160: loss 5.5964, time 125.09ms
iter 531170: loss 5.7843, time 125.54ms
iter 531180: loss 5.8660, time 125.71ms
iter 531190: loss 5.7643, time 125.59ms
iter 531200: loss 5.8317, time 125.44ms
iter 531210: loss 6.4077, time 125.17ms
iter 531220: loss 5.8367, time 125.83ms
iter 531230: loss 5.5240, time 125.28ms
iter 531240: loss 5.9151, time 126.42ms
step 531250: train loss 5.5129, val loss 5.5730
saving checkpoint to out-shakespeare-char
iter 531250: loss 6.9294, time 2865.39ms
iter 531260: loss 5.5210, time 128.18ms
iter 531270: loss 6.7211, time 125.74ms
iter 531280: loss 6.3569, time 128.20ms
iter 531290: loss 6.2515, time 125.67ms
iter 531300: loss 5.8634, time 128.25ms
iter 531310: loss 6.1530, time 125.50ms
iter 531320: loss 5.9374, time 128.52ms
iter 531330: loss 6.0824, time 125.04ms
iter 531340: loss 5.9096, time 128.25ms
iter 531350: loss 6.0832, time 125.82ms
iter 531360: loss 5.5983, time 128.46ms
iter 531370: loss 5.1489, time 125.71ms
iter 531380: loss 5.7285, time 128.09ms
iter 531390: loss 5.9697, time 125.51ms
iter 531400: loss 6.1657, time 128.78ms
iter 531410: loss 6.2458, time 126.23ms
iter 531420: loss 5.5447, time 128.36ms
iter 531430: loss 6.7867, time 125.77ms
iter 531440: loss 5.6530, time 128.20ms
iter 531450: loss 5.6462, time 125.69ms
iter 531460: loss 6.3020, time 125.64ms
iter 531470: loss 5.8973, time 125.68ms
iter 531480: loss 6.6852, time 126.30ms
iter 531490: loss 6.1976, time 125.83ms
step 531500: train loss 5.5422, val loss 5.5245
saving checkpoint to out-shakespeare-char
iter 531500: loss 6.1118, time 2895.59ms
iter 531510: loss 5.7507, time 126.10ms
iter 531520: loss 6.5111, time 125.36ms
iter 531530: loss 6.2508, time 125.42ms
iter 531540: loss 5.7333, time 125.35ms
iter 531550: loss 5.5682, time 125.59ms
iter 531560: loss 5.9544, time 126.17ms
iter 531570: loss 5.7033, time 126.09ms
iter 531580: loss 6.1990, time 126.57ms
iter 531590: loss 6.1784, time 124.81ms
iter 531600: loss 5.4846, time 126.45ms
iter 531610: loss 5.8287, time 125.20ms
iter 531620: loss 6.0684, time 125.72ms
iter 531630: loss 6.4387, time 125.26ms
iter 531640: loss 6.6610, time 125.69ms
iter 531650: loss 5.9923, time 125.41ms
iter 531660: loss 6.4552, time 125.99ms
iter 531670: loss 6.2648, time 125.53ms
iter 531680: loss 5.9878, time 125.83ms
iter 531690: loss 5.8455, time 125.48ms
iter 531700: loss 6.3703, time 126.02ms
iter 531710: loss 6.8957, time 125.57ms
iter 531720: loss 5.7660, time 122.84ms
iter 531730: loss 6.1224, time 120.86ms
iter 531740: loss 6.0133, time 121.86ms
step 531750: train loss 5.4761, val loss 5.5504
saving checkpoint to out-shakespeare-char
iter 531750: loss 5.9985, time 2913.60ms
iter 531760: loss 6.6450, time 121.64ms
iter 531770: loss 6.3697, time 121.55ms
iter 531780: loss 5.8547, time 122.14ms
iter 531790: loss 6.5382, time 121.70ms
iter 531800: loss 6.0603, time 121.72ms
iter 531810: loss 5.6391, time 121.84ms
iter 531820: loss 5.6385, time 122.75ms
iter 531830: loss 6.1054, time 121.44ms
iter 531840: loss 5.6769, time 121.93ms
iter 531850: loss 5.7762, time 124.18ms
iter 531860: loss 5.8564, time 120.76ms
iter 531870: loss 5.4218, time 121.46ms
iter 531880: loss 6.3467, time 121.81ms
iter 531890: loss 6.0436, time 121.61ms
iter 531900: loss 6.1688, time 121.54ms
iter 531910: loss 5.9877, time 121.30ms
iter 531920: loss 6.1974, time 121.97ms
iter 531930: loss 6.0105, time 120.82ms
iter 531940: loss 5.8136, time 121.37ms
iter 531950: loss 5.8223, time 122.04ms
iter 531960: loss 5.8763, time 121.37ms
iter 531970: loss 6.4776, time 121.21ms
iter 531980: loss 5.9762, time 124.09ms
iter 531990: loss 5.7468, time 121.27ms
step 532000: train loss 5.4869, val loss 5.4953
saving checkpoint to out-shakespeare-char
iter 532000: loss 5.6042, time 2908.06ms
iter 532010: loss 6.4109, time 120.75ms
iter 532020: loss 6.4339, time 122.59ms
iter 532030: loss 5.7820, time 121.63ms
iter 532040: loss 5.8136, time 120.85ms
iter 532050: loss 5.3730, time 122.33ms
iter 532060: loss 6.3268, time 122.06ms
iter 532070: loss 6.0293, time 121.43ms
iter 532080: loss 6.0115, time 124.66ms
iter 532090: loss 5.6216, time 121.51ms
iter 532100: loss 6.3691, time 121.09ms
iter 532110: loss 6.1246, time 122.05ms
iter 532120: loss 5.8156, time 121.44ms
iter 532130: loss 6.3123, time 121.43ms
iter 532140: loss 5.4203, time 121.51ms
iter 532150: loss 5.7283, time 122.47ms
iter 532160: loss 5.3453, time 121.63ms
iter 532170: loss 6.1015, time 121.43ms
iter 532180: loss 6.0924, time 122.37ms
iter 532190: loss 6.3096, time 121.44ms
iter 532200: loss 5.8734, time 121.37ms
iter 532210: loss 6.2139, time 124.00ms
iter 532220: loss 5.7469, time 121.78ms
iter 532230: loss 5.7943, time 121.50ms
iter 532240: loss 6.0273, time 120.64ms
step 532250: train loss 5.5490, val loss 5.5529
saving checkpoint to out-shakespeare-char
iter 532250: loss 6.2062, time 2894.64ms
iter 532260: loss 5.9661, time 121.59ms
iter 532270: loss 6.1007, time 121.56ms
iter 532280: loss 6.6043, time 123.61ms
iter 532290: loss 5.7107, time 121.75ms
iter 532300: loss 5.6527, time 121.52ms
iter 532310: loss 5.8550, time 121.50ms
iter 532320: loss 5.7988, time 122.54ms
iter 532330: loss 5.0953, time 121.40ms
iter 532340: loss 5.7908, time 121.84ms
iter 532350: loss 6.3996, time 122.89ms
iter 532360: loss 5.9666, time 121.54ms
iter 532370: loss 5.9691, time 121.58ms
iter 532380: loss 5.4779, time 121.59ms
iter 532390: loss 5.6482, time 121.53ms
iter 532400: loss 6.4079, time 121.33ms
iter 532410: loss 6.2309, time 121.52ms
iter 532420: loss 5.6846, time 122.72ms
iter 532430: loss 5.0014, time 121.46ms
iter 532440: loss 5.6717, time 121.24ms
iter 532450: loss 6.1961, time 122.63ms
iter 532460: loss 5.8278, time 121.64ms
iter 532470: loss 6.3480, time 121.55ms
iter 532480: loss 6.1513, time 123.39ms
iter 532490: loss 5.5566, time 122.05ms
step 532500: train loss 5.5331, val loss 5.5101
saving checkpoint to out-shakespeare-char
iter 532500: loss 6.2601, time 2920.74ms
iter 532510: loss 6.9080, time 128.61ms
iter 532520: loss 6.3488, time 125.89ms
iter 532530: loss 5.9805, time 128.28ms
iter 532540: loss 6.0466, time 125.68ms
iter 532550: loss 5.8399, time 128.23ms
iter 532560: loss 6.4042, time 125.87ms
iter 532570: loss 6.7602, time 128.49ms
iter 532580: loss 6.4902, time 125.86ms
iter 532590: loss 5.8997, time 128.55ms
iter 532600: loss 5.8502, time 125.73ms
iter 532610: loss 5.5563, time 128.40ms
iter 532620: loss 6.5210, time 125.77ms
iter 532630: loss 5.5052, time 128.42ms
iter 532640: loss 5.7502, time 125.46ms
iter 532650: loss 5.9523, time 128.03ms
iter 532660: loss 6.0651, time 125.61ms
iter 532670: loss 5.3336, time 128.24ms
iter 532680: loss 6.0548, time 124.93ms
iter 532690: loss 5.4755, time 128.08ms
iter 532700: loss 5.5159, time 126.26ms
iter 532710: loss 5.2336, time 128.29ms
iter 532720: loss 6.0800, time 125.60ms
iter 532730: loss 6.1109, time 128.06ms
iter 532740: loss 5.7509, time 125.73ms
step 532750: train loss 5.5736, val loss 5.5176
saving checkpoint to out-shakespeare-char
iter 532750: loss 6.1420, time 2902.30ms
iter 532760: loss 6.5521, time 126.12ms
iter 532770: loss 6.1526, time 124.91ms
iter 532780: loss 6.5081, time 124.53ms
iter 532790: loss 5.3897, time 125.22ms
iter 532800: loss 6.0248, time 125.31ms
iter 532810: loss 5.8477, time 125.00ms
iter 532820: loss 6.1827, time 125.04ms
iter 532830: loss 6.3812, time 124.90ms
iter 532840: loss 6.4817, time 125.28ms
iter 532850: loss 6.5273, time 125.24ms
iter 532860: loss 6.8184, time 125.10ms
iter 532870: loss 6.0678, time 124.93ms
iter 532880: loss 6.9121, time 124.93ms
iter 532890: loss 6.0126, time 125.00ms
iter 532900: loss 5.3577, time 125.14ms
iter 532910: loss 5.4548, time 125.09ms
iter 532920: loss 5.7285, time 125.07ms
iter 532930: loss 5.7943, time 124.76ms
iter 532940: loss 6.2224, time 125.66ms
iter 532950: loss 5.9523, time 124.96ms
iter 532960: loss 6.2740, time 124.96ms
iter 532970: loss 5.3375, time 125.16ms
iter 532980: loss 6.2902, time 124.06ms
iter 532990: loss 6.5458, time 124.74ms
step 533000: train loss 5.5423, val loss 5.5912
saving checkpoint to out-shakespeare-char
iter 533000: loss 6.2334, time 2905.19ms
iter 533010: loss 6.0768, time 125.39ms
iter 533020: loss 6.5566, time 125.35ms
iter 533030: loss 5.3905, time 125.14ms
iter 533040: loss 6.7051, time 125.37ms
iter 533050: loss 5.9625, time 125.14ms
iter 533060: loss 6.1454, time 125.39ms
iter 533070: loss 5.4707, time 125.08ms
iter 533080: loss 5.6472, time 125.54ms
iter 533090: loss 6.4309, time 125.05ms
iter 533100: loss 6.0352, time 125.50ms
iter 533110: loss 5.6891, time 125.76ms
iter 533120: loss 5.4815, time 125.49ms
iter 533130: loss 4.8465, time 125.33ms
iter 533140: loss 6.7668, time 125.77ms
iter 533150: loss 6.9392, time 127.03ms
iter 533160: loss 6.0585, time 125.68ms
iter 533170: loss 5.2994, time 125.60ms
iter 533180: loss 6.0505, time 126.11ms
iter 533190: loss 5.8809, time 125.60ms
iter 533200: loss 6.8101, time 125.87ms
iter 533210: loss 5.5908, time 125.81ms
iter 533220: loss 5.5688, time 125.75ms
iter 533230: loss 6.4139, time 125.90ms
iter 533240: loss 6.4952, time 125.79ms
step 533250: train loss 5.5100, val loss 5.5343
saving checkpoint to out-shakespeare-char
iter 533250: loss 6.1537, time 2901.69ms
iter 533260: loss 6.3399, time 128.17ms
iter 533270: loss 6.0036, time 125.86ms
iter 533280: loss 6.6184, time 128.47ms
iter 533290: loss 5.9019, time 125.47ms
iter 533300: loss 6.1005, time 128.05ms
iter 533310: loss 5.9268, time 125.51ms
iter 533320: loss 5.7300, time 128.20ms
iter 533330: loss 5.7712, time 125.69ms
iter 533340: loss 5.8641, time 128.27ms
iter 533350: loss 6.9787, time 125.75ms
iter 533360: loss 5.6788, time 127.68ms
iter 533370: loss 5.6913, time 125.44ms
iter 533380: loss 6.8204, time 128.08ms
iter 533390: loss 6.3914, time 124.97ms
iter 533400: loss 5.6633, time 128.06ms
iter 533410: loss 5.9365, time 125.29ms
iter 533420: loss 6.3380, time 128.09ms
iter 533430: loss 5.9167, time 125.58ms
iter 533440: loss 5.7327, time 128.23ms
iter 533450: loss 5.1815, time 125.58ms
iter 533460: loss 5.8485, time 126.12ms
iter 533470: loss 5.7354, time 125.67ms
iter 533480: loss 6.3244, time 125.86ms
iter 533490: loss 5.5241, time 125.17ms
step 533500: train loss 5.4941, val loss 5.5451
saving checkpoint to out-shakespeare-char
iter 533500: loss 5.7343, time 2885.95ms
iter 533510: loss 7.0500, time 122.39ms
iter 533520: loss 5.5139, time 120.25ms
iter 533530: loss 6.3193, time 121.22ms
iter 533540: loss 5.5640, time 122.00ms
iter 533550: loss 6.0223, time 122.32ms
iter 533560: loss 5.9498, time 121.49ms
iter 533570: loss 5.5917, time 121.00ms
iter 533580: loss 5.8554, time 124.09ms
iter 533590: loss 5.4143, time 121.49ms
iter 533600: loss 6.3186, time 121.47ms
iter 533610: loss 6.1884, time 121.72ms
iter 533620: loss 6.1223, time 121.66ms
iter 533630: loss 6.2141, time 121.54ms
iter 533640: loss 5.6601, time 121.59ms
iter 533650: loss 6.2800, time 122.62ms
iter 533660: loss 6.4342, time 120.73ms
iter 533670: loss 6.0836, time 121.32ms
iter 533680: loss 5.2518, time 123.97ms
iter 533690: loss 5.7768, time 122.29ms
iter 533700: loss 6.2584, time 121.56ms
iter 533710: loss 5.5980, time 121.65ms
iter 533720: loss 6.5705, time 122.52ms
iter 533730: loss 6.3199, time 121.50ms
iter 533740: loss 5.5456, time 121.20ms
step 533750: train loss 5.5338, val loss 5.5462
saving checkpoint to out-shakespeare-char
iter 533750: loss 6.0236, time 2878.78ms
iter 533760: loss 6.2313, time 122.86ms
iter 533770: loss 5.4263, time 121.84ms
iter 533780: loss 6.4131, time 121.81ms
iter 533790: loss 6.5871, time 123.07ms
iter 533800: loss 5.3318, time 121.79ms
iter 533810: loss 6.5383, time 121.48ms
iter 533820: loss 5.4193, time 121.62ms
iter 533830: loss 6.0531, time 122.60ms
iter 533840: loss 6.7150, time 120.64ms
iter 533850: loss 5.9126, time 121.99ms
iter 533860: loss 6.3982, time 122.60ms
iter 533870: loss 5.7843, time 121.42ms
iter 533880: loss 6.8455, time 120.84ms
iter 533890: loss 5.5654, time 124.11ms
iter 533900: loss 5.8605, time 121.54ms
iter 533910: loss 5.3791, time 122.59ms
iter 533920: loss 5.9382, time 121.48ms
iter 533930: loss 6.2179, time 122.99ms
iter 533940: loss 5.6278, time 121.51ms
iter 533950: loss 6.0477, time 121.40ms
iter 533960: loss 5.8376, time 121.66ms
iter 533970: loss 6.3608, time 122.53ms
iter 533980: loss 5.3557, time 121.69ms
iter 533990: loss 6.0966, time 121.76ms
step 534000: train loss 5.5546, val loss 5.5472
saving checkpoint to out-shakespeare-char
iter 534000: loss 6.2739, time 2879.07ms
iter 534010: loss 5.3786, time 122.31ms
iter 534020: loss 4.9430, time 120.71ms
iter 534030: loss 6.8532, time 122.77ms
iter 534040: loss 5.8344, time 121.09ms
iter 534050: loss 5.3686, time 123.18ms
iter 534060: loss 5.6801, time 121.52ms
iter 534070: loss 5.8768, time 121.19ms
iter 534080: loss 5.4007, time 122.69ms
iter 534090: loss 5.5162, time 121.49ms
iter 534100: loss 6.1241, time 121.69ms
iter 534110: loss 6.2462, time 123.59ms
iter 534120: loss 5.8082, time 121.76ms
iter 534130: loss 5.3082, time 121.59ms
iter 534140: loss 6.2419, time 121.44ms
iter 534150: loss 5.4848, time 122.50ms
iter 534160: loss 6.5636, time 121.28ms
iter 534170: loss 6.1713, time 121.23ms
iter 534180: loss 6.1356, time 123.68ms
iter 534190: loss 5.6765, time 122.32ms
iter 534200: loss 6.0652, time 122.01ms
iter 534210: loss 6.1335, time 121.41ms
iter 534220: loss 6.5702, time 122.79ms
iter 534230: loss 6.3965, time 122.02ms
iter 534240: loss 5.0978, time 121.30ms
step 534250: train loss 5.5333, val loss 5.5725
saving checkpoint to out-shakespeare-char
iter 534250: loss 6.1978, time 2897.66ms
iter 534260: loss 6.3774, time 122.31ms
iter 534270: loss 4.9322, time 121.33ms
iter 534280: loss 6.1238, time 121.30ms
iter 534290: loss 6.1418, time 121.88ms
iter 534300: loss 5.6394, time 121.30ms
iter 534310: loss 6.2098, time 121.44ms
iter 534320: loss 6.2139, time 124.02ms
iter 534330: loss 5.8595, time 122.68ms
iter 534340: loss 5.4781, time 121.33ms
iter 534350: loss 5.8383, time 121.30ms
iter 534360: loss 6.3252, time 122.25ms
iter 534370: loss 5.0843, time 121.38ms
iter 534380: loss 6.1245, time 121.61ms
iter 534390: loss 6.3128, time 122.78ms
iter 534400: loss 5.7629, time 121.35ms
iter 534410: loss 5.6335, time 121.43ms
iter 534420: loss 6.0145, time 121.54ms
iter 534430: loss 6.1135, time 121.66ms
iter 534440: loss 5.5643, time 121.29ms
iter 534450: loss 6.0241, time 121.15ms
iter 534460: loss 6.6145, time 122.86ms
iter 534470: loss 5.6063, time 121.81ms
iter 534480: loss 6.3530, time 121.23ms
iter 534490: loss 6.5311, time 121.14ms
step 534500: train loss 5.5613, val loss 5.5566
saving checkpoint to out-shakespeare-char
iter 534500: loss 5.7390, time 2905.96ms
iter 534510: loss 5.5298, time 124.57ms
iter 534520: loss 6.3824, time 126.59ms
iter 534530: loss 5.8424, time 123.93ms
iter 534540: loss 5.4215, time 122.44ms
iter 534550: loss 5.7699, time 123.42ms
iter 534560: loss 5.6390, time 124.41ms
iter 534570: loss 6.6307, time 123.90ms
iter 534580: loss 5.9168, time 124.48ms
iter 534590: loss 5.7212, time 123.90ms
iter 534600: loss 6.2118, time 124.56ms
iter 534610: loss 6.1466, time 124.45ms
iter 534620: loss 5.9101, time 124.90ms
iter 534630: loss 6.4459, time 124.11ms
iter 534640: loss 6.4689, time 124.24ms
iter 534650: loss 6.3956, time 124.05ms
iter 534660: loss 6.7304, time 124.77ms
iter 534670: loss 6.0727, time 124.10ms
iter 534680: loss 5.7410, time 124.13ms
iter 534690: loss 5.7464, time 123.99ms
iter 534700: loss 5.9370, time 124.82ms
iter 534710: loss 6.5173, time 123.92ms
iter 534720: loss 5.4662, time 123.24ms
iter 534730: loss 5.5518, time 124.18ms
iter 534740: loss 5.5864, time 124.04ms
step 534750: train loss 5.4986, val loss 5.5592
saving checkpoint to out-shakespeare-char
iter 534750: loss 6.1120, time 2889.83ms
iter 534760: loss 6.0019, time 124.11ms
iter 534770: loss 6.5122, time 124.88ms
iter 534780: loss 5.7204, time 124.53ms
iter 534790: loss 5.0473, time 125.17ms
iter 534800: loss 5.9313, time 125.04ms
iter 534810: loss 5.8007, time 125.57ms
iter 534820: loss 6.4281, time 125.40ms
iter 534830: loss 6.0514, time 125.44ms
iter 534840: loss 5.6746, time 124.95ms
iter 534850: loss 6.3977, time 124.43ms
iter 534860: loss 5.9535, time 125.21ms
iter 534870: loss 6.0323, time 125.75ms
iter 534880: loss 6.0382, time 126.52ms
iter 534890: loss 5.4187, time 125.39ms
iter 534900: loss 6.0740, time 125.35ms
iter 534910: loss 6.2068, time 125.44ms
iter 534920: loss 5.8132, time 126.19ms
iter 534930: loss 6.0844, time 124.60ms
iter 534940: loss 6.0437, time 125.18ms
iter 534950: loss 5.2436, time 123.38ms
iter 534960: loss 5.8579, time 124.91ms
iter 534970: loss 5.7697, time 124.98ms
iter 534980: loss 6.3571, time 124.91ms
iter 534990: loss 6.2008, time 125.03ms
step 535000: train loss 5.5486, val loss 5.5958
saving checkpoint to out-shakespeare-char
iter 535000: loss 6.2082, time 2876.57ms
iter 535010: loss 6.5356, time 125.31ms
iter 535020: loss 5.5344, time 123.95ms
iter 535030: loss 6.6332, time 125.07ms
iter 535040: loss 5.9321, time 125.02ms
iter 535050: loss 5.9187, time 124.86ms
iter 535060: loss 5.3637, time 123.42ms
iter 535070: loss 5.5067, time 124.75ms
iter 535080: loss 5.8201, time 125.93ms
iter 535090: loss 5.6418, time 124.89ms
iter 535100: loss 5.9105, time 124.04ms
iter 535110: loss 6.3662, time 125.61ms
iter 535120: loss 6.0907, time 124.21ms
iter 535130: loss 6.0416, time 125.34ms
iter 535140: loss 5.6140, time 125.58ms
iter 535150: loss 5.2873, time 124.53ms
iter 535160: loss 5.8685, time 124.60ms
iter 535170: loss 6.1385, time 125.24ms
iter 535180: loss 6.4874, time 125.43ms
iter 535190: loss 6.2315, time 124.90ms
iter 535200: loss 6.5853, time 133.07ms
iter 535210: loss 6.0317, time 125.64ms
iter 535220: loss 6.2825, time 125.12ms
iter 535230: loss 5.5697, time 125.36ms
iter 535240: loss 6.0543, time 125.13ms
step 535250: train loss 5.5150, val loss 5.4795
saving checkpoint to out-shakespeare-char
iter 535250: loss 6.8198, time 2861.23ms
iter 535260: loss 6.0518, time 125.08ms
iter 535270: loss 5.7423, time 125.79ms
iter 535280: loss 6.4784, time 125.17ms
iter 535290: loss 5.7838, time 124.98ms
iter 535300: loss 6.9028, time 125.15ms
iter 535310: loss 6.0003, time 125.01ms
iter 535320: loss 5.8165, time 124.95ms
iter 535330: loss 5.6941, time 125.39ms
iter 535340: loss 5.9281, time 125.51ms
iter 535350: loss 6.1834, time 125.41ms
iter 535360: loss 5.7230, time 124.99ms
iter 535370: loss 5.8964, time 124.62ms
iter 535380: loss 5.0132, time 125.48ms
iter 535390: loss 5.7120, time 125.43ms
iter 535400: loss 6.2257, time 124.92ms
iter 535410: loss 5.1577, time 125.44ms
iter 535420: loss 5.8432, time 125.10ms
iter 535430: loss 5.7287, time 124.55ms
iter 535440: loss 6.3873, time 124.06ms
iter 535450: loss 5.7815, time 125.03ms
iter 535460: loss 6.2093, time 125.14ms
iter 535470: loss 6.1365, time 125.18ms
iter 535480: loss 6.7213, time 121.93ms
iter 535490: loss 5.4795, time 121.46ms
step 535500: train loss 5.5368, val loss 5.5634
saving checkpoint to out-shakespeare-char
iter 535500: loss 5.7766, time 2897.37ms
iter 535510: loss 6.9421, time 125.89ms
iter 535520: loss 5.5698, time 126.28ms
iter 535530: loss 6.3789, time 124.97ms
iter 535540: loss 5.9633, time 125.67ms
iter 535550: loss 5.9920, time 125.54ms
iter 535560: loss 5.3331, time 126.28ms
iter 535570: loss 6.2093, time 125.77ms
iter 535580: loss 5.9906, time 125.57ms
iter 535590: loss 5.6478, time 125.18ms
iter 535600: loss 5.3181, time 126.11ms
iter 535610: loss 6.2098, time 126.08ms
iter 535620: loss 6.2651, time 125.77ms
iter 535630: loss 6.2938, time 123.72ms
iter 535640: loss 5.5354, time 125.69ms
iter 535650: loss 6.4119, time 125.38ms
iter 535660: loss 5.8215, time 124.90ms
iter 535670: loss 4.9965, time 122.67ms
iter 535680: loss 5.8968, time 125.43ms
iter 535690: loss 5.6518, time 125.40ms
iter 535700: loss 5.6174, time 125.93ms
iter 535710: loss 5.9839, time 124.12ms
iter 535720: loss 5.3556, time 125.57ms
iter 535730: loss 6.5025, time 125.58ms
iter 535740: loss 6.4011, time 125.52ms
step 535750: train loss 5.5041, val loss 5.5544
saving checkpoint to out-shakespeare-char
iter 535750: loss 5.4586, time 2889.44ms
iter 535760: loss 5.9955, time 125.69ms
iter 535770: loss 6.0110, time 125.39ms
iter 535780: loss 5.7904, time 124.61ms
iter 535790: loss 5.7021, time 125.43ms
iter 535800: loss 6.0137, time 125.51ms
iter 535810: loss 6.3744, time 125.37ms
iter 535820: loss 5.2386, time 125.14ms
iter 535830: loss 5.9183, time 124.86ms
iter 535840: loss 6.1463, time 124.49ms
iter 535850: loss 5.9054, time 125.43ms
iter 535860: loss 5.6855, time 124.93ms
iter 535870: loss 5.7104, time 125.57ms
iter 535880: loss 5.6864, time 125.28ms
iter 535890: loss 6.4004, time 125.08ms
iter 535900: loss 6.6465, time 126.20ms
iter 535910: loss 6.1412, time 124.49ms
iter 535920: loss 5.6863, time 126.25ms
iter 535930: loss 5.2281, time 124.83ms
iter 535940: loss 5.2897, time 125.24ms
iter 535950: loss 6.0399, time 125.07ms
iter 535960: loss 5.8245, time 125.09ms
iter 535970: loss 6.1247, time 125.11ms
iter 535980: loss 6.4230, time 125.21ms
iter 535990: loss 6.3329, time 125.21ms
step 536000: train loss 5.5450, val loss 5.5523
saving checkpoint to out-shakespeare-char
iter 536000: loss 6.5432, time 2904.42ms
iter 536010: loss 5.2215, time 125.89ms
iter 536020: loss 6.0698, time 126.27ms
iter 536030: loss 5.3131, time 127.17ms
iter 536040: loss 6.2622, time 125.75ms
iter 536050: loss 5.9812, time 128.30ms
iter 536060: loss 6.1145, time 125.70ms
iter 536070: loss 5.9238, time 128.22ms
iter 536080: loss 5.8168, time 125.62ms
iter 536090: loss 6.3390, time 128.48ms
iter 536100: loss 5.7619, time 125.66ms
iter 536110: loss 5.0430, time 128.42ms
iter 536120: loss 6.0071, time 125.64ms
iter 536130: loss 6.2806, time 125.76ms
iter 536140: loss 6.2564, time 126.08ms
iter 536150: loss 6.0398, time 125.72ms
iter 536160: loss 5.7811, time 125.33ms
iter 536170: loss 6.2857, time 124.94ms
iter 536180: loss 6.2621, time 125.42ms
iter 536190: loss 5.9155, time 125.66ms
iter 536200: loss 5.9195, time 124.95ms
iter 536210: loss 5.4194, time 125.82ms
iter 536220: loss 6.2972, time 125.65ms
iter 536230: loss 6.0154, time 125.27ms
iter 536240: loss 5.9502, time 125.64ms
step 536250: train loss 5.5214, val loss 5.5207
saving checkpoint to out-shakespeare-char
iter 536250: loss 5.9901, time 2901.85ms
iter 536260: loss 5.8969, time 125.42ms
iter 536270: loss 5.5617, time 125.36ms
iter 536280: loss 5.5866, time 125.71ms
iter 536290: loss 6.2613, time 125.53ms
iter 536300: loss 5.6770, time 125.76ms
iter 536310: loss 6.0239, time 125.86ms
iter 536320: loss 5.6110, time 125.67ms
iter 536330: loss 5.9723, time 125.91ms
iter 536340: loss 5.1884, time 125.71ms
iter 536350: loss 5.3779, time 125.14ms
iter 536360: loss 6.4560, time 124.82ms
iter 536370: loss 5.9322, time 125.79ms
iter 536380: loss 5.8152, time 125.47ms
iter 536390: loss 6.0227, time 125.42ms
iter 536400: loss 6.3570, time 125.69ms
iter 536410: loss 6.5036, time 126.26ms
iter 536420: loss 6.7967, time 125.54ms
iter 536430: loss 5.8452, time 124.69ms
iter 536440: loss 6.1480, time 125.55ms
iter 536450: loss 5.8181, time 126.02ms
iter 536460: loss 6.3894, time 124.49ms
iter 536470: loss 6.1747, time 124.04ms
iter 536480: loss 5.9343, time 125.69ms
iter 536490: loss 5.8084, time 125.86ms
step 536500: train loss 5.6391, val loss 5.5222
saving checkpoint to out-shakespeare-char
iter 536500: loss 5.4472, time 2886.37ms
iter 536510: loss 6.3689, time 128.23ms
iter 536520: loss 6.0556, time 125.77ms
iter 536530: loss 5.9875, time 128.30ms
iter 536540: loss 5.5358, time 125.50ms
iter 536550: loss 6.0248, time 128.10ms
iter 536560: loss 6.6061, time 124.71ms
iter 536570: loss 5.9065, time 127.54ms
iter 536580: loss 5.9400, time 125.82ms
iter 536590: loss 6.3652, time 127.78ms
iter 536600: loss 6.5163, time 125.63ms
iter 536610: loss 6.1352, time 128.23ms
iter 536620: loss 5.9213, time 125.76ms
iter 536630: loss 6.0862, time 128.11ms
iter 536640: loss 5.7146, time 125.84ms
iter 536650: loss 6.2088, time 128.74ms
iter 536660: loss 6.6849, time 125.71ms
iter 536670: loss 5.6185, time 125.71ms
iter 536680: loss 5.6612, time 125.84ms
iter 536690: loss 5.4910, time 124.69ms
iter 536700: loss 6.4795, time 125.59ms
iter 536710: loss 5.9964, time 125.69ms
iter 536720: loss 5.8068, time 125.63ms
iter 536730: loss 6.4914, time 126.08ms
iter 536740: loss 6.4058, time 121.22ms
step 536750: train loss 5.5427, val loss 5.5697
saving checkpoint to out-shakespeare-char
iter 536750: loss 6.4469, time 2880.21ms
iter 536760: loss 5.6834, time 121.66ms
iter 536770: loss 5.6370, time 122.19ms
iter 536780: loss 6.0698, time 120.71ms
iter 536790: loss 6.2510, time 121.91ms
iter 536800: loss 5.8867, time 124.03ms
iter 536810: loss 6.6433, time 122.13ms
iter 536820: loss 6.3437, time 121.49ms
iter 536830: loss 6.0512, time 120.27ms
iter 536840: loss 5.4961, time 121.57ms
iter 536850: loss 5.6339, time 121.44ms
iter 536860: loss 5.6321, time 121.45ms
iter 536870: loss 5.9373, time 122.70ms
iter 536880: loss 5.4850, time 121.88ms
iter 536890: loss 6.0575, time 120.71ms
iter 536900: loss 5.3671, time 122.59ms
iter 536910: loss 5.4640, time 121.97ms
iter 536920: loss 6.6448, time 121.37ms
iter 536930: loss 5.8970, time 124.29ms
iter 536940: loss 6.0887, time 121.46ms
iter 536950: loss 6.1683, time 121.81ms
iter 536960: loss 6.1311, time 122.21ms
iter 536970: loss 5.9166, time 122.47ms
iter 536980: loss 5.4786, time 121.40ms
iter 536990: loss 5.9952, time 120.49ms
step 537000: train loss 5.5439, val loss 5.5725
saving checkpoint to out-shakespeare-char
iter 537000: loss 5.6458, time 2907.57ms
iter 537010: loss 5.8940, time 121.46ms
iter 537020: loss 6.1455, time 122.08ms
iter 537030: loss 6.2707, time 124.01ms
iter 537040: loss 6.4659, time 121.43ms
iter 537050: loss 5.3072, time 121.47ms
iter 537060: loss 5.6623, time 121.90ms
iter 537070: loss 6.2683, time 121.51ms
iter 537080: loss 6.0465, time 121.51ms
iter 537090: loss 5.7679, time 121.51ms
iter 537100: loss 7.0358, time 122.65ms
iter 537110: loss 5.2511, time 121.19ms
iter 537120: loss 5.9998, time 121.41ms
iter 537130: loss 6.0667, time 124.32ms
iter 537140: loss 6.0487, time 121.42ms
iter 537150: loss 5.9319, time 121.17ms
iter 537160: loss 5.3241, time 121.45ms
iter 537170: loss 6.3480, time 121.50ms
iter 537180: loss 5.5369, time 121.55ms
iter 537190: loss 6.4596, time 121.52ms
iter 537200: loss 5.2969, time 122.78ms
iter 537210: loss 5.9470, time 121.63ms
iter 537220: loss 5.6602, time 121.56ms
iter 537230: loss 6.0862, time 122.76ms
iter 537240: loss 5.9035, time 121.66ms
step 537250: train loss 5.5736, val loss 5.5281
saving checkpoint to out-shakespeare-char
iter 537250: loss 6.5415, time 2908.86ms
iter 537260: loss 5.8613, time 122.77ms
iter 537270: loss 6.8171, time 121.32ms
iter 537280: loss 6.4029, time 121.76ms
iter 537290: loss 6.3434, time 121.60ms
iter 537300: loss 6.4730, time 122.81ms
iter 537310: loss 5.9653, time 121.61ms
iter 537320: loss 5.3786, time 121.85ms
iter 537330: loss 6.1195, time 123.02ms
iter 537340: loss 5.7890, time 122.22ms
iter 537350: loss 5.9051, time 121.54ms
iter 537360: loss 5.2501, time 124.47ms
iter 537370: loss 6.0245, time 121.81ms
iter 537380: loss 5.9369, time 121.74ms
iter 537390: loss 5.7966, time 121.99ms
iter 537400: loss 6.4915, time 121.74ms
iter 537410: loss 5.8899, time 121.45ms
iter 537420: loss 6.0527, time 121.44ms
iter 537430: loss 6.2114, time 122.14ms
iter 537440: loss 5.5938, time 121.54ms
iter 537450: loss 5.8174, time 121.61ms
iter 537460: loss 6.2610, time 122.69ms
iter 537470: loss 5.5312, time 120.51ms
iter 537480: loss 5.5052, time 121.77ms
iter 537490: loss 6.3609, time 124.13ms
step 537500: train loss 5.5616, val loss 5.5353
saving checkpoint to out-shakespeare-char
iter 537500: loss 6.6942, time 2904.09ms
iter 537510: loss 6.0402, time 120.96ms
iter 537520: loss 5.8302, time 121.46ms
iter 537530: loss 6.0813, time 121.38ms
iter 537540: loss 5.5145, time 121.51ms
iter 537550: loss 6.6992, time 121.52ms
iter 537560: loss 5.1863, time 121.93ms
iter 537570: loss 6.5046, time 122.94ms
iter 537580: loss 6.2117, time 121.25ms
iter 537590: loss 5.3697, time 121.49ms
iter 537600: loss 5.4917, time 123.57ms
iter 537610: loss 5.7685, time 121.65ms
iter 537620: loss 5.9218, time 121.43ms
iter 537630: loss 5.2038, time 121.35ms
iter 537640: loss 6.2600, time 121.78ms
iter 537650: loss 5.9075, time 121.59ms
iter 537660: loss 6.1239, time 121.84ms
iter 537670: loss 6.1004, time 122.96ms
iter 537680: loss 6.1557, time 121.28ms
iter 537690: loss 6.3622, time 121.85ms
iter 537700: loss 6.6040, time 124.90ms
iter 537710: loss 5.6942, time 122.00ms
iter 537720: loss 7.0676, time 121.60ms
iter 537730: loss 6.1304, time 121.44ms
iter 537740: loss 5.6025, time 121.63ms
step 537750: train loss 5.5380, val loss 5.5473
saving checkpoint to out-shakespeare-char
iter 537750: loss 5.7812, time 2909.57ms
iter 537760: loss 6.1575, time 121.50ms
iter 537770: loss 5.8783, time 124.07ms
iter 537780: loss 5.5037, time 121.59ms
iter 537790: loss 6.4403, time 121.57ms
iter 537800: loss 5.5574, time 120.84ms
iter 537810: loss 6.5747, time 123.04ms
iter 537820: loss 5.8301, time 121.08ms
iter 537830: loss 6.0269, time 121.50ms
iter 537840: loss 5.5978, time 121.86ms
iter 537850: loss 5.9848, time 121.57ms
iter 537860: loss 5.7130, time 121.59ms
iter 537870: loss 5.8012, time 120.63ms
iter 537880: loss 6.1555, time 120.96ms
iter 537890: loss 6.7704, time 121.88ms
iter 537900: loss 4.8727, time 122.08ms
iter 537910: loss 6.1765, time 124.74ms
iter 537920: loss 5.4354, time 121.03ms
iter 537930: loss 5.5336, time 121.32ms
iter 537940: loss 5.6637, time 121.68ms
iter 537950: loss 6.1540, time 121.37ms
iter 537960: loss 6.9119, time 121.53ms
iter 537970: loss 5.7023, time 121.48ms
iter 537980: loss 5.6996, time 122.19ms
iter 537990: loss 6.5301, time 121.90ms
step 538000: train loss 5.5533, val loss 5.5674
saving checkpoint to out-shakespeare-char
iter 538000: loss 6.1358, time 2900.01ms
iter 538010: loss 5.3438, time 121.36ms
iter 538020: loss 6.8251, time 122.77ms
iter 538030: loss 6.2027, time 121.47ms
iter 538040: loss 6.5364, time 121.01ms
iter 538050: loss 5.9866, time 124.26ms
iter 538060: loss 6.0909, time 121.36ms
iter 538070: loss 6.1948, time 121.41ms
iter 538080: loss 5.0170, time 123.44ms
iter 538090: loss 6.4008, time 121.51ms
iter 538100: loss 6.1665, time 121.78ms
iter 538110: loss 6.8113, time 121.85ms
iter 538120: loss 5.7771, time 121.62ms
iter 538130: loss 5.8416, time 121.63ms
iter 538140: loss 6.5709, time 122.04ms
iter 538150: loss 5.6631, time 121.93ms
iter 538160: loss 5.9941, time 121.59ms
iter 538170: loss 6.2585, time 120.59ms
iter 538180: loss 5.6910, time 121.20ms
iter 538190: loss 6.1348, time 120.39ms
iter 538200: loss 6.0193, time 120.28ms
iter 538210: loss 5.5103, time 122.58ms
iter 538220: loss 5.5946, time 121.57ms
iter 538230: loss 6.3476, time 119.41ms
iter 538240: loss 5.4744, time 119.93ms
step 538250: train loss 5.5394, val loss 5.5464
saving checkpoint to out-shakespeare-char
iter 538250: loss 6.1682, time 2916.62ms
iter 538260: loss 6.2656, time 125.71ms
iter 538270: loss 6.5615, time 125.14ms
iter 538280: loss 6.4821, time 125.56ms
iter 538290: loss 6.1816, time 122.79ms
iter 538300: loss 5.6679, time 121.20ms
iter 538310: loss 6.4712, time 124.69ms
iter 538320: loss 6.0376, time 121.30ms
iter 538330: loss 5.8911, time 121.33ms
iter 538340: loss 5.2506, time 121.68ms
iter 538350: loss 5.8074, time 121.52ms
iter 538360: loss 4.7327, time 121.67ms
iter 538370: loss 6.2397, time 121.62ms
iter 538380: loss 5.7777, time 122.68ms
iter 538390: loss 6.8691, time 120.57ms
iter 538400: loss 5.7587, time 121.42ms
iter 538410: loss 6.2862, time 122.87ms
iter 538420: loss 6.1412, time 121.40ms
iter 538430: loss 5.5097, time 121.26ms
iter 538440: loss 6.0933, time 124.33ms
iter 538450: loss 6.1208, time 121.87ms
iter 538460: loss 6.6910, time 121.42ms
iter 538470: loss 6.6382, time 121.60ms
iter 538480: loss 6.0126, time 121.88ms
iter 538490: loss 5.6435, time 121.34ms
step 538500: train loss 5.5500, val loss 5.5517
saving checkpoint to out-shakespeare-char
iter 538500: loss 5.5140, time 2900.91ms
iter 538510: loss 6.2800, time 120.64ms
iter 538520: loss 6.1485, time 121.49ms
iter 538530: loss 5.9050, time 121.51ms
iter 538540: loss 5.9442, time 121.68ms
iter 538550: loss 6.5447, time 121.41ms
iter 538560: loss 6.1217, time 121.43ms
iter 538570: loss 6.6823, time 122.97ms
iter 538580: loss 6.1710, time 122.04ms
iter 538590: loss 5.9893, time 121.52ms
iter 538600: loss 6.2092, time 123.27ms
iter 538610: loss 6.0717, time 121.53ms
iter 538620: loss 5.7615, time 121.69ms
iter 538630: loss 6.6359, time 124.31ms
iter 538640: loss 6.3641, time 121.71ms
iter 538650: loss 5.9238, time 121.24ms
iter 538660: loss 6.6085, time 120.77ms
iter 538670: loss 5.4504, time 121.64ms
iter 538680: loss 5.5262, time 121.41ms
iter 538690: loss 6.5855, time 121.29ms
iter 538700: loss 5.8149, time 122.85ms
iter 538710: loss 5.1002, time 121.81ms
iter 538720: loss 5.9285, time 121.35ms
iter 538730: loss 6.4369, time 123.09ms
iter 538740: loss 6.1731, time 121.34ms
step 538750: train loss 5.5522, val loss 5.5576
saving checkpoint to out-shakespeare-char
iter 538750: loss 6.1367, time 2899.59ms
iter 538760: loss 6.1791, time 121.78ms
iter 538770: loss 6.0597, time 123.04ms
iter 538780: loss 5.6716, time 121.39ms
iter 538790: loss 7.2803, time 121.03ms
iter 538800: loss 5.7964, time 122.55ms
iter 538810: loss 5.5042, time 121.40ms
iter 538820: loss 5.2913, time 121.33ms
iter 538830: loss 5.6903, time 124.20ms
iter 538840: loss 5.8852, time 121.60ms
iter 538850: loss 6.2830, time 121.49ms
iter 538860: loss 6.3438, time 121.56ms
iter 538870: loss 5.3558, time 121.77ms
iter 538880: loss 5.9534, time 121.42ms
iter 538890: loss 5.5385, time 121.60ms
iter 538900: loss 6.0904, time 122.49ms
iter 538910: loss 5.7221, time 121.26ms
iter 538920: loss 5.9611, time 121.50ms
iter 538930: loss 5.6302, time 122.48ms
iter 538940: loss 5.8674, time 121.78ms
iter 538950: loss 5.5809, time 121.40ms
iter 538960: loss 5.5047, time 124.04ms
iter 538970: loss 6.2755, time 121.57ms
iter 538980: loss 5.6035, time 122.11ms
iter 538990: loss 6.4409, time 120.57ms
step 539000: train loss 5.5610, val loss 5.5376
saving checkpoint to out-shakespeare-char
iter 539000: loss 5.6685, time 2900.33ms
iter 539010: loss 6.5994, time 121.52ms
iter 539020: loss 6.1020, time 121.51ms
iter 539030: loss 6.6851, time 124.12ms
iter 539040: loss 5.5530, time 121.57ms
iter 539050: loss 6.1839, time 121.54ms
iter 539060: loss 6.0403, time 121.41ms
iter 539070: loss 6.8068, time 121.55ms
iter 539080: loss 5.7430, time 121.55ms
iter 539090: loss 5.4818, time 121.62ms
iter 539100: loss 5.7085, time 123.03ms
iter 539110: loss 6.2979, time 121.38ms
iter 539120: loss 5.8901, time 121.49ms
iter 539130: loss 5.4869, time 123.52ms
iter 539140: loss 5.1555, time 121.74ms
iter 539150: loss 6.1522, time 122.30ms
iter 539160: loss 5.8514, time 121.23ms
iter 539170: loss 6.2573, time 121.47ms
iter 539180: loss 5.8729, time 121.67ms
iter 539190: loss 5.8657, time 121.30ms
iter 539200: loss 6.6246, time 122.53ms
iter 539210: loss 5.5741, time 122.74ms
iter 539220: loss 5.0416, time 121.79ms
iter 539230: loss 6.0189, time 122.86ms
iter 539240: loss 6.0301, time 121.48ms
step 539250: train loss 5.5650, val loss 5.5631
saving checkpoint to out-shakespeare-char
iter 539250: loss 5.4545, time 2895.81ms
iter 539260: loss 6.3351, time 121.34ms
iter 539270: loss 5.7851, time 121.46ms
iter 539280: loss 6.1364, time 122.11ms
iter 539290: loss 5.8947, time 121.60ms
iter 539300: loss 6.3405, time 121.81ms
iter 539310: loss 6.0858, time 121.49ms
iter 539320: loss 6.2406, time 121.29ms
iter 539330: loss 5.5024, time 122.55ms
iter 539340: loss 5.5866, time 121.72ms
iter 539350: loss 5.6267, time 121.61ms
iter 539360: loss 6.4776, time 121.24ms
iter 539370: loss 6.5182, time 121.58ms
iter 539380: loss 5.8182, time 121.58ms
iter 539390: loss 5.7063, time 121.37ms
iter 539400: loss 6.2972, time 122.54ms
iter 539410: loss 6.3890, time 121.49ms
iter 539420: loss 5.2104, time 121.59ms
iter 539430: loss 5.7837, time 122.62ms
iter 539440: loss 6.1322, time 121.11ms
iter 539450: loss 5.7102, time 121.56ms
iter 539460: loss 5.9416, time 124.03ms
iter 539470: loss 5.2097, time 122.22ms
iter 539480: loss 6.4996, time 121.55ms
iter 539490: loss 5.5876, time 121.26ms
step 539500: train loss 5.5515, val loss 5.5378
saving checkpoint to out-shakespeare-char
iter 539500: loss 5.9019, time 2902.92ms
iter 539510: loss 6.5744, time 121.63ms
iter 539520: loss 6.6413, time 121.25ms
iter 539530: loss 5.6740, time 121.51ms
iter 539540: loss 6.1918, time 123.05ms
iter 539550: loss 5.9187, time 121.65ms
iter 539560: loss 6.0016, time 122.05ms
iter 539570: loss 5.8366, time 122.77ms
iter 539580: loss 6.3376, time 121.61ms
iter 539590: loss 6.4736, time 121.61ms
iter 539600: loss 6.2078, time 124.05ms
iter 539610: loss 5.7400, time 121.52ms
iter 539620: loss 5.7871, time 121.65ms
iter 539630: loss 5.3809, time 121.27ms
iter 539640: loss 6.0118, time 121.46ms
iter 539650: loss 5.6669, time 121.53ms
iter 539660: loss 6.4258, time 121.37ms
iter 539670: loss 5.7639, time 122.93ms
iter 539680: loss 6.2544, time 121.48ms
iter 539690: loss 6.4444, time 121.70ms
iter 539700: loss 5.3452, time 122.98ms
iter 539710: loss 6.5332, time 121.33ms
iter 539720: loss 5.7512, time 121.08ms
iter 539730: loss 6.2513, time 123.99ms
iter 539740: loss 6.3451, time 121.57ms
step 539750: train loss 5.5817, val loss 5.5357
saving checkpoint to out-shakespeare-char
iter 539750: loss 6.3938, time 2901.75ms
iter 539760: loss 6.5484, time 121.58ms
iter 539770: loss 5.5970, time 123.25ms
iter 539780: loss 6.8655, time 121.71ms
iter 539790: loss 5.0945, time 121.78ms
iter 539800: loss 5.6860, time 124.36ms
iter 539810: loss 5.8302, time 121.58ms
iter 539820: loss 5.6185, time 121.66ms
iter 539830: loss 6.0939, time 121.33ms
iter 539840: loss 5.5722, time 122.00ms
iter 539850: loss 6.5771, time 121.61ms
iter 539860: loss 5.5381, time 121.53ms
iter 539870: loss 5.6606, time 123.20ms
iter 539880: loss 6.0294, time 121.30ms
iter 539890: loss 6.3421, time 121.52ms
iter 539900: loss 7.5478, time 123.90ms
iter 539910: loss 6.1931, time 121.55ms
iter 539920: loss 6.3008, time 121.56ms
iter 539930: loss 5.1885, time 124.41ms
iter 539940: loss 5.1141, time 121.55ms
iter 539950: loss 6.6278, time 121.65ms
iter 539960: loss 5.7937, time 121.87ms
iter 539970: loss 6.4108, time 121.69ms
iter 539980: loss 6.2404, time 121.95ms
iter 539990: loss 6.5047, time 120.79ms
step 540000: train loss 5.5416, val loss 5.5804
saving checkpoint to out-shakespeare-char
iter 540000: loss 5.4472, time 2897.50ms
iter 540010: loss 5.6917, time 121.51ms
iter 540020: loss 5.6469, time 120.92ms
iter 540030: loss 6.5280, time 121.74ms
iter 540040: loss 5.8728, time 121.25ms
iter 540050: loss 5.9767, time 121.79ms
iter 540060: loss 6.4943, time 121.44ms
iter 540070: loss 5.5761, time 121.44ms
iter 540080: loss 5.8889, time 122.80ms
iter 540090: loss 5.9291, time 121.18ms
iter 540100: loss 5.4555, time 121.49ms
iter 540110: loss 5.8831, time 123.15ms
iter 540120: loss 5.7705, time 121.50ms
iter 540130: loss 6.5769, time 121.76ms
iter 540140: loss 6.1071, time 123.97ms
iter 540150: loss 6.0725, time 121.71ms
iter 540160: loss 6.1034, time 121.16ms
iter 540170: loss 5.4570, time 122.45ms
iter 540180: loss 5.6414, time 121.45ms
iter 540190: loss 5.8011, time 121.60ms
iter 540200: loss 6.1740, time 120.67ms
iter 540210: loss 6.6404, time 122.62ms
iter 540220: loss 6.4172, time 121.40ms
iter 540230: loss 5.7280, time 121.56ms
iter 540240: loss 6.1777, time 122.04ms
step 540250: train loss 5.5635, val loss 5.5254
saving checkpoint to out-shakespeare-char
iter 540250: loss 6.1457, time 2895.77ms
iter 540260: loss 6.6048, time 121.49ms
iter 540270: loss 5.9503, time 122.68ms
iter 540280: loss 6.4620, time 121.38ms
iter 540290: loss 5.7835, time 121.53ms
iter 540300: loss 6.2277, time 121.22ms
iter 540310: loss 6.5176, time 122.01ms
iter 540320: loss 6.5647, time 121.58ms
iter 540330: loss 6.0083, time 120.98ms
iter 540340: loss 6.0815, time 122.76ms
iter 540350: loss 6.0338, time 121.95ms
iter 540360: loss 6.0978, time 121.49ms
iter 540370: loss 5.5724, time 122.27ms
iter 540380: loss 5.6778, time 121.59ms
iter 540390: loss 6.0972, time 121.64ms
iter 540400: loss 5.9801, time 124.12ms
iter 540410: loss 5.9071, time 121.72ms
iter 540420: loss 5.3134, time 121.53ms
iter 540430: loss 6.0059, time 121.79ms
iter 540440: loss 5.9092, time 121.95ms
iter 540450: loss 6.3588, time 121.32ms
iter 540460: loss 6.2287, time 121.60ms
iter 540470: loss 6.1323, time 123.06ms
iter 540480: loss 5.6731, time 121.76ms
iter 540490: loss 5.9473, time 121.72ms
step 540500: train loss 5.5159, val loss 5.5112
saving checkpoint to out-shakespeare-char
iter 540500: loss 6.4963, time 2915.32ms
iter 540510: loss 6.2880, time 122.11ms
iter 540520: loss 5.9000, time 122.24ms
iter 540530: loss 6.0525, time 121.90ms
iter 540540: loss 5.9686, time 122.46ms
iter 540550: loss 6.6689, time 121.74ms
iter 540560: loss 6.2866, time 121.87ms
iter 540570: loss 6.0127, time 123.02ms
iter 540580: loss 5.9632, time 121.69ms
iter 540590: loss 5.0988, time 121.90ms
iter 540600: loss 5.6554, time 124.59ms
iter 540610: loss 5.6345, time 122.09ms
iter 540620: loss 6.2543, time 121.09ms
iter 540630: loss 5.7860, time 121.77ms
iter 540640: loss 6.6810, time 121.91ms
iter 540650: loss 5.6968, time 121.73ms
iter 540660: loss 5.8511, time 122.08ms
iter 540670: loss 6.0161, time 122.95ms
iter 540680: loss 5.6705, time 121.66ms
iter 540690: loss 6.2508, time 121.96ms
iter 540700: loss 5.4431, time 122.34ms
iter 540710: loss 6.2320, time 121.85ms
iter 540720: loss 6.3030, time 121.90ms
iter 540730: loss 6.6339, time 124.47ms
iter 540740: loss 6.0394, time 121.62ms
step 540750: train loss 5.5718, val loss 5.5376
saving checkpoint to out-shakespeare-char
iter 540750: loss 5.6321, time 2902.36ms
iter 540760: loss 6.4836, time 121.77ms
iter 540770: loss 5.9657, time 122.93ms
iter 540780: loss 6.5597, time 120.93ms
iter 540790: loss 5.8234, time 121.52ms
iter 540800: loss 6.6602, time 122.68ms
iter 540810: loss 5.0601, time 121.52ms
iter 540820: loss 5.8216, time 120.65ms
iter 540830: loss 5.8234, time 124.08ms
iter 540840: loss 6.5714, time 121.51ms
iter 540850: loss 5.6913, time 121.01ms
iter 540860: loss 6.4246, time 121.22ms
iter 540870: loss 5.8376, time 121.74ms
iter 540880: loss 6.3072, time 121.63ms
iter 540890: loss 5.5933, time 121.77ms
iter 540900: loss 5.8460, time 122.60ms
iter 540910: loss 5.4499, time 120.82ms
iter 540920: loss 5.9154, time 121.57ms
iter 540930: loss 6.1510, time 121.52ms
iter 540940: loss 6.0269, time 121.81ms
iter 540950: loss 5.2413, time 121.69ms
iter 540960: loss 6.2349, time 121.34ms
iter 540970: loss 5.8072, time 122.71ms
iter 540980: loss 5.7800, time 121.26ms
iter 540990: loss 6.4112, time 121.71ms
step 541000: train loss 5.5366, val loss 5.5243
saving checkpoint to out-shakespeare-char
iter 541000: loss 6.4026, time 2903.27ms
iter 541010: loss 6.5691, time 124.93ms
iter 541020: loss 6.2748, time 128.01ms
iter 541030: loss 5.4638, time 125.80ms
iter 541040: loss 6.4499, time 128.43ms
iter 541050: loss 5.7281, time 125.41ms
iter 541060: loss 6.4111, time 128.18ms
iter 541070: loss 5.7808, time 125.82ms
iter 541080: loss 6.3477, time 128.13ms
iter 541090: loss 5.5249, time 125.82ms
iter 541100: loss 5.3570, time 125.98ms
iter 541110: loss 5.8824, time 125.85ms
iter 541120: loss 6.0851, time 125.50ms
iter 541130: loss 6.6346, time 125.47ms
iter 541140: loss 6.7059, time 125.81ms
iter 541150: loss 5.7050, time 125.94ms
iter 541160: loss 5.9704, time 125.72ms
iter 541170: loss 6.0175, time 125.46ms
iter 541180: loss 6.1656, time 126.05ms
iter 541190: loss 6.1801, time 125.88ms
iter 541200: loss 5.8248, time 125.76ms
iter 541210: loss 5.4457, time 125.49ms
iter 541220: loss 6.2519, time 125.79ms
iter 541230: loss 5.7654, time 125.99ms
iter 541240: loss 6.6986, time 125.72ms
step 541250: train loss 5.5279, val loss 5.5294
saving checkpoint to out-shakespeare-char
iter 541250: loss 5.9044, time 2897.94ms
iter 541260: loss 5.9277, time 125.73ms
iter 541270: loss 5.8501, time 125.25ms
iter 541280: loss 6.0862, time 125.20ms
iter 541290: loss 5.7206, time 124.67ms
iter 541300: loss 5.9623, time 125.40ms
iter 541310: loss 6.4282, time 125.25ms
iter 541320: loss 5.5279, time 125.95ms
iter 541330: loss 6.1920, time 125.33ms
iter 541340: loss 6.1954, time 124.92ms
iter 541350: loss 5.5264, time 124.94ms
iter 541360: loss 6.2033, time 125.34ms
iter 541370: loss 5.7715, time 123.40ms
iter 541380: loss 5.4203, time 125.54ms
iter 541390: loss 6.3217, time 124.74ms
iter 541400: loss 5.9681, time 125.10ms
iter 541410: loss 6.3957, time 124.17ms
iter 541420: loss 5.7737, time 124.98ms
iter 541430: loss 6.1576, time 124.77ms
iter 541440: loss 5.8897, time 125.33ms
iter 541450: loss 5.5605, time 124.76ms
iter 541460: loss 5.8617, time 125.09ms
iter 541470: loss 5.7405, time 124.71ms
iter 541480: loss 5.7046, time 125.07ms
iter 541490: loss 6.2093, time 124.25ms
step 541500: train loss 5.5717, val loss 5.5688
saving checkpoint to out-shakespeare-char
iter 541500: loss 5.9240, time 2877.14ms
iter 541510: loss 5.9682, time 124.80ms
iter 541520: loss 5.7249, time 125.22ms
iter 541530: loss 5.7431, time 123.69ms
iter 541540: loss 5.7108, time 124.23ms
iter 541550: loss 5.3735, time 124.27ms
iter 541560: loss 5.8664, time 124.77ms
iter 541570: loss 5.9718, time 124.65ms
iter 541580: loss 5.9295, time 123.84ms
iter 541590: loss 6.0714, time 124.29ms
iter 541600: loss 4.9498, time 124.94ms
iter 541610: loss 5.7795, time 124.78ms
iter 541620: loss 6.4442, time 123.62ms
iter 541630: loss 5.4074, time 124.21ms
iter 541640: loss 5.3865, time 124.86ms
iter 541650: loss 5.3790, time 125.18ms
iter 541660: loss 5.6974, time 124.19ms
iter 541670: loss 5.8516, time 124.72ms
iter 541680: loss 5.9617, time 123.91ms
iter 541690: loss 6.6730, time 125.42ms
iter 541700: loss 5.8428, time 125.22ms
iter 541710: loss 5.8311, time 123.68ms
iter 541720: loss 5.8160, time 124.04ms
iter 541730: loss 6.8127, time 125.12ms
iter 541740: loss 6.6326, time 125.08ms
step 541750: train loss 5.5428, val loss 5.5926
saving checkpoint to out-shakespeare-char
iter 541750: loss 6.0872, time 2859.41ms
iter 541760: loss 5.9449, time 125.12ms
iter 541770: loss 6.5932, time 123.72ms
iter 541780: loss 6.1762, time 124.44ms
iter 541790: loss 6.3191, time 124.98ms
iter 541800: loss 5.9612, time 124.84ms
iter 541810: loss 5.7457, time 123.69ms
iter 541820: loss 5.9256, time 124.36ms
iter 541830: loss 5.7896, time 125.09ms
iter 541840: loss 5.5537, time 124.16ms
iter 541850: loss 6.4224, time 124.06ms
iter 541860: loss 5.8521, time 124.84ms
iter 541870: loss 5.6952, time 125.23ms
iter 541880: loss 5.8683, time 124.65ms
iter 541890: loss 5.6209, time 124.24ms
iter 541900: loss 6.1807, time 124.77ms
iter 541910: loss 5.5921, time 125.47ms
iter 541920: loss 5.8943, time 126.00ms
iter 541930: loss 5.7949, time 125.63ms
iter 541940: loss 6.3362, time 125.36ms
iter 541950: loss 6.2771, time 125.71ms
iter 541960: loss 6.3684, time 126.92ms
iter 541970: loss 5.9939, time 125.49ms
iter 541980: loss 6.2872, time 125.75ms
iter 541990: loss 6.3338, time 125.67ms
step 542000: train loss 5.5034, val loss 5.5577
saving checkpoint to out-shakespeare-char
iter 542000: loss 5.8767, time 2898.20ms
iter 542010: loss 6.3654, time 119.26ms
iter 542020: loss 5.6262, time 122.84ms
iter 542030: loss 6.3332, time 121.79ms
iter 542040: loss 5.5492, time 121.79ms
iter 542050: loss 6.1256, time 119.10ms
iter 542060: loss 6.3551, time 124.29ms
iter 542070: loss 5.6514, time 124.40ms
iter 542080: loss 6.2778, time 119.81ms
iter 542090: loss 5.6347, time 120.85ms
iter 542100: loss 5.9783, time 121.32ms
iter 542110: loss 6.0334, time 124.52ms
iter 542120: loss 6.0085, time 121.83ms
iter 542130: loss 5.8510, time 121.90ms
iter 542140: loss 6.2991, time 121.93ms
iter 542150: loss 5.9495, time 121.93ms
iter 542160: loss 6.5355, time 121.83ms
iter 542170: loss 5.9733, time 122.21ms
iter 542180: loss 6.1123, time 122.21ms
iter 542190: loss 5.0171, time 121.91ms
iter 542200: loss 6.3318, time 121.21ms
iter 542210: loss 5.5683, time 121.03ms
iter 542220: loss 6.0631, time 119.88ms
iter 542230: loss 6.0006, time 120.04ms
iter 542240: loss 6.3828, time 119.85ms
step 542250: train loss 5.4727, val loss 5.5362
saving checkpoint to out-shakespeare-char
iter 542250: loss 5.9667, time 2902.25ms
iter 542260: loss 6.0738, time 121.14ms
iter 542270: loss 5.5898, time 119.57ms
iter 542280: loss 6.3773, time 119.53ms
iter 542290: loss 6.2635, time 119.52ms
iter 542300: loss 5.5958, time 119.55ms
iter 542310: loss 5.9976, time 119.69ms
iter 542320: loss 7.1873, time 120.74ms
iter 542330: loss 6.1556, time 120.62ms
iter 542340: loss 6.0473, time 124.08ms
iter 542350: loss 5.8389, time 122.17ms
iter 542360: loss 5.8328, time 121.90ms
iter 542370: loss 5.8507, time 121.76ms
iter 542380: loss 5.8703, time 121.36ms
iter 542390: loss 6.0992, time 121.50ms
iter 542400: loss 6.7836, time 121.46ms
iter 542410: loss 6.2688, time 122.65ms
iter 542420: loss 5.7136, time 121.73ms
iter 542430: loss 5.8114, time 121.42ms
iter 542440: loss 6.4477, time 123.30ms
iter 542450: loss 5.9922, time 121.79ms
iter 542460: loss 6.3813, time 121.46ms
iter 542470: loss 5.5874, time 124.47ms
iter 542480: loss 5.7374, time 121.70ms
iter 542490: loss 5.8584, time 121.72ms
step 542500: train loss 5.5350, val loss 5.5517
saving checkpoint to out-shakespeare-char
iter 542500: loss 6.2164, time 2900.47ms
iter 542510: loss 6.5138, time 121.64ms
iter 542520: loss 5.7395, time 121.63ms
iter 542530: loss 5.8832, time 124.19ms
iter 542540: loss 5.7110, time 121.59ms
iter 542550: loss 6.7418, time 122.17ms
iter 542560: loss 5.9271, time 121.67ms
iter 542570: loss 6.1473, time 121.56ms
iter 542580: loss 5.4741, time 122.18ms
iter 542590: loss 7.1086, time 121.95ms
iter 542600: loss 5.4134, time 123.06ms
iter 542610: loss 5.6216, time 121.61ms
iter 542620: loss 6.0751, time 121.51ms
iter 542630: loss 6.7516, time 122.10ms
iter 542640: loss 5.6333, time 121.50ms
iter 542650: loss 5.8081, time 121.37ms
iter 542660: loss 5.3070, time 124.05ms
iter 542670: loss 5.7255, time 121.42ms
iter 542680: loss 6.6078, time 121.50ms
iter 542690: loss 6.2142, time 121.50ms
iter 542700: loss 5.8711, time 121.42ms
iter 542710: loss 5.6494, time 121.54ms
iter 542720: loss 5.5772, time 121.30ms
iter 542730: loss 5.7438, time 122.97ms
iter 542740: loss 5.9845, time 122.00ms
step 542750: train loss 5.5331, val loss 5.5671
saving checkpoint to out-shakespeare-char
iter 542750: loss 5.7679, time 2897.64ms
iter 542760: loss 6.1067, time 122.57ms
iter 542770: loss 5.5671, time 120.95ms
iter 542780: loss 5.9771, time 122.57ms
iter 542790: loss 5.8231, time 123.99ms
iter 542800: loss 6.2492, time 121.52ms
iter 542810: loss 6.3607, time 121.51ms
iter 542820: loss 5.4741, time 124.30ms
iter 542830: loss 5.6135, time 122.51ms
iter 542840: loss 6.1575, time 121.51ms
iter 542850: loss 6.2438, time 121.47ms
iter 542860: loss 5.0999, time 123.27ms
iter 542870: loss 5.3601, time 121.42ms
iter 542880: loss 6.5373, time 122.19ms
iter 542890: loss 5.7635, time 122.77ms
iter 542900: loss 6.1277, time 121.53ms
iter 542910: loss 6.2068, time 121.53ms
iter 542920: loss 6.0172, time 123.22ms
iter 542930: loss 6.2006, time 122.24ms
iter 542940: loss 5.3672, time 121.85ms
iter 542950: loss 5.7408, time 121.51ms
iter 542960: loss 5.8363, time 121.38ms
iter 542970: loss 5.3681, time 121.39ms
iter 542980: loss 6.2886, time 121.62ms
iter 542990: loss 5.7995, time 122.73ms
step 543000: train loss 5.5465, val loss 5.5877
saving checkpoint to out-shakespeare-char
iter 543000: loss 5.9036, time 2882.50ms
iter 543010: loss 5.6490, time 122.50ms
iter 543020: loss 5.9308, time 121.35ms
iter 543030: loss 5.6542, time 122.64ms
iter 543040: loss 6.0353, time 121.61ms
iter 543050: loss 5.9600, time 121.49ms
iter 543060: loss 6.0224, time 122.66ms
iter 543070: loss 4.9255, time 121.65ms
iter 543080: loss 6.2706, time 121.61ms
iter 543090: loss 6.2815, time 124.15ms
iter 543100: loss 6.2127, time 121.52ms
iter 543110: loss 6.8895, time 120.90ms
iter 543120: loss 5.8559, time 122.50ms
iter 543130: loss 6.0334, time 122.81ms
iter 543140: loss 6.0079, time 121.61ms
iter 543150: loss 6.0409, time 121.48ms
iter 543160: loss 6.1876, time 123.89ms
iter 543170: loss 5.9134, time 121.79ms
iter 543180: loss 6.0584, time 121.51ms
iter 543190: loss 6.3469, time 121.93ms
iter 543200: loss 5.9007, time 122.46ms
iter 543210: loss 6.2172, time 121.51ms
iter 543220: loss 6.2272, time 121.42ms
iter 543230: loss 5.7175, time 123.95ms
iter 543240: loss 5.8015, time 122.28ms
step 543250: train loss 5.5795, val loss 5.5450
saving checkpoint to out-shakespeare-char
iter 543250: loss 6.1960, time 2887.44ms
iter 543260: loss 6.2790, time 121.37ms
iter 543270: loss 6.2205, time 122.77ms
iter 543280: loss 6.2541, time 120.87ms
iter 543290: loss 5.8635, time 121.31ms
iter 543300: loss 6.8736, time 120.63ms
iter 543310: loss 6.0318, time 120.08ms
iter 543320: loss 5.6955, time 121.28ms
iter 543330: loss 6.2253, time 121.37ms
iter 543340: loss 5.5645, time 121.47ms
iter 543350: loss 6.1190, time 121.86ms
iter 543360: loss 5.8297, time 122.57ms
iter 543370: loss 6.1098, time 121.59ms
iter 543380: loss 5.9768, time 119.87ms
iter 543390: loss 6.3157, time 121.06ms
iter 543400: loss 6.0513, time 120.45ms
iter 543410: loss 5.8914, time 121.77ms
iter 543420: loss 6.3028, time 121.44ms
iter 543430: loss 5.2824, time 121.54ms
iter 543440: loss 6.5251, time 122.41ms
iter 543450: loss 6.0779, time 121.80ms
iter 543460: loss 5.7117, time 120.73ms
iter 543470: loss 6.2442, time 121.46ms
iter 543480: loss 5.9776, time 122.85ms
iter 543490: loss 6.0077, time 121.51ms
step 543500: train loss 5.5000, val loss 5.5030
saving checkpoint to out-shakespeare-char
iter 543500: loss 5.7978, time 2903.53ms
iter 543510: loss 5.5051, time 122.88ms
iter 543520: loss 5.6881, time 121.98ms
iter 543530: loss 5.5242, time 121.68ms
iter 543540: loss 5.7990, time 124.51ms
iter 543550: loss 5.7898, time 121.54ms
iter 543560: loss 5.9101, time 121.79ms
iter 543570: loss 6.5861, time 121.80ms
iter 543580: loss 5.8351, time 122.05ms
iter 543590: loss 5.5164, time 122.27ms
iter 543600: loss 5.5142, time 121.77ms
iter 543610: loss 6.4022, time 123.12ms
iter 543620: loss 6.0070, time 121.66ms
iter 543630: loss 6.0803, time 121.81ms
iter 543640: loss 4.9063, time 123.03ms
iter 543650: loss 5.7238, time 122.20ms
iter 543660: loss 6.3120, time 121.05ms
iter 543670: loss 6.0971, time 124.48ms
iter 543680: loss 6.1954, time 121.60ms
iter 543690: loss 5.5282, time 121.86ms
iter 543700: loss 6.1442, time 121.84ms
iter 543710: loss 5.4523, time 121.86ms
iter 543720: loss 6.2383, time 121.80ms
iter 543730: loss 6.5729, time 121.88ms
iter 543740: loss 6.3121, time 123.01ms
step 543750: train loss 5.5482, val loss 5.5557
saving checkpoint to out-shakespeare-char
iter 543750: loss 6.2820, time 2900.85ms
iter 543760: loss 6.0647, time 121.66ms
iter 543770: loss 5.9320, time 121.20ms
iter 543780: loss 5.5123, time 121.27ms
iter 543790: loss 5.5348, time 121.62ms
iter 543800: loss 5.5168, time 121.48ms
iter 543810: loss 6.1639, time 121.56ms
iter 543820: loss 6.0784, time 121.39ms
iter 543830: loss 6.4206, time 121.45ms
iter 543840: loss 5.5984, time 121.09ms
iter 543850: loss 6.7209, time 122.12ms
iter 543860: loss 5.8737, time 121.27ms
iter 543870: loss 5.8749, time 121.16ms
iter 543880: loss 6.5557, time 123.76ms
iter 543890: loss 5.3866, time 121.28ms
iter 543900: loss 5.8472, time 121.22ms
iter 543910: loss 6.1522, time 120.62ms
iter 543920: loss 5.4775, time 122.53ms
iter 543930: loss 5.9133, time 121.23ms
iter 543940: loss 6.0978, time 121.03ms
iter 543950: loss 5.8949, time 123.92ms
iter 543960: loss 5.2251, time 121.15ms
iter 543970: loss 5.6002, time 122.34ms
iter 543980: loss 5.8371, time 121.13ms
iter 543990: loss 6.1143, time 121.74ms
step 544000: train loss 5.5512, val loss 5.5894
saving checkpoint to out-shakespeare-char
iter 544000: loss 5.3487, time 2902.09ms
iter 544010: loss 6.4772, time 121.44ms
iter 544020: loss 5.9181, time 121.24ms
iter 544030: loss 6.0002, time 124.20ms
iter 544040: loss 6.5592, time 121.58ms
iter 544050: loss 6.3224, time 120.61ms
iter 544060: loss 5.5913, time 121.26ms
iter 544070: loss 5.6108, time 122.02ms
iter 544080: loss 5.4383, time 122.31ms
iter 544090: loss 6.4200, time 121.56ms
iter 544100: loss 6.2780, time 122.63ms
iter 544110: loss 6.0712, time 121.45ms
iter 544120: loss 5.7930, time 121.62ms
iter 544130: loss 5.3153, time 122.77ms
iter 544140: loss 5.7243, time 121.30ms
iter 544150: loss 5.7564, time 121.21ms
iter 544160: loss 5.8461, time 121.38ms
iter 544170: loss 6.3366, time 121.40ms
iter 544180: loss 5.1406, time 121.53ms
iter 544190: loss 6.1350, time 121.48ms
iter 544200: loss 5.2347, time 122.66ms
iter 544210: loss 6.0592, time 121.40ms
iter 544220: loss 5.9752, time 121.31ms
iter 544230: loss 5.6994, time 122.65ms
iter 544240: loss 5.7627, time 121.47ms
step 544250: train loss 5.5796, val loss 5.5804
saving checkpoint to out-shakespeare-char
iter 544250: loss 6.4775, time 2913.55ms
iter 544260: loss 6.2786, time 124.73ms
iter 544270: loss 5.5918, time 121.89ms
iter 544280: loss 5.6993, time 121.68ms
iter 544290: loss 6.0562, time 121.95ms
iter 544300: loss 5.6497, time 121.61ms
iter 544310: loss 6.8101, time 121.72ms
iter 544320: loss 5.5987, time 121.68ms
iter 544330: loss 5.9582, time 123.19ms
iter 544340: loss 6.5883, time 121.66ms
iter 544350: loss 5.4879, time 122.05ms
iter 544360: loss 5.5210, time 122.87ms
iter 544370: loss 5.8942, time 121.87ms
iter 544380: loss 6.8085, time 121.94ms
iter 544390: loss 6.2543, time 124.82ms
iter 544400: loss 6.4677, time 122.17ms
iter 544410: loss 6.2564, time 122.89ms
iter 544420: loss 6.5596, time 121.90ms
iter 544430: loss 5.6447, time 121.73ms
iter 544440: loss 5.7939, time 121.74ms
iter 544450: loss 5.5161, time 121.71ms
iter 544460: loss 5.1511, time 123.24ms
iter 544470: loss 5.9671, time 122.69ms
iter 544480: loss 6.0767, time 121.68ms
iter 544490: loss 6.5703, time 123.17ms
step 544500: train loss 5.5850, val loss 5.5326
saving checkpoint to out-shakespeare-char
iter 544500: loss 6.2547, time 2905.13ms
iter 544510: loss 5.7597, time 121.73ms
iter 544520: loss 6.6008, time 120.85ms
iter 544530: loss 6.6455, time 121.77ms
iter 544540: loss 5.7995, time 121.77ms
iter 544550: loss 5.4268, time 122.02ms
iter 544560: loss 6.3617, time 123.19ms
iter 544570: loss 5.8311, time 121.72ms
iter 544580: loss 5.6379, time 121.97ms
iter 544590: loss 5.6984, time 123.67ms
iter 544600: loss 5.8752, time 121.79ms
iter 544610: loss 5.4919, time 122.36ms
iter 544620: loss 6.1862, time 124.27ms
iter 544630: loss 6.2955, time 121.35ms
iter 544640: loss 6.5833, time 121.69ms
iter 544650: loss 5.3640, time 121.86ms
iter 544660: loss 6.2299, time 121.66ms
iter 544670: loss 5.6842, time 121.86ms
iter 544680: loss 5.2286, time 121.96ms
iter 544690: loss 6.4432, time 122.95ms
iter 544700: loss 4.9924, time 121.79ms
iter 544710: loss 5.5903, time 120.79ms
iter 544720: loss 6.3445, time 123.18ms
iter 544730: loss 6.0014, time 121.84ms
iter 544740: loss 6.0013, time 121.76ms
step 544750: train loss 5.5381, val loss 5.5447
saving checkpoint to out-shakespeare-char
iter 544750: loss 5.8915, time 2905.70ms
iter 544760: loss 5.6597, time 122.71ms
iter 544770: loss 5.7635, time 121.36ms
iter 544780: loss 5.4462, time 121.74ms
iter 544790: loss 5.6580, time 122.81ms
iter 544800: loss 5.5814, time 120.57ms
iter 544810: loss 5.8572, time 121.38ms
iter 544820: loss 5.1509, time 123.81ms
iter 544830: loss 5.8529, time 121.43ms
iter 544840: loss 5.2813, time 121.50ms
iter 544850: loss 6.0916, time 120.86ms
iter 544860: loss 6.3059, time 121.55ms
iter 544870: loss 6.0576, time 121.64ms
iter 544880: loss 6.1890, time 121.27ms
iter 544890: loss 6.6498, time 124.08ms
iter 544900: loss 5.3338, time 121.30ms
iter 544910: loss 6.0288, time 121.51ms
iter 544920: loss 6.8328, time 121.34ms
iter 544930: loss 5.7632, time 123.03ms
iter 544940: loss 5.8268, time 121.37ms
iter 544950: loss 6.4336, time 121.48ms
iter 544960: loss 5.7802, time 123.93ms
iter 544970: loss 5.9631, time 121.26ms
iter 544980: loss 5.2524, time 121.29ms
iter 544990: loss 6.7018, time 121.68ms
step 545000: train loss 5.4894, val loss 5.4853
saving checkpoint to out-shakespeare-char
iter 545000: loss 6.2235, time 2927.40ms
iter 545010: loss 4.9546, time 124.78ms
iter 545020: loss 5.6481, time 125.33ms
iter 545030: loss 6.5440, time 124.79ms
iter 545040: loss 5.2876, time 124.94ms
iter 545050: loss 6.2604, time 125.12ms
iter 545060: loss 6.1943, time 124.90ms
iter 545070: loss 5.2306, time 124.87ms
iter 545080: loss 5.5456, time 125.20ms
iter 545090: loss 5.5855, time 124.98ms
iter 545100: loss 5.9481, time 125.24ms
iter 545110: loss 5.6775, time 124.90ms
iter 545120: loss 5.6329, time 125.14ms
iter 545130: loss 5.4771, time 124.97ms
iter 545140: loss 6.5913, time 125.10ms
iter 545150: loss 5.4965, time 124.97ms
iter 545160: loss 6.4818, time 124.99ms
iter 545170: loss 5.9422, time 124.04ms
iter 545180: loss 6.1643, time 125.07ms
iter 545190: loss 6.7417, time 125.19ms
iter 545200: loss 6.5520, time 125.28ms
iter 545210: loss 6.2815, time 125.08ms
iter 545220: loss 6.2629, time 124.96ms
iter 545230: loss 5.9473, time 121.24ms
iter 545240: loss 6.0217, time 121.36ms
step 545250: train loss 5.5287, val loss 5.5020
saving checkpoint to out-shakespeare-char
iter 545250: loss 5.9968, time 2906.26ms
iter 545260: loss 5.5788, time 121.62ms
iter 545270: loss 6.4342, time 121.57ms
iter 545280: loss 5.7652, time 123.94ms
iter 545290: loss 6.2272, time 121.06ms
iter 545300: loss 5.5294, time 121.53ms
iter 545310: loss 6.1257, time 120.60ms
iter 545320: loss 5.9373, time 121.43ms
iter 545330: loss 5.8928, time 121.62ms
iter 545340: loss 5.4046, time 121.40ms
iter 545350: loss 5.8065, time 122.53ms
iter 545360: loss 6.1296, time 121.50ms
iter 545370: loss 6.6877, time 120.51ms
iter 545380: loss 6.1987, time 124.13ms
iter 545390: loss 6.4726, time 121.45ms
iter 545400: loss 6.6930, time 121.46ms
iter 545410: loss 6.2021, time 121.43ms
iter 545420: loss 5.9440, time 122.68ms
iter 545430: loss 5.7154, time 121.54ms
iter 545440: loss 5.9792, time 121.40ms
iter 545450: loss 5.9398, time 122.47ms
iter 545460: loss 6.3586, time 121.47ms
iter 545470: loss 6.7557, time 121.33ms
iter 545480: loss 6.2969, time 121.50ms
iter 545490: loss 5.4158, time 121.44ms
step 545500: train loss 5.5239, val loss 5.5490
saving checkpoint to out-shakespeare-char
iter 545500: loss 6.2938, time 2902.40ms
iter 545510: loss 5.9283, time 121.84ms
iter 545520: loss 5.6403, time 122.70ms
iter 545530: loss 6.5647, time 121.14ms
iter 545540: loss 6.3370, time 121.19ms
iter 545550: loss 5.8425, time 124.11ms
iter 545560: loss 6.4626, time 121.40ms
iter 545570: loss 5.9342, time 121.60ms
iter 545580: loss 5.9178, time 120.72ms
iter 545590: loss 5.5848, time 122.91ms
iter 545600: loss 6.0941, time 121.65ms
iter 545610: loss 6.1669, time 121.53ms
iter 545620: loss 5.9676, time 124.47ms
iter 545630: loss 6.2967, time 121.65ms
iter 545640: loss 5.8586, time 121.67ms
iter 545650: loss 6.1920, time 121.57ms
iter 545660: loss 5.5885, time 122.77ms
iter 545670: loss 4.9294, time 121.74ms
iter 545680: loss 5.9096, time 121.60ms
iter 545690: loss 6.7446, time 124.12ms
iter 545700: loss 6.3286, time 121.38ms
iter 545710: loss 6.0356, time 121.60ms
iter 545720: loss 6.6369, time 121.45ms
iter 545730: loss 5.8476, time 121.52ms
iter 545740: loss 5.8725, time 121.33ms
step 545750: train loss 5.5434, val loss 5.5596
saving checkpoint to out-shakespeare-char
iter 545750: loss 6.2889, time 2912.73ms
iter 545760: loss 6.1070, time 121.40ms
iter 545770: loss 5.6055, time 121.27ms
iter 545780: loss 5.7072, time 121.61ms
iter 545790: loss 6.2771, time 122.94ms
iter 545800: loss 6.6589, time 121.44ms
iter 545810: loss 5.9159, time 121.70ms
iter 545820: loss 5.9616, time 123.82ms
iter 545830: loss 6.3515, time 121.44ms
iter 545840: loss 5.7689, time 121.47ms
iter 545850: loss 6.0706, time 121.81ms
iter 545860: loss 5.9031, time 122.83ms
iter 545870: loss 6.4544, time 121.33ms
iter 545880: loss 6.2384, time 121.26ms
iter 545890: loss 5.6225, time 121.48ms
iter 545900: loss 6.1176, time 121.28ms
iter 545910: loss 5.9898, time 121.47ms
iter 545920: loss 5.9798, time 121.48ms
iter 545930: loss 5.7787, time 122.63ms
iter 545940: loss 5.5190, time 121.38ms
iter 545950: loss 5.3629, time 120.70ms
iter 545960: loss 6.7431, time 122.51ms
iter 545970: loss 6.0706, time 121.33ms
iter 545980: loss 5.7382, time 121.42ms
iter 545990: loss 6.0744, time 121.57ms
step 546000: train loss 5.5863, val loss 5.5143
saving checkpoint to out-shakespeare-char
iter 546000: loss 6.3769, time 2892.21ms
iter 546010: loss 6.1702, time 121.80ms
iter 546020: loss 5.9932, time 121.70ms
iter 546030: loss 6.0795, time 123.89ms
iter 546040: loss 6.8407, time 121.84ms
iter 546050: loss 5.5279, time 121.57ms
iter 546060: loss 5.4139, time 121.49ms
iter 546070: loss 6.4741, time 121.45ms
iter 546080: loss 5.9053, time 121.33ms
iter 546090: loss 5.3619, time 121.37ms
iter 546100: loss 5.7084, time 122.44ms
iter 546110: loss 5.8630, time 122.10ms
iter 546120: loss 5.4329, time 121.39ms
iter 546130: loss 6.5672, time 124.09ms
iter 546140: loss 6.3306, time 121.46ms
iter 546150: loss 5.3737, time 121.46ms
iter 546160: loss 6.5653, time 121.12ms
iter 546170: loss 6.1300, time 121.28ms
iter 546180: loss 6.7728, time 121.45ms
iter 546190: loss 6.4111, time 121.87ms
iter 546200: loss 6.1353, time 122.42ms
iter 546210: loss 6.1338, time 121.18ms
iter 546220: loss 6.0951, time 121.39ms
iter 546230: loss 6.6521, time 122.34ms
iter 546240: loss 5.0342, time 121.36ms
step 546250: train loss 5.5669, val loss 5.4921
saving checkpoint to out-shakespeare-char
iter 546250: loss 6.6958, time 2906.80ms
iter 546260: loss 5.8866, time 121.76ms
iter 546270: loss 6.2243, time 124.03ms
iter 546280: loss 5.9778, time 121.47ms
iter 546290: loss 5.4885, time 121.52ms
iter 546300: loss 6.0247, time 122.20ms
iter 546310: loss 5.8812, time 121.43ms
iter 546320: loss 6.4570, time 121.75ms
iter 546330: loss 6.2941, time 121.39ms
iter 546340: loss 6.2321, time 122.58ms
iter 546350: loss 5.8550, time 121.72ms
iter 546360: loss 5.9174, time 121.84ms
iter 546370: loss 5.6460, time 122.81ms
iter 546380: loss 6.3296, time 121.89ms
iter 546390: loss 6.4665, time 122.09ms
iter 546400: loss 6.1371, time 123.00ms
iter 546410: loss 5.5716, time 121.57ms
iter 546420: loss 6.7350, time 121.75ms
iter 546430: loss 5.4607, time 121.65ms
iter 546440: loss 5.5516, time 121.40ms
iter 546450: loss 5.5434, time 121.82ms
iter 546460: loss 5.9046, time 121.38ms
iter 546470: loss 5.8356, time 122.57ms
iter 546480: loss 6.1034, time 121.76ms
iter 546490: loss 6.5256, time 121.60ms
step 546500: train loss 5.5135, val loss 5.5559
saving checkpoint to out-shakespeare-char
iter 546500: loss 6.2587, time 2890.75ms
iter 546510: loss 5.8740, time 122.01ms
iter 546520: loss 6.0697, time 122.73ms
iter 546530: loss 6.4496, time 121.60ms
iter 546540: loss 6.1056, time 120.41ms
iter 546550: loss 5.7146, time 121.84ms
iter 546560: loss 5.5085, time 122.40ms
iter 546570: loss 5.7503, time 121.66ms
iter 546580: loss 6.0038, time 122.99ms
iter 546590: loss 6.5817, time 121.72ms
iter 546600: loss 5.7861, time 121.61ms
iter 546610: loss 5.8058, time 121.56ms
iter 546620: loss 5.5582, time 121.60ms
iter 546630: loss 6.1505, time 122.62ms
iter 546640: loss 5.8833, time 120.66ms
iter 546650: loss 6.1835, time 123.27ms
iter 546660: loss 6.1155, time 122.01ms
iter 546670: loss 6.4951, time 121.48ms
iter 546680: loss 5.8077, time 124.01ms
iter 546690: loss 5.1899, time 121.83ms
iter 546700: loss 5.8855, time 121.63ms
iter 546710: loss 6.2931, time 121.55ms
iter 546720: loss 5.6560, time 120.88ms
iter 546730: loss 6.0105, time 122.66ms
iter 546740: loss 6.3368, time 121.75ms
step 546750: train loss 5.5505, val loss 5.5363
saving checkpoint to out-shakespeare-char
iter 546750: loss 6.1206, time 2910.70ms
iter 546760: loss 5.8704, time 121.88ms
iter 546770: loss 6.2329, time 121.95ms
iter 546780: loss 6.3672, time 121.84ms
iter 546790: loss 5.6935, time 121.41ms
iter 546800: loss 5.8219, time 121.48ms
iter 546810: loss 6.3692, time 121.52ms
iter 546820: loss 6.0906, time 122.63ms
iter 546830: loss 6.0373, time 121.33ms
iter 546840: loss 6.5451, time 121.47ms
iter 546850: loss 5.5307, time 123.07ms
iter 546860: loss 5.5936, time 121.53ms
iter 546870: loss 6.1594, time 121.89ms
iter 546880: loss 5.8187, time 124.22ms
iter 546890: loss 6.5698, time 121.73ms
iter 546900: loss 5.7419, time 121.36ms
iter 546910: loss 5.8605, time 121.30ms
iter 546920: loss 6.1910, time 121.61ms
iter 546930: loss 5.7544, time 121.90ms
iter 546940: loss 5.9536, time 121.82ms
iter 546950: loss 6.2802, time 123.25ms
iter 546960: loss 5.3732, time 121.52ms
iter 546970: loss 5.6021, time 121.61ms
iter 546980: loss 6.3378, time 123.26ms
iter 546990: loss 5.6056, time 121.54ms
step 547000: train loss 5.5449, val loss 5.5408
saving checkpoint to out-shakespeare-char
iter 547000: loss 5.9480, time 2895.68ms
iter 547010: loss 6.5083, time 122.20ms
iter 547020: loss 5.6502, time 122.53ms
iter 547030: loss 5.9701, time 121.48ms
iter 547040: loss 6.2457, time 121.35ms
iter 547050: loss 6.2583, time 123.94ms
iter 547060: loss 6.7328, time 121.36ms
iter 547070: loss 6.1316, time 121.35ms
iter 547080: loss 6.8286, time 121.35ms
iter 547090: loss 6.1521, time 122.60ms
iter 547100: loss 6.3624, time 121.42ms
iter 547110: loss 5.5136, time 121.68ms
iter 547120: loss 5.9348, time 123.18ms
iter 547130: loss 5.6481, time 122.14ms
iter 547140: loss 5.8078, time 121.34ms
iter 547150: loss 5.7608, time 121.46ms
iter 547160: loss 5.8756, time 122.47ms
iter 547170: loss 5.8188, time 121.18ms
iter 547180: loss 6.4200, time 122.39ms
iter 547190: loss 5.6215, time 123.57ms
iter 547200: loss 6.4087, time 121.55ms
iter 547210: loss 6.1986, time 121.60ms
iter 547220: loss 5.8311, time 121.52ms
iter 547230: loss 6.3658, time 121.60ms
iter 547240: loss 6.2713, time 121.61ms
step 547250: train loss 5.5668, val loss 5.4927
saving checkpoint to out-shakespeare-char
iter 547250: loss 6.7764, time 2916.94ms
iter 547260: loss 6.2676, time 120.55ms
iter 547270: loss 6.3115, time 121.70ms
iter 547280: loss 5.7681, time 121.77ms
iter 547290: loss 6.5842, time 122.74ms
iter 547300: loss 6.1682, time 121.44ms
iter 547310: loss 6.0491, time 121.76ms
iter 547320: loss 6.2395, time 122.65ms
iter 547330: loss 6.8279, time 121.79ms
iter 547340: loss 5.6864, time 121.63ms
iter 547350: loss 5.8326, time 124.17ms
iter 547360: loss 6.1141, time 121.45ms
iter 547370: loss 6.2043, time 121.48ms
iter 547380: loss 6.2116, time 121.37ms
iter 547390: loss 5.8168, time 121.60ms
iter 547400: loss 6.2336, time 121.46ms
iter 547410: loss 5.9876, time 121.51ms
iter 547420: loss 6.1487, time 122.52ms
iter 547430: loss 6.5126, time 121.36ms
iter 547440: loss 6.2484, time 121.42ms
iter 547450: loss 6.7774, time 122.61ms
iter 547460: loss 6.5833, time 122.16ms
iter 547470: loss 5.7758, time 121.45ms
iter 547480: loss 6.5924, time 121.47ms
iter 547490: loss 6.6888, time 122.65ms
step 547500: train loss 5.5759, val loss 5.5361
saving checkpoint to out-shakespeare-char
iter 547500: loss 5.9987, time 2900.46ms
iter 547510: loss 5.6909, time 122.49ms
iter 547520: loss 5.8172, time 121.43ms
iter 547530: loss 5.8996, time 121.37ms
iter 547540: loss 5.3269, time 123.95ms
iter 547550: loss 6.3667, time 120.93ms
iter 547560: loss 6.2788, time 120.64ms
iter 547570: loss 5.6075, time 121.45ms
iter 547580: loss 5.8983, time 122.81ms
iter 547590: loss 5.8649, time 121.25ms
iter 547600: loss 5.3737, time 121.61ms
iter 547610: loss 6.0384, time 124.05ms
iter 547620: loss 6.2012, time 121.65ms
iter 547630: loss 6.0051, time 122.43ms
iter 547640: loss 5.4014, time 121.49ms
iter 547650: loss 5.6781, time 121.37ms
iter 547660: loss 6.2882, time 121.38ms
iter 547670: loss 6.1616, time 121.45ms
iter 547680: loss 6.0111, time 124.12ms
iter 547690: loss 5.6293, time 121.58ms
iter 547700: loss 6.0637, time 121.75ms
iter 547710: loss 5.8550, time 122.78ms
iter 547720: loss 6.1225, time 121.40ms
iter 547730: loss 5.9522, time 121.35ms
iter 547740: loss 6.4755, time 124.35ms
step 547750: train loss 5.5450, val loss 5.5312
saving checkpoint to out-shakespeare-char
iter 547750: loss 5.1732, time 2905.66ms
iter 547760: loss 5.9214, time 121.55ms
iter 547770: loss 5.8026, time 121.33ms
iter 547780: loss 6.0499, time 122.40ms
iter 547790: loss 6.1393, time 121.41ms
iter 547800: loss 5.4331, time 121.48ms
iter 547810: loss 6.0748, time 121.42ms
iter 547820: loss 6.7130, time 121.38ms
iter 547830: loss 5.4990, time 121.52ms
iter 547840: loss 6.0583, time 121.54ms
iter 547850: loss 6.3361, time 122.79ms
iter 547860: loss 5.8332, time 121.38ms
iter 547870: loss 6.4194, time 121.35ms
iter 547880: loss 6.1166, time 124.01ms
iter 547890: loss 6.3223, time 121.15ms
iter 547900: loss 5.9073, time 121.38ms
iter 547910: loss 5.4196, time 121.20ms
iter 547920: loss 5.8814, time 122.49ms
iter 547930: loss 5.8281, time 121.50ms
iter 547940: loss 5.6512, time 121.37ms
iter 547950: loss 6.6911, time 122.35ms
iter 547960: loss 5.5636, time 121.25ms
iter 547970: loss 6.3963, time 121.56ms
iter 547980: loss 6.0444, time 121.71ms
iter 547990: loss 5.9248, time 121.19ms
step 548000: train loss 5.5785, val loss 5.5530
saving checkpoint to out-shakespeare-char
iter 548000: loss 6.0886, time 2893.07ms
iter 548010: loss 6.0339, time 124.97ms
iter 548020: loss 6.2572, time 125.01ms
iter 548030: loss 5.3618, time 124.96ms
iter 548040: loss 6.5911, time 126.22ms
iter 548050: loss 6.2742, time 124.94ms
iter 548060: loss 6.2558, time 126.18ms
iter 548070: loss 6.0898, time 125.01ms
iter 548080: loss 5.2097, time 124.48ms
iter 548090: loss 6.9971, time 124.91ms
iter 548100: loss 5.6625, time 125.05ms
iter 548110: loss 5.4504, time 124.73ms
iter 548120: loss 5.9904, time 126.11ms
iter 548130: loss 7.1844, time 125.32ms
iter 548140: loss 5.9197, time 126.09ms
iter 548150: loss 5.6365, time 125.30ms
iter 548160: loss 5.4628, time 125.58ms
iter 548170: loss 5.4239, time 125.78ms
iter 548180: loss 6.0763, time 125.90ms
iter 548190: loss 6.6386, time 125.73ms
iter 548200: loss 5.6242, time 125.74ms
iter 548210: loss 6.4217, time 124.84ms
iter 548220: loss 6.1449, time 125.80ms
iter 548230: loss 5.4960, time 125.47ms
iter 548240: loss 6.1334, time 126.03ms
step 548250: train loss 5.5447, val loss 5.6189
saving checkpoint to out-shakespeare-char
iter 548250: loss 5.8935, time 2894.93ms
iter 548260: loss 6.2346, time 124.43ms
iter 548270: loss 6.8014, time 125.01ms
iter 548280: loss 6.2477, time 124.36ms
iter 548290: loss 5.8234, time 124.08ms
iter 548300: loss 6.3774, time 124.08ms
iter 548310: loss 6.0425, time 125.06ms
iter 548320: loss 6.5457, time 124.25ms
iter 548330: loss 6.3644, time 124.56ms
iter 548340: loss 5.5768, time 124.98ms
iter 548350: loss 6.4742, time 124.11ms
iter 548360: loss 5.8083, time 124.99ms
iter 548370: loss 6.5206, time 123.88ms
iter 548380: loss 6.0922, time 125.01ms
iter 548390: loss 6.0158, time 123.80ms
iter 548400: loss 6.4918, time 125.11ms
iter 548410: loss 6.0422, time 124.09ms
iter 548420: loss 5.5234, time 124.29ms
iter 548430: loss 6.2158, time 124.72ms
iter 548440: loss 5.7597, time 124.42ms
iter 548450: loss 5.6985, time 124.89ms
iter 548460: loss 5.3995, time 124.20ms
iter 548470: loss 6.3227, time 124.90ms
iter 548480: loss 5.8919, time 124.00ms
iter 548490: loss 5.7080, time 124.88ms
step 548500: train loss 5.5260, val loss 5.5471
saving checkpoint to out-shakespeare-char
iter 548500: loss 6.0155, time 2894.00ms
iter 548510: loss 6.5470, time 125.01ms
iter 548520: loss 5.9823, time 124.98ms
iter 548530: loss 6.0101, time 123.98ms
iter 548540: loss 6.2179, time 124.88ms
iter 548550: loss 5.5830, time 124.88ms
iter 548560: loss 6.0067, time 124.77ms
iter 548570: loss 5.9490, time 124.91ms
iter 548580: loss 5.6182, time 124.45ms
iter 548590: loss 6.1972, time 124.77ms
iter 548600: loss 6.2083, time 123.26ms
iter 548610: loss 6.1407, time 124.63ms
iter 548620: loss 5.9039, time 126.59ms
iter 548630: loss 6.2144, time 124.74ms
iter 548640: loss 6.5251, time 125.98ms
iter 548650: loss 5.4265, time 124.14ms
iter 548660: loss 5.6795, time 124.81ms
iter 548670: loss 6.7719, time 124.01ms
iter 548680: loss 6.1424, time 124.87ms
iter 548690: loss 5.9250, time 123.98ms
iter 548700: loss 6.4252, time 125.10ms
iter 548710: loss 6.7115, time 124.73ms
iter 548720: loss 5.9847, time 124.08ms
iter 548730: loss 5.8481, time 125.09ms
iter 548740: loss 6.0093, time 124.03ms
step 548750: train loss 5.5355, val loss 5.5566
saving checkpoint to out-shakespeare-char
iter 548750: loss 6.1696, time 2904.99ms
iter 548760: loss 5.9055, time 125.69ms
iter 548770: loss 5.6577, time 124.76ms
iter 548780: loss 5.9576, time 125.70ms
iter 548790: loss 5.5925, time 125.69ms
iter 548800: loss 5.7987, time 126.16ms
iter 548810: loss 5.7044, time 125.24ms
iter 548820: loss 5.5454, time 125.84ms
iter 548830: loss 6.2769, time 125.43ms
iter 548840: loss 6.0365, time 125.69ms
iter 548850: loss 5.4962, time 124.38ms
iter 548860: loss 5.3642, time 125.70ms
iter 548870: loss 5.9691, time 125.58ms
iter 548880: loss 5.9752, time 125.71ms
iter 548890: loss 6.4076, time 125.57ms
iter 548900: loss 6.2046, time 125.69ms
iter 548910: loss 5.9271, time 125.77ms
iter 548920: loss 6.4007, time 125.67ms
iter 548930: loss 5.6481, time 125.59ms
iter 548940: loss 5.9936, time 125.78ms
iter 548950: loss 6.1506, time 125.70ms
iter 548960: loss 6.6779, time 125.62ms
iter 548970: loss 5.7116, time 125.51ms
iter 548980: loss 5.9351, time 125.83ms
iter 548990: loss 6.1141, time 125.62ms
step 549000: train loss 5.5332, val loss 5.4984
saving checkpoint to out-shakespeare-char
iter 549000: loss 5.1805, time 2888.23ms
iter 549010: loss 6.0545, time 121.36ms
iter 549020: loss 5.2020, time 121.53ms
iter 549030: loss 6.2881, time 121.48ms
iter 549040: loss 5.9915, time 122.54ms
iter 549050: loss 6.1332, time 120.60ms
iter 549060: loss 6.5481, time 121.51ms
iter 549070: loss 5.6008, time 122.60ms
iter 549080: loss 5.8196, time 121.40ms
iter 549090: loss 6.3838, time 121.53ms
iter 549100: loss 6.7031, time 121.39ms
iter 549110: loss 6.1819, time 121.84ms
iter 549120: loss 5.3765, time 121.24ms
iter 549130: loss 5.9563, time 121.51ms
iter 549140: loss 5.9682, time 122.55ms
iter 549150: loss 5.9756, time 121.42ms
iter 549160: loss 6.4088, time 121.36ms
iter 549170: loss 6.1861, time 122.70ms
iter 549180: loss 5.6110, time 121.43ms
iter 549190: loss 5.7025, time 121.33ms
iter 549200: loss 6.7032, time 121.31ms
iter 549210: loss 6.8257, time 121.05ms
iter 549220: loss 6.2186, time 121.17ms
iter 549230: loss 6.1701, time 121.32ms
iter 549240: loss 6.0124, time 122.47ms
step 549250: train loss 5.5009, val loss 5.5293
saving checkpoint to out-shakespeare-char
iter 549250: loss 6.4640, time 2898.58ms
iter 549260: loss 5.8522, time 121.40ms
iter 549270: loss 6.1708, time 120.93ms
iter 549280: loss 6.3440, time 122.49ms
iter 549290: loss 5.6428, time 121.31ms
iter 549300: loss 6.3457, time 123.80ms
iter 549310: loss 5.8556, time 121.33ms
iter 549320: loss 6.1424, time 122.37ms
iter 549330: loss 5.7689, time 121.55ms
iter 549340: loss 5.0125, time 122.60ms
iter 549350: loss 5.8904, time 121.54ms
iter 549360: loss 6.5525, time 121.49ms
iter 549370: loss 6.0008, time 123.98ms
iter 549380: loss 5.2979, time 121.43ms
iter 549390: loss 5.9996, time 121.39ms
iter 549400: loss 6.4097, time 121.40ms
iter 549410: loss 6.0926, time 122.83ms
iter 549420: loss 6.5061, time 121.30ms
iter 549430: loss 6.3901, time 121.46ms
iter 549440: loss 5.4490, time 121.34ms
iter 549450: loss 5.2678, time 121.39ms
iter 549460: loss 6.1167, time 122.14ms
iter 549470: loss 5.6640, time 121.77ms
iter 549480: loss 6.8755, time 123.30ms
iter 549490: loss 5.5756, time 121.68ms
step 549500: train loss 5.5486, val loss 5.5688
saving checkpoint to out-shakespeare-char
iter 549500: loss 5.8726, time 2895.76ms
iter 549510: loss 5.6850, time 125.71ms
iter 549520: loss 4.8968, time 125.36ms
iter 549530: loss 5.3187, time 125.55ms
iter 549540: loss 5.6676, time 125.41ms
iter 549550: loss 6.7709, time 122.58ms
iter 549560: loss 5.3811, time 121.51ms
iter 549570: loss 6.6900, time 121.46ms
iter 549580: loss 5.7995, time 122.76ms
iter 549590: loss 5.7432, time 121.61ms
iter 549600: loss 6.1107, time 121.60ms
iter 549610: loss 6.2281, time 123.92ms
iter 549620: loss 5.8986, time 121.43ms
iter 549630: loss 5.6213, time 121.39ms
iter 549640: loss 5.8173, time 121.46ms
iter 549650: loss 5.6815, time 121.42ms
iter 549660: loss 6.3805, time 121.48ms
iter 549670: loss 6.2859, time 121.51ms
iter 549680: loss 6.0100, time 122.62ms
iter 549690: loss 5.5618, time 121.34ms
iter 549700: loss 6.3361, time 121.55ms
iter 549710: loss 6.4206, time 122.53ms
iter 549720: loss 5.7500, time 121.64ms
iter 549730: loss 5.8693, time 121.54ms
iter 549740: loss 5.9479, time 123.93ms
step 549750: train loss 5.5060, val loss 5.5459
saving checkpoint to out-shakespeare-char
iter 549750: loss 5.3625, time 2891.97ms
iter 549760: loss 5.7557, time 119.47ms
iter 549770: loss 5.6429, time 120.52ms
iter 549780: loss 5.7118, time 120.52ms
iter 549790: loss 5.9352, time 121.38ms
iter 549800: loss 6.1837, time 120.67ms
iter 549810: loss 6.3045, time 119.51ms
iter 549820: loss 6.6722, time 122.01ms
iter 549830: loss 5.5494, time 120.52ms
iter 549840: loss 5.3372, time 119.35ms
iter 549850: loss 5.8672, time 119.13ms
iter 549860: loss 5.9763, time 119.39ms
iter 549870: loss 5.1408, time 119.56ms
iter 549880: loss 5.8379, time 119.40ms
iter 549890: loss 5.6493, time 120.61ms
iter 549900: loss 5.7208, time 120.48ms
iter 549910: loss 5.9888, time 119.58ms
iter 549920: loss 5.7356, time 120.64ms
iter 549930: loss 6.2937, time 119.30ms
iter 549940: loss 5.5580, time 120.94ms
iter 549950: loss 6.5096, time 120.62ms
iter 549960: loss 6.1102, time 120.58ms
iter 549970: loss 4.8713, time 120.67ms
iter 549980: loss 6.2219, time 119.57ms
iter 549990: loss 4.7113, time 120.47ms
step 550000: train loss 5.5189, val loss 5.5663
saving checkpoint to out-shakespeare-char
iter 550000: loss 6.1922, time 2890.40ms
iter 550010: loss 5.4230, time 121.43ms
iter 550020: loss 5.5578, time 119.50ms
iter 550030: loss 5.7339, time 119.28ms
iter 550040: loss 6.1141, time 121.57ms
iter 550050: loss 5.6274, time 119.70ms
iter 550060: loss 5.7139, time 119.55ms
iter 550070: loss 5.7542, time 125.25ms
iter 550080: loss 6.6889, time 123.95ms
iter 550090: loss 6.1360, time 124.01ms
iter 550100: loss 6.0960, time 124.63ms
iter 550110: loss 6.1057, time 123.82ms
iter 550120: loss 5.1878, time 124.90ms
iter 550130: loss 5.6670, time 124.35ms
iter 550140: loss 6.6759, time 124.35ms
iter 550150: loss 5.9356, time 123.99ms
iter 550160: loss 5.9919, time 124.34ms
iter 550170: loss 6.1767, time 123.80ms
iter 550180: loss 5.7933, time 124.29ms
iter 550190: loss 5.6984, time 124.11ms
iter 550200: loss 6.5391, time 124.34ms
iter 550210: loss 6.0285, time 123.68ms
iter 550220: loss 6.2716, time 124.71ms
iter 550230: loss 5.8059, time 123.81ms
iter 550240: loss 6.1480, time 124.34ms
step 550250: train loss 5.5923, val loss 5.5412
saving checkpoint to out-shakespeare-char
iter 550250: loss 6.4940, time 2902.57ms
iter 550260: loss 6.7247, time 124.51ms
iter 550270: loss 5.8823, time 125.25ms
iter 550280: loss 6.0403, time 125.02ms
iter 550290: loss 6.3861, time 124.89ms
iter 550300: loss 5.8309, time 125.24ms
iter 550310: loss 5.7209, time 125.18ms
iter 550320: loss 5.2588, time 125.35ms
iter 550330: loss 6.1105, time 125.27ms
iter 550340: loss 5.1787, time 125.04ms
iter 550350: loss 6.1566, time 125.14ms
iter 550360: loss 5.9590, time 125.22ms
iter 550370: loss 6.8622, time 124.18ms
iter 550380: loss 5.6432, time 124.77ms
iter 550390: loss 6.1611, time 124.76ms
iter 550400: loss 6.4118, time 124.83ms
iter 550410: loss 5.0673, time 124.71ms
iter 550420: loss 5.9786, time 124.81ms
iter 550430: loss 6.3457, time 125.15ms
iter 550440: loss 5.4000, time 124.76ms
iter 550450: loss 5.3896, time 125.33ms
iter 550460: loss 5.6628, time 125.15ms
iter 550470: loss 5.8346, time 125.30ms
iter 550480: loss 5.3594, time 125.29ms
iter 550490: loss 5.2093, time 124.66ms
step 550500: train loss 5.4734, val loss 5.5011
saving checkpoint to out-shakespeare-char
iter 550500: loss 5.1338, time 2891.64ms
iter 550510: loss 5.8817, time 125.74ms
iter 550520: loss 6.0719, time 125.52ms
iter 550530: loss 5.5392, time 125.33ms
iter 550540: loss 5.3885, time 125.20ms
iter 550550: loss 5.6952, time 125.49ms
iter 550560: loss 5.8839, time 125.21ms
iter 550570: loss 6.0899, time 125.43ms
iter 550580: loss 5.3849, time 125.24ms
iter 550590: loss 5.3675, time 125.38ms
iter 550600: loss 5.5703, time 125.17ms
iter 550610: loss 5.2688, time 125.36ms
iter 550620: loss 5.9820, time 125.29ms
iter 550630: loss 5.7402, time 125.43ms
iter 550640: loss 5.3619, time 125.18ms
iter 550650: loss 6.0616, time 125.28ms
iter 550660: loss 6.7542, time 125.14ms
iter 550670: loss 6.0420, time 125.54ms
iter 550680: loss 6.5849, time 125.51ms
iter 550690: loss 5.8520, time 125.71ms
iter 550700: loss 5.8900, time 125.63ms
iter 550710: loss 7.1121, time 125.57ms
iter 550720: loss 6.4270, time 125.44ms
iter 550730: loss 6.4315, time 125.79ms
iter 550740: loss 5.8281, time 125.42ms
step 550750: train loss 5.5622, val loss 5.5023
saving checkpoint to out-shakespeare-char
iter 550750: loss 5.8676, time 2880.91ms
iter 550760: loss 6.0419, time 123.65ms
iter 550770: loss 6.4808, time 125.71ms
iter 550780: loss 6.0627, time 124.34ms
iter 550790: loss 5.6535, time 125.69ms
iter 550800: loss 5.3007, time 124.42ms
iter 550810: loss 6.0391, time 125.48ms
iter 550820: loss 6.0222, time 125.20ms
iter 550830: loss 6.3302, time 125.74ms
iter 550840: loss 5.5124, time 124.55ms
iter 550850: loss 6.5649, time 125.67ms
iter 550860: loss 6.9097, time 125.30ms
iter 550870: loss 6.3424, time 126.42ms
iter 550880: loss 6.7775, time 124.62ms
iter 550890: loss 6.6930, time 125.56ms
iter 550900: loss 5.0173, time 125.28ms
iter 550910: loss 5.5994, time 125.58ms
iter 550920: loss 5.8977, time 125.60ms
iter 550930: loss 5.4491, time 125.76ms
iter 550940: loss 5.3472, time 125.47ms
iter 550950: loss 6.1458, time 124.65ms
iter 550960: loss 6.5768, time 125.47ms
iter 550970: loss 6.1198, time 125.60ms
iter 550980: loss 6.2310, time 125.58ms
iter 550990: loss 6.4252, time 120.35ms
step 551000: train loss 5.6097, val loss 5.5414
saving checkpoint to out-shakespeare-char
iter 551000: loss 5.0693, time 2888.52ms
iter 551010: loss 5.3238, time 122.28ms
iter 551020: loss 5.7525, time 121.15ms
iter 551030: loss 6.0573, time 121.28ms
iter 551040: loss 6.3141, time 121.34ms
iter 551050: loss 6.7073, time 122.24ms
iter 551060: loss 6.3392, time 121.21ms
iter 551070: loss 5.5695, time 121.66ms
iter 551080: loss 5.8690, time 122.85ms
iter 551090: loss 6.5293, time 121.14ms
iter 551100: loss 6.6258, time 121.40ms
iter 551110: loss 5.6784, time 121.14ms
iter 551120: loss 5.3261, time 122.67ms
iter 551130: loss 5.6016, time 121.25ms
iter 551140: loss 5.7981, time 120.75ms
iter 551150: loss 6.9457, time 121.18ms
iter 551160: loss 5.7256, time 122.13ms
iter 551170: loss 4.9374, time 120.99ms
iter 551180: loss 5.9233, time 121.16ms
iter 551190: loss 6.1390, time 121.29ms
iter 551200: loss 5.8038, time 122.14ms
iter 551210: loss 6.5078, time 121.15ms
iter 551220: loss 5.9741, time 121.37ms
iter 551230: loss 5.9685, time 121.44ms
iter 551240: loss 6.0614, time 123.76ms
step 551250: train loss 5.5178, val loss 5.5235
saving checkpoint to out-shakespeare-char
iter 551250: loss 5.3681, time 2885.39ms
iter 551260: loss 5.7771, time 121.26ms
iter 551270: loss 5.0860, time 121.27ms
iter 551280: loss 6.3537, time 122.47ms
iter 551290: loss 6.0564, time 122.04ms
iter 551300: loss 6.0411, time 121.06ms
iter 551310: loss 6.2603, time 121.16ms
iter 551320: loss 6.4721, time 121.00ms
iter 551330: loss 6.3818, time 122.46ms
iter 551340: loss 5.4215, time 120.31ms
iter 551350: loss 5.9083, time 121.07ms
iter 551360: loss 6.0687, time 121.19ms
iter 551370: loss 5.9035, time 122.48ms
iter 551380: loss 6.1033, time 121.17ms
iter 551390: loss 6.0649, time 120.77ms
iter 551400: loss 5.4427, time 121.02ms
iter 551410: loss 6.0458, time 121.25ms
iter 551420: loss 6.2644, time 121.09ms
iter 551430: loss 5.6066, time 122.70ms
iter 551440: loss 5.8355, time 123.63ms
iter 551450: loss 6.0987, time 121.34ms
iter 551460: loss 6.0846, time 121.14ms
iter 551470: loss 5.8857, time 121.06ms
iter 551480: loss 5.8008, time 122.41ms
iter 551490: loss 6.4037, time 120.97ms
step 551500: train loss 5.5088, val loss 5.5092
saving checkpoint to out-shakespeare-char
iter 551500: loss 6.3826, time 2880.03ms
iter 551510: loss 5.3750, time 121.51ms
iter 551520: loss 5.5394, time 123.00ms
iter 551530: loss 5.9946, time 121.30ms
iter 551540: loss 5.4323, time 121.26ms
iter 551550: loss 5.9713, time 121.39ms
iter 551560: loss 5.4101, time 121.37ms
iter 551570: loss 5.8861, time 121.22ms
iter 551580: loss 6.3818, time 121.83ms
iter 551590: loss 5.7487, time 121.27ms
iter 551600: loss 5.9889, time 122.37ms
iter 551610: loss 5.7080, time 121.25ms
iter 551620: loss 6.9038, time 121.36ms
iter 551630: loss 5.8367, time 122.73ms
iter 551640: loss 6.2993, time 121.40ms
iter 551650: loss 6.2573, time 121.39ms
iter 551660: loss 6.1638, time 122.56ms
iter 551670: loss 6.3016, time 121.47ms
iter 551680: loss 6.8719, time 121.77ms
iter 551690: loss 6.4225, time 121.29ms
iter 551700: loss 5.4678, time 122.51ms
iter 551710: loss 6.3525, time 121.94ms
iter 551720: loss 5.6273, time 122.72ms
iter 551730: loss 6.1837, time 121.65ms
iter 551740: loss 5.8684, time 121.46ms
step 551750: train loss 5.5771, val loss 5.5624
saving checkpoint to out-shakespeare-char
iter 551750: loss 5.9150, time 2904.69ms
iter 551760: loss 6.3560, time 125.25ms
iter 551770: loss 5.4394, time 124.78ms
iter 551780: loss 6.1908, time 124.82ms
iter 551790: loss 6.0351, time 124.94ms
iter 551800: loss 6.0602, time 124.85ms
iter 551810: loss 4.8925, time 124.15ms
iter 551820: loss 6.1621, time 124.95ms
iter 551830: loss 6.5918, time 124.66ms
iter 551840: loss 5.8278, time 124.85ms
iter 551850: loss 6.1259, time 124.72ms
iter 551860: loss 6.3199, time 127.51ms
iter 551870: loss 6.6087, time 124.83ms
iter 551880: loss 6.5873, time 127.45ms
iter 551890: loss 6.5884, time 124.96ms
iter 551900: loss 5.1764, time 127.60ms
iter 551910: loss 5.4948, time 124.69ms
iter 551920: loss 5.7730, time 121.43ms
iter 551930: loss 5.0955, time 122.78ms
iter 551940: loss 6.4375, time 121.35ms
iter 551950: loss 5.7191, time 121.43ms
iter 551960: loss 6.5729, time 124.10ms
iter 551970: loss 5.7273, time 121.25ms
iter 551980: loss 5.7474, time 121.47ms
iter 551990: loss 6.8978, time 121.46ms
step 552000: train loss 5.4899, val loss 5.5383
saving checkpoint to out-shakespeare-char
iter 552000: loss 6.2134, time 2909.93ms
iter 552010: loss 6.1649, time 123.17ms
iter 552020: loss 5.8057, time 122.11ms
iter 552030: loss 5.8514, time 121.71ms
iter 552040: loss 5.6881, time 120.83ms
iter 552050: loss 5.3796, time 121.81ms
iter 552060: loss 5.4023, time 121.67ms
iter 552070: loss 5.9360, time 121.79ms
iter 552080: loss 5.6212, time 123.17ms
iter 552090: loss 6.2283, time 121.80ms
iter 552100: loss 5.9412, time 121.69ms
iter 552110: loss 6.5455, time 122.87ms
iter 552120: loss 6.3213, time 122.10ms
iter 552130: loss 6.0809, time 122.24ms
iter 552140: loss 5.7984, time 124.28ms
iter 552150: loss 6.1965, time 121.65ms
iter 552160: loss 5.7150, time 121.96ms
iter 552170: loss 5.6545, time 120.82ms
iter 552180: loss 6.2749, time 121.79ms
iter 552190: loss 6.1580, time 121.79ms
iter 552200: loss 5.4952, time 121.67ms
iter 552210: loss 5.7977, time 122.90ms
iter 552220: loss 6.2194, time 121.92ms
iter 552230: loss 6.1176, time 121.73ms
iter 552240: loss 5.9483, time 122.89ms
step 552250: train loss 5.5368, val loss 5.6123
saving checkpoint to out-shakespeare-char
iter 552250: loss 5.8829, time 2903.78ms
iter 552260: loss 5.7550, time 121.80ms
iter 552270: loss 5.8619, time 121.85ms
iter 552280: loss 6.2711, time 121.72ms
iter 552290: loss 5.7104, time 121.80ms
iter 552300: loss 5.9120, time 121.36ms
iter 552310: loss 6.1293, time 122.67ms
iter 552320: loss 6.3994, time 121.69ms
iter 552330: loss 5.8842, time 121.71ms
iter 552340: loss 6.3382, time 123.10ms
iter 552350: loss 5.7455, time 121.57ms
iter 552360: loss 5.8917, time 121.45ms
iter 552370: loss 5.5483, time 124.31ms
iter 552380: loss 6.2630, time 121.88ms
iter 552390: loss 6.1180, time 121.82ms
iter 552400: loss 5.0032, time 121.34ms
iter 552410: loss 6.2043, time 121.89ms
iter 552420: loss 6.1310, time 121.71ms
iter 552430: loss 5.6868, time 121.72ms
iter 552440: loss 5.8157, time 122.91ms
iter 552450: loss 6.2171, time 121.74ms
iter 552460: loss 5.7134, time 121.85ms
iter 552470: loss 6.3307, time 124.39ms
iter 552480: loss 5.0816, time 121.83ms
iter 552490: loss 5.2940, time 121.75ms
step 552500: train loss 5.5233, val loss 5.5494
saving checkpoint to out-shakespeare-char
iter 552500: loss 5.7245, time 2895.15ms
iter 552510: loss 6.0084, time 122.61ms
iter 552520: loss 5.8646, time 121.18ms
iter 552530: loss 6.4127, time 121.08ms
iter 552540: loss 6.1813, time 123.76ms
iter 552550: loss 5.2766, time 121.36ms
iter 552560: loss 6.8530, time 121.27ms
iter 552570: loss 5.7885, time 121.20ms
iter 552580: loss 5.8728, time 121.19ms
iter 552590: loss 5.5481, time 121.36ms
iter 552600: loss 5.4397, time 121.99ms
iter 552610: loss 5.9034, time 122.80ms
iter 552620: loss 6.2996, time 122.41ms
iter 552630: loss 5.6212, time 121.25ms
iter 552640: loss 5.6093, time 124.08ms
iter 552650: loss 6.3324, time 120.59ms
iter 552660: loss 6.0928, time 121.48ms
iter 552670: loss 5.6185, time 121.63ms
iter 552680: loss 6.4436, time 121.49ms
iter 552690: loss 6.1961, time 121.69ms
iter 552700: loss 5.6137, time 121.35ms
iter 552710: loss 5.2726, time 124.42ms
iter 552720: loss 5.7758, time 121.51ms
iter 552730: loss 6.7141, time 121.38ms
iter 552740: loss 5.6148, time 121.69ms
step 552750: train loss 5.5265, val loss 5.4781
saving checkpoint to out-shakespeare-char
iter 552750: loss 5.2358, time 2900.01ms
iter 552760: loss 5.9890, time 121.68ms
iter 552770: loss 6.5964, time 121.63ms
iter 552780: loss 5.5818, time 122.47ms
iter 552790: loss 5.6666, time 121.50ms
iter 552800: loss 6.2034, time 121.34ms
iter 552810: loss 5.4699, time 121.79ms
iter 552820: loss 6.0165, time 121.50ms
iter 552830: loss 5.7491, time 121.40ms
iter 552840: loss 5.6994, time 123.84ms
iter 552850: loss 5.6852, time 121.58ms
iter 552860: loss 5.6302, time 121.77ms
iter 552870: loss 5.2557, time 121.31ms
iter 552880: loss 5.8931, time 122.52ms
iter 552890: loss 6.1508, time 121.36ms
iter 552900: loss 5.8879, time 121.58ms
iter 552910: loss 5.9018, time 124.39ms
iter 552920: loss 6.2440, time 121.40ms
iter 552930: loss 5.8515, time 121.56ms
iter 552940: loss 5.8409, time 121.34ms
iter 552950: loss 5.7669, time 121.41ms
iter 552960: loss 6.0137, time 122.01ms
iter 552970: loss 6.3563, time 121.55ms
iter 552980: loss 5.0678, time 123.01ms
iter 552990: loss 5.9630, time 121.46ms
step 553000: train loss 5.5761, val loss 5.5787
saving checkpoint to out-shakespeare-char
iter 553000: loss 6.0270, time 2894.85ms
iter 553010: loss 5.3766, time 121.79ms
iter 553020: loss 6.5107, time 123.54ms
iter 553030: loss 5.5759, time 121.05ms
iter 553040: loss 5.9717, time 121.76ms
iter 553050: loss 6.5873, time 123.26ms
iter 553060: loss 6.0978, time 121.58ms
iter 553070: loss 6.0746, time 121.72ms
iter 553080: loss 5.9260, time 123.94ms
iter 553090: loss 5.3720, time 121.71ms
iter 553100: loss 5.8890, time 120.94ms
iter 553110: loss 5.2337, time 121.83ms
iter 553120: loss 6.7049, time 121.64ms
iter 553130: loss 6.2429, time 121.83ms
iter 553140: loss 5.8660, time 121.23ms
iter 553150: loss 5.8536, time 123.11ms
iter 553160: loss 5.7812, time 121.71ms
iter 553170: loss 5.1499, time 121.74ms
iter 553180: loss 5.5133, time 122.30ms
iter 553190: loss 6.1099, time 121.80ms
iter 553200: loss 5.8180, time 121.73ms
iter 553210: loss 5.7751, time 124.32ms
iter 553220: loss 6.2551, time 121.61ms
iter 553230: loss 5.6679, time 121.76ms
iter 553240: loss 6.1922, time 121.67ms
step 553250: train loss 5.5509, val loss 5.5498
saving checkpoint to out-shakespeare-char
iter 553250: loss 6.2228, time 2900.34ms
iter 553260: loss 5.5375, time 121.23ms
iter 553270: loss 5.3843, time 121.70ms
iter 553280: loss 5.3462, time 122.29ms
iter 553290: loss 6.0419, time 121.53ms
iter 553300: loss 6.5172, time 121.20ms
iter 553310: loss 5.4538, time 121.35ms
iter 553320: loss 5.7935, time 121.39ms
iter 553330: loss 6.1599, time 121.11ms
iter 553340: loss 6.2206, time 121.24ms
iter 553350: loss 5.8612, time 122.27ms
iter 553360: loss 5.6729, time 120.47ms
iter 553370: loss 6.5408, time 122.75ms
iter 553380: loss 5.6343, time 120.64ms
iter 553390: loss 6.5524, time 121.94ms
iter 553400: loss 6.3085, time 121.34ms
iter 553410: loss 6.5100, time 121.31ms
iter 553420: loss 5.2430, time 122.40ms
iter 553430: loss 5.9050, time 121.37ms
iter 553440: loss 6.0774, time 121.33ms
iter 553450: loss 6.3259, time 121.36ms
iter 553460: loss 5.5533, time 122.43ms
iter 553470: loss 5.9531, time 121.38ms
iter 553480: loss 5.4797, time 121.28ms
iter 553490: loss 5.9752, time 122.52ms
step 553500: train loss 5.5161, val loss 5.5487
saving checkpoint to out-shakespeare-char
iter 553500: loss 6.0089, time 2892.26ms
iter 553510: loss 5.6302, time 125.93ms
iter 553520: loss 6.2461, time 125.78ms
iter 553530: loss 6.3479, time 125.52ms
iter 553540: loss 6.2420, time 125.58ms
iter 553550: loss 6.2636, time 125.63ms
iter 553560: loss 6.1454, time 125.39ms
iter 553570: loss 5.9893, time 125.38ms
iter 553580: loss 4.8256, time 125.81ms
iter 553590: loss 6.2237, time 125.72ms
iter 553600: loss 5.9952, time 126.07ms
iter 553610: loss 6.2601, time 125.57ms
iter 553620: loss 6.0943, time 125.64ms
iter 553630: loss 6.2036, time 125.67ms
iter 553640: loss 6.1532, time 125.68ms
iter 553650: loss 6.0756, time 125.45ms
iter 553660: loss 5.3758, time 125.62ms
iter 553670: loss 5.6580, time 125.77ms
iter 553680: loss 5.3771, time 125.56ms
iter 553690: loss 5.8667, time 124.58ms
iter 553700: loss 6.4823, time 125.37ms
iter 553710: loss 6.1158, time 125.48ms
iter 553720: loss 6.5238, time 126.25ms
iter 553730: loss 6.1003, time 125.44ms
iter 553740: loss 5.9731, time 125.60ms
step 553750: train loss 5.5282, val loss 5.5768
saving checkpoint to out-shakespeare-char
iter 553750: loss 5.8122, time 2885.09ms
iter 553760: loss 5.6327, time 125.63ms
iter 553770: loss 6.3160, time 125.94ms
iter 553780: loss 6.3403, time 126.63ms
iter 553790: loss 5.4267, time 124.69ms
iter 553800: loss 6.3415, time 125.43ms
iter 553810: loss 6.6276, time 125.01ms
iter 553820: loss 6.1493, time 125.00ms
iter 553830: loss 5.9713, time 125.44ms
iter 553840: loss 5.5610, time 125.16ms
iter 553850: loss 5.8137, time 125.15ms
iter 553860: loss 5.6032, time 125.46ms
iter 553870: loss 5.7399, time 125.32ms
iter 553880: loss 6.0821, time 124.63ms
iter 553890: loss 5.8928, time 125.15ms
iter 553900: loss 5.7017, time 125.05ms
iter 553910: loss 5.7999, time 125.15ms
iter 553920: loss 5.9941, time 125.12ms
iter 553930: loss 6.2207, time 125.27ms
iter 553940: loss 6.2334, time 126.17ms
iter 553950: loss 5.8249, time 126.40ms
iter 553960: loss 5.9437, time 125.53ms
iter 553970: loss 5.8351, time 125.01ms
iter 553980: loss 5.3089, time 125.57ms
iter 553990: loss 6.5874, time 125.57ms
step 554000: train loss 5.5764, val loss 5.5648
saving checkpoint to out-shakespeare-char
iter 554000: loss 6.4290, time 2882.36ms
iter 554010: loss 5.5982, time 125.49ms
iter 554020: loss 6.3917, time 124.95ms
iter 554030: loss 6.2128, time 124.41ms
iter 554040: loss 5.9880, time 125.18ms
iter 554050: loss 5.5693, time 125.07ms
iter 554060: loss 6.5404, time 125.06ms
iter 554070: loss 6.0417, time 124.96ms
iter 554080: loss 5.8397, time 124.92ms
iter 554090: loss 5.6668, time 125.08ms
iter 554100: loss 6.4696, time 124.95ms
iter 554110: loss 5.9329, time 124.57ms
iter 554120: loss 5.9320, time 125.22ms
iter 554130: loss 6.7324, time 125.09ms
iter 554140: loss 5.6135, time 124.60ms
iter 554150: loss 5.6623, time 125.03ms
iter 554160: loss 5.4296, time 125.51ms
iter 554170: loss 6.7526, time 125.31ms
iter 554180: loss 6.3474, time 125.49ms
iter 554190: loss 5.7059, time 125.09ms
iter 554200: loss 6.3754, time 124.31ms
iter 554210: loss 5.5911, time 121.84ms
iter 554220: loss 5.5873, time 121.54ms
iter 554230: loss 6.3590, time 122.53ms
iter 554240: loss 5.0909, time 122.29ms
step 554250: train loss 5.5853, val loss 5.5302
saving checkpoint to out-shakespeare-char
iter 554250: loss 6.0298, time 2887.08ms
iter 554260: loss 5.5288, time 125.79ms
iter 554270: loss 4.9272, time 126.11ms
iter 554280: loss 5.7209, time 127.22ms
iter 554290: loss 6.3616, time 125.75ms
iter 554300: loss 5.6514, time 128.73ms
iter 554310: loss 5.8438, time 119.66ms
iter 554320: loss 6.4000, time 121.45ms
iter 554330: loss 5.7497, time 122.46ms
iter 554340: loss 6.5555, time 122.65ms
iter 554350: loss 6.0910, time 121.65ms
iter 554360: loss 5.9122, time 122.53ms
iter 554370: loss 6.6720, time 121.39ms
iter 554380: loss 5.9684, time 121.61ms
iter 554390: loss 6.1448, time 122.58ms
iter 554400: loss 5.6060, time 122.70ms
iter 554410: loss 6.9873, time 121.44ms
iter 554420: loss 5.9465, time 121.54ms
iter 554430: loss 5.7731, time 122.43ms
iter 554440: loss 5.1976, time 121.45ms
iter 554450: loss 6.3093, time 121.65ms
iter 554460: loss 6.2741, time 124.45ms
iter 554470: loss 6.3997, time 121.56ms
iter 554480: loss 5.5127, time 121.12ms
iter 554490: loss 5.6857, time 120.58ms
step 554500: train loss 5.5193, val loss 5.5622
saving checkpoint to out-shakespeare-char
iter 554500: loss 7.1294, time 2883.00ms
iter 554510: loss 6.3307, time 122.00ms
iter 554520: loss 6.4076, time 121.94ms
iter 554530: loss 5.7488, time 121.55ms
iter 554540: loss 6.8047, time 123.06ms
iter 554550: loss 5.8009, time 121.52ms
iter 554560: loss 5.7637, time 122.45ms
iter 554570: loss 5.6088, time 121.58ms
iter 554580: loss 5.3649, time 123.02ms
iter 554590: loss 6.0008, time 122.99ms
iter 554600: loss 6.3022, time 122.27ms
iter 554610: loss 5.8443, time 122.83ms
iter 554620: loss 6.2180, time 121.88ms
iter 554630: loss 6.0257, time 123.20ms
iter 554640: loss 5.7474, time 121.61ms
iter 554650: loss 6.0242, time 121.61ms
iter 554660: loss 6.3740, time 123.16ms
iter 554670: loss 6.7418, time 121.75ms
iter 554680: loss 5.7266, time 122.16ms
iter 554690: loss 6.1291, time 123.71ms
iter 554700: loss 5.4494, time 121.79ms
iter 554710: loss 5.8386, time 121.79ms
iter 554720: loss 6.6260, time 121.90ms
iter 554730: loss 6.8999, time 122.30ms
iter 554740: loss 5.9341, time 121.48ms
step 554750: train loss 5.5412, val loss 5.5118
saving checkpoint to out-shakespeare-char
iter 554750: loss 6.2599, time 2890.00ms
iter 554760: loss 6.6710, time 120.76ms
iter 554770: loss 6.4624, time 121.27ms
iter 554780: loss 5.9312, time 121.34ms
iter 554790: loss 6.1348, time 121.43ms
iter 554800: loss 5.6186, time 121.57ms
iter 554810: loss 5.7772, time 122.10ms
iter 554820: loss 5.6315, time 120.75ms
iter 554830: loss 5.9310, time 120.37ms
iter 554840: loss 6.5349, time 120.90ms
iter 554850: loss 5.5768, time 122.30ms
iter 554860: loss 6.5162, time 121.34ms
iter 554870: loss 6.1209, time 121.33ms
iter 554880: loss 5.5249, time 124.07ms
iter 554890: loss 6.1664, time 121.13ms
iter 554900: loss 5.9613, time 121.32ms
iter 554910: loss 6.4271, time 120.72ms
iter 554920: loss 5.9124, time 121.64ms
iter 554930: loss 5.8353, time 121.42ms
iter 554940: loss 6.1737, time 121.34ms
iter 554950: loss 6.0822, time 122.49ms
iter 554960: loss 6.4914, time 121.26ms
iter 554970: loss 6.2715, time 121.78ms
iter 554980: loss 6.4579, time 123.57ms
iter 554990: loss 5.6803, time 121.07ms
step 555000: train loss 5.5338, val loss 5.5135
saving checkpoint to out-shakespeare-char
iter 555000: loss 6.9151, time 2881.13ms
iter 555010: loss 6.1524, time 125.84ms
iter 555020: loss 5.6324, time 125.94ms
iter 555030: loss 6.1586, time 128.09ms
iter 555040: loss 5.8902, time 125.71ms
iter 555050: loss 6.9940, time 127.93ms
iter 555060: loss 5.9366, time 125.70ms
iter 555070: loss 5.5872, time 128.18ms
iter 555080: loss 4.6443, time 125.47ms
iter 555090: loss 5.9685, time 128.26ms
iter 555100: loss 6.2770, time 125.71ms
iter 555110: loss 6.3737, time 128.09ms
iter 555120: loss 5.4819, time 125.44ms
iter 555130: loss 6.1869, time 128.18ms
iter 555140: loss 5.7735, time 125.59ms
iter 555150: loss 5.9856, time 128.38ms
iter 555160: loss 5.2537, time 125.57ms
iter 555170: loss 5.7155, time 128.26ms
iter 555180: loss 5.6865, time 125.82ms
iter 555190: loss 6.9665, time 128.02ms
iter 555200: loss 5.7167, time 125.46ms
iter 555210: loss 6.0488, time 127.94ms
iter 555220: loss 6.3502, time 125.86ms
iter 555230: loss 6.1690, time 127.24ms
iter 555240: loss 5.9755, time 124.66ms
step 555250: train loss 5.5345, val loss 5.5376
saving checkpoint to out-shakespeare-char
iter 555250: loss 5.7064, time 2906.89ms
iter 555260: loss 5.8626, time 125.53ms
iter 555270: loss 5.9313, time 124.78ms
iter 555280: loss 5.7656, time 125.44ms
iter 555290: loss 6.4681, time 125.57ms
iter 555300: loss 6.5300, time 125.70ms
iter 555310: loss 6.2505, time 125.42ms
iter 555320: loss 6.2446, time 125.54ms
iter 555330: loss 5.5013, time 125.41ms
iter 555340: loss 6.0975, time 125.34ms
iter 555350: loss 6.3369, time 125.39ms
iter 555360: loss 6.2415, time 125.63ms
iter 555370: loss 6.5111, time 125.52ms
iter 555380: loss 5.8160, time 125.89ms
iter 555390: loss 5.6471, time 125.40ms
iter 555400: loss 6.3788, time 125.54ms
iter 555410: loss 5.4489, time 125.43ms
iter 555420: loss 6.0266, time 125.47ms
iter 555430: loss 5.8225, time 125.20ms
iter 555440: loss 5.8147, time 125.24ms
iter 555450: loss 5.2804, time 125.32ms
iter 555460: loss 6.6025, time 125.34ms
iter 555470: loss 5.6043, time 125.42ms
iter 555480: loss 6.4321, time 125.45ms
iter 555490: loss 5.2855, time 125.24ms
step 555500: train loss 5.5398, val loss 5.5142
saving checkpoint to out-shakespeare-char
iter 555500: loss 6.8652, time 2877.89ms
iter 555510: loss 5.7720, time 124.86ms
iter 555520: loss 5.6907, time 125.44ms
iter 555530: loss 5.6668, time 125.38ms
iter 555540: loss 6.1161, time 125.46ms
iter 555550: loss 5.4532, time 125.44ms
iter 555560: loss 6.1185, time 125.55ms
iter 555570: loss 6.2026, time 125.25ms
iter 555580: loss 6.1887, time 125.56ms
iter 555590: loss 5.6916, time 125.27ms
iter 555600: loss 5.3984, time 126.27ms
iter 555610: loss 6.2126, time 125.70ms
iter 555620: loss 5.8304, time 125.70ms
iter 555630: loss 5.9801, time 125.47ms
iter 555640: loss 6.6012, time 125.67ms
iter 555650: loss 5.7780, time 127.25ms
iter 555660: loss 5.4553, time 125.05ms
iter 555670: loss 6.2034, time 121.94ms
iter 555680: loss 5.9355, time 121.03ms
iter 555690: loss 6.2329, time 122.61ms
iter 555700: loss 6.1555, time 121.43ms
iter 555710: loss 5.9264, time 121.31ms
iter 555720: loss 6.3598, time 122.98ms
iter 555730: loss 6.2470, time 121.38ms
iter 555740: loss 6.4910, time 121.49ms
step 555750: train loss 5.5466, val loss 5.5912
saving checkpoint to out-shakespeare-char
iter 555750: loss 6.8322, time 2903.51ms
iter 555760: loss 6.1682, time 121.46ms
iter 555770: loss 5.7133, time 121.51ms
iter 555780: loss 6.0077, time 121.43ms
iter 555790: loss 5.9514, time 122.46ms
iter 555800: loss 5.2386, time 121.39ms
iter 555810: loss 5.7537, time 121.36ms
iter 555820: loss 6.2297, time 122.33ms
iter 555830: loss 5.6592, time 121.40ms
iter 555840: loss 6.1837, time 121.22ms
iter 555850: loss 6.9354, time 121.28ms
iter 555860: loss 5.3418, time 121.12ms
iter 555870: loss 5.7519, time 121.41ms
iter 555880: loss 5.8354, time 121.38ms
iter 555890: loss 5.4726, time 121.78ms
iter 555900: loss 5.6726, time 121.49ms
iter 555910: loss 5.3290, time 121.64ms
iter 555920: loss 6.5854, time 121.45ms
iter 555930: loss 6.1680, time 121.45ms
iter 555940: loss 5.7541, time 121.42ms
iter 555950: loss 6.0356, time 121.28ms
iter 555960: loss 6.7051, time 121.46ms
iter 555970: loss 6.7485, time 121.95ms
iter 555980: loss 5.9272, time 121.12ms
iter 555990: loss 5.6565, time 122.43ms
step 556000: train loss 5.5360, val loss 5.5136
saving checkpoint to out-shakespeare-char
iter 556000: loss 6.3128, time 2891.33ms
iter 556010: loss 5.2879, time 121.47ms
iter 556020: loss 5.7219, time 121.55ms
iter 556030: loss 5.8902, time 121.88ms
iter 556040: loss 6.9510, time 122.00ms
iter 556050: loss 5.5170, time 121.46ms
iter 556060: loss 6.0089, time 123.08ms
iter 556070: loss 6.2338, time 121.98ms
iter 556080: loss 6.4813, time 121.50ms
iter 556090: loss 5.5790, time 122.99ms
iter 556100: loss 5.6362, time 121.53ms
iter 556110: loss 6.4490, time 121.57ms
iter 556120: loss 6.7670, time 122.68ms
iter 556130: loss 6.0588, time 122.04ms
iter 556140: loss 6.4994, time 121.66ms
iter 556150: loss 6.4923, time 124.44ms
iter 556160: loss 6.1723, time 123.28ms
iter 556170: loss 5.8205, time 121.78ms
iter 556180: loss 5.7234, time 122.10ms
iter 556190: loss 6.1651, time 122.89ms
iter 556200: loss 5.3353, time 122.00ms
iter 556210: loss 6.8431, time 121.30ms
iter 556220: loss 7.1758, time 124.04ms
iter 556230: loss 5.4326, time 121.84ms
iter 556240: loss 6.0223, time 121.48ms
step 556250: train loss 5.5404, val loss 5.5462
saving checkpoint to out-shakespeare-char
iter 556250: loss 5.9663, time 2911.80ms
iter 556260: loss 6.0778, time 121.79ms
iter 556270: loss 6.0997, time 121.24ms
iter 556280: loss 6.0539, time 123.84ms
iter 556290: loss 5.4921, time 121.46ms
iter 556300: loss 6.0838, time 121.29ms
iter 556310: loss 6.5726, time 121.47ms
iter 556320: loss 5.2466, time 121.24ms
iter 556330: loss 6.1002, time 121.00ms
iter 556340: loss 5.9253, time 121.14ms
iter 556350: loss 6.4571, time 121.70ms
iter 556360: loss 5.6875, time 120.46ms
iter 556370: loss 5.8826, time 120.89ms
iter 556380: loss 6.4282, time 120.39ms
iter 556390: loss 6.8297, time 122.71ms
iter 556400: loss 6.2610, time 121.06ms
iter 556410: loss 5.9726, time 120.60ms
iter 556420: loss 6.2484, time 121.53ms
iter 556430: loss 5.4495, time 121.13ms
iter 556440: loss 6.1159, time 121.04ms
iter 556450: loss 6.0697, time 121.37ms
iter 556460: loss 5.6486, time 120.91ms
iter 556470: loss 6.2998, time 120.70ms
iter 556480: loss 5.9146, time 120.71ms
iter 556490: loss 6.1267, time 123.10ms
step 556500: train loss 5.5602, val loss 5.4999
saving checkpoint to out-shakespeare-char
iter 556500: loss 6.2581, time 2896.42ms
iter 556510: loss 6.1991, time 122.11ms
iter 556520: loss 5.8423, time 121.69ms
iter 556530: loss 6.0284, time 121.60ms
iter 556540: loss 5.9809, time 121.74ms
iter 556550: loss 5.6064, time 121.73ms
iter 556560: loss 6.1381, time 123.52ms
iter 556570: loss 5.8638, time 121.71ms
iter 556580: loss 5.6621, time 122.21ms
iter 556590: loss 6.2435, time 122.91ms
iter 556600: loss 6.2841, time 121.91ms
iter 556610: loss 5.6290, time 123.38ms
iter 556620: loss 6.0861, time 124.11ms
iter 556630: loss 6.6812, time 121.54ms
iter 556640: loss 6.0576, time 121.40ms
iter 556650: loss 6.3087, time 121.39ms
iter 556660: loss 5.7440, time 122.55ms
iter 556670: loss 6.0173, time 121.50ms
iter 556680: loss 5.0866, time 121.19ms
iter 556690: loss 6.4373, time 122.60ms
iter 556700: loss 5.6419, time 121.46ms
iter 556710: loss 5.7999, time 122.32ms
iter 556720: loss 6.1576, time 123.77ms
iter 556730: loss 5.9653, time 121.53ms
iter 556740: loss 5.8499, time 121.43ms
step 556750: train loss 5.5725, val loss 5.5531
saving checkpoint to out-shakespeare-char
iter 556750: loss 6.1572, time 2895.19ms
iter 556760: loss 6.4054, time 123.08ms
iter 556770: loss 6.2435, time 121.79ms
iter 556780: loss 5.8416, time 121.02ms
iter 556790: loss 6.5427, time 122.77ms
iter 556800: loss 5.4236, time 121.63ms
iter 556810: loss 6.4601, time 121.34ms
iter 556820: loss 6.2162, time 124.24ms
iter 556830: loss 6.1635, time 121.80ms
iter 556840: loss 5.6534, time 121.72ms
iter 556850: loss 5.7239, time 121.38ms
iter 556860: loss 6.1073, time 121.65ms
iter 556870: loss 6.4563, time 121.79ms
iter 556880: loss 6.7092, time 121.36ms
iter 556890: loss 5.2345, time 122.61ms
iter 556900: loss 5.6528, time 121.29ms
iter 556910: loss 6.6769, time 121.22ms
iter 556920: loss 6.8932, time 122.86ms
iter 556930: loss 5.6860, time 121.20ms
iter 556940: loss 5.6769, time 121.12ms
iter 556950: loss 5.4299, time 121.13ms
iter 556960: loss 6.2192, time 121.29ms
iter 556970: loss 5.3039, time 121.49ms
iter 556980: loss 6.4200, time 121.06ms
iter 556990: loss 5.9727, time 122.68ms
step 557000: train loss 5.5791, val loss 5.5617
saving checkpoint to out-shakespeare-char
iter 557000: loss 5.7448, time 2897.70ms
iter 557010: loss 6.1302, time 121.87ms
iter 557020: loss 6.8931, time 123.89ms
iter 557030: loss 5.9359, time 121.51ms
iter 557040: loss 5.5230, time 121.42ms
iter 557050: loss 5.0364, time 121.29ms
iter 557060: loss 5.5689, time 121.40ms
iter 557070: loss 5.6867, time 121.45ms
iter 557080: loss 5.9023, time 121.29ms
iter 557090: loss 5.8516, time 123.96ms
iter 557100: loss 5.8784, time 121.45ms
iter 557110: loss 5.7893, time 121.08ms
iter 557120: loss 5.6684, time 121.09ms
iter 557130: loss 6.4087, time 122.27ms
iter 557140: loss 5.8791, time 121.10ms
iter 557150: loss 5.3950, time 121.15ms
iter 557160: loss 6.0965, time 122.82ms
iter 557170: loss 6.9228, time 121.22ms
iter 557180: loss 5.5704, time 121.13ms
iter 557190: loss 6.0110, time 123.82ms
iter 557200: loss 6.2717, time 121.18ms
iter 557210: loss 6.0748, time 121.85ms
iter 557220: loss 6.3690, time 121.23ms
iter 557230: loss 6.1386, time 122.25ms
iter 557240: loss 6.1113, time 121.11ms
step 557250: train loss 5.4927, val loss 5.5318
saving checkpoint to out-shakespeare-char
iter 557250: loss 6.3099, time 2895.74ms
iter 557260: loss 5.0726, time 123.86ms
iter 557270: loss 6.7456, time 121.51ms
iter 557280: loss 5.8244, time 121.55ms
iter 557290: loss 5.0806, time 121.94ms
iter 557300: loss 6.2605, time 121.59ms
iter 557310: loss 5.7920, time 121.47ms
iter 557320: loss 6.2059, time 121.42ms
iter 557330: loss 5.8712, time 121.95ms
iter 557340: loss 5.3696, time 122.69ms
iter 557350: loss 6.2558, time 121.78ms
iter 557360: loss 6.3632, time 121.62ms
iter 557370: loss 6.4337, time 124.19ms
iter 557380: loss 5.3934, time 121.49ms
iter 557390: loss 5.6748, time 121.59ms
iter 557400: loss 6.0310, time 121.67ms
iter 557410: loss 6.1641, time 122.91ms
iter 557420: loss 4.9517, time 121.81ms
iter 557430: loss 6.3228, time 121.97ms
iter 557440: loss 5.3640, time 124.26ms
iter 557450: loss 5.5797, time 121.46ms
iter 557460: loss 5.5801, time 121.49ms
iter 557470: loss 6.5685, time 121.63ms
iter 557480: loss 5.7188, time 121.52ms
iter 557490: loss 6.0996, time 121.65ms
step 557500: train loss 5.5227, val loss 5.5702
saving checkpoint to out-shakespeare-char
iter 557500: loss 5.4127, time 2910.89ms
iter 557510: loss 6.2658, time 121.60ms
iter 557520: loss 6.1582, time 121.54ms
iter 557530: loss 6.1420, time 121.31ms
iter 557540: loss 5.7777, time 122.81ms
iter 557550: loss 6.2576, time 121.81ms
iter 557560: loss 5.8259, time 121.83ms
iter 557570: loss 5.6568, time 122.51ms
iter 557580: loss 5.4346, time 121.72ms
iter 557590: loss 5.6602, time 121.29ms
iter 557600: loss 5.9627, time 121.61ms
iter 557610: loss 5.4819, time 121.49ms
iter 557620: loss 6.3276, time 121.57ms
iter 557630: loss 5.9848, time 121.71ms
iter 557640: loss 6.0326, time 123.00ms
iter 557650: loss 6.6938, time 121.66ms
iter 557660: loss 6.4718, time 121.79ms
iter 557670: loss 6.6654, time 122.68ms
iter 557680: loss 6.1278, time 121.71ms
iter 557690: loss 5.2930, time 121.58ms
iter 557700: loss 5.8352, time 123.36ms
iter 557710: loss 6.2215, time 121.45ms
iter 557720: loss 6.6786, time 121.39ms
iter 557730: loss 6.4568, time 121.41ms
iter 557740: loss 5.4670, time 122.83ms
step 557750: train loss 5.5480, val loss 5.5415
saving checkpoint to out-shakespeare-char
iter 557750: loss 6.1838, time 2895.51ms
iter 557760: loss 6.0146, time 121.21ms
iter 557770: loss 6.0820, time 121.56ms
iter 557780: loss 6.4695, time 121.21ms
iter 557790: loss 5.1780, time 121.38ms
iter 557800: loss 5.3480, time 121.35ms
iter 557810: loss 6.6617, time 122.25ms
iter 557820: loss 5.8497, time 121.87ms
iter 557830: loss 5.5880, time 121.29ms
iter 557840: loss 5.1752, time 122.39ms
iter 557850: loss 6.0327, time 122.08ms
iter 557860: loss 6.2344, time 121.94ms
iter 557870: loss 5.7003, time 124.14ms
iter 557880: loss 5.6237, time 121.67ms
iter 557890: loss 6.5363, time 121.63ms
iter 557900: loss 5.6815, time 121.54ms
iter 557910: loss 6.0729, time 121.88ms
iter 557920: loss 6.3490, time 121.50ms
iter 557930: loss 6.5987, time 121.84ms
iter 557940: loss 5.4745, time 122.88ms
iter 557950: loss 6.7510, time 121.83ms
iter 557960: loss 6.8880, time 121.56ms
iter 557970: loss 6.4739, time 122.99ms
iter 557980: loss 6.2795, time 121.56ms
iter 557990: loss 5.5296, time 121.65ms
step 558000: train loss 5.5450, val loss 5.5221
saving checkpoint to out-shakespeare-char
iter 558000: loss 6.1950, time 2902.13ms
iter 558010: loss 6.3124, time 121.48ms
iter 558020: loss 5.5970, time 122.83ms
iter 558030: loss 6.0707, time 121.90ms
iter 558040: loss 6.1673, time 122.10ms
iter 558050: loss 5.7341, time 124.06ms
iter 558060: loss 6.1353, time 121.64ms
iter 558070: loss 5.7097, time 121.74ms
iter 558080: loss 6.4142, time 121.34ms
iter 558090: loss 6.1472, time 122.59ms
iter 558100: loss 6.0794, time 121.20ms
iter 558110: loss 5.8171, time 121.25ms
iter 558120: loss 6.1572, time 123.69ms
iter 558130: loss 5.6288, time 121.08ms
iter 558140: loss 6.2743, time 121.13ms
iter 558150: loss 6.2096, time 121.20ms
iter 558160: loss 6.2027, time 121.16ms
iter 558170: loss 5.6005, time 121.23ms
iter 558180: loss 5.5441, time 121.19ms
iter 558190: loss 5.3756, time 124.00ms
iter 558200: loss 6.6820, time 122.19ms
iter 558210: loss 5.2015, time 121.01ms
iter 558220: loss 6.1472, time 121.41ms
iter 558230: loss 6.4522, time 123.87ms
iter 558240: loss 5.4555, time 120.66ms
step 558250: train loss 5.5516, val loss 5.5356
saving checkpoint to out-shakespeare-char
iter 558250: loss 6.3078, time 2904.11ms
iter 558260: loss 6.3297, time 122.50ms
iter 558270: loss 5.9427, time 121.41ms
iter 558280: loss 6.0559, time 121.25ms
iter 558290: loss 5.4079, time 123.96ms
iter 558300: loss 5.9611, time 120.42ms
iter 558310: loss 5.9914, time 120.28ms
iter 558320: loss 5.7749, time 121.24ms
iter 558330: loss 5.5084, time 121.47ms
iter 558340: loss 5.6941, time 121.56ms
iter 558350: loss 5.9357, time 121.48ms
iter 558360: loss 6.0282, time 122.45ms
iter 558370: loss 5.6987, time 121.36ms
iter 558380: loss 5.2187, time 121.25ms
iter 558390: loss 5.8424, time 123.26ms
iter 558400: loss 6.2556, time 121.58ms
iter 558410: loss 6.0633, time 121.44ms
iter 558420: loss 6.4393, time 124.52ms
iter 558430: loss 5.6754, time 121.93ms
iter 558440: loss 6.4610, time 121.57ms
iter 558450: loss 5.6533, time 121.77ms
iter 558460: loss 6.2846, time 121.35ms
iter 558470: loss 6.3839, time 120.56ms
iter 558480: loss 6.0215, time 121.67ms
iter 558490: loss 5.6538, time 122.43ms
step 558500: train loss 5.5052, val loss 5.5479
saving checkpoint to out-shakespeare-char
iter 558500: loss 5.3299, time 2897.04ms
iter 558510: loss 6.2508, time 121.53ms
iter 558520: loss 5.5139, time 124.13ms
iter 558530: loss 6.0151, time 121.37ms
iter 558540: loss 6.7477, time 121.59ms
iter 558550: loss 5.8594, time 121.53ms
iter 558560: loss 6.5478, time 121.54ms
iter 558570: loss 5.9088, time 121.65ms
iter 558580: loss 5.7899, time 121.55ms
iter 558590: loss 6.1347, time 122.64ms
iter 558600: loss 6.1520, time 121.49ms
iter 558610: loss 6.4172, time 120.55ms
iter 558620: loss 6.1603, time 122.61ms
iter 558630: loss 6.1195, time 121.49ms
iter 558640: loss 5.1647, time 121.55ms
iter 558650: loss 5.8211, time 121.53ms
iter 558660: loss 6.3154, time 121.47ms
iter 558670: loss 5.6653, time 121.74ms
iter 558680: loss 6.2024, time 121.85ms
iter 558690: loss 6.2173, time 122.48ms
iter 558700: loss 6.1791, time 121.47ms
iter 558710: loss 5.3906, time 121.51ms
iter 558720: loss 6.1732, time 122.56ms
iter 558730: loss 5.0334, time 121.63ms
iter 558740: loss 6.2987, time 121.50ms
step 558750: train loss 5.5051, val loss 5.4778
saving checkpoint to out-shakespeare-char
iter 558750: loss 6.0996, time 2904.20ms
iter 558760: loss 6.5711, time 122.52ms
iter 558770: loss 4.8279, time 121.35ms
iter 558780: loss 6.2051, time 121.58ms
iter 558790: loss 5.7594, time 122.63ms
iter 558800: loss 6.2091, time 121.37ms
iter 558810: loss 5.6667, time 121.43ms
iter 558820: loss 5.7568, time 124.00ms
iter 558830: loss 5.6243, time 121.40ms
iter 558840: loss 6.1521, time 121.32ms
iter 558850: loss 5.6294, time 121.76ms
iter 558860: loss 5.5525, time 122.45ms
iter 558870: loss 6.5640, time 121.33ms
iter 558880: loss 5.5959, time 121.29ms
iter 558890: loss 6.3149, time 123.89ms
iter 558900: loss 6.6378, time 121.38ms
iter 558910: loss 5.7001, time 121.55ms
iter 558920: loss 5.9850, time 122.52ms
iter 558930: loss 5.9418, time 122.47ms
iter 558940: loss 5.6331, time 121.51ms
iter 558950: loss 5.8818, time 121.64ms
iter 558960: loss 6.0087, time 122.42ms
iter 558970: loss 5.9835, time 121.61ms
iter 558980: loss 5.9296, time 121.40ms
iter 558990: loss 5.0368, time 121.41ms
step 559000: train loss 5.5533, val loss 5.5665
saving checkpoint to out-shakespeare-char
iter 559000: loss 5.7329, time 2903.61ms
iter 559010: loss 5.9822, time 124.63ms
iter 559020: loss 6.0842, time 124.57ms
iter 559030: loss 5.4613, time 124.89ms
iter 559040: loss 6.2607, time 125.28ms
iter 559050: loss 6.1181, time 125.26ms
iter 559060: loss 6.0939, time 125.33ms
iter 559070: loss 6.0468, time 125.03ms
iter 559080: loss 6.4384, time 125.18ms
iter 559090: loss 6.5299, time 125.54ms
iter 559100: loss 6.3665, time 125.79ms
iter 559110: loss 5.5757, time 125.53ms
iter 559120: loss 6.2396, time 124.60ms
iter 559130: loss 5.7711, time 125.45ms
iter 559140: loss 7.0931, time 125.12ms
iter 559150: loss 5.4193, time 125.53ms
iter 559160: loss 6.4171, time 125.25ms
iter 559170: loss 5.6496, time 121.67ms
iter 559180: loss 6.1479, time 122.76ms
iter 559190: loss 6.8589, time 121.75ms
iter 559200: loss 5.5223, time 121.86ms
iter 559210: loss 5.9837, time 124.43ms
iter 559220: loss 5.3085, time 121.78ms
iter 559230: loss 5.6603, time 121.91ms
iter 559240: loss 6.4444, time 121.56ms
step 559250: train loss 5.5405, val loss 5.4713
saving checkpoint to out-shakespeare-char
iter 559250: loss 6.6413, time 2888.63ms
iter 559260: loss 6.0080, time 121.39ms
iter 559270: loss 5.9118, time 121.10ms
iter 559280: loss 6.0358, time 121.68ms
iter 559290: loss 6.2156, time 121.21ms
iter 559300: loss 5.7986, time 121.68ms
iter 559310: loss 6.0360, time 120.42ms
iter 559320: loss 5.8954, time 121.08ms
iter 559330: loss 5.5990, time 122.65ms
iter 559340: loss 5.8576, time 120.97ms
iter 559350: loss 6.0841, time 121.08ms
iter 559360: loss 5.7817, time 122.43ms
iter 559370: loss 5.4096, time 120.18ms
iter 559380: loss 6.2821, time 121.55ms
iter 559390: loss 5.2531, time 121.14ms
iter 559400: loss 6.2199, time 121.17ms
iter 559410: loss 6.1249, time 121.43ms
iter 559420: loss 6.3722, time 121.93ms
iter 559430: loss 6.3873, time 121.11ms
iter 559440: loss 5.8756, time 121.55ms
iter 559450: loss 6.5323, time 121.22ms
iter 559460: loss 6.2036, time 121.43ms
iter 559470: loss 6.6178, time 121.17ms
iter 559480: loss 6.5431, time 121.05ms
iter 559490: loss 6.1986, time 123.89ms
step 559500: train loss 5.5416, val loss 5.5212
saving checkpoint to out-shakespeare-char
iter 559500: loss 6.5175, time 2898.44ms
iter 559510: loss 5.2200, time 119.36ms
iter 559520: loss 5.7168, time 119.47ms
iter 559530: loss 6.1949, time 120.59ms
iter 559540: loss 5.8859, time 119.41ms
iter 559550: loss 6.3229, time 120.56ms
iter 559560: loss 6.9170, time 120.06ms
iter 559570: loss 5.7773, time 120.92ms
iter 559580: loss 5.8754, time 120.70ms
iter 559590: loss 6.1380, time 120.61ms
iter 559600: loss 5.7098, time 121.87ms
iter 559610: loss 6.2539, time 123.18ms
iter 559620: loss 5.5511, time 121.82ms
iter 559630: loss 5.8863, time 122.05ms
iter 559640: loss 5.2713, time 120.68ms
iter 559650: loss 5.4170, time 120.66ms
iter 559660: loss 5.8817, time 119.32ms
iter 559670: loss 5.9760, time 119.30ms
iter 559680: loss 5.6619, time 119.66ms
iter 559690: loss 5.7137, time 120.63ms
iter 559700: loss 5.7265, time 119.59ms
iter 559710: loss 6.6024, time 119.46ms
iter 559720: loss 5.9451, time 119.44ms
iter 559730: loss 6.0951, time 120.50ms
iter 559740: loss 5.8249, time 119.50ms
step 559750: train loss 5.5566, val loss 5.5007
saving checkpoint to out-shakespeare-char
iter 559750: loss 5.3301, time 2884.75ms
iter 559760: loss 5.5072, time 121.77ms
iter 559770: loss 6.6248, time 121.29ms
iter 559780: loss 5.8480, time 121.28ms
iter 559790: loss 5.7005, time 122.43ms
iter 559800: loss 6.1541, time 121.31ms
iter 559810: loss 5.7841, time 121.85ms
iter 559820: loss 5.5837, time 122.32ms
iter 559830: loss 5.8129, time 121.58ms
iter 559840: loss 6.3780, time 121.50ms
iter 559850: loss 5.9151, time 123.88ms
iter 559860: loss 5.1299, time 121.17ms
iter 559870: loss 6.5067, time 121.27ms
iter 559880: loss 6.0889, time 121.43ms
iter 559890: loss 5.9432, time 121.20ms
iter 559900: loss 5.7386, time 122.41ms
iter 559910: loss 6.4291, time 121.34ms
iter 559920: loss 5.8429, time 124.27ms
iter 559930: loss 6.2550, time 121.34ms
iter 559940: loss 5.3001, time 121.27ms
iter 559950: loss 5.6555, time 121.24ms
iter 559960: loss 5.4824, time 122.38ms
iter 559970: loss 6.0039, time 121.33ms
iter 559980: loss 6.5098, time 121.40ms
iter 559990: loss 6.1344, time 121.37ms
step 560000: train loss 5.5231, val loss 5.5422
saving checkpoint to out-shakespeare-char
iter 560000: loss 6.7740, time 2890.46ms
iter 560010: loss 6.0464, time 126.01ms
iter 560020: loss 6.3675, time 128.80ms
iter 560030: loss 5.8898, time 125.96ms
iter 560040: loss 5.8622, time 128.52ms
iter 560050: loss 5.5972, time 126.95ms
iter 560060: loss 6.0851, time 128.49ms
iter 560070: loss 5.9272, time 125.91ms
iter 560080: loss 6.3374, time 128.11ms
iter 560090: loss 5.9008, time 125.50ms
iter 560100: loss 5.4347, time 128.02ms
iter 560110: loss 6.1725, time 125.68ms
iter 560120: loss 6.1411, time 128.39ms
iter 560130: loss 6.5691, time 125.77ms
iter 560140: loss 6.8246, time 127.77ms
iter 560150: loss 5.4466, time 125.56ms
iter 560160: loss 6.1371, time 128.07ms
iter 560170: loss 6.1165, time 125.56ms
iter 560180: loss 5.6273, time 128.23ms
iter 560190: loss 5.3387, time 124.86ms
iter 560200: loss 6.5885, time 124.13ms
iter 560210: loss 5.9202, time 124.95ms
iter 560220: loss 6.2737, time 124.97ms
iter 560230: loss 6.1894, time 124.85ms
iter 560240: loss 5.5582, time 124.15ms
step 560250: train loss 5.5384, val loss 5.5261
saving checkpoint to out-shakespeare-char
iter 560250: loss 5.5984, time 2883.77ms
iter 560260: loss 5.3621, time 125.75ms
iter 560270: loss 5.7542, time 125.63ms
iter 560280: loss 5.7010, time 128.18ms
iter 560290: loss 5.3089, time 125.48ms
iter 560300: loss 6.8428, time 128.25ms
iter 560310: loss 6.2237, time 125.74ms
iter 560320: loss 6.0377, time 128.15ms
iter 560330: loss 6.0099, time 125.63ms
iter 560340: loss 5.6645, time 128.39ms
iter 560350: loss 5.9461, time 125.66ms
iter 560360: loss 5.8653, time 128.25ms
iter 560370: loss 6.1624, time 123.91ms
iter 560380: loss 6.0818, time 128.26ms
iter 560390: loss 5.4040, time 125.73ms
iter 560400: loss 5.4838, time 128.40ms
iter 560410: loss 5.8149, time 125.83ms
iter 560420: loss 6.6146, time 128.52ms
iter 560430: loss 5.9117, time 125.72ms
iter 560440: loss 5.6714, time 125.60ms
iter 560450: loss 5.9885, time 125.66ms
iter 560460: loss 5.7041, time 128.50ms
iter 560470: loss 5.9867, time 125.57ms
iter 560480: loss 6.1544, time 128.22ms
iter 560490: loss 5.6210, time 126.04ms
step 560500: train loss 5.5691, val loss 5.5468
saving checkpoint to out-shakespeare-char
iter 560500: loss 5.3989, time 2885.84ms
iter 560510: loss 6.0213, time 125.24ms
iter 560520: loss 6.2513, time 127.73ms
iter 560530: loss 6.1907, time 125.21ms
iter 560540: loss 5.7214, time 127.92ms
iter 560550: loss 6.0017, time 125.11ms
iter 560560: loss 5.5247, time 127.86ms
iter 560570: loss 6.2689, time 125.23ms
iter 560580: loss 5.8116, time 127.63ms
iter 560590: loss 5.6354, time 125.02ms
iter 560600: loss 6.4530, time 127.82ms
iter 560610: loss 6.1274, time 125.05ms
iter 560620: loss 5.9977, time 127.80ms
iter 560630: loss 6.4013, time 124.64ms
iter 560640: loss 5.7934, time 127.82ms
iter 560650: loss 6.1241, time 123.55ms
iter 560660: loss 5.2626, time 127.13ms
iter 560670: loss 5.7769, time 125.00ms
iter 560680: loss 6.0547, time 125.13ms
iter 560690: loss 6.2783, time 124.31ms
iter 560700: loss 6.0141, time 124.83ms
iter 560710: loss 6.4752, time 125.34ms
iter 560720: loss 6.4990, time 124.82ms
iter 560730: loss 6.1533, time 122.77ms
iter 560740: loss 5.9401, time 124.86ms
step 560750: train loss 5.5463, val loss 5.5093
saving checkpoint to out-shakespeare-char
iter 560750: loss 6.6040, time 2897.65ms
iter 560760: loss 6.3891, time 125.31ms
iter 560770: loss 4.7213, time 125.31ms
iter 560780: loss 6.0385, time 125.22ms
iter 560790: loss 5.3461, time 125.49ms
iter 560800: loss 5.4447, time 125.56ms
iter 560810: loss 6.5157, time 125.72ms
iter 560820: loss 6.0996, time 125.61ms
iter 560830: loss 6.2698, time 125.93ms
iter 560840: loss 6.4394, time 125.65ms
iter 560850: loss 5.9894, time 125.59ms
iter 560860: loss 6.2733, time 126.00ms
iter 560870: loss 6.3133, time 125.83ms
iter 560880: loss 5.9228, time 125.54ms
iter 560890: loss 6.2623, time 125.84ms
iter 560900: loss 5.8697, time 125.76ms
iter 560910: loss 5.9461, time 126.17ms
iter 560920: loss 5.1920, time 125.65ms
iter 560930: loss 5.8087, time 123.79ms
iter 560940: loss 5.7532, time 125.70ms
iter 560950: loss 5.3953, time 125.74ms
iter 560960: loss 6.7271, time 125.59ms
iter 560970: loss 5.6042, time 126.24ms
iter 560980: loss 4.9302, time 125.61ms
iter 560990: loss 6.3621, time 125.80ms
step 561000: train loss 5.5387, val loss 5.5471
saving checkpoint to out-shakespeare-char
iter 561000: loss 5.2800, time 2890.63ms
iter 561010: loss 6.0214, time 128.00ms
iter 561020: loss 6.4987, time 125.84ms
iter 561030: loss 5.5010, time 128.25ms
iter 561040: loss 6.5675, time 121.28ms
iter 561050: loss 5.7112, time 121.30ms
iter 561060: loss 6.1624, time 122.21ms
iter 561070: loss 6.2471, time 122.98ms
iter 561080: loss 6.1775, time 122.00ms
iter 561090: loss 5.7199, time 121.96ms
iter 561100: loss 6.1164, time 124.59ms
iter 561110: loss 6.0568, time 123.01ms
iter 561120: loss 6.5075, time 121.75ms
iter 561130: loss 6.2477, time 121.81ms
iter 561140: loss 5.6157, time 122.83ms
iter 561150: loss 6.0583, time 121.56ms
iter 561160: loss 6.3342, time 121.73ms
iter 561170: loss 6.1218, time 122.78ms
iter 561180: loss 5.7871, time 122.06ms
iter 561190: loss 5.6705, time 121.77ms
iter 561200: loss 5.8984, time 124.34ms
iter 561210: loss 6.0200, time 121.70ms
iter 561220: loss 6.2923, time 121.67ms
iter 561230: loss 5.8058, time 121.70ms
iter 561240: loss 5.8336, time 122.14ms
step 561250: train loss 5.5132, val loss 5.5419
saving checkpoint to out-shakespeare-char
iter 561250: loss 6.2329, time 2891.86ms
iter 561260: loss 6.2366, time 125.53ms
iter 561270: loss 6.4458, time 124.94ms
iter 561280: loss 5.7471, time 125.02ms
iter 561290: loss 6.8823, time 125.29ms
iter 561300: loss 5.9132, time 125.39ms
iter 561310: loss 6.3048, time 125.89ms
iter 561320: loss 6.2047, time 125.66ms
iter 561330: loss 5.7406, time 125.17ms
iter 561340: loss 6.3635, time 125.53ms
iter 561350: loss 5.8477, time 125.64ms
iter 561360: loss 6.3615, time 124.60ms
iter 561370: loss 6.5535, time 125.90ms
iter 561380: loss 6.0548, time 125.81ms
iter 561390: loss 5.8474, time 126.20ms
iter 561400: loss 6.0342, time 125.30ms
iter 561410: loss 6.7715, time 125.50ms
iter 561420: loss 5.6691, time 125.35ms
iter 561430: loss 5.8429, time 125.19ms
iter 561440: loss 5.4085, time 124.23ms
iter 561450: loss 6.0580, time 125.24ms
iter 561460: loss 5.7816, time 124.69ms
iter 561470: loss 5.7324, time 125.01ms
iter 561480: loss 5.9923, time 124.48ms
iter 561490: loss 5.8909, time 125.29ms
step 561500: train loss 5.6169, val loss 5.5691
saving checkpoint to out-shakespeare-char
iter 561500: loss 5.9868, time 2898.16ms
iter 561510: loss 6.5511, time 125.27ms
iter 561520: loss 6.4326, time 125.02ms
iter 561530: loss 6.2121, time 122.79ms
iter 561540: loss 5.9771, time 125.32ms
iter 561550: loss 5.8621, time 125.64ms
iter 561560: loss 5.6932, time 125.57ms
iter 561570: loss 5.5617, time 123.14ms
iter 561580: loss 5.8807, time 125.09ms
iter 561590: loss 5.9755, time 125.32ms
iter 561600: loss 6.0595, time 125.26ms
iter 561610: loss 5.8032, time 122.81ms
iter 561620: loss 5.5865, time 125.66ms
iter 561630: loss 6.3666, time 125.05ms
iter 561640: loss 5.8615, time 125.34ms
iter 561650: loss 5.1183, time 123.77ms
iter 561660: loss 6.0927, time 125.01ms
iter 561670: loss 6.3536, time 125.25ms
iter 561680: loss 5.7722, time 125.14ms
iter 561690: loss 6.0072, time 124.60ms
iter 561700: loss 6.1560, time 125.16ms
iter 561710: loss 5.7485, time 125.28ms
iter 561720: loss 5.4154, time 125.11ms
iter 561730: loss 6.0256, time 124.48ms
iter 561740: loss 6.1140, time 125.41ms
step 561750: train loss 5.5458, val loss 5.5119
saving checkpoint to out-shakespeare-char
iter 561750: loss 6.1920, time 2893.81ms
iter 561760: loss 6.9811, time 125.43ms
iter 561770: loss 5.9495, time 125.17ms
iter 561780: loss 6.0603, time 125.04ms
iter 561790: loss 5.8371, time 125.29ms
iter 561800: loss 5.9301, time 125.44ms
iter 561810: loss 5.4243, time 125.20ms
iter 561820: loss 6.0799, time 124.42ms
iter 561830: loss 5.5674, time 124.76ms
iter 561840: loss 5.5585, time 124.90ms
iter 561850: loss 5.7318, time 125.03ms
iter 561860: loss 5.8424, time 125.19ms
iter 561870: loss 6.6397, time 125.15ms
iter 561880: loss 6.4975, time 125.71ms
iter 561890: loss 5.6113, time 125.58ms
iter 561900: loss 5.5579, time 125.45ms
iter 561910: loss 5.3688, time 125.57ms
iter 561920: loss 5.9977, time 125.40ms
iter 561930: loss 5.7424, time 125.50ms
iter 561940: loss 5.8036, time 126.11ms
iter 561950: loss 6.5912, time 125.57ms
iter 561960: loss 5.9938, time 125.56ms
iter 561970: loss 6.2531, time 128.15ms
iter 561980: loss 6.8181, time 125.63ms
iter 561990: loss 6.2830, time 128.05ms
step 562000: train loss 5.5475, val loss 5.5355
saving checkpoint to out-shakespeare-char
iter 562000: loss 5.9727, time 2900.56ms
iter 562010: loss 6.1945, time 124.90ms
iter 562020: loss 5.5712, time 125.43ms
iter 562030: loss 6.6146, time 126.18ms
iter 562040: loss 6.4258, time 126.73ms
iter 562050: loss 5.5390, time 126.05ms
iter 562060: loss 6.2797, time 125.81ms
iter 562070: loss 5.6927, time 125.39ms
iter 562080: loss 6.2077, time 125.63ms
iter 562090: loss 6.5423, time 125.80ms
iter 562100: loss 6.2622, time 125.39ms
iter 562110: loss 6.0952, time 124.79ms
iter 562120: loss 6.1149, time 125.57ms
iter 562130: loss 5.9745, time 125.54ms
iter 562140: loss 6.1413, time 125.70ms
iter 562150: loss 5.7016, time 129.12ms
iter 562160: loss 5.7860, time 124.48ms
iter 562170: loss 6.1239, time 124.51ms
iter 562180: loss 5.1389, time 124.85ms
iter 562190: loss 6.2409, time 124.88ms
iter 562200: loss 6.3720, time 124.81ms
iter 562210: loss 5.4923, time 125.02ms
iter 562220: loss 5.5928, time 125.03ms
iter 562230: loss 6.0233, time 125.44ms
iter 562240: loss 6.2413, time 125.09ms
step 562250: train loss 5.5071, val loss 5.5232
saving checkpoint to out-shakespeare-char
iter 562250: loss 5.7573, time 2897.31ms
iter 562260: loss 5.6104, time 125.80ms
iter 562270: loss 5.3904, time 125.48ms
iter 562280: loss 6.2342, time 125.66ms
iter 562290: loss 6.1088, time 125.58ms
iter 562300: loss 6.7147, time 125.85ms
iter 562310: loss 6.1446, time 120.56ms
iter 562320: loss 5.9743, time 119.55ms
iter 562330: loss 5.6812, time 120.79ms
iter 562340: loss 6.0842, time 119.18ms
iter 562350: loss 6.8998, time 120.76ms
iter 562360: loss 5.8018, time 119.66ms
iter 562370: loss 6.1644, time 119.47ms
iter 562380: loss 5.9883, time 120.59ms
iter 562390: loss 5.8174, time 119.35ms
iter 562400: loss 6.1861, time 120.54ms
iter 562410: loss 5.3910, time 121.57ms
iter 562420: loss 5.0880, time 123.96ms
iter 562430: loss 5.5419, time 121.43ms
iter 562440: loss 6.1339, time 121.42ms
iter 562450: loss 6.2686, time 121.24ms
iter 562460: loss 5.8963, time 121.52ms
iter 562470: loss 6.0571, time 121.40ms
iter 562480: loss 5.9407, time 121.79ms
iter 562490: loss 6.1048, time 122.58ms
step 562500: train loss 5.5311, val loss 5.5743
saving checkpoint to out-shakespeare-char
iter 562500: loss 6.0335, time 2906.10ms
iter 562510: loss 6.2125, time 125.76ms
iter 562520: loss 5.5208, time 125.48ms
iter 562530: loss 5.6407, time 125.73ms
iter 562540: loss 5.9374, time 125.58ms
iter 562550: loss 6.1347, time 125.58ms
iter 562560: loss 5.6764, time 125.44ms
iter 562570: loss 5.8058, time 125.56ms
iter 562580: loss 5.3711, time 126.61ms
iter 562590: loss 5.3013, time 125.43ms
iter 562600: loss 5.4675, time 125.65ms
iter 562610: loss 6.6936, time 125.28ms
iter 562620: loss 6.0784, time 125.51ms
iter 562630: loss 5.8701, time 125.62ms
iter 562640: loss 5.9977, time 125.76ms
iter 562650: loss 5.8094, time 125.46ms
iter 562660: loss 5.7656, time 125.70ms
iter 562670: loss 4.7712, time 125.54ms
iter 562680: loss 5.8306, time 125.46ms
iter 562690: loss 6.4279, time 125.44ms
iter 562700: loss 5.6612, time 125.57ms
iter 562710: loss 5.5669, time 126.25ms
iter 562720: loss 6.2181, time 125.74ms
iter 562730: loss 6.3037, time 126.27ms
iter 562740: loss 5.8210, time 125.50ms
step 562750: train loss 5.5026, val loss 5.5129
saving checkpoint to out-shakespeare-char
iter 562750: loss 5.6747, time 2895.99ms
iter 562760: loss 6.1398, time 125.82ms
iter 562770: loss 5.1784, time 127.01ms
iter 562780: loss 6.1832, time 125.84ms
iter 562790: loss 6.0230, time 126.04ms
iter 562800: loss 5.6598, time 125.75ms
iter 562810: loss 6.5052, time 125.84ms
iter 562820: loss 6.4946, time 126.27ms
iter 562830: loss 5.9003, time 125.71ms
iter 562840: loss 5.8008, time 125.41ms
iter 562850: loss 6.5786, time 125.82ms
iter 562860: loss 6.4766, time 125.83ms
iter 562870: loss 6.2622, time 125.55ms
iter 562880: loss 6.0574, time 125.85ms
iter 562890: loss 5.9029, time 125.88ms
iter 562900: loss 6.4842, time 125.99ms
iter 562910: loss 6.1533, time 125.80ms
iter 562920: loss 5.3552, time 125.76ms
iter 562930: loss 5.6139, time 125.83ms
iter 562940: loss 5.5249, time 125.91ms
iter 562950: loss 5.7135, time 126.05ms
iter 562960: loss 5.1595, time 126.09ms
iter 562970: loss 5.8933, time 126.24ms
iter 562980: loss 5.8035, time 126.19ms
iter 562990: loss 5.6733, time 125.52ms
step 563000: train loss 5.4939, val loss 5.5067
saving checkpoint to out-shakespeare-char
iter 563000: loss 6.0502, time 2885.72ms
iter 563010: loss 6.2020, time 125.18ms
iter 563020: loss 6.6960, time 125.81ms
iter 563030: loss 5.7257, time 125.43ms
iter 563040: loss 5.8267, time 125.61ms
iter 563050: loss 5.6695, time 125.96ms
iter 563060: loss 6.3060, time 125.25ms
iter 563070: loss 5.6516, time 125.26ms
iter 563080: loss 5.5614, time 125.05ms
iter 563090: loss 5.7077, time 125.32ms
iter 563100: loss 6.1953, time 125.09ms
iter 563110: loss 6.0104, time 124.86ms
iter 563120: loss 6.3368, time 125.53ms
iter 563130: loss 5.8710, time 124.12ms
iter 563140: loss 6.1799, time 125.21ms
iter 563150: loss 5.8620, time 125.26ms
iter 563160: loss 5.2370, time 124.21ms
iter 563170: loss 6.7318, time 125.26ms
iter 563180: loss 6.1286, time 124.77ms
iter 563190: loss 5.7520, time 125.12ms
iter 563200: loss 6.3035, time 124.05ms
iter 563210: loss 5.3726, time 124.60ms
iter 563220: loss 6.3138, time 125.83ms
iter 563230: loss 5.2734, time 124.77ms
iter 563240: loss 5.5091, time 125.95ms
step 563250: train loss 5.4889, val loss 5.5439
saving checkpoint to out-shakespeare-char
iter 563250: loss 6.2666, time 2915.84ms
iter 563260: loss 6.1387, time 125.15ms
iter 563270: loss 5.6365, time 125.08ms
iter 563280: loss 6.3621, time 125.33ms
iter 563290: loss 5.2323, time 126.76ms
iter 563300: loss 6.4677, time 125.46ms
iter 563310: loss 6.5801, time 127.66ms
iter 563320: loss 5.6616, time 125.45ms
iter 563330: loss 6.0554, time 126.69ms
iter 563340: loss 5.7834, time 125.68ms
iter 563350: loss 6.3960, time 126.82ms
iter 563360: loss 5.6869, time 125.27ms
iter 563370: loss 4.7700, time 126.75ms
iter 563380: loss 6.0214, time 124.97ms
iter 563390: loss 6.1818, time 126.38ms
iter 563400: loss 6.2533, time 124.94ms
iter 563410: loss 5.5488, time 127.70ms
iter 563420: loss 5.9292, time 124.14ms
iter 563430: loss 5.9276, time 127.64ms
iter 563440: loss 5.7221, time 125.08ms
iter 563450: loss 5.6983, time 127.88ms
iter 563460: loss 5.4890, time 124.88ms
iter 563470: loss 5.8997, time 127.82ms
iter 563480: loss 5.3676, time 125.26ms
iter 563490: loss 6.0452, time 127.48ms
step 563500: train loss 5.5629, val loss 5.5686
saving checkpoint to out-shakespeare-char
iter 563500: loss 5.7299, time 2911.33ms
iter 563510: loss 6.4169, time 125.46ms
iter 563520: loss 5.7155, time 125.09ms
iter 563530: loss 5.5389, time 125.41ms
iter 563540: loss 6.3214, time 124.73ms
iter 563550: loss 5.2159, time 125.21ms
iter 563560: loss 5.9988, time 125.14ms
iter 563570: loss 5.4885, time 125.93ms
iter 563580: loss 5.9853, time 125.94ms
iter 563590: loss 5.4866, time 125.75ms
iter 563600: loss 5.8495, time 125.74ms
iter 563610: loss 5.7815, time 125.68ms
iter 563620: loss 6.4032, time 125.78ms
iter 563630: loss 6.2890, time 125.90ms
iter 563640: loss 5.0822, time 125.20ms
iter 563650: loss 6.4276, time 125.22ms
iter 563660: loss 5.6976, time 126.62ms
iter 563670: loss 5.7088, time 125.61ms
iter 563680: loss 6.3080, time 125.29ms
iter 563690: loss 6.4292, time 125.74ms
iter 563700: loss 5.9002, time 125.66ms
iter 563710: loss 5.3288, time 125.74ms
iter 563720: loss 5.6528, time 125.26ms
iter 563730: loss 5.3116, time 125.51ms
iter 563740: loss 5.2724, time 125.94ms
step 563750: train loss 5.5514, val loss 5.5421
saving checkpoint to out-shakespeare-char
iter 563750: loss 5.8582, time 2905.88ms
iter 563760: loss 6.0646, time 122.26ms
iter 563770: loss 6.2800, time 121.80ms
iter 563780: loss 5.2556, time 121.82ms
iter 563790: loss 6.1482, time 123.10ms
iter 563800: loss 5.9243, time 121.69ms
iter 563810: loss 5.5426, time 122.27ms
iter 563820: loss 5.4275, time 122.84ms
iter 563830: loss 6.0483, time 121.76ms
iter 563840: loss 5.9375, time 121.93ms
iter 563850: loss 5.9722, time 124.63ms
iter 563860: loss 6.3471, time 121.75ms
iter 563870: loss 5.8860, time 121.68ms
iter 563880: loss 5.9165, time 121.80ms
iter 563890: loss 6.0777, time 121.71ms
iter 563900: loss 6.2424, time 121.03ms
iter 563910: loss 5.9853, time 121.72ms
iter 563920: loss 5.7524, time 122.89ms
iter 563930: loss 5.8268, time 121.71ms
iter 563940: loss 5.9873, time 122.24ms
iter 563950: loss 5.8758, time 122.65ms
iter 563960: loss 6.6014, time 121.70ms
iter 563970: loss 5.4677, time 121.90ms
iter 563980: loss 5.6922, time 124.32ms
iter 563990: loss 6.2864, time 121.81ms
step 564000: train loss 5.5067, val loss 5.5225
saving checkpoint to out-shakespeare-char
iter 564000: loss 5.5275, time 2889.27ms
iter 564010: loss 6.0792, time 121.47ms
iter 564020: loss 5.5950, time 122.81ms
iter 564030: loss 6.5085, time 121.32ms
iter 564040: loss 6.0122, time 121.60ms
iter 564050: loss 5.6710, time 123.84ms
iter 564060: loss 5.9896, time 121.36ms
iter 564070: loss 5.3706, time 121.43ms
iter 564080: loss 5.8022, time 121.47ms
iter 564090: loss 6.2241, time 121.73ms
iter 564100: loss 5.5759, time 121.49ms
iter 564110: loss 6.2781, time 121.34ms
iter 564120: loss 6.2078, time 122.42ms
iter 564130: loss 6.2874, time 121.43ms
iter 564140: loss 5.7381, time 122.86ms
iter 564150: loss 5.4012, time 121.35ms
iter 564160: loss 6.2783, time 122.50ms
iter 564170: loss 5.6956, time 121.18ms
iter 564180: loss 5.5446, time 121.55ms
iter 564190: loss 6.8397, time 121.24ms
iter 564200: loss 6.4660, time 121.45ms
iter 564210: loss 5.5307, time 121.86ms
iter 564220: loss 6.7369, time 121.49ms
iter 564230: loss 5.8113, time 122.46ms
iter 564240: loss 6.3125, time 121.35ms
step 564250: train loss 5.5221, val loss 5.5346
saving checkpoint to out-shakespeare-char
iter 564250: loss 6.2259, time 2897.19ms
iter 564260: loss 6.0795, time 121.45ms
iter 564270: loss 6.0995, time 122.94ms
iter 564280: loss 6.1757, time 120.56ms
iter 564290: loss 6.2729, time 121.29ms
iter 564300: loss 5.8640, time 121.46ms
iter 564310: loss 5.5628, time 121.60ms
iter 564320: loss 5.8018, time 121.77ms
iter 564330: loss 6.5642, time 121.55ms
iter 564340: loss 5.8937, time 121.59ms
iter 564350: loss 6.6999, time 121.52ms
iter 564360: loss 6.1662, time 121.57ms
iter 564370: loss 5.2588, time 121.51ms
iter 564380: loss 6.6910, time 121.50ms
iter 564390: loss 5.2063, time 121.52ms
iter 564400: loss 6.2753, time 121.37ms
iter 564410: loss 6.1434, time 122.56ms
iter 564420: loss 5.8883, time 121.95ms
iter 564430: loss 5.6938, time 121.45ms
iter 564440: loss 5.8893, time 121.85ms
iter 564450: loss 6.0753, time 121.33ms
iter 564460: loss 6.0059, time 120.84ms
iter 564470: loss 5.9330, time 123.87ms
iter 564480: loss 5.2027, time 121.89ms
iter 564490: loss 6.1247, time 121.66ms
step 564500: train loss 5.4887, val loss 5.5245
saving checkpoint to out-shakespeare-char
iter 564500: loss 5.0050, time 2903.37ms
iter 564510: loss 6.2602, time 124.42ms
iter 564520: loss 6.3067, time 125.21ms
iter 564530: loss 6.3554, time 125.24ms
iter 564540: loss 6.0876, time 125.23ms
iter 564550: loss 5.5578, time 125.19ms
iter 564560: loss 6.9890, time 125.20ms
iter 564570: loss 5.2212, time 125.31ms
iter 564580: loss 5.9743, time 125.76ms
iter 564590: loss 6.3682, time 125.27ms
iter 564600: loss 6.1394, time 125.49ms
iter 564610: loss 6.1619, time 125.67ms
iter 564620: loss 6.2406, time 125.50ms
iter 564630: loss 5.7570, time 124.70ms
iter 564640: loss 6.1956, time 124.77ms
iter 564650: loss 6.0192, time 125.86ms
iter 564660: loss 6.1880, time 125.44ms
iter 564670: loss 6.7876, time 124.74ms
iter 564680: loss 6.5119, time 125.27ms
iter 564690: loss 5.8980, time 125.67ms
iter 564700: loss 5.5375, time 125.42ms
iter 564710: loss 5.4667, time 124.29ms
iter 564720: loss 5.7715, time 125.44ms
iter 564730: loss 6.2864, time 125.96ms
iter 564740: loss 6.2216, time 126.62ms
step 564750: train loss 5.4849, val loss 5.5242
saving checkpoint to out-shakespeare-char
iter 564750: loss 6.1995, time 2882.64ms
iter 564760: loss 5.6284, time 120.33ms
iter 564770: loss 5.8273, time 121.12ms
iter 564780: loss 6.1410, time 121.47ms
iter 564790: loss 6.1477, time 120.71ms
iter 564800: loss 6.2623, time 121.49ms
iter 564810: loss 5.5987, time 120.72ms
iter 564820: loss 5.6949, time 122.48ms
iter 564830: loss 5.6554, time 121.53ms
iter 564840: loss 5.7907, time 121.47ms
iter 564850: loss 5.2065, time 123.90ms
iter 564860: loss 5.6864, time 121.03ms
iter 564870: loss 6.4785, time 121.39ms
iter 564880: loss 6.7330, time 121.73ms
iter 564890: loss 5.4345, time 122.91ms
iter 564900: loss 6.3376, time 121.38ms
iter 564910: loss 6.3015, time 122.79ms
iter 564920: loss 6.1994, time 123.87ms
iter 564930: loss 6.7895, time 121.27ms
iter 564940: loss 6.5449, time 121.45ms
iter 564950: loss 7.0348, time 119.92ms
iter 564960: loss 5.7580, time 121.09ms
iter 564970: loss 6.2864, time 120.87ms
iter 564980: loss 6.2246, time 121.32ms
iter 564990: loss 5.5665, time 122.47ms
step 565000: train loss 5.5333, val loss 5.5476
saving checkpoint to out-shakespeare-char
iter 565000: loss 5.8104, time 2888.44ms
iter 565010: loss 6.0020, time 120.80ms
iter 565020: loss 6.1162, time 121.59ms
iter 565030: loss 6.0891, time 124.08ms
iter 565040: loss 5.9501, time 122.54ms
iter 565050: loss 6.0030, time 121.25ms
iter 565060: loss 5.8980, time 122.17ms
iter 565070: loss 6.1621, time 120.52ms
iter 565080: loss 6.0267, time 121.96ms
iter 565090: loss 5.6114, time 121.65ms
iter 565100: loss 5.8745, time 119.76ms
iter 565110: loss 5.5031, time 121.97ms
iter 565120: loss 5.3852, time 121.68ms
iter 565130: loss 5.9361, time 122.58ms
iter 565140: loss 6.0098, time 120.59ms
iter 565150: loss 5.7175, time 120.86ms
iter 565160: loss 6.0337, time 122.19ms
iter 565170: loss 6.1672, time 121.52ms
iter 565180: loss 6.0223, time 121.51ms
iter 565190: loss 5.7738, time 122.49ms
iter 565200: loss 6.2069, time 120.93ms
iter 565210: loss 6.2606, time 120.38ms
iter 565220: loss 5.5922, time 122.00ms
iter 565230: loss 5.0730, time 120.92ms
iter 565240: loss 6.4074, time 122.74ms
step 565250: train loss 5.5070, val loss 5.5634
saving checkpoint to out-shakespeare-char
iter 565250: loss 6.1258, time 2898.90ms
iter 565260: loss 6.3163, time 125.34ms
iter 565270: loss 6.4792, time 124.62ms
iter 565280: loss 6.0962, time 124.84ms
iter 565290: loss 6.0369, time 124.99ms
iter 565300: loss 6.2069, time 125.07ms
iter 565310: loss 5.6827, time 126.38ms
iter 565320: loss 6.2555, time 124.90ms
iter 565330: loss 6.3011, time 125.86ms
iter 565340: loss 6.4076, time 125.43ms
iter 565350: loss 6.1918, time 127.27ms
iter 565360: loss 6.4174, time 125.15ms
iter 565370: loss 6.2845, time 127.58ms
iter 565380: loss 5.6818, time 124.46ms
iter 565390: loss 6.9024, time 128.05ms
iter 565400: loss 5.4029, time 124.72ms
iter 565410: loss 6.2875, time 126.32ms
iter 565420: loss 6.3845, time 122.50ms
iter 565430: loss 5.7118, time 120.83ms
iter 565440: loss 5.6945, time 120.39ms
iter 565450: loss 5.5377, time 122.04ms
iter 565460: loss 6.2764, time 120.60ms
iter 565470: loss 5.6279, time 121.68ms
iter 565480: loss 6.0465, time 121.09ms
iter 565490: loss 5.7576, time 122.23ms
step 565500: train loss 5.5817, val loss 5.5241
saving checkpoint to out-shakespeare-char
iter 565500: loss 6.0720, time 2888.56ms
iter 565510: loss 5.7494, time 120.82ms
iter 565520: loss 6.1354, time 121.28ms
iter 565530: loss 6.3770, time 121.19ms
iter 565540: loss 6.3527, time 121.44ms
iter 565550: loss 5.7817, time 121.75ms
iter 565560: loss 5.5555, time 121.98ms
iter 565570: loss 5.6928, time 122.47ms
iter 565580: loss 5.7564, time 122.41ms
iter 565590: loss 5.5923, time 121.26ms
iter 565600: loss 5.4739, time 121.47ms
iter 565610: loss 6.1879, time 122.37ms
iter 565620: loss 5.9820, time 121.29ms
iter 565630: loss 6.0902, time 121.30ms
iter 565640: loss 6.5018, time 121.36ms
iter 565650: loss 5.7534, time 122.23ms
iter 565660: loss 5.0943, time 121.32ms
iter 565670: loss 6.1872, time 121.26ms
iter 565680: loss 5.9998, time 121.30ms
iter 565690: loss 6.3437, time 122.48ms
iter 565700: loss 5.5417, time 121.66ms
iter 565710: loss 5.2952, time 121.73ms
iter 565720: loss 5.8481, time 122.85ms
iter 565730: loss 6.4346, time 121.31ms
iter 565740: loss 5.4982, time 121.81ms
step 565750: train loss 5.4957, val loss 5.5402
saving checkpoint to out-shakespeare-char
iter 565750: loss 5.5836, time 2895.46ms
iter 565760: loss 6.9451, time 122.74ms
iter 565770: loss 5.0570, time 121.51ms
iter 565780: loss 7.3921, time 121.94ms
iter 565790: loss 5.8307, time 123.70ms
iter 565800: loss 6.1048, time 121.90ms
iter 565810: loss 6.0232, time 121.33ms
iter 565820: loss 5.8959, time 122.15ms
iter 565830: loss 6.1723, time 121.80ms
iter 565840: loss 6.1963, time 121.42ms
iter 565850: loss 5.6329, time 121.47ms
iter 565860: loss 5.6549, time 119.42ms
iter 565870: loss 6.2852, time 121.96ms
iter 565880: loss 5.8533, time 121.62ms
iter 565890: loss 5.6476, time 120.09ms
iter 565900: loss 5.8325, time 121.17ms
iter 565910: loss 6.1862, time 119.69ms
iter 565920: loss 5.8475, time 120.50ms
iter 565930: loss 5.3688, time 119.52ms
iter 565940: loss 6.4946, time 120.61ms
iter 565950: loss 5.9245, time 119.62ms
iter 565960: loss 6.7117, time 119.40ms
iter 565970: loss 5.9922, time 119.48ms
iter 565980: loss 6.2954, time 122.36ms
iter 565990: loss 5.1581, time 121.93ms
step 566000: train loss 5.5339, val loss 5.5706
saving checkpoint to out-shakespeare-char
iter 566000: loss 5.4790, time 2900.12ms
iter 566010: loss 5.2098, time 124.02ms
iter 566020: loss 5.8985, time 120.99ms
iter 566030: loss 6.7056, time 121.45ms
iter 566040: loss 5.2906, time 121.45ms
iter 566050: loss 5.5038, time 121.74ms
iter 566060: loss 6.4527, time 121.49ms
iter 566070: loss 6.1528, time 121.38ms
iter 566080: loss 6.2658, time 122.44ms
iter 566090: loss 5.9123, time 121.42ms
iter 566100: loss 6.5103, time 121.50ms
iter 566110: loss 5.3633, time 123.89ms
iter 566120: loss 6.1212, time 121.32ms
iter 566130: loss 5.9619, time 121.32ms
iter 566140: loss 5.5049, time 121.79ms
iter 566150: loss 6.4279, time 122.36ms
iter 566160: loss 6.2510, time 121.36ms
iter 566170: loss 5.7567, time 121.35ms
iter 566180: loss 5.6845, time 121.38ms
iter 566190: loss 5.8533, time 121.33ms
iter 566200: loss 5.9139, time 121.34ms
iter 566210: loss 6.0101, time 121.51ms
iter 566220: loss 5.3528, time 122.84ms
iter 566230: loss 5.6242, time 121.54ms
iter 566240: loss 6.3394, time 121.37ms
step 566250: train loss 5.5148, val loss 5.5740
saving checkpoint to out-shakespeare-char
iter 566250: loss 6.9116, time 2906.66ms
iter 566260: loss 5.8988, time 121.81ms
iter 566270: loss 5.9904, time 121.36ms
iter 566280: loss 6.5950, time 121.56ms
iter 566290: loss 5.9614, time 122.68ms
iter 566300: loss 6.0250, time 121.38ms
iter 566310: loss 5.4310, time 121.74ms
iter 566320: loss 5.7057, time 124.33ms
iter 566330: loss 5.8697, time 120.16ms
iter 566340: loss 5.2855, time 121.37ms
iter 566350: loss 5.4365, time 121.54ms
iter 566360: loss 5.6350, time 121.61ms
iter 566370: loss 6.2336, time 121.52ms
iter 566380: loss 5.3477, time 121.49ms
iter 566390: loss 5.4923, time 121.30ms
iter 566400: loss 5.9475, time 121.39ms
iter 566410: loss 5.2609, time 121.47ms
iter 566420: loss 5.4713, time 122.91ms
iter 566430: loss 6.0143, time 121.58ms
iter 566440: loss 5.8992, time 121.29ms
iter 566450: loss 6.4718, time 121.51ms
iter 566460: loss 6.1250, time 121.35ms
iter 566470: loss 6.1158, time 121.34ms
iter 566480: loss 6.2216, time 121.13ms
iter 566490: loss 5.6904, time 121.25ms
step 566500: train loss 5.5108, val loss 5.5325
saving checkpoint to out-shakespeare-char
iter 566500: loss 6.0771, time 2895.11ms
iter 566510: loss 5.9728, time 122.34ms
iter 566520: loss 5.4347, time 121.42ms
iter 566530: loss 5.7321, time 122.86ms
iter 566540: loss 5.3289, time 121.46ms
iter 566550: loss 5.9128, time 122.51ms
iter 566560: loss 6.2543, time 121.34ms
iter 566570: loss 5.8796, time 120.96ms
iter 566580: loss 5.8868, time 121.35ms
iter 566590: loss 6.3045, time 121.45ms
iter 566600: loss 6.5020, time 121.40ms
iter 566610: loss 5.8757, time 120.92ms
iter 566620: loss 6.7658, time 122.46ms
iter 566630: loss 5.9473, time 121.31ms
iter 566640: loss 6.8637, time 121.37ms
iter 566650: loss 6.0987, time 121.37ms
iter 566660: loss 6.1053, time 122.43ms
iter 566670: loss 5.4188, time 120.57ms
iter 566680: loss 6.2399, time 121.42ms
iter 566690: loss 5.6840, time 121.42ms
iter 566700: loss 5.3626, time 121.43ms
iter 566710: loss 6.1472, time 119.08ms
iter 566720: loss 5.7450, time 121.35ms
iter 566730: loss 5.9350, time 122.76ms
iter 566740: loss 6.2019, time 121.37ms
step 566750: train loss 5.5320, val loss 5.5299
saving checkpoint to out-shakespeare-char
iter 566750: loss 6.2194, time 2888.51ms
iter 566760: loss 6.0636, time 121.44ms
iter 566770: loss 6.0310, time 122.68ms
iter 566780: loss 6.7172, time 121.16ms
iter 566790: loss 5.9071, time 122.77ms
iter 566800: loss 6.5355, time 121.51ms
iter 566810: loss 5.8592, time 122.34ms
iter 566820: loss 6.0827, time 121.43ms
iter 566830: loss 5.7047, time 121.56ms
iter 566840: loss 6.0451, time 121.51ms
iter 566850: loss 6.0015, time 122.49ms
iter 566860: loss 5.7436, time 121.37ms
iter 566870: loss 5.7738, time 121.50ms
iter 566880: loss 5.6728, time 124.20ms
iter 566890: loss 5.9410, time 121.70ms
iter 566900: loss 5.6042, time 122.36ms
iter 566910: loss 5.9347, time 120.72ms
iter 566920: loss 6.3574, time 122.49ms
iter 566930: loss 5.2540, time 121.25ms
iter 566940: loss 6.5269, time 121.32ms
iter 566950: loss 5.9706, time 120.37ms
iter 566960: loss 5.7503, time 122.50ms
iter 566970: loss 6.2272, time 121.39ms
iter 566980: loss 6.0410, time 121.41ms
iter 566990: loss 6.5588, time 122.56ms
step 567000: train loss 5.5151, val loss 5.5667
saving checkpoint to out-shakespeare-char
iter 567000: loss 5.6361, time 2894.10ms
iter 567010: loss 5.7523, time 122.68ms
iter 567020: loss 5.2546, time 121.26ms
iter 567030: loss 6.0863, time 122.47ms
iter 567040: loss 5.6923, time 121.35ms
iter 567050: loss 5.8928, time 120.88ms
iter 567060: loss 5.8304, time 121.30ms
iter 567070: loss 5.4804, time 122.52ms
iter 567080: loss 5.7864, time 121.25ms
iter 567090: loss 5.5424, time 121.21ms
iter 567100: loss 5.5642, time 124.00ms
iter 567110: loss 5.7840, time 122.54ms
iter 567120: loss 6.1518, time 121.11ms
iter 567130: loss 5.8675, time 121.47ms
iter 567140: loss 6.0473, time 121.46ms
iter 567150: loss 6.2086, time 123.25ms
iter 567160: loss 6.4762, time 121.83ms
iter 567170: loss 6.0022, time 121.36ms
iter 567180: loss 6.3993, time 123.84ms
iter 567190: loss 5.5882, time 119.68ms
iter 567200: loss 6.4751, time 121.56ms
iter 567210: loss 5.8731, time 121.44ms
iter 567220: loss 5.9673, time 121.29ms
iter 567230: loss 6.6833, time 121.36ms
iter 567240: loss 5.5528, time 121.40ms
step 567250: train loss 5.4794, val loss 5.5351
saving checkpoint to out-shakespeare-char
iter 567250: loss 5.4863, time 2896.35ms
iter 567260: loss 6.4013, time 121.22ms
iter 567270: loss 6.5769, time 121.50ms
iter 567280: loss 5.8178, time 121.09ms
iter 567290: loss 5.7703, time 121.14ms
iter 567300: loss 6.6374, time 121.88ms
iter 567310: loss 5.8903, time 120.96ms
iter 567320: loss 6.2328, time 121.82ms
iter 567330: loss 5.9753, time 121.65ms
iter 567340: loss 5.9870, time 122.68ms
iter 567350: loss 6.0782, time 122.08ms
iter 567360: loss 5.8117, time 120.72ms
iter 567370: loss 6.0688, time 121.80ms
iter 567380: loss 6.7224, time 120.78ms
iter 567390: loss 6.3424, time 123.52ms
iter 567400: loss 6.5213, time 121.47ms
iter 567410: loss 5.4727, time 121.41ms
iter 567420: loss 5.5833, time 121.21ms
iter 567430: loss 5.9031, time 122.89ms
iter 567440: loss 5.8948, time 122.28ms
iter 567450: loss 6.1653, time 121.46ms
iter 567460: loss 5.3860, time 121.24ms
iter 567470: loss 5.8166, time 121.44ms
iter 567480: loss 6.0701, time 122.55ms
iter 567490: loss 5.9490, time 121.11ms
step 567500: train loss 5.5744, val loss 5.5054
saving checkpoint to out-shakespeare-char
iter 567500: loss 6.0315, time 2900.58ms
iter 567510: loss 5.6635, time 122.20ms
iter 567520: loss 5.9048, time 121.26ms
iter 567530: loss 5.9764, time 122.52ms
iter 567540: loss 5.6805, time 122.58ms
iter 567550: loss 6.0492, time 122.63ms
iter 567560: loss 6.6741, time 122.56ms
iter 567570: loss 5.8470, time 121.38ms
iter 567580: loss 5.4414, time 119.76ms
iter 567590: loss 5.1164, time 120.88ms
iter 567600: loss 6.5075, time 123.40ms
iter 567610: loss 5.3369, time 121.35ms
iter 567620: loss 5.8961, time 120.50ms
iter 567630: loss 5.8514, time 121.60ms
iter 567640: loss 5.7747, time 122.76ms
iter 567650: loss 5.6358, time 120.79ms
iter 567660: loss 5.9131, time 121.70ms
iter 567670: loss 6.3481, time 121.77ms
iter 567680: loss 5.5936, time 122.83ms
iter 567690: loss 5.9301, time 121.41ms
iter 567700: loss 5.8134, time 122.36ms
iter 567710: loss 5.9575, time 121.36ms
iter 567720: loss 6.3538, time 121.83ms
iter 567730: loss 6.1742, time 120.38ms
iter 567740: loss 5.3150, time 121.45ms
step 567750: train loss 5.5293, val loss 5.5546
saving checkpoint to out-shakespeare-char
iter 567750: loss 6.2305, time 2869.17ms
iter 567760: loss 6.0425, time 121.97ms
iter 567770: loss 6.0590, time 120.76ms
iter 567780: loss 5.7461, time 122.68ms
iter 567790: loss 5.8300, time 121.63ms
iter 567800: loss 5.4101, time 122.91ms
iter 567810: loss 5.2394, time 121.41ms
iter 567820: loss 5.6310, time 121.13ms
iter 567830: loss 5.6914, time 120.92ms
iter 567840: loss 6.3291, time 122.14ms
iter 567850: loss 5.8323, time 121.42ms
iter 567860: loss 6.2362, time 121.47ms
iter 567870: loss 5.2288, time 120.41ms
iter 567880: loss 5.6969, time 122.53ms
iter 567890: loss 6.2585, time 121.49ms
iter 567900: loss 5.9582, time 120.74ms
iter 567910: loss 5.6977, time 121.28ms
iter 567920: loss 6.3420, time 122.64ms
iter 567930: loss 6.9792, time 121.44ms
iter 567940: loss 5.7303, time 121.43ms
iter 567950: loss 5.2509, time 123.35ms
iter 567960: loss 6.2773, time 121.68ms
iter 567970: loss 6.3295, time 121.39ms
iter 567980: loss 6.6547, time 123.97ms
iter 567990: loss 6.0582, time 121.05ms
step 568000: train loss 5.5188, val loss 5.5217
saving checkpoint to out-shakespeare-char
iter 568000: loss 5.9664, time 2900.17ms
iter 568010: loss 6.1365, time 121.61ms
iter 568020: loss 5.5593, time 122.51ms
iter 568030: loss 5.9945, time 122.30ms
iter 568040: loss 5.7567, time 121.91ms
iter 568050: loss 5.6236, time 121.92ms
iter 568060: loss 6.2629, time 122.76ms
iter 568070: loss 6.6445, time 121.66ms
iter 568080: loss 6.0379, time 120.62ms
iter 568090: loss 5.7088, time 124.08ms
iter 568100: loss 5.6839, time 120.93ms
iter 568110: loss 5.8678, time 122.08ms
iter 568120: loss 6.0048, time 121.44ms
iter 568130: loss 5.6042, time 121.30ms
iter 568140: loss 5.7193, time 121.41ms
iter 568150: loss 5.9950, time 121.43ms
iter 568160: loss 5.3381, time 123.26ms
iter 568170: loss 5.5106, time 121.44ms
iter 568180: loss 5.4548, time 121.99ms
iter 568190: loss 5.4804, time 122.50ms
iter 568200: loss 6.4003, time 121.61ms
iter 568210: loss 6.1709, time 122.09ms
iter 568220: loss 5.5420, time 124.46ms
iter 568230: loss 6.0871, time 121.78ms
iter 568240: loss 5.5883, time 121.91ms
step 568250: train loss 5.5650, val loss 5.5685
saving checkpoint to out-shakespeare-char
iter 568250: loss 6.6412, time 2911.30ms
iter 568260: loss 6.2101, time 126.01ms
iter 568270: loss 6.6704, time 125.23ms
iter 568280: loss 5.9002, time 122.71ms
iter 568290: loss 6.4572, time 126.10ms
iter 568300: loss 5.8189, time 125.82ms
iter 568310: loss 5.4401, time 125.66ms
iter 568320: loss 6.2261, time 125.75ms
iter 568330: loss 6.9341, time 125.50ms
iter 568340: loss 5.4026, time 125.79ms
iter 568350: loss 6.1906, time 126.23ms
iter 568360: loss 5.3083, time 125.62ms
iter 568370: loss 6.2960, time 125.46ms
iter 568380: loss 5.8685, time 125.96ms
iter 568390: loss 5.4839, time 125.70ms
iter 568400: loss 5.5340, time 125.65ms
iter 568410: loss 5.9534, time 125.58ms
iter 568420: loss 5.6991, time 125.77ms
iter 568430: loss 5.5742, time 125.49ms
iter 568440: loss 6.3655, time 125.70ms
iter 568450: loss 6.3820, time 125.28ms
iter 568460: loss 6.6908, time 125.81ms
iter 568470: loss 6.1464, time 125.59ms
iter 568480: loss 6.0788, time 126.22ms
iter 568490: loss 6.2930, time 125.54ms
step 568500: train loss 5.5277, val loss 5.5348
saving checkpoint to out-shakespeare-char
iter 568500: loss 5.6530, time 2888.58ms
iter 568510: loss 5.6796, time 126.59ms
iter 568520: loss 5.5725, time 126.57ms
iter 568530: loss 6.0098, time 126.44ms
iter 568540: loss 5.9808, time 123.71ms
iter 568550: loss 6.2989, time 123.09ms
iter 568560: loss 5.4090, time 121.95ms
iter 568570: loss 5.9143, time 123.10ms
iter 568580: loss 6.2570, time 122.08ms
iter 568590: loss 5.8783, time 121.80ms
iter 568600: loss 5.5789, time 124.73ms
iter 568610: loss 6.2069, time 121.90ms
iter 568620: loss 5.8598, time 121.87ms
iter 568630: loss 6.1391, time 121.69ms
iter 568640: loss 6.1375, time 121.82ms
iter 568650: loss 6.4072, time 122.02ms
iter 568660: loss 5.4904, time 121.92ms
iter 568670: loss 5.6212, time 123.23ms
iter 568680: loss 5.5201, time 121.51ms
iter 568690: loss 5.8928, time 121.77ms
iter 568700: loss 5.5746, time 122.81ms
iter 568710: loss 6.2000, time 121.80ms
iter 568720: loss 6.0355, time 121.79ms
iter 568730: loss 5.9030, time 124.31ms
iter 568740: loss 6.3820, time 121.81ms
step 568750: train loss 5.4842, val loss 5.5606
saving checkpoint to out-shakespeare-char
iter 568750: loss 6.6785, time 2885.79ms
iter 568760: loss 5.1251, time 121.27ms
iter 568770: loss 6.1318, time 123.34ms
iter 568780: loss 6.5708, time 121.23ms
iter 568790: loss 6.5703, time 121.32ms
iter 568800: loss 5.8745, time 121.30ms
iter 568810: loss 5.9809, time 122.39ms
iter 568820: loss 6.2064, time 121.19ms
iter 568830: loss 5.7594, time 121.39ms
iter 568840: loss 5.6213, time 121.44ms
iter 568850: loss 5.8645, time 122.59ms
iter 568860: loss 5.5859, time 121.53ms
iter 568870: loss 5.2201, time 121.47ms
iter 568880: loss 5.7679, time 122.74ms
iter 568890: loss 6.9195, time 121.67ms
iter 568900: loss 6.3304, time 121.54ms
iter 568910: loss 6.3749, time 121.20ms
iter 568920: loss 5.7947, time 122.58ms
iter 568930: loss 5.8596, time 120.72ms
iter 568940: loss 5.7874, time 121.28ms
iter 568950: loss 6.1681, time 121.56ms
iter 568960: loss 6.0589, time 122.56ms
iter 568970: loss 5.6380, time 121.37ms
iter 568980: loss 5.7669, time 120.99ms
iter 568990: loss 5.4037, time 121.56ms
step 569000: train loss 5.5031, val loss 5.5429
saving checkpoint to out-shakespeare-char
iter 569000: loss 6.3226, time 2914.69ms
iter 569010: loss 6.2832, time 122.75ms
iter 569020: loss 5.1424, time 122.40ms
iter 569030: loss 6.7562, time 121.35ms
iter 569040: loss 6.6322, time 122.85ms
iter 569050: loss 6.2211, time 121.35ms
iter 569060: loss 6.2896, time 121.28ms
iter 569070: loss 5.7986, time 124.02ms
iter 569080: loss 5.2911, time 121.46ms
iter 569090: loss 6.1983, time 121.40ms
iter 569100: loss 5.5991, time 121.73ms
iter 569110: loss 6.2612, time 121.37ms
iter 569120: loss 6.3406, time 121.31ms
iter 569130: loss 5.6793, time 120.88ms
iter 569140: loss 6.5777, time 122.95ms
iter 569150: loss 5.6803, time 121.44ms
iter 569160: loss 5.5509, time 121.36ms
iter 569170: loss 6.4534, time 122.49ms
iter 569180: loss 5.0712, time 121.65ms
iter 569190: loss 6.4270, time 121.36ms
iter 569200: loss 5.9831, time 119.48ms
iter 569210: loss 6.4442, time 120.37ms
iter 569220: loss 5.1336, time 120.71ms
iter 569230: loss 5.7053, time 121.12ms
iter 569240: loss 5.3521, time 119.93ms
step 569250: train loss 5.4908, val loss 5.5423
saving checkpoint to out-shakespeare-char
iter 569250: loss 6.0505, time 2888.55ms
iter 569260: loss 5.0949, time 122.61ms
iter 569270: loss 5.9083, time 121.45ms
iter 569280: loss 5.8709, time 121.34ms
iter 569290: loss 6.0335, time 123.94ms
iter 569300: loss 6.8024, time 121.71ms
iter 569310: loss 5.8213, time 121.38ms
iter 569320: loss 5.9826, time 121.30ms
iter 569330: loss 5.9754, time 122.36ms
iter 569340: loss 6.3825, time 121.42ms
iter 569350: loss 6.1166, time 121.43ms
iter 569360: loss 5.6726, time 122.40ms
iter 569370: loss 6.1220, time 121.75ms
iter 569380: loss 6.6409, time 121.40ms
iter 569390: loss 6.3801, time 124.41ms
iter 569400: loss 5.8994, time 121.39ms
iter 569410: loss 5.8005, time 120.51ms
iter 569420: loss 5.3068, time 121.58ms
iter 569430: loss 5.6494, time 122.56ms
iter 569440: loss 6.2870, time 120.74ms
iter 569450: loss 5.5859, time 121.46ms
iter 569460: loss 5.1149, time 123.91ms
iter 569470: loss 5.9939, time 121.44ms
iter 569480: loss 6.0442, time 121.52ms
iter 569490: loss 5.7607, time 121.55ms
step 569500: train loss 5.5091, val loss 5.5422
saving checkpoint to out-shakespeare-char
iter 569500: loss 5.7800, time 2892.44ms
iter 569510: loss 6.0772, time 121.64ms
iter 569520: loss 6.1676, time 121.66ms
iter 569530: loss 5.5998, time 121.65ms
iter 569540: loss 5.4335, time 121.62ms
iter 569550: loss 5.4144, time 121.72ms
iter 569560: loss 5.4927, time 121.70ms
iter 569570: loss 6.1921, time 123.12ms
iter 569580: loss 5.8350, time 121.27ms
iter 569590: loss 5.6066, time 123.19ms
iter 569600: loss 6.0392, time 122.83ms
iter 569610: loss 5.8416, time 125.63ms
iter 569620: loss 5.6189, time 128.29ms
iter 569630: loss 6.0331, time 124.95ms
iter 569640: loss 5.1393, time 125.73ms
iter 569650: loss 5.6902, time 124.54ms
iter 569660: loss 5.6485, time 124.73ms
iter 569670: loss 6.3925, time 125.43ms
iter 569680: loss 5.5487, time 127.74ms
iter 569690: loss 6.1232, time 124.96ms
iter 569700: loss 6.7687, time 126.78ms
iter 569710: loss 6.1703, time 125.31ms
iter 569720: loss 5.7263, time 127.89ms
iter 569730: loss 6.2254, time 125.42ms
iter 569740: loss 6.4084, time 125.60ms
step 569750: train loss 5.5277, val loss 5.5318
saving checkpoint to out-shakespeare-char
iter 569750: loss 6.2263, time 2908.25ms
iter 569760: loss 5.9581, time 125.02ms
iter 569770: loss 6.3522, time 124.91ms
iter 569780: loss 5.8828, time 125.24ms
iter 569790: loss 6.1770, time 125.00ms
iter 569800: loss 5.5756, time 124.79ms
iter 569810: loss 5.9699, time 125.60ms
iter 569820: loss 6.2869, time 125.52ms
iter 569830: loss 5.5823, time 125.13ms
iter 569840: loss 6.0537, time 125.16ms
iter 569850: loss 6.2892, time 125.17ms
iter 569860: loss 6.0456, time 125.32ms
iter 569870: loss 5.8280, time 125.05ms
iter 569880: loss 6.2944, time 125.59ms
iter 569890: loss 5.5968, time 125.08ms
iter 569900: loss 5.9045, time 125.08ms
iter 569910: loss 6.4822, time 125.30ms
iter 569920: loss 5.8694, time 125.59ms
iter 569930: loss 6.1047, time 125.42ms
iter 569940: loss 5.9844, time 123.67ms
iter 569950: loss 6.0758, time 125.07ms
iter 569960: loss 5.9514, time 125.13ms
iter 569970: loss 5.5891, time 125.25ms
iter 569980: loss 6.5163, time 125.25ms
iter 569990: loss 5.8429, time 125.19ms
step 570000: train loss 5.5249, val loss 5.5251
saving checkpoint to out-shakespeare-char
iter 570000: loss 5.9143, time 2913.44ms
iter 570010: loss 6.4172, time 124.99ms
iter 570020: loss 5.9673, time 125.76ms
iter 570030: loss 5.7869, time 125.34ms
iter 570040: loss 6.4609, time 125.88ms
iter 570050: loss 5.5640, time 126.30ms
iter 570060: loss 6.4861, time 125.95ms
iter 570070: loss 5.9014, time 125.92ms
iter 570080: loss 6.4378, time 125.42ms
iter 570090: loss 5.5108, time 125.54ms
iter 570100: loss 5.9628, time 125.94ms
iter 570110: loss 5.6761, time 125.62ms
iter 570120: loss 5.3995, time 125.53ms
iter 570130: loss 5.6775, time 125.47ms
iter 570140: loss 5.6746, time 125.79ms
iter 570150: loss 5.8758, time 125.85ms
iter 570160: loss 5.4532, time 125.60ms
iter 570170: loss 5.7808, time 125.51ms
iter 570180: loss 6.2149, time 125.52ms
iter 570190: loss 6.0937, time 126.67ms
iter 570200: loss 5.5155, time 126.93ms
iter 570210: loss 5.9707, time 124.69ms
iter 570220: loss 6.0598, time 125.19ms
iter 570230: loss 6.7478, time 125.59ms
iter 570240: loss 5.6587, time 125.70ms
step 570250: train loss 5.5167, val loss 5.5322
saving checkpoint to out-shakespeare-char
iter 570250: loss 6.2862, time 2896.67ms
iter 570260: loss 5.5287, time 125.60ms
iter 570270: loss 5.3837, time 125.28ms
iter 570280: loss 5.9492, time 125.23ms
iter 570290: loss 5.3200, time 125.54ms
iter 570300: loss 6.2824, time 125.37ms
iter 570310: loss 6.6101, time 125.47ms
iter 570320: loss 5.7305, time 125.89ms
iter 570330: loss 6.6227, time 125.51ms
iter 570340: loss 5.6141, time 125.35ms
iter 570350: loss 5.9104, time 125.57ms
iter 570360: loss 5.8578, time 125.58ms
iter 570370: loss 5.7417, time 126.34ms
iter 570380: loss 5.9344, time 125.32ms
iter 570390: loss 5.9440, time 125.62ms
iter 570400: loss 6.1070, time 125.77ms
iter 570410: loss 5.7281, time 125.55ms
iter 570420: loss 4.9036, time 125.72ms
iter 570430: loss 5.7153, time 127.82ms
iter 570440: loss 6.4786, time 125.52ms
iter 570450: loss 6.0661, time 127.99ms
iter 570460: loss 5.6636, time 125.48ms
iter 570470: loss 6.0011, time 127.92ms
iter 570480: loss 5.9280, time 125.21ms
iter 570490: loss 6.1757, time 128.28ms
step 570500: train loss 5.5634, val loss 5.5717
saving checkpoint to out-shakespeare-char
iter 570500: loss 6.0518, time 2901.28ms
iter 570510: loss 5.6950, time 127.68ms
iter 570520: loss 6.1424, time 125.50ms
iter 570530: loss 5.7949, time 128.10ms
iter 570540: loss 6.3199, time 125.77ms
iter 570550: loss 5.4764, time 128.29ms
iter 570560: loss 6.3753, time 125.46ms
iter 570570: loss 5.7359, time 128.13ms
iter 570580: loss 5.9336, time 125.49ms
iter 570590: loss 6.0637, time 128.00ms
iter 570600: loss 6.2930, time 125.25ms
iter 570610: loss 6.2004, time 128.15ms
iter 570620: loss 6.7379, time 120.65ms
iter 570630: loss 5.5324, time 119.69ms
iter 570640: loss 5.7901, time 119.85ms
iter 570650: loss 6.0142, time 120.80ms
iter 570660: loss 5.3249, time 120.61ms
iter 570670: loss 5.8675, time 120.83ms
iter 570680: loss 5.3256, time 119.60ms
iter 570690: loss 5.2747, time 121.90ms
iter 570700: loss 4.6176, time 121.02ms
iter 570710: loss 6.2877, time 121.71ms
iter 570720: loss 5.5045, time 121.49ms
iter 570730: loss 5.6795, time 122.46ms
iter 570740: loss 5.7024, time 121.55ms
step 570750: train loss 5.4932, val loss 5.5727
saving checkpoint to out-shakespeare-char
iter 570750: loss 5.9776, time 2889.50ms
iter 570760: loss 5.3177, time 124.40ms
iter 570770: loss 5.7304, time 121.46ms
iter 570780: loss 6.2027, time 121.53ms
iter 570790: loss 5.7814, time 121.54ms
iter 570800: loss 6.1250, time 121.49ms
iter 570810: loss 6.0031, time 121.63ms
iter 570820: loss 5.9861, time 121.53ms
iter 570830: loss 6.2212, time 122.58ms
iter 570840: loss 5.3866, time 121.18ms
iter 570850: loss 6.1061, time 119.89ms
iter 570860: loss 5.7621, time 119.87ms
iter 570870: loss 6.2333, time 121.71ms
iter 570880: loss 6.3530, time 120.06ms
iter 570890: loss 6.4248, time 119.97ms
iter 570900: loss 6.0639, time 120.89ms
iter 570910: loss 5.4843, time 120.65ms
iter 570920: loss 5.6821, time 120.13ms
iter 570930: loss 5.8548, time 120.29ms
iter 570940: loss 5.8496, time 123.53ms
iter 570950: loss 6.2512, time 119.86ms
iter 570960: loss 6.0842, time 119.87ms
iter 570970: loss 6.0863, time 121.15ms
iter 570980: loss 6.3888, time 121.43ms
iter 570990: loss 6.3625, time 119.85ms
step 571000: train loss 5.5289, val loss 5.5215
saving checkpoint to out-shakespeare-char
iter 571000: loss 6.2451, time 2901.80ms
iter 571010: loss 6.3066, time 119.50ms
iter 571020: loss 6.0672, time 122.29ms
iter 571030: loss 5.9736, time 120.63ms
iter 571040: loss 5.6881, time 119.71ms
iter 571050: loss 5.5877, time 119.64ms
iter 571060: loss 6.2620, time 119.55ms
iter 571070: loss 5.7458, time 120.28ms
iter 571080: loss 6.2164, time 120.28ms
iter 571090: loss 5.6895, time 121.05ms
iter 571100: loss 5.8277, time 119.50ms
iter 571110: loss 5.9529, time 120.69ms
iter 571120: loss 5.9770, time 121.80ms
iter 571130: loss 5.7699, time 120.77ms
iter 571140: loss 5.7070, time 120.14ms
iter 571150: loss 5.5864, time 119.66ms
iter 571160: loss 6.1789, time 119.60ms
iter 571170: loss 5.7727, time 120.75ms
iter 571180: loss 5.7972, time 120.42ms
iter 571190: loss 5.9054, time 121.44ms
iter 571200: loss 5.8002, time 119.54ms
iter 571210: loss 6.3953, time 120.01ms
iter 571220: loss 6.1722, time 120.62ms
iter 571230: loss 6.3788, time 124.04ms
iter 571240: loss 5.4136, time 121.58ms
step 571250: train loss 5.5391, val loss 5.4781
saving checkpoint to out-shakespeare-char
iter 571250: loss 5.8312, time 2900.16ms
iter 571260: loss 6.6194, time 121.58ms
iter 571270: loss 6.3291, time 122.50ms
iter 571280: loss 6.3839, time 121.16ms
iter 571290: loss 6.0713, time 121.38ms
iter 571300: loss 6.3714, time 122.51ms
iter 571310: loss 5.9316, time 121.48ms
iter 571320: loss 6.1512, time 121.35ms
iter 571330: loss 5.7641, time 124.01ms
iter 571340: loss 5.6762, time 121.46ms
iter 571350: loss 5.9581, time 121.81ms
iter 571360: loss 5.9194, time 121.40ms
iter 571370: loss 5.3584, time 121.71ms
iter 571380: loss 6.1388, time 121.35ms
iter 571390: loss 5.8005, time 121.52ms
iter 571400: loss 5.6427, time 122.92ms
iter 571410: loss 5.9076, time 121.60ms
iter 571420: loss 6.1114, time 121.75ms
iter 571430: loss 5.4631, time 121.71ms
iter 571440: loss 6.0996, time 122.47ms
iter 571450: loss 5.9963, time 121.32ms
iter 571460: loss 6.1491, time 121.39ms
iter 571470: loss 5.5429, time 121.43ms
iter 571480: loss 6.3536, time 121.19ms
iter 571490: loss 5.1525, time 121.56ms
step 571500: train loss 5.5407, val loss 5.5504
saving checkpoint to out-shakespeare-char
iter 571500: loss 6.8438, time 2889.28ms
iter 571510: loss 5.8505, time 121.31ms
iter 571520: loss 5.9432, time 121.40ms
iter 571530: loss 5.3193, time 123.88ms
iter 571540: loss 6.2644, time 121.67ms
iter 571550: loss 5.7573, time 121.95ms
iter 571560: loss 5.6869, time 122.00ms
iter 571570: loss 6.1557, time 121.00ms
iter 571580: loss 5.6999, time 121.21ms
iter 571590: loss 5.8165, time 121.79ms
iter 571600: loss 6.2130, time 123.82ms
iter 571610: loss 5.8477, time 121.58ms
iter 571620: loss 6.1100, time 121.41ms
iter 571630: loss 5.9952, time 121.21ms
iter 571640: loss 5.5510, time 121.23ms
iter 571650: loss 6.2600, time 121.17ms
iter 571660: loss 5.8597, time 120.74ms
iter 571670: loss 6.3704, time 120.65ms
iter 571680: loss 5.5310, time 121.22ms
iter 571690: loss 6.0653, time 121.07ms
iter 571700: loss 6.6383, time 122.43ms
iter 571710: loss 6.0477, time 121.22ms
iter 571720: loss 6.3515, time 120.25ms
iter 571730: loss 6.1317, time 120.70ms
iter 571740: loss 6.2739, time 120.50ms
step 571750: train loss 5.5656, val loss 5.5922
saving checkpoint to out-shakespeare-char
iter 571750: loss 5.8469, time 2894.56ms
iter 571760: loss 6.0127, time 124.26ms
iter 571770: loss 6.1245, time 121.55ms
iter 571780: loss 6.4117, time 121.47ms
iter 571790: loss 6.1031, time 121.59ms
iter 571800: loss 6.1050, time 119.94ms
iter 571810: loss 5.9317, time 120.87ms
iter 571820: loss 5.3518, time 120.39ms
iter 571830: loss 6.3529, time 122.81ms
iter 571840: loss 6.1415, time 121.84ms
iter 571850: loss 6.6849, time 122.15ms
iter 571860: loss 5.7351, time 124.16ms
iter 571870: loss 5.6961, time 121.41ms
iter 571880: loss 6.4668, time 122.11ms
iter 571890: loss 5.6706, time 121.58ms
iter 571900: loss 5.9839, time 121.71ms
iter 571910: loss 5.9408, time 121.52ms
iter 571920: loss 5.0099, time 121.51ms
iter 571930: loss 6.0435, time 122.59ms
iter 571940: loss 5.4280, time 121.75ms
iter 571950: loss 6.1168, time 121.68ms
iter 571960: loss 6.4823, time 124.06ms
iter 571970: loss 6.0161, time 121.51ms
iter 571980: loss 5.8153, time 121.44ms
iter 571990: loss 5.7414, time 121.05ms
step 572000: train loss 5.5461, val loss 5.6108
saving checkpoint to out-shakespeare-char
iter 572000: loss 6.5690, time 2896.23ms
iter 572010: loss 5.6591, time 121.64ms
iter 572020: loss 6.8676, time 121.69ms
iter 572030: loss 6.4509, time 120.85ms
iter 572040: loss 6.2693, time 123.09ms
iter 572050: loss 5.9534, time 120.28ms
iter 572060: loss 6.2638, time 120.78ms
iter 572070: loss 5.9388, time 120.81ms
iter 572080: loss 6.1644, time 120.84ms
iter 572090: loss 6.1090, time 121.60ms
iter 572100: loss 6.6397, time 119.72ms
iter 572110: loss 5.6921, time 119.76ms
iter 572120: loss 5.9554, time 120.51ms
iter 572130: loss 6.5511, time 122.03ms
iter 572140: loss 5.7177, time 120.43ms
iter 572150: loss 5.6453, time 121.68ms
iter 572160: loss 5.3316, time 119.72ms
iter 572170: loss 5.7543, time 121.11ms
iter 572180: loss 6.1314, time 119.59ms
iter 572190: loss 5.8279, time 123.17ms
iter 572200: loss 5.9623, time 120.87ms
iter 572210: loss 6.0579, time 119.70ms
iter 572220: loss 5.7405, time 119.57ms
iter 572230: loss 6.2982, time 119.67ms
iter 572240: loss 6.6983, time 119.34ms
step 572250: train loss 5.5085, val loss 5.4980
saving checkpoint to out-shakespeare-char
iter 572250: loss 5.7518, time 2902.98ms
iter 572260: loss 5.8114, time 121.19ms
iter 572270: loss 5.4152, time 119.62ms
iter 572280: loss 6.1347, time 119.65ms
iter 572290: loss 6.5525, time 119.51ms
iter 572300: loss 5.8303, time 120.68ms
iter 572310: loss 4.9597, time 120.68ms
iter 572320: loss 6.3546, time 119.44ms
iter 572330: loss 6.4238, time 119.65ms
iter 572340: loss 6.1000, time 119.66ms
iter 572350: loss 6.0788, time 123.42ms
iter 572360: loss 5.9256, time 121.66ms
iter 572370: loss 6.1240, time 119.76ms
iter 572380: loss 6.4630, time 119.49ms
iter 572390: loss 7.1548, time 120.69ms
iter 572400: loss 5.5356, time 119.75ms
iter 572410: loss 5.4063, time 120.62ms
iter 572420: loss 6.3108, time 119.63ms
iter 572430: loss 5.7267, time 121.42ms
iter 572440: loss 6.0434, time 121.63ms
iter 572450: loss 5.2785, time 121.89ms
iter 572460: loss 5.7155, time 120.47ms
iter 572470: loss 6.3898, time 121.04ms
iter 572480: loss 6.5535, time 119.45ms
iter 572490: loss 5.7975, time 119.57ms
step 572500: train loss 5.4930, val loss 5.5389
saving checkpoint to out-shakespeare-char
iter 572500: loss 6.0153, time 2898.52ms
iter 572510: loss 6.3391, time 119.78ms
iter 572520: loss 6.1155, time 120.24ms
iter 572530: loss 6.5821, time 121.46ms
iter 572540: loss 5.9643, time 122.57ms
iter 572550: loss 5.6202, time 121.58ms
iter 572560: loss 5.9639, time 121.57ms
iter 572570: loss 5.9251, time 123.93ms
iter 572580: loss 6.2126, time 121.64ms
iter 572590: loss 6.5443, time 121.60ms
iter 572600: loss 5.9211, time 121.46ms
iter 572610: loss 6.7578, time 122.66ms
iter 572620: loss 5.9495, time 122.01ms
iter 572630: loss 5.8377, time 121.70ms
iter 572640: loss 6.4489, time 122.58ms
iter 572650: loss 5.3280, time 121.94ms
iter 572660: loss 6.2020, time 121.52ms
iter 572670: loss 6.6586, time 121.63ms
iter 572680: loss 5.3826, time 121.51ms
iter 572690: loss 6.7139, time 121.42ms
iter 572700: loss 5.7015, time 121.55ms
iter 572710: loss 5.2945, time 122.53ms
iter 572720: loss 6.3153, time 121.36ms
iter 572730: loss 5.8783, time 121.52ms
iter 572740: loss 6.6936, time 122.82ms
step 572750: train loss 5.5817, val loss 5.5345
saving checkpoint to out-shakespeare-char
iter 572750: loss 5.8726, time 2917.44ms
iter 572760: loss 5.6454, time 124.81ms
iter 572770: loss 5.7763, time 125.69ms
iter 572780: loss 5.8544, time 125.16ms
iter 572790: loss 6.1134, time 125.02ms
iter 572800: loss 5.9642, time 124.71ms
iter 572810: loss 6.7343, time 125.51ms
iter 572820: loss 5.7713, time 124.25ms
iter 572830: loss 5.4861, time 125.18ms
iter 572840: loss 6.1405, time 124.70ms
iter 572850: loss 4.9662, time 124.52ms
iter 572860: loss 6.0839, time 123.05ms
iter 572870: loss 5.5911, time 125.27ms
iter 572880: loss 5.9768, time 125.19ms
iter 572890: loss 6.1322, time 125.18ms
iter 572900: loss 6.3157, time 125.36ms
iter 572910: loss 5.8300, time 125.10ms
iter 572920: loss 5.1611, time 125.29ms
iter 572930: loss 5.8554, time 125.11ms
iter 572940: loss 6.7879, time 125.47ms
iter 572950: loss 5.2576, time 125.31ms
iter 572960: loss 5.9554, time 124.80ms
iter 572970: loss 6.5610, time 125.00ms
iter 572980: loss 5.8528, time 125.23ms
iter 572990: loss 6.2181, time 125.09ms
step 573000: train loss 5.5806, val loss 5.5426
saving checkpoint to out-shakespeare-char
iter 573000: loss 5.8249, time 2883.18ms
iter 573010: loss 5.9842, time 124.40ms
iter 573020: loss 5.8779, time 125.22ms
iter 573030: loss 5.9099, time 125.08ms
iter 573040: loss 5.7038, time 124.46ms
iter 573050: loss 4.9263, time 125.29ms
iter 573060: loss 5.4847, time 125.03ms
iter 573070: loss 5.5183, time 124.12ms
iter 573080: loss 6.3500, time 124.93ms
iter 573090: loss 5.5229, time 124.24ms
iter 573100: loss 6.5978, time 124.91ms
iter 573110: loss 6.2128, time 125.32ms
iter 573120: loss 6.2905, time 124.80ms
iter 573130: loss 5.9103, time 124.92ms
iter 573140: loss 6.4371, time 126.28ms
iter 573150: loss 5.6620, time 124.83ms
iter 573160: loss 5.7624, time 124.87ms
iter 573170: loss 5.9780, time 124.97ms
iter 573180: loss 5.6547, time 124.96ms
iter 573190: loss 6.6970, time 126.28ms
iter 573200: loss 6.1420, time 125.15ms
iter 573210: loss 5.6416, time 127.62ms
iter 573220: loss 6.6687, time 125.18ms
iter 573230: loss 6.0722, time 127.38ms
iter 573240: loss 5.5746, time 124.90ms
step 573250: train loss 5.5145, val loss 5.5446
saving checkpoint to out-shakespeare-char
iter 573250: loss 5.0950, time 2882.37ms
iter 573260: loss 5.9328, time 125.19ms
iter 573270: loss 5.9894, time 125.13ms
iter 573280: loss 6.1996, time 125.06ms
iter 573290: loss 5.8760, time 124.86ms
iter 573300: loss 5.6114, time 125.36ms
iter 573310: loss 5.5455, time 126.56ms
iter 573320: loss 6.0525, time 125.22ms
iter 573330: loss 5.7539, time 124.86ms
iter 573340: loss 6.1269, time 124.69ms
iter 573350: loss 6.0165, time 125.11ms
iter 573360: loss 5.8340, time 124.40ms
iter 573370: loss 5.4373, time 124.72ms
iter 573380: loss 6.2191, time 124.85ms
iter 573390: loss 5.8921, time 124.44ms
iter 573400: loss 6.5376, time 124.69ms
iter 573410: loss 5.7959, time 125.16ms
iter 573420: loss 5.5931, time 124.91ms
iter 573430: loss 5.4406, time 125.28ms
iter 573440: loss 6.1291, time 125.44ms
iter 573450: loss 6.0780, time 125.06ms
iter 573460: loss 6.5094, time 124.88ms
iter 573470: loss 5.9931, time 124.78ms
iter 573480: loss 5.7941, time 124.71ms
iter 573490: loss 6.3157, time 124.73ms
step 573500: train loss 5.5489, val loss 5.5505
saving checkpoint to out-shakespeare-char
iter 573500: loss 5.5628, time 2893.54ms
iter 573510: loss 6.0867, time 125.11ms
iter 573520: loss 5.4606, time 124.77ms
iter 573530: loss 5.3826, time 124.93ms
iter 573540: loss 6.1003, time 124.81ms
iter 573550: loss 5.7197, time 125.57ms
iter 573560: loss 6.8404, time 125.04ms
iter 573570: loss 5.6618, time 124.75ms
iter 573580: loss 6.3303, time 124.75ms
iter 573590: loss 5.9979, time 125.02ms
iter 573600: loss 5.6530, time 124.87ms
iter 573610: loss 4.9642, time 125.33ms
iter 573620: loss 5.9154, time 125.04ms
iter 573630: loss 6.1139, time 125.06ms
iter 573640: loss 6.6652, time 125.33ms
iter 573650: loss 5.7796, time 124.96ms
iter 573660: loss 5.9319, time 124.99ms
iter 573670: loss 5.6996, time 124.78ms
iter 573680: loss 5.4080, time 125.01ms
iter 573690: loss 6.0103, time 124.22ms
iter 573700: loss 6.9977, time 124.87ms
iter 573710: loss 6.1093, time 124.88ms
iter 573720: loss 5.7718, time 124.88ms
iter 573730: loss 5.0889, time 124.16ms
iter 573740: loss 6.0896, time 124.81ms
step 573750: train loss 5.5308, val loss 5.5504
saving checkpoint to out-shakespeare-char
iter 573750: loss 6.1973, time 2901.53ms
iter 573760: loss 5.1184, time 124.80ms
iter 573770: loss 6.8946, time 124.89ms
iter 573780: loss 5.2375, time 124.59ms
iter 573790: loss 6.1020, time 124.79ms
iter 573800: loss 5.7855, time 124.61ms
iter 573810: loss 6.4626, time 124.91ms
iter 573820: loss 6.1099, time 124.41ms
iter 573830: loss 6.1416, time 124.93ms
iter 573840: loss 5.9910, time 124.67ms
iter 573850: loss 5.8247, time 124.78ms
iter 573860: loss 5.3320, time 125.02ms
iter 573870: loss 5.6527, time 125.05ms
iter 573880: loss 5.9047, time 124.63ms
iter 573890: loss 5.7055, time 124.86ms
iter 573900: loss 5.7513, time 124.75ms
iter 573910: loss 6.2221, time 125.11ms
iter 573920: loss 6.1828, time 124.75ms
iter 573930: loss 6.3501, time 124.87ms
iter 573940: loss 6.0698, time 124.88ms
iter 573950: loss 5.4897, time 124.94ms
iter 573960: loss 6.3357, time 124.74ms
iter 573970: loss 5.3891, time 124.67ms
iter 573980: loss 6.1197, time 124.86ms
iter 573990: loss 5.8761, time 125.01ms
step 574000: train loss 5.5295, val loss 5.5088
saving checkpoint to out-shakespeare-char
iter 574000: loss 6.1398, time 2905.68ms
iter 574010: loss 6.1549, time 127.29ms
iter 574020: loss 5.1990, time 125.52ms
iter 574030: loss 5.9843, time 123.50ms
iter 574040: loss 6.0389, time 124.85ms
iter 574050: loss 5.7772, time 125.48ms
iter 574060: loss 5.5776, time 124.79ms
iter 574070: loss 6.1388, time 124.65ms
iter 574080: loss 5.5981, time 124.64ms
iter 574090: loss 6.1264, time 123.70ms
iter 574100: loss 6.1842, time 124.48ms
iter 574110: loss 5.7359, time 125.08ms
iter 574120: loss 6.1737, time 124.89ms
iter 574130: loss 6.1469, time 124.58ms
iter 574140: loss 5.1134, time 124.87ms
iter 574150: loss 6.9753, time 124.74ms
iter 574160: loss 6.5904, time 124.84ms
iter 574170: loss 6.5040, time 124.69ms
iter 574180: loss 4.9661, time 124.66ms
iter 574190: loss 6.1612, time 124.81ms
iter 574200: loss 5.7074, time 124.86ms
iter 574210: loss 5.8836, time 124.81ms
iter 574220: loss 5.5367, time 124.43ms
iter 574230: loss 5.8056, time 124.57ms
iter 574240: loss 5.1796, time 124.62ms
step 574250: train loss 5.5009, val loss 5.5560
saving checkpoint to out-shakespeare-char
iter 574250: loss 5.9568, time 2900.15ms
iter 574260: loss 6.0444, time 124.93ms
iter 574270: loss 5.6952, time 124.77ms
iter 574280: loss 5.6707, time 124.31ms
iter 574290: loss 5.7063, time 124.32ms
iter 574300: loss 5.8271, time 124.75ms
iter 574310: loss 6.2359, time 125.13ms
iter 574320: loss 6.2434, time 124.97ms
iter 574330: loss 6.0174, time 124.80ms
iter 574340: loss 5.9938, time 125.85ms
iter 574350: loss 6.1335, time 124.90ms
iter 574360: loss 5.7449, time 123.63ms
iter 574370: loss 6.0669, time 124.77ms
iter 574380: loss 5.3938, time 124.80ms
iter 574390: loss 6.0189, time 124.58ms
iter 574400: loss 5.8638, time 123.55ms
iter 574410: loss 6.2708, time 124.46ms
iter 574420: loss 5.8206, time 125.67ms
iter 574430: loss 6.3052, time 124.68ms
iter 574440: loss 6.0113, time 124.70ms
iter 574450: loss 5.6691, time 124.68ms
iter 574460: loss 6.1401, time 124.67ms
iter 574470: loss 5.5111, time 124.84ms
iter 574480: loss 5.4489, time 125.30ms
iter 574490: loss 5.2060, time 124.75ms
step 574500: train loss 5.5385, val loss 5.5545
saving checkpoint to out-shakespeare-char
iter 574500: loss 6.5603, time 2883.59ms
iter 574510: loss 6.7304, time 125.02ms
iter 574520: loss 6.0190, time 124.86ms
iter 574530: loss 6.1754, time 124.41ms
iter 574540: loss 5.1796, time 124.20ms
iter 574550: loss 5.8554, time 124.66ms
iter 574560: loss 5.6964, time 124.61ms
iter 574570: loss 6.5914, time 124.92ms
iter 574580: loss 5.7568, time 124.76ms
iter 574590: loss 5.8542, time 124.71ms
iter 574600: loss 6.1352, time 124.60ms
iter 574610: loss 6.0478, time 124.83ms
iter 574620: loss 6.2831, time 124.69ms
iter 574630: loss 6.3968, time 124.75ms
iter 574640: loss 6.0987, time 124.72ms
iter 574650: loss 5.9248, time 124.83ms
iter 574660: loss 5.4460, time 124.77ms
iter 574670: loss 5.9145, time 124.70ms
iter 574680: loss 6.5749, time 124.55ms
iter 574690: loss 5.8320, time 124.56ms
iter 574700: loss 6.5879, time 125.30ms
iter 574710: loss 6.1029, time 125.11ms
iter 574720: loss 5.7002, time 124.70ms
iter 574730: loss 5.2725, time 124.58ms
iter 574740: loss 5.7425, time 124.56ms
step 574750: train loss 5.5439, val loss 5.5089
saving checkpoint to out-shakespeare-char
iter 574750: loss 5.9703, time 2890.38ms
iter 574760: loss 5.6576, time 124.81ms
iter 574770: loss 5.3275, time 124.68ms
iter 574780: loss 5.6187, time 124.60ms
iter 574790: loss 5.3493, time 124.53ms
iter 574800: loss 6.2828, time 124.99ms
iter 574810: loss 5.9265, time 124.77ms
iter 574820: loss 5.6326, time 124.06ms
iter 574830: loss 5.5074, time 124.68ms
iter 574840: loss 5.6308, time 124.68ms
iter 574850: loss 6.0657, time 124.76ms
iter 574860: loss 6.2119, time 124.80ms
iter 574870: loss 5.6810, time 124.80ms
iter 574880: loss 6.3452, time 125.00ms
iter 574890: loss 6.0799, time 124.73ms
iter 574900: loss 6.1333, time 125.36ms
iter 574910: loss 6.4423, time 124.82ms
iter 574920: loss 6.1311, time 125.57ms
iter 574930: loss 6.3857, time 124.83ms
iter 574940: loss 5.9925, time 125.22ms
iter 574950: loss 6.0434, time 124.84ms
iter 574960: loss 6.0329, time 125.06ms
iter 574970: loss 6.1013, time 124.85ms
iter 574980: loss 5.7619, time 125.03ms
iter 574990: loss 5.8893, time 124.83ms
step 575000: train loss 5.5859, val loss 5.5535
saving checkpoint to out-shakespeare-char
iter 575000: loss 5.3409, time 2869.12ms
iter 575010: loss 5.9715, time 124.81ms
iter 575020: loss 5.5930, time 125.21ms
iter 575030: loss 6.0284, time 125.60ms
iter 575040: loss 6.5044, time 124.73ms
iter 575050: loss 5.7999, time 125.04ms
iter 575060: loss 6.4856, time 125.03ms
iter 575070: loss 5.9643, time 125.45ms
iter 575080: loss 5.9223, time 124.98ms
iter 575090: loss 6.3431, time 124.97ms
iter 575100: loss 5.8635, time 124.82ms
iter 575110: loss 5.2416, time 125.55ms
iter 575120: loss 6.6413, time 124.96ms
iter 575130: loss 6.1611, time 124.92ms
iter 575140: loss 5.8064, time 125.03ms
iter 575150: loss 5.7820, time 124.92ms
iter 575160: loss 5.2877, time 124.95ms
iter 575170: loss 6.0793, time 124.79ms
iter 575180: loss 5.4683, time 124.92ms
iter 575190: loss 5.8426, time 124.71ms
iter 575200: loss 6.4308, time 124.60ms
iter 575210: loss 5.6316, time 124.78ms
iter 575220: loss 6.8529, time 124.77ms
iter 575230: loss 7.0489, time 124.80ms
iter 575240: loss 6.0821, time 124.70ms
step 575250: train loss 5.5226, val loss 5.5203
saving checkpoint to out-shakespeare-char
iter 575250: loss 5.7079, time 2882.89ms
iter 575260: loss 6.3874, time 122.73ms
iter 575270: loss 6.4046, time 120.65ms
iter 575280: loss 5.8724, time 121.26ms
iter 575290: loss 5.5001, time 124.22ms
iter 575300: loss 5.5777, time 121.56ms
iter 575310: loss 5.9532, time 121.73ms
iter 575320: loss 5.7406, time 121.38ms
iter 575330: loss 6.2713, time 120.57ms
iter 575340: loss 5.3426, time 121.84ms
iter 575350: loss 5.6227, time 121.39ms
iter 575360: loss 6.1267, time 122.53ms
iter 575370: loss 6.5026, time 121.40ms
iter 575380: loss 5.7776, time 121.89ms
iter 575390: loss 6.3935, time 124.38ms
iter 575400: loss 5.7963, time 121.33ms
iter 575410: loss 5.8642, time 121.31ms
iter 575420: loss 5.7767, time 121.62ms
iter 575430: loss 5.9207, time 121.40ms
iter 575440: loss 6.3727, time 121.59ms
iter 575450: loss 6.3198, time 121.47ms
iter 575460: loss 6.8424, time 122.52ms
iter 575470: loss 6.3864, time 121.35ms
iter 575480: loss 5.6743, time 121.46ms
iter 575490: loss 6.5686, time 122.47ms
step 575500: train loss 5.5037, val loss 5.5501
saving checkpoint to out-shakespeare-char
iter 575500: loss 5.8067, time 2896.89ms
iter 575510: loss 6.4362, time 121.87ms
iter 575520: loss 6.1792, time 121.38ms
iter 575530: loss 6.4871, time 122.51ms
iter 575540: loss 5.0714, time 121.72ms
iter 575550: loss 6.2061, time 120.65ms
iter 575560: loss 6.0374, time 122.57ms
iter 575570: loss 6.4199, time 121.06ms
iter 575580: loss 5.3198, time 121.36ms
iter 575590: loss 6.1971, time 121.41ms
iter 575600: loss 5.8508, time 121.51ms
iter 575610: loss 5.6854, time 121.52ms
iter 575620: loss 5.9952, time 121.59ms
iter 575630: loss 6.3849, time 122.96ms
iter 575640: loss 5.4648, time 120.55ms
iter 575650: loss 5.5692, time 121.58ms
iter 575660: loss 5.9648, time 122.54ms
iter 575670: loss 6.2165, time 121.54ms
iter 575680: loss 6.1826, time 121.73ms
iter 575690: loss 5.7729, time 123.84ms
iter 575700: loss 5.3120, time 121.47ms
iter 575710: loss 6.2129, time 121.39ms
iter 575720: loss 5.2241, time 121.48ms
iter 575730: loss 5.8103, time 122.73ms
iter 575740: loss 6.0075, time 121.00ms
step 575750: train loss 5.5589, val loss 5.5441
saving checkpoint to out-shakespeare-char
iter 575750: loss 5.7523, time 2909.64ms
iter 575760: loss 5.2929, time 121.56ms
iter 575770: loss 5.7870, time 121.78ms
iter 575780: loss 5.5539, time 121.66ms
iter 575790: loss 5.9833, time 121.69ms
iter 575800: loss 6.1777, time 122.72ms
iter 575810: loss 5.6660, time 121.66ms
iter 575820: loss 6.2125, time 121.35ms
iter 575830: loss 6.3134, time 122.73ms
iter 575840: loss 5.7609, time 121.58ms
iter 575850: loss 6.0765, time 121.50ms
iter 575860: loss 6.0939, time 124.41ms
iter 575870: loss 6.7946, time 121.43ms
iter 575880: loss 5.8893, time 121.60ms
iter 575890: loss 6.9364, time 121.41ms
iter 575900: loss 5.4778, time 121.68ms
iter 575910: loss 6.2670, time 121.57ms
iter 575920: loss 6.1004, time 121.46ms
iter 575930: loss 5.5258, time 122.73ms
iter 575940: loss 6.1627, time 122.06ms
iter 575950: loss 6.3497, time 121.32ms
iter 575960: loss 5.5338, time 124.31ms
iter 575970: loss 5.9048, time 121.65ms
iter 575980: loss 6.6782, time 120.83ms
iter 575990: loss 5.9928, time 121.34ms
step 576000: train loss 5.5494, val loss 5.5190
saving checkpoint to out-shakespeare-char
iter 576000: loss 5.8228, time 2913.64ms
iter 576010: loss 6.1088, time 121.85ms
iter 576020: loss 5.7062, time 121.35ms
iter 576030: loss 5.5005, time 121.05ms
iter 576040: loss 5.9716, time 123.02ms
iter 576050: loss 5.9058, time 121.38ms
iter 576060: loss 6.1035, time 122.03ms
iter 576070: loss 5.8010, time 122.85ms
iter 576080: loss 5.4844, time 121.03ms
iter 576090: loss 6.1869, time 120.78ms
iter 576100: loss 6.1592, time 124.23ms
iter 576110: loss 5.9424, time 121.76ms
iter 576120: loss 5.9700, time 121.41ms
iter 576130: loss 5.8170, time 121.47ms
iter 576140: loss 6.4470, time 122.50ms
iter 576150: loss 6.2196, time 121.39ms
iter 576160: loss 6.3592, time 121.99ms
iter 576170: loss 6.1452, time 123.01ms
iter 576180: loss 5.9600, time 121.41ms
iter 576190: loss 6.2350, time 121.54ms
iter 576200: loss 6.0430, time 120.88ms
iter 576210: loss 6.0379, time 121.40ms
iter 576220: loss 5.9597, time 120.62ms
iter 576230: loss 5.3835, time 121.39ms
iter 576240: loss 5.9213, time 122.76ms
step 576250: train loss 5.5012, val loss 5.5585
saving checkpoint to out-shakespeare-char
iter 576250: loss 5.7821, time 2898.19ms
iter 576260: loss 6.3478, time 121.48ms
iter 576270: loss 6.4388, time 124.42ms
iter 576280: loss 6.5310, time 120.85ms
iter 576290: loss 6.2189, time 121.67ms
iter 576300: loss 5.9470, time 121.88ms
iter 576310: loss 5.6013, time 121.52ms
iter 576320: loss 6.2497, time 121.81ms
iter 576330: loss 5.5466, time 121.44ms
iter 576340: loss 5.8894, time 122.97ms
iter 576350: loss 5.7165, time 121.45ms
iter 576360: loss 5.7281, time 121.39ms
iter 576370: loss 5.9380, time 122.90ms
iter 576380: loss 5.8951, time 121.39ms
iter 576390: loss 5.8128, time 120.71ms
iter 576400: loss 6.4160, time 124.24ms
iter 576410: loss 5.8422, time 121.51ms
iter 576420: loss 6.1677, time 121.51ms
iter 576430: loss 6.1968, time 121.60ms
iter 576440: loss 5.6541, time 121.35ms
iter 576450: loss 5.8795, time 120.70ms
iter 576460: loss 6.2732, time 121.57ms
iter 576470: loss 6.4440, time 121.60ms
iter 576480: loss 5.4292, time 121.48ms
iter 576490: loss 5.9156, time 121.42ms
step 576500: train loss 5.5418, val loss 5.5760
saving checkpoint to out-shakespeare-char
iter 576500: loss 5.9373, time 2895.93ms
iter 576510: loss 6.1104, time 121.76ms
iter 576520: loss 6.5728, time 121.63ms
iter 576530: loss 6.0250, time 121.56ms
iter 576540: loss 6.8032, time 122.93ms
iter 576550: loss 5.5877, time 121.65ms
iter 576560: loss 6.1993, time 121.61ms
iter 576570: loss 6.2194, time 122.60ms
iter 576580: loss 6.3769, time 122.21ms
iter 576590: loss 6.5752, time 121.67ms
iter 576600: loss 5.7196, time 124.06ms
iter 576610: loss 6.1208, time 120.31ms
iter 576620: loss 5.9217, time 121.67ms
iter 576630: loss 6.1240, time 122.70ms
iter 576640: loss 5.2239, time 121.53ms
iter 576650: loss 6.4646, time 121.30ms
iter 576660: loss 6.3807, time 120.89ms
iter 576670: loss 5.7581, time 122.57ms
iter 576680: loss 6.1951, time 121.58ms
iter 576690: loss 5.8746, time 121.40ms
iter 576700: loss 5.4602, time 122.68ms
iter 576710: loss 6.1625, time 121.33ms
iter 576720: loss 5.4012, time 121.47ms
iter 576730: loss 5.9183, time 124.14ms
iter 576740: loss 5.9314, time 121.81ms
step 576750: train loss 5.5506, val loss 5.5130
saving checkpoint to out-shakespeare-char
iter 576750: loss 6.2997, time 2900.17ms
iter 576760: loss 5.7811, time 122.25ms
iter 576770: loss 6.8113, time 119.68ms
iter 576780: loss 5.7153, time 120.12ms
iter 576790: loss 4.9863, time 119.69ms
iter 576800: loss 6.3953, time 119.86ms
iter 576810: loss 6.2128, time 120.02ms
iter 576820: loss 6.1183, time 119.49ms
iter 576830: loss 5.8592, time 120.81ms
iter 576840: loss 6.2574, time 119.73ms
iter 576850: loss 6.3443, time 119.95ms
iter 576860: loss 5.6354, time 119.96ms
iter 576870: loss 6.2318, time 120.99ms
iter 576880: loss 5.6658, time 119.80ms
iter 576890: loss 6.2241, time 119.82ms
iter 576900: loss 6.0110, time 122.34ms
iter 576910: loss 6.2416, time 120.03ms
iter 576920: loss 6.2918, time 119.92ms
iter 576930: loss 6.4861, time 119.67ms
iter 576940: loss 6.0858, time 121.16ms
iter 576950: loss 6.6818, time 119.81ms
iter 576960: loss 6.3933, time 120.08ms
iter 576970: loss 5.7132, time 119.83ms
iter 576980: loss 5.4884, time 119.70ms
iter 576990: loss 5.8895, time 119.64ms
step 577000: train loss 5.5512, val loss 5.5208
saving checkpoint to out-shakespeare-char
iter 577000: loss 6.2491, time 2899.76ms
iter 577010: loss 5.7833, time 121.47ms
iter 577020: loss 5.9333, time 120.03ms
iter 577030: loss 5.5872, time 119.70ms
iter 577040: loss 5.9514, time 122.86ms
iter 577050: loss 5.3489, time 121.47ms
iter 577060: loss 5.5915, time 121.65ms
iter 577070: loss 5.8991, time 122.62ms
iter 577080: loss 5.6220, time 121.69ms
iter 577090: loss 6.2096, time 121.37ms
iter 577100: loss 6.1413, time 121.61ms
iter 577110: loss 6.3008, time 122.81ms
iter 577120: loss 5.9620, time 122.28ms
iter 577130: loss 5.9917, time 121.51ms
iter 577140: loss 5.3757, time 122.60ms
iter 577150: loss 5.5726, time 121.21ms
iter 577160: loss 6.5626, time 121.33ms
iter 577170: loss 5.7568, time 124.21ms
iter 577180: loss 6.3121, time 121.31ms
iter 577190: loss 5.5239, time 121.80ms
iter 577200: loss 5.8479, time 121.58ms
iter 577210: loss 5.8672, time 121.61ms
iter 577220: loss 6.5703, time 121.62ms
iter 577230: loss 6.2941, time 121.55ms
iter 577240: loss 5.9004, time 122.75ms
step 577250: train loss 5.5305, val loss 5.5245
saving checkpoint to out-shakespeare-char
iter 577250: loss 5.8144, time 2909.98ms
iter 577260: loss 5.8718, time 121.46ms
iter 577270: loss 5.6609, time 124.08ms
iter 577280: loss 6.4178, time 121.34ms
iter 577290: loss 6.6198, time 121.50ms
iter 577300: loss 5.6889, time 121.39ms
iter 577310: loss 5.8914, time 121.53ms
iter 577320: loss 6.0613, time 121.59ms
iter 577330: loss 6.4968, time 120.64ms
iter 577340: loss 5.2738, time 122.59ms
iter 577350: loss 5.1940, time 120.56ms
iter 577360: loss 5.7535, time 121.32ms
iter 577370: loss 5.0469, time 122.69ms
iter 577380: loss 6.0180, time 121.26ms
iter 577390: loss 6.0849, time 121.25ms
iter 577400: loss 6.8688, time 124.12ms
iter 577410: loss 5.8201, time 121.72ms
iter 577420: loss 5.7112, time 121.46ms
iter 577430: loss 6.0013, time 121.26ms
iter 577440: loss 6.0835, time 121.60ms
iter 577450: loss 6.2620, time 121.50ms
iter 577460: loss 6.5698, time 121.41ms
iter 577470: loss 5.7852, time 122.65ms
iter 577480: loss 5.6970, time 121.99ms
iter 577490: loss 6.4483, time 121.61ms
step 577500: train loss 5.5084, val loss 5.5344
saving checkpoint to out-shakespeare-char
iter 577500: loss 6.0674, time 2912.85ms
iter 577510: loss 5.7644, time 120.89ms
iter 577520: loss 5.6231, time 121.47ms
iter 577530: loss 5.9523, time 124.16ms
iter 577540: loss 5.3279, time 121.49ms
iter 577550: loss 5.1821, time 121.48ms
iter 577560: loss 5.2078, time 124.72ms
iter 577570: loss 5.0371, time 124.74ms
iter 577580: loss 5.7636, time 124.42ms
iter 577590: loss 5.8900, time 124.92ms
iter 577600: loss 6.0274, time 124.29ms
iter 577610: loss 6.4948, time 124.85ms
iter 577620: loss 6.4431, time 124.45ms
iter 577630: loss 5.7145, time 124.55ms
iter 577640: loss 5.4649, time 124.93ms
iter 577650: loss 5.1169, time 124.93ms
iter 577660: loss 6.1091, time 124.10ms
iter 577670: loss 6.1999, time 124.72ms
iter 577680: loss 5.5726, time 123.92ms
iter 577690: loss 5.9434, time 124.56ms
iter 577700: loss 5.7232, time 124.86ms
iter 577710: loss 5.6881, time 124.91ms
iter 577720: loss 5.7119, time 125.28ms
iter 577730: loss 6.1868, time 125.00ms
iter 577740: loss 6.2167, time 124.92ms
step 577750: train loss 5.4939, val loss 5.5321
saving checkpoint to out-shakespeare-char
iter 577750: loss 5.3624, time 2889.68ms
iter 577760: loss 5.6993, time 124.80ms
iter 577770: loss 5.6757, time 125.60ms
iter 577780: loss 6.1488, time 125.55ms
iter 577790: loss 5.9417, time 124.59ms
iter 577800: loss 6.4103, time 125.60ms
iter 577810: loss 5.1547, time 124.63ms
iter 577820: loss 6.3546, time 125.43ms
iter 577830: loss 5.8189, time 125.87ms
iter 577840: loss 6.0484, time 125.08ms
iter 577850: loss 5.4532, time 125.16ms
iter 577860: loss 5.5496, time 124.91ms
iter 577870: loss 6.0653, time 125.23ms
iter 577880: loss 5.4719, time 124.84ms
iter 577890: loss 6.0077, time 125.04ms
iter 577900: loss 6.2594, time 124.96ms
iter 577910: loss 6.6673, time 125.94ms
iter 577920: loss 5.2651, time 124.83ms
iter 577930: loss 6.1216, time 125.18ms
iter 577940: loss 5.4034, time 125.02ms
iter 577950: loss 6.3588, time 124.95ms
iter 577960: loss 5.6775, time 124.89ms
iter 577970: loss 5.7205, time 124.89ms
iter 577980: loss 6.5330, time 124.62ms
iter 577990: loss 6.0296, time 124.97ms
step 578000: train loss 5.5507, val loss 5.5499
saving checkpoint to out-shakespeare-char
iter 578000: loss 5.7857, time 2907.02ms
iter 578010: loss 5.6259, time 124.84ms
iter 578020: loss 5.9238, time 124.87ms
iter 578030: loss 6.2304, time 125.02ms
iter 578040: loss 5.7110, time 124.62ms
iter 578050: loss 5.5932, time 125.24ms
iter 578060: loss 6.4700, time 125.23ms
iter 578070: loss 6.1365, time 126.56ms
iter 578080: loss 6.0658, time 125.43ms
iter 578090: loss 5.4772, time 125.28ms
iter 578100: loss 5.8032, time 125.01ms
iter 578110: loss 6.4653, time 125.14ms
iter 578120: loss 5.2320, time 125.11ms
iter 578130: loss 6.0628, time 124.72ms
iter 578140: loss 5.6225, time 125.05ms
iter 578150: loss 6.1490, time 124.85ms
iter 578160: loss 6.6748, time 124.81ms
iter 578170: loss 6.6443, time 125.10ms
iter 578180: loss 6.0390, time 124.85ms
iter 578190: loss 5.8338, time 124.66ms
iter 578200: loss 5.8700, time 125.24ms
iter 578210: loss 6.4691, time 124.54ms
iter 578220: loss 5.9566, time 125.35ms
iter 578230: loss 5.8722, time 125.23ms
iter 578240: loss 5.9427, time 125.14ms
step 578250: train loss 5.5463, val loss 5.4845
saving checkpoint to out-shakespeare-char
iter 578250: loss 5.7374, time 2886.82ms
iter 578260: loss 6.0528, time 125.14ms
iter 578270: loss 5.7209, time 124.98ms
iter 578280: loss 5.7002, time 124.54ms
iter 578290: loss 5.9896, time 125.67ms
iter 578300: loss 5.5989, time 125.08ms
iter 578310: loss 5.8913, time 124.94ms
iter 578320: loss 5.9416, time 125.13ms
iter 578330: loss 6.0717, time 125.29ms
iter 578340: loss 6.3429, time 124.76ms
iter 578350: loss 6.3318, time 125.15ms
iter 578360: loss 6.2759, time 124.96ms
iter 578370: loss 6.1518, time 124.99ms
iter 578380: loss 5.7791, time 125.52ms
iter 578390: loss 5.7347, time 125.29ms
iter 578400: loss 5.5593, time 124.84ms
iter 578410: loss 5.7984, time 125.30ms
iter 578420: loss 5.2525, time 126.62ms
iter 578430: loss 5.7238, time 124.85ms
iter 578440: loss 6.6215, time 124.87ms
iter 578450: loss 6.2572, time 125.06ms
iter 578460: loss 6.5472, time 125.33ms
iter 578470: loss 6.3227, time 124.70ms
iter 578480: loss 5.9736, time 125.07ms
iter 578490: loss 5.6452, time 124.92ms
step 578500: train loss 5.5329, val loss 5.6016
saving checkpoint to out-shakespeare-char
iter 578500: loss 5.9889, time 2904.08ms
iter 578510: loss 6.5469, time 123.05ms
iter 578520: loss 5.8892, time 128.80ms
iter 578530: loss 6.2033, time 125.05ms
iter 578540: loss 6.4175, time 125.81ms
iter 578550: loss 5.8840, time 126.05ms
iter 578560: loss 5.9508, time 128.97ms
iter 578570: loss 6.0468, time 125.78ms
iter 578580: loss 5.6305, time 128.31ms
iter 578590: loss 5.7628, time 125.21ms
iter 578600: loss 6.2186, time 127.87ms
iter 578610: loss 6.3505, time 125.32ms
iter 578620: loss 5.8643, time 127.96ms
iter 578630: loss 5.7707, time 125.24ms
iter 578640: loss 6.6077, time 127.51ms
iter 578650: loss 6.6032, time 124.86ms
iter 578660: loss 6.1886, time 127.48ms
iter 578670: loss 5.9060, time 124.77ms
iter 578680: loss 6.7651, time 127.64ms
iter 578690: loss 5.4214, time 124.74ms
iter 578700: loss 5.9462, time 127.28ms
iter 578710: loss 5.9195, time 124.95ms
iter 578720: loss 5.3743, time 124.81ms
iter 578730: loss 6.3245, time 124.87ms
iter 578740: loss 6.0617, time 125.39ms
step 578750: train loss 5.5455, val loss 5.4975
saving checkpoint to out-shakespeare-char
iter 578750: loss 5.9912, time 2894.55ms
iter 578760: loss 5.9337, time 125.34ms
iter 578770: loss 5.6904, time 125.48ms
iter 578780: loss 6.7185, time 125.50ms
iter 578790: loss 6.1956, time 126.53ms
iter 578800: loss 5.6966, time 125.77ms
iter 578810: loss 6.0158, time 125.90ms
iter 578820: loss 5.4559, time 125.81ms
iter 578830: loss 6.2310, time 126.22ms
iter 578840: loss 5.7632, time 125.72ms
iter 578850: loss 5.8766, time 125.66ms
iter 578860: loss 6.1303, time 125.69ms
iter 578870: loss 5.6118, time 125.87ms
iter 578880: loss 6.2501, time 125.86ms
iter 578890: loss 5.8441, time 125.81ms
iter 578900: loss 6.0899, time 125.89ms
iter 578910: loss 5.8913, time 125.97ms
iter 578920: loss 5.9137, time 125.72ms
iter 578930: loss 5.9606, time 125.95ms
iter 578940: loss 5.7604, time 124.99ms
iter 578950: loss 6.0723, time 125.68ms
iter 578960: loss 5.2393, time 125.03ms
iter 578970: loss 5.3909, time 125.64ms
iter 578980: loss 6.0577, time 125.35ms
iter 578990: loss 5.8976, time 125.73ms
step 579000: train loss 5.4682, val loss 5.5300
saving checkpoint to out-shakespeare-char
iter 579000: loss 6.1135, time 2881.87ms
iter 579010: loss 6.2044, time 124.13ms
iter 579020: loss 5.6174, time 121.63ms
iter 579030: loss 5.8464, time 121.50ms
iter 579040: loss 5.6544, time 121.47ms
iter 579050: loss 5.1717, time 121.34ms
iter 579060: loss 6.0401, time 121.51ms
iter 579070: loss 6.2975, time 121.46ms
iter 579080: loss 6.0060, time 122.67ms
iter 579090: loss 5.6989, time 121.55ms
iter 579100: loss 6.2767, time 121.05ms
iter 579110: loss 5.8949, time 122.86ms
iter 579120: loss 5.2929, time 121.44ms
iter 579130: loss 5.5474, time 121.48ms
iter 579140: loss 5.1515, time 123.99ms
iter 579150: loss 5.1487, time 121.65ms
iter 579160: loss 5.6119, time 121.49ms
iter 579170: loss 5.5720, time 121.37ms
iter 579180: loss 5.6836, time 122.70ms
iter 579190: loss 6.5268, time 121.34ms
iter 579200: loss 5.9690, time 122.78ms
iter 579210: loss 5.3495, time 124.35ms
iter 579220: loss 6.1771, time 121.39ms
iter 579230: loss 6.1578, time 121.46ms
iter 579240: loss 5.5773, time 121.34ms
step 579250: train loss 5.5315, val loss 5.5400
saving checkpoint to out-shakespeare-char
iter 579250: loss 5.9186, time 2905.52ms
iter 579260: loss 5.6198, time 121.69ms
iter 579270: loss 5.3667, time 121.68ms
iter 579280: loss 5.6247, time 121.57ms
iter 579290: loss 5.5553, time 122.71ms
iter 579300: loss 5.9115, time 121.59ms
iter 579310: loss 5.4831, time 121.87ms
iter 579320: loss 6.1191, time 123.94ms
iter 579330: loss 6.1815, time 121.61ms
iter 579340: loss 5.5894, time 121.07ms
iter 579350: loss 5.7545, time 124.05ms
iter 579360: loss 6.0788, time 121.18ms
iter 579370: loss 6.4763, time 125.66ms
iter 579380: loss 5.0395, time 127.82ms
iter 579390: loss 6.4526, time 125.69ms
iter 579400: loss 6.2778, time 128.43ms
iter 579410: loss 5.8641, time 125.80ms
iter 579420: loss 5.6226, time 128.28ms
iter 579430: loss 6.1898, time 125.77ms
iter 579440: loss 5.4220, time 127.45ms
iter 579450: loss 5.5883, time 125.26ms
iter 579460: loss 5.9126, time 127.62ms
iter 579470: loss 6.5580, time 125.22ms
iter 579480: loss 5.9161, time 127.83ms
iter 579490: loss 5.3206, time 125.47ms
step 579500: train loss 5.5778, val loss 5.5674
saving checkpoint to out-shakespeare-char
iter 579500: loss 6.4217, time 2865.22ms
iter 579510: loss 6.0384, time 126.06ms
iter 579520: loss 5.9067, time 125.80ms
iter 579530: loss 5.7732, time 125.92ms
iter 579540: loss 5.9375, time 125.82ms
iter 579550: loss 6.0059, time 126.05ms
iter 579560: loss 5.8678, time 125.75ms
iter 579570: loss 6.5867, time 124.95ms
iter 579580: loss 6.4099, time 124.05ms
iter 579590: loss 5.5987, time 126.29ms
iter 579600: loss 6.4762, time 125.69ms
iter 579610: loss 6.6505, time 125.64ms
iter 579620: loss 5.4050, time 125.63ms
iter 579630: loss 6.5064, time 124.76ms
iter 579640: loss 6.2241, time 125.33ms
iter 579650: loss 6.2119, time 124.89ms
iter 579660: loss 5.7330, time 124.78ms
iter 579670: loss 5.9066, time 124.49ms
iter 579680: loss 6.2926, time 124.85ms
iter 579690: loss 5.9972, time 124.94ms
iter 579700: loss 5.4324, time 124.91ms
iter 579710: loss 6.2935, time 125.04ms
iter 579720: loss 6.1540, time 125.13ms
iter 579730: loss 5.1525, time 124.91ms
iter 579740: loss 6.5826, time 124.77ms
step 579750: train loss 5.5668, val loss 5.5692
saving checkpoint to out-shakespeare-char
iter 579750: loss 5.0467, time 2908.36ms
iter 579760: loss 5.3496, time 124.24ms
iter 579770: loss 5.4477, time 125.45ms
iter 579780: loss 5.9633, time 125.02ms
iter 579790: loss 5.8703, time 125.35ms
iter 579800: loss 6.7684, time 124.98ms
iter 579810: loss 5.4797, time 124.35ms
iter 579820: loss 6.1343, time 125.12ms
iter 579830: loss 6.5979, time 125.18ms
iter 579840: loss 6.0412, time 125.26ms
iter 579850: loss 6.3962, time 125.62ms
iter 579860: loss 6.2770, time 125.28ms
iter 579870: loss 5.8608, time 125.38ms
iter 579880: loss 5.4011, time 125.56ms
iter 579890: loss 5.3143, time 125.32ms
iter 579900: loss 6.0241, time 125.12ms
iter 579910: loss 6.2895, time 125.17ms
iter 579920: loss 6.1348, time 125.65ms
iter 579930: loss 5.3845, time 125.34ms
iter 579940: loss 6.0817, time 124.93ms
iter 579950: loss 6.0079, time 125.26ms
iter 579960: loss 5.9505, time 125.37ms
iter 579970: loss 6.3834, time 125.27ms
iter 579980: loss 5.5683, time 124.82ms
iter 579990: loss 6.0106, time 125.07ms
step 580000: train loss 5.5550, val loss 5.5363
saving checkpoint to out-shakespeare-char
iter 580000: loss 6.0450, time 2904.15ms
iter 580010: loss 6.9515, time 128.11ms
iter 580020: loss 5.2041, time 125.47ms
iter 580030: loss 5.7853, time 127.91ms
iter 580040: loss 6.1416, time 125.23ms
iter 580050: loss 5.4765, time 127.89ms
iter 580060: loss 5.7797, time 125.18ms
iter 580070: loss 6.5330, time 127.87ms
iter 580080: loss 6.5495, time 125.20ms
iter 580090: loss 5.8736, time 128.31ms
iter 580100: loss 6.0800, time 125.36ms
iter 580110: loss 6.0562, time 127.81ms
iter 580120: loss 5.5026, time 125.39ms
iter 580130: loss 6.4435, time 128.05ms
iter 580140: loss 5.9992, time 125.19ms
iter 580150: loss 6.6675, time 127.66ms
iter 580160: loss 6.5124, time 125.19ms
iter 580170: loss 6.1694, time 127.82ms
iter 580180: loss 5.3104, time 125.07ms
iter 580190: loss 5.1976, time 127.77ms
iter 580200: loss 5.6097, time 125.55ms
iter 580210: loss 5.5632, time 127.66ms
iter 580220: loss 6.9514, time 125.23ms
iter 580230: loss 6.0814, time 127.72ms
iter 580240: loss 5.5532, time 125.13ms
step 580250: train loss 5.5252, val loss 5.5763
saving checkpoint to out-shakespeare-char
iter 580250: loss 5.1895, time 2900.67ms
iter 580260: loss 6.2115, time 122.59ms
iter 580270: loss 6.0862, time 121.55ms
iter 580280: loss 5.1875, time 121.48ms
iter 580290: loss 5.7535, time 122.72ms
iter 580300: loss 5.6784, time 121.33ms
iter 580310: loss 5.9333, time 121.63ms
iter 580320: loss 5.5423, time 124.09ms
iter 580330: loss 6.6063, time 121.61ms
iter 580340: loss 5.8276, time 121.57ms
iter 580350: loss 5.7315, time 121.44ms
iter 580360: loss 6.3255, time 121.96ms
iter 580370: loss 5.5481, time 121.70ms
iter 580380: loss 6.2331, time 121.52ms
iter 580390: loss 5.9997, time 122.75ms
iter 580400: loss 6.1220, time 120.65ms
iter 580410: loss 5.7282, time 121.48ms
iter 580420: loss 6.1753, time 124.24ms
iter 580430: loss 6.4566, time 121.60ms
iter 580440: loss 6.6119, time 122.26ms
iter 580450: loss 5.7513, time 120.74ms
iter 580460: loss 5.8283, time 121.47ms
iter 580470: loss 6.0541, time 121.46ms
iter 580480: loss 6.1335, time 121.52ms
iter 580490: loss 6.1417, time 121.90ms
step 580500: train loss 5.5272, val loss 5.5272
saving checkpoint to out-shakespeare-char
iter 580500: loss 6.2528, time 2893.19ms
iter 580510: loss 5.8806, time 121.57ms
iter 580520: loss 5.5895, time 121.14ms
iter 580530: loss 5.5660, time 121.04ms
iter 580540: loss 5.7961, time 122.07ms
iter 580550: loss 6.8979, time 121.31ms
iter 580560: loss 6.0809, time 123.77ms
iter 580570: loss 6.1193, time 121.37ms
iter 580580: loss 6.3764, time 121.42ms
iter 580590: loss 6.0882, time 121.22ms
iter 580600: loss 5.6405, time 122.59ms
iter 580610: loss 5.2251, time 121.32ms
iter 580620: loss 5.7946, time 121.53ms
iter 580630: loss 5.9714, time 122.46ms
iter 580640: loss 6.5124, time 120.82ms
iter 580650: loss 5.7977, time 121.73ms
iter 580660: loss 5.9334, time 121.40ms
iter 580670: loss 5.0941, time 121.40ms
iter 580680: loss 5.8814, time 122.90ms
iter 580690: loss 5.6443, time 120.56ms
iter 580700: loss 5.8287, time 122.44ms
iter 580710: loss 6.4349, time 122.34ms
iter 580720: loss 6.1613, time 121.34ms
iter 580730: loss 6.0567, time 122.22ms
iter 580740: loss 5.6779, time 121.55ms
step 580750: train loss 5.4708, val loss 5.5554
saving checkpoint to out-shakespeare-char
iter 580750: loss 5.8620, time 2912.48ms
iter 580760: loss 5.7345, time 124.05ms
iter 580770: loss 5.8803, time 121.88ms
iter 580780: loss 5.8824, time 121.38ms
iter 580790: loss 6.7028, time 121.92ms
iter 580800: loss 6.0742, time 121.29ms
iter 580810: loss 5.9515, time 121.37ms
iter 580820: loss 6.1057, time 121.39ms
iter 580830: loss 6.4098, time 122.48ms
iter 580840: loss 5.4511, time 121.42ms
iter 580850: loss 5.4095, time 121.96ms
iter 580860: loss 5.9949, time 122.39ms
iter 580870: loss 6.2019, time 121.66ms
iter 580880: loss 6.2948, time 122.01ms
iter 580890: loss 5.8890, time 121.45ms
iter 580900: loss 6.2921, time 121.96ms
iter 580910: loss 6.4305, time 121.57ms
iter 580920: loss 6.6722, time 121.46ms
iter 580930: loss 5.2059, time 123.74ms
iter 580940: loss 5.7918, time 122.92ms
iter 580950: loss 5.8812, time 121.72ms
iter 580960: loss 4.8945, time 122.37ms
iter 580970: loss 6.0679, time 121.75ms
iter 580980: loss 5.9677, time 121.69ms
iter 580990: loss 5.6729, time 122.44ms
step 581000: train loss 5.5451, val loss 5.5219
saving checkpoint to out-shakespeare-char
iter 581000: loss 5.2836, time 2910.69ms
iter 581010: loss 6.1396, time 121.73ms
iter 581020: loss 6.2835, time 121.75ms
iter 581030: loss 6.1846, time 121.85ms
iter 581040: loss 6.4768, time 121.73ms
iter 581050: loss 6.7400, time 121.66ms
iter 581060: loss 5.8962, time 121.93ms
iter 581070: loss 5.5627, time 122.92ms
iter 581080: loss 5.5923, time 121.84ms
iter 581090: loss 6.7363, time 121.83ms
iter 581100: loss 6.4808, time 123.13ms
iter 581110: loss 6.1416, time 121.82ms
iter 581120: loss 6.1800, time 121.85ms
iter 581130: loss 6.1371, time 124.32ms
iter 581140: loss 6.4554, time 121.90ms
iter 581150: loss 6.3234, time 121.79ms
iter 581160: loss 6.0370, time 121.92ms
iter 581170: loss 5.9231, time 121.86ms
iter 581180: loss 6.0133, time 121.79ms
iter 581190: loss 5.9205, time 121.88ms
iter 581200: loss 6.3190, time 123.30ms
iter 581210: loss 5.7478, time 122.20ms
iter 581220: loss 6.2208, time 121.79ms
iter 581230: loss 5.9058, time 123.16ms
iter 581240: loss 6.2123, time 121.36ms
step 581250: train loss 5.4975, val loss 5.5093
saving checkpoint to out-shakespeare-char
iter 581250: loss 6.4705, time 2910.94ms
iter 581260: loss 5.5689, time 125.59ms
iter 581270: loss 6.2151, time 125.76ms
iter 581280: loss 5.5097, time 125.92ms
iter 581290: loss 5.6456, time 126.41ms
iter 581300: loss 5.8387, time 123.00ms
iter 581310: loss 5.7835, time 121.81ms
iter 581320: loss 6.1008, time 121.79ms
iter 581330: loss 5.3633, time 124.34ms
iter 581340: loss 6.3865, time 121.84ms
iter 581350: loss 4.8910, time 121.66ms
iter 581360: loss 6.8855, time 121.63ms
iter 581370: loss 5.3113, time 121.82ms
iter 581380: loss 5.7338, time 121.17ms
iter 581390: loss 6.3931, time 122.02ms
iter 581400: loss 6.4539, time 124.46ms
iter 581410: loss 6.5578, time 122.11ms
iter 581420: loss 5.4355, time 121.83ms
iter 581430: loss 5.8361, time 121.87ms
iter 581440: loss 5.9608, time 121.62ms
iter 581450: loss 5.7794, time 126.74ms
iter 581460: loss 6.1037, time 125.82ms
iter 581470: loss 6.5259, time 125.75ms
iter 581480: loss 6.0369, time 124.82ms
iter 581490: loss 6.3918, time 125.59ms
step 581500: train loss 5.5579, val loss 5.5321
saving checkpoint to out-shakespeare-char
iter 581500: loss 6.5219, time 2903.12ms
iter 581510: loss 5.2765, time 125.76ms
iter 581520: loss 5.9443, time 124.67ms
iter 581530: loss 6.0116, time 125.43ms
iter 581540: loss 5.7614, time 125.39ms
iter 581550: loss 6.3595, time 125.67ms
iter 581560: loss 5.9205, time 125.26ms
iter 581570: loss 5.8375, time 125.14ms
iter 581580: loss 6.2260, time 125.48ms
iter 581590: loss 6.0193, time 125.59ms
iter 581600: loss 6.5041, time 126.75ms
iter 581610: loss 6.3447, time 125.37ms
iter 581620: loss 5.5801, time 125.49ms
iter 581630: loss 5.4272, time 125.16ms
iter 581640: loss 5.7954, time 124.57ms
iter 581650: loss 5.3461, time 125.46ms
iter 581660: loss 5.6601, time 125.49ms
iter 581670: loss 5.9414, time 125.29ms
iter 581680: loss 6.2352, time 126.93ms
iter 581690: loss 5.3182, time 121.24ms
iter 581700: loss 6.5292, time 121.48ms
iter 581710: loss 5.9934, time 122.12ms
iter 581720: loss 5.9515, time 121.35ms
iter 581730: loss 6.0646, time 122.42ms
iter 581740: loss 6.0382, time 121.38ms
step 581750: train loss 5.5145, val loss 5.5417
saving checkpoint to out-shakespeare-char
iter 581750: loss 6.2314, time 2897.87ms
iter 581760: loss 5.5171, time 125.51ms
iter 581770: loss 5.9413, time 125.44ms
iter 581780: loss 5.6240, time 126.23ms
iter 581790: loss 6.1829, time 127.15ms
iter 581800: loss 5.8578, time 125.07ms
iter 581810: loss 5.5634, time 125.51ms
iter 581820: loss 5.7813, time 124.91ms
iter 581830: loss 6.1898, time 126.11ms
iter 581840: loss 6.2036, time 124.97ms
iter 581850: loss 6.5280, time 125.40ms
iter 581860: loss 5.9775, time 125.70ms
iter 581870: loss 5.9682, time 125.41ms
iter 581880: loss 6.1117, time 128.86ms
iter 581890: loss 5.6220, time 124.53ms
iter 581900: loss 6.3973, time 126.96ms
iter 581910: loss 5.1836, time 125.81ms
iter 581920: loss 5.9073, time 125.74ms
iter 581930: loss 5.5059, time 125.80ms
iter 581940: loss 6.3108, time 124.33ms
iter 581950: loss 6.0452, time 125.62ms
iter 581960: loss 5.6129, time 125.60ms
iter 581970: loss 5.9342, time 125.84ms
iter 581980: loss 5.9463, time 125.92ms
iter 581990: loss 6.4609, time 126.33ms
step 582000: train loss 5.5576, val loss 5.5294
saving checkpoint to out-shakespeare-char
iter 582000: loss 5.7834, time 2906.04ms
iter 582010: loss 6.1713, time 125.08ms
iter 582020: loss 5.7667, time 124.96ms
iter 582030: loss 5.4064, time 125.01ms
iter 582040: loss 5.9514, time 124.88ms
iter 582050: loss 5.8213, time 125.20ms
iter 582060: loss 6.0370, time 125.00ms
iter 582070: loss 5.8228, time 125.32ms
iter 582080: loss 5.8367, time 124.72ms
iter 582090: loss 6.3972, time 126.60ms
iter 582100: loss 6.3726, time 125.18ms
iter 582110: loss 6.5119, time 126.56ms
iter 582120: loss 6.1289, time 125.14ms
iter 582130: loss 6.3164, time 125.76ms
iter 582140: loss 5.7776, time 125.28ms
iter 582150: loss 5.6847, time 126.56ms
iter 582160: loss 6.1576, time 124.60ms
iter 582170: loss 6.4183, time 125.08ms
iter 582180: loss 5.6711, time 128.41ms
iter 582190: loss 5.4682, time 126.34ms
iter 582200: loss 6.1791, time 124.79ms
iter 582210: loss 6.2080, time 125.55ms
iter 582220: loss 5.9114, time 124.65ms
iter 582230: loss 6.3715, time 124.69ms
iter 582240: loss 6.1205, time 124.96ms
step 582250: train loss 5.5023, val loss 5.5087
saving checkpoint to out-shakespeare-char
iter 582250: loss 5.9189, time 2885.88ms
iter 582260: loss 5.6735, time 125.28ms
iter 582270: loss 6.8336, time 125.29ms
iter 582280: loss 6.1550, time 124.69ms
iter 582290: loss 6.4602, time 126.92ms
iter 582300: loss 6.2949, time 125.21ms
iter 582310: loss 6.6531, time 127.15ms
iter 582320: loss 5.8882, time 125.86ms
iter 582330: loss 6.2194, time 125.65ms
iter 582340: loss 5.4873, time 126.01ms
iter 582350: loss 6.5691, time 125.15ms
iter 582360: loss 6.1104, time 125.74ms
iter 582370: loss 6.0280, time 125.52ms
iter 582380: loss 5.7699, time 125.49ms
iter 582390: loss 5.9175, time 125.38ms
iter 582400: loss 5.6621, time 127.11ms
iter 582410: loss 5.8770, time 126.45ms
iter 582420: loss 6.9563, time 125.62ms
iter 582430: loss 6.8291, time 125.84ms
iter 582440: loss 5.4812, time 125.79ms
iter 582450: loss 5.9752, time 128.06ms
iter 582460: loss 6.4565, time 125.39ms
iter 582470: loss 6.1041, time 126.08ms
iter 582480: loss 7.3129, time 125.71ms
iter 582490: loss 6.3591, time 126.68ms
step 582500: train loss 5.5048, val loss 5.5402
saving checkpoint to out-shakespeare-char
iter 582500: loss 6.0825, time 2922.77ms
iter 582510: loss 6.0285, time 125.53ms
iter 582520: loss 6.5214, time 128.01ms
iter 582530: loss 6.1678, time 125.72ms
iter 582540: loss 5.1417, time 127.90ms
iter 582550: loss 5.8756, time 125.74ms
iter 582560: loss 5.7643, time 128.42ms
iter 582570: loss 6.0218, time 125.53ms
iter 582580: loss 5.8405, time 128.09ms
iter 582590: loss 6.1529, time 123.66ms
iter 582600: loss 6.2195, time 127.94ms
iter 582610: loss 6.2232, time 124.83ms
iter 582620: loss 5.2781, time 130.67ms
iter 582630: loss 5.4295, time 125.80ms
iter 582640: loss 5.7169, time 124.96ms
iter 582650: loss 6.7533, time 124.35ms
iter 582660: loss 5.8638, time 124.91ms
iter 582670: loss 6.4132, time 123.50ms
iter 582680: loss 6.4698, time 124.39ms
iter 582690: loss 6.3279, time 124.94ms
iter 582700: loss 6.1264, time 125.03ms
iter 582710: loss 6.0153, time 124.98ms
iter 582720: loss 6.1148, time 124.86ms
iter 582730: loss 6.4953, time 125.52ms
iter 582740: loss 5.5651, time 124.89ms
step 582750: train loss 5.5132, val loss 5.5580
saving checkpoint to out-shakespeare-char
iter 582750: loss 5.7900, time 2858.44ms
iter 582760: loss 6.4374, time 125.59ms
iter 582770: loss 5.7241, time 125.17ms
iter 582780: loss 5.9299, time 126.04ms
iter 582790: loss 6.3887, time 124.98ms
iter 582800: loss 5.6812, time 125.66ms
iter 582810: loss 5.6134, time 125.10ms
iter 582820: loss 6.1450, time 125.10ms
iter 582830: loss 5.9347, time 124.82ms
iter 582840: loss 5.4944, time 125.63ms
iter 582850: loss 6.2865, time 125.07ms
iter 582860: loss 6.4847, time 125.82ms
iter 582870: loss 5.2106, time 125.77ms
iter 582880: loss 5.8416, time 125.37ms
iter 582890: loss 5.6107, time 125.67ms
iter 582900: loss 6.0356, time 125.66ms
iter 582910: loss 6.8563, time 124.88ms
iter 582920: loss 6.0994, time 125.46ms
iter 582930: loss 5.7920, time 125.56ms
iter 582940: loss 6.0710, time 125.41ms
iter 582950: loss 5.9511, time 126.02ms
iter 582960: loss 6.3364, time 125.45ms
iter 582970: loss 5.9519, time 125.56ms
iter 582980: loss 6.1855, time 126.34ms
iter 582990: loss 5.5757, time 125.02ms
step 583000: train loss 5.5362, val loss 5.4613
saving checkpoint to out-shakespeare-char
iter 583000: loss 6.0424, time 2882.65ms
iter 583010: loss 6.1147, time 124.90ms
iter 583020: loss 5.7042, time 125.14ms
iter 583030: loss 5.8923, time 125.62ms
iter 583040: loss 6.1470, time 124.88ms
iter 583050: loss 5.7536, time 124.88ms
iter 583060: loss 6.2311, time 124.75ms
iter 583070: loss 6.1367, time 124.95ms
iter 583080: loss 5.7100, time 125.16ms
iter 583090: loss 6.3616, time 125.21ms
iter 583100: loss 6.8529, time 124.79ms
iter 583110: loss 6.0893, time 125.04ms
iter 583120: loss 5.1414, time 123.71ms
iter 583130: loss 5.5613, time 125.01ms
iter 583140: loss 5.9031, time 124.98ms
iter 583150: loss 5.5887, time 124.18ms
iter 583160: loss 5.8621, time 125.18ms
iter 583170: loss 6.1608, time 125.11ms
iter 583180: loss 5.8758, time 124.89ms
iter 583190: loss 5.9912, time 125.24ms
iter 583200: loss 6.0151, time 123.82ms
iter 583210: loss 6.5474, time 124.91ms
iter 583220: loss 5.6222, time 124.85ms
iter 583230: loss 6.0263, time 124.73ms
iter 583240: loss 6.0643, time 125.27ms
step 583250: train loss 5.5040, val loss 5.6123
saving checkpoint to out-shakespeare-char
iter 583250: loss 6.3229, time 2874.65ms
iter 583260: loss 6.4286, time 125.14ms
iter 583270: loss 6.4918, time 124.99ms
iter 583280: loss 5.7973, time 124.82ms
iter 583290: loss 6.3032, time 124.07ms
iter 583300: loss 5.3095, time 124.64ms
iter 583310: loss 5.7570, time 124.97ms
iter 583320: loss 6.4048, time 125.07ms
iter 583330: loss 5.7415, time 124.89ms
iter 583340: loss 5.9111, time 125.00ms
iter 583350: loss 6.6476, time 125.19ms
iter 583360: loss 5.9915, time 124.06ms
iter 583370: loss 5.1125, time 125.22ms
iter 583380: loss 6.4218, time 125.14ms
iter 583390: loss 5.7782, time 125.07ms
iter 583400: loss 6.2388, time 125.25ms
iter 583410: loss 5.6319, time 124.86ms
iter 583420: loss 5.3857, time 124.93ms
iter 583430: loss 6.3721, time 126.43ms
iter 583440: loss 5.6753, time 124.92ms
iter 583450: loss 6.3696, time 126.18ms
iter 583460: loss 5.5018, time 125.48ms
iter 583470: loss 5.8978, time 126.17ms
iter 583480: loss 5.8618, time 125.79ms
iter 583490: loss 6.2249, time 124.78ms
step 583500: train loss 5.5369, val loss 5.5124
saving checkpoint to out-shakespeare-char
iter 583500: loss 5.7815, time 2892.51ms
iter 583510: loss 6.3539, time 124.45ms
iter 583520: loss 5.6662, time 125.22ms
iter 583530: loss 6.4800, time 124.56ms
iter 583540: loss 5.8897, time 125.28ms
iter 583550: loss 6.6326, time 124.51ms
iter 583560: loss 6.3941, time 126.08ms
iter 583570: loss 6.5247, time 124.87ms
iter 583580: loss 5.4361, time 126.07ms
iter 583590: loss 5.5523, time 125.04ms
iter 583600: loss 5.5354, time 125.43ms
iter 583610: loss 6.2601, time 124.30ms
iter 583620: loss 5.2292, time 125.89ms
iter 583630: loss 5.7004, time 125.37ms
iter 583640: loss 5.9661, time 125.95ms
iter 583650: loss 6.4279, time 126.03ms
iter 583660: loss 6.0093, time 126.33ms
iter 583670: loss 5.6616, time 126.43ms
iter 583680: loss 7.2885, time 125.72ms
iter 583690: loss 5.8723, time 128.51ms
iter 583700: loss 6.1634, time 125.72ms
iter 583710: loss 5.7713, time 128.20ms
iter 583720: loss 6.8842, time 125.43ms
iter 583730: loss 5.7521, time 128.53ms
iter 583740: loss 6.1176, time 125.12ms
step 583750: train loss 5.5501, val loss 5.4926
saving checkpoint to out-shakespeare-char
iter 583750: loss 6.0737, time 2898.23ms
iter 583760: loss 6.3021, time 125.89ms
iter 583770: loss 5.6474, time 126.68ms
iter 583780: loss 5.3814, time 126.46ms
iter 583790: loss 6.6167, time 125.85ms
iter 583800: loss 6.5197, time 126.06ms
iter 583810: loss 6.0200, time 125.57ms
iter 583820: loss 5.9341, time 125.60ms
iter 583830: loss 5.8608, time 127.10ms
iter 583840: loss 5.3126, time 126.05ms
iter 583850: loss 6.0461, time 125.42ms
iter 583860: loss 6.2609, time 124.95ms
iter 583870: loss 5.8008, time 125.34ms
iter 583880: loss 5.7594, time 125.51ms
iter 583890: loss 5.6677, time 125.79ms
iter 583900: loss 5.8222, time 125.16ms
iter 583910: loss 6.1459, time 124.98ms
iter 583920: loss 6.5048, time 125.45ms
iter 583930: loss 6.4987, time 125.33ms
iter 583940: loss 6.1368, time 125.59ms
iter 583950: loss 5.7360, time 125.33ms
iter 583960: loss 6.1878, time 125.37ms
iter 583970: loss 6.5973, time 125.37ms
iter 583980: loss 5.7314, time 125.15ms
iter 583990: loss 6.4010, time 125.21ms
step 584000: train loss 5.5381, val loss 5.5305
saving checkpoint to out-shakespeare-char
iter 584000: loss 6.1530, time 2884.00ms
iter 584010: loss 6.2366, time 124.97ms
iter 584020: loss 5.9049, time 125.15ms
iter 584030: loss 5.8305, time 124.62ms
iter 584040: loss 5.6403, time 125.50ms
iter 584050: loss 5.9221, time 125.55ms
iter 584060: loss 5.4521, time 125.59ms
iter 584070: loss 5.2732, time 125.81ms
iter 584080: loss 5.1616, time 126.00ms
iter 584090: loss 6.1576, time 126.60ms
iter 584100: loss 6.0082, time 125.34ms
iter 584110: loss 6.5774, time 125.91ms
iter 584120: loss 5.3568, time 124.68ms
iter 584130: loss 5.7101, time 124.07ms
iter 584140: loss 5.8122, time 124.74ms
iter 584150: loss 6.0373, time 123.63ms
iter 584160: loss 6.4836, time 124.87ms
iter 584170: loss 5.8168, time 124.17ms
iter 584180: loss 6.0976, time 124.25ms
iter 584190: loss 6.3147, time 124.50ms
iter 584200: loss 5.4695, time 124.86ms
iter 584210: loss 6.0983, time 124.03ms
iter 584220: loss 6.1420, time 124.63ms
iter 584230: loss 5.9456, time 125.17ms
iter 584240: loss 6.0529, time 125.17ms
step 584250: train loss 5.5161, val loss 5.5138
saving checkpoint to out-shakespeare-char
iter 584250: loss 5.4529, time 2880.22ms
iter 584260: loss 6.2706, time 123.12ms
iter 584270: loss 6.4791, time 121.43ms
iter 584280: loss 5.5883, time 121.34ms
iter 584290: loss 6.2396, time 124.48ms
iter 584300: loss 5.8248, time 121.63ms
iter 584310: loss 6.0227, time 121.43ms
iter 584320: loss 4.9567, time 121.37ms
iter 584330: loss 5.8354, time 119.81ms
iter 584340: loss 6.7028, time 120.06ms
iter 584350: loss 5.5639, time 120.68ms
iter 584360: loss 5.5302, time 121.07ms
iter 584370: loss 5.8199, time 119.50ms
iter 584380: loss 5.5231, time 120.91ms
iter 584390: loss 6.4280, time 119.50ms
iter 584400: loss 5.9000, time 120.98ms
iter 584410: loss 5.8473, time 119.54ms
iter 584420: loss 7.1399, time 119.62ms
iter 584430: loss 6.4397, time 119.61ms
iter 584440: loss 6.8218, time 120.52ms
iter 584450: loss 5.8068, time 121.53ms
iter 584460: loss 6.2953, time 120.56ms
iter 584470: loss 6.3664, time 119.48ms
iter 584480: loss 5.7366, time 119.51ms
iter 584490: loss 5.3009, time 121.06ms
step 584500: train loss 5.5343, val loss 5.5016
saving checkpoint to out-shakespeare-char
iter 584500: loss 6.2440, time 2885.11ms
iter 584510: loss 6.0247, time 124.11ms
iter 584520: loss 6.6870, time 121.37ms
iter 584530: loss 5.7222, time 122.67ms
iter 584540: loss 5.6302, time 121.77ms
iter 584550: loss 5.9683, time 122.41ms
iter 584560: loss 6.2080, time 122.77ms
iter 584570: loss 5.4181, time 121.74ms
iter 584580: loss 5.7388, time 121.45ms
iter 584590: loss 6.5701, time 123.44ms
iter 584600: loss 6.0907, time 121.10ms
iter 584610: loss 5.2588, time 122.58ms
iter 584620: loss 6.0780, time 124.10ms
iter 584630: loss 5.4531, time 122.71ms
iter 584640: loss 6.1087, time 121.38ms
iter 584650: loss 6.2574, time 121.33ms
iter 584660: loss 5.6692, time 121.96ms
iter 584670: loss 6.3671, time 121.95ms
iter 584680: loss 5.6925, time 121.44ms
iter 584690: loss 5.9905, time 121.46ms
iter 584700: loss 5.5358, time 121.57ms
iter 584710: loss 5.5920, time 121.41ms
iter 584720: loss 5.7659, time 121.53ms
iter 584730: loss 6.2047, time 122.84ms
iter 584740: loss 6.4467, time 121.40ms
step 584750: train loss 5.4967, val loss 5.5480
saving checkpoint to out-shakespeare-char
iter 584750: loss 5.5905, time 2879.78ms
iter 584760: loss 6.5418, time 121.57ms
iter 584770: loss 5.8634, time 121.48ms
iter 584780: loss 5.2757, time 121.20ms
iter 584790: loss 6.7380, time 122.60ms
iter 584800: loss 5.5970, time 124.21ms
iter 584810: loss 5.4173, time 122.20ms
iter 584820: loss 5.7490, time 121.68ms
iter 584830: loss 6.0389, time 122.86ms
iter 584840: loss 5.9493, time 121.46ms
iter 584850: loss 5.9579, time 120.80ms
iter 584860: loss 5.6745, time 121.85ms
iter 584870: loss 5.2990, time 121.60ms
iter 584880: loss 6.3559, time 122.63ms
iter 584890: loss 6.0524, time 121.51ms
iter 584900: loss 5.8991, time 121.54ms
iter 584910: loss 6.5497, time 122.95ms
iter 584920: loss 5.7932, time 121.32ms
iter 584930: loss 6.4112, time 121.35ms
iter 584940: loss 6.1055, time 124.39ms
iter 584950: loss 6.5326, time 121.92ms
iter 584960: loss 5.7595, time 121.52ms
iter 584970: loss 5.8295, time 121.88ms
iter 584980: loss 6.0422, time 122.53ms
iter 584990: loss 6.6146, time 122.05ms
step 585000: train loss 5.5153, val loss 5.5066
saving checkpoint to out-shakespeare-char
iter 585000: loss 5.8058, time 2889.35ms
iter 585010: loss 5.9119, time 124.08ms
iter 585020: loss 5.8813, time 121.49ms
iter 585030: loss 5.7337, time 121.35ms
iter 585040: loss 5.5755, time 121.65ms
iter 585050: loss 5.9974, time 122.40ms
iter 585060: loss 5.7629, time 121.69ms
iter 585070: loss 6.1487, time 121.38ms
iter 585080: loss 5.7276, time 121.50ms
iter 585090: loss 5.1677, time 122.65ms
iter 585100: loss 6.2847, time 120.85ms
iter 585110: loss 6.8462, time 121.28ms
iter 585120: loss 6.4054, time 121.33ms
iter 585130: loss 6.3983, time 122.04ms
iter 585140: loss 5.9087, time 121.07ms
iter 585150: loss 5.1969, time 121.33ms
iter 585160: loss 6.3524, time 121.05ms
iter 585170: loss 6.2658, time 122.45ms
iter 585180: loss 5.9857, time 121.62ms
iter 585190: loss 6.4491, time 122.23ms
iter 585200: loss 6.5684, time 125.15ms
iter 585210: loss 5.9849, time 121.71ms
iter 585220: loss 5.9314, time 121.59ms
iter 585230: loss 6.0529, time 121.20ms
iter 585240: loss 5.8963, time 121.27ms
step 585250: train loss 5.5309, val loss 5.5766
saving checkpoint to out-shakespeare-char
iter 585250: loss 6.2735, time 2884.91ms
iter 585260: loss 5.4967, time 121.64ms
iter 585270: loss 5.6337, time 123.99ms
iter 585280: loss 5.9195, time 121.15ms
iter 585290: loss 5.8368, time 121.31ms
iter 585300: loss 5.2456, time 121.67ms
iter 585310: loss 5.7892, time 121.27ms
iter 585320: loss 6.2291, time 121.45ms
iter 585330: loss 5.6464, time 121.28ms
iter 585340: loss 6.2053, time 122.38ms
iter 585350: loss 5.4475, time 122.19ms
iter 585360: loss 5.8232, time 121.26ms
iter 585370: loss 6.4046, time 121.44ms
iter 585380: loss 5.3151, time 121.15ms
iter 585390: loss 6.2107, time 121.19ms
iter 585400: loss 5.6465, time 121.02ms
iter 585410: loss 5.9454, time 122.26ms
iter 585420: loss 6.2487, time 121.13ms
iter 585430: loss 6.5876, time 121.21ms
iter 585440: loss 6.3789, time 122.35ms
iter 585450: loss 6.0221, time 121.20ms
iter 585460: loss 6.1764, time 121.47ms
iter 585470: loss 5.9459, time 124.44ms
iter 585480: loss 5.6979, time 121.71ms
iter 585490: loss 5.9176, time 121.14ms
step 585500: train loss 5.5308, val loss 5.5465
saving checkpoint to out-shakespeare-char
iter 585500: loss 5.7951, time 2902.81ms
iter 585510: loss 6.0192, time 125.62ms
iter 585520: loss 5.7530, time 125.86ms
iter 585530: loss 5.8796, time 125.56ms
iter 585540: loss 6.2343, time 125.51ms
iter 585550: loss 5.6025, time 126.12ms
iter 585560: loss 5.8219, time 126.03ms
iter 585570: loss 5.9278, time 125.00ms
iter 585580: loss 6.1300, time 124.76ms
iter 585590: loss 5.9632, time 124.85ms
iter 585600: loss 5.6979, time 124.54ms
iter 585610: loss 6.1505, time 124.61ms
iter 585620: loss 5.7850, time 125.19ms
iter 585630: loss 6.3607, time 125.72ms
iter 585640: loss 6.1346, time 124.94ms
iter 585650: loss 6.5930, time 124.65ms
iter 585660: loss 5.3718, time 124.89ms
iter 585670: loss 6.3736, time 125.10ms
iter 585680: loss 5.6547, time 124.90ms
iter 585690: loss 6.4086, time 125.23ms
iter 585700: loss 6.7700, time 125.38ms
iter 585710: loss 5.9480, time 125.33ms
iter 585720: loss 6.0585, time 125.43ms
iter 585730: loss 4.8720, time 125.16ms
iter 585740: loss 5.7442, time 125.20ms
step 585750: train loss 5.5166, val loss 5.5467
saving checkpoint to out-shakespeare-char
iter 585750: loss 5.4888, time 2901.81ms
iter 585760: loss 5.8599, time 125.47ms
iter 585770: loss 5.9386, time 125.32ms
iter 585780: loss 6.3336, time 125.62ms
iter 585790: loss 6.4101, time 125.64ms
iter 585800: loss 6.4610, time 125.92ms
iter 585810: loss 5.8101, time 125.04ms
iter 585820: loss 5.5641, time 126.27ms
iter 585830: loss 6.1050, time 125.24ms
iter 585840: loss 6.1346, time 125.05ms
iter 585850: loss 5.8473, time 125.15ms
iter 585860: loss 5.6551, time 125.61ms
iter 585870: loss 6.4577, time 125.18ms
iter 585880: loss 6.2250, time 125.20ms
iter 585890: loss 5.2952, time 124.23ms
iter 585900: loss 6.6574, time 123.98ms
iter 585910: loss 6.0144, time 124.61ms
iter 585920: loss 6.2649, time 124.14ms
iter 585930: loss 5.2478, time 125.21ms
iter 585940: loss 6.2648, time 125.38ms
iter 585950: loss 5.7735, time 124.01ms
iter 585960: loss 5.9887, time 125.87ms
iter 585970: loss 6.1684, time 125.38ms
iter 585980: loss 6.3979, time 125.39ms
iter 585990: loss 6.3468, time 125.39ms
step 586000: train loss 5.5149, val loss 5.5231
saving checkpoint to out-shakespeare-char
iter 586000: loss 6.0440, time 2880.33ms
iter 586010: loss 5.7601, time 125.61ms
iter 586020: loss 6.0594, time 125.82ms
iter 586030: loss 6.2773, time 125.45ms
iter 586040: loss 5.9584, time 125.59ms
iter 586050: loss 5.8752, time 125.30ms
iter 586060: loss 5.8149, time 125.68ms
iter 586070: loss 5.9214, time 125.57ms
iter 586080: loss 5.7999, time 125.80ms
iter 586090: loss 5.1785, time 125.65ms
iter 586100: loss 6.3095, time 126.01ms
iter 586110: loss 5.6739, time 125.96ms
iter 586120: loss 6.5118, time 125.88ms
iter 586130: loss 5.8747, time 125.63ms
iter 586140: loss 6.1737, time 125.73ms
iter 586150: loss 5.7472, time 125.13ms
iter 586160: loss 5.5580, time 124.94ms
iter 586170: loss 6.0487, time 125.64ms
iter 586180: loss 5.8351, time 125.39ms
iter 586190: loss 6.5852, time 125.27ms
iter 586200: loss 5.3057, time 126.03ms
iter 586210: loss 6.0153, time 125.65ms
iter 586220: loss 5.8634, time 125.70ms
iter 586230: loss 5.6907, time 124.57ms
iter 586240: loss 5.2017, time 125.53ms
step 586250: train loss 5.5527, val loss 5.5414
saving checkpoint to out-shakespeare-char
iter 586250: loss 5.7489, time 2883.81ms
iter 586260: loss 5.5376, time 125.59ms
iter 586270: loss 5.6867, time 125.45ms
iter 586280: loss 6.0628, time 125.48ms
iter 586290: loss 6.0681, time 125.75ms
iter 586300: loss 5.6433, time 125.19ms
iter 586310: loss 5.3195, time 129.15ms
iter 586320: loss 5.6917, time 127.31ms
iter 586330: loss 6.0609, time 126.34ms
iter 586340: loss 6.1902, time 126.47ms
iter 586350: loss 6.2920, time 125.63ms
iter 586360: loss 5.6237, time 125.73ms
iter 586370: loss 6.4129, time 125.45ms
iter 586380: loss 6.2037, time 125.72ms
iter 586390: loss 6.2842, time 126.08ms
iter 586400: loss 6.4989, time 125.32ms
iter 586410: loss 5.9589, time 125.88ms
iter 586420: loss 5.6166, time 125.76ms
iter 586430: loss 5.5035, time 126.02ms
iter 586440: loss 6.0695, time 125.60ms
iter 586450: loss 6.6833, time 124.48ms
iter 586460: loss 5.1720, time 125.76ms
iter 586470: loss 5.6188, time 125.73ms
iter 586480: loss 5.8865, time 125.88ms
iter 586490: loss 6.0420, time 125.58ms
step 586500: train loss 5.4787, val loss 5.5456
saving checkpoint to out-shakespeare-char
iter 586500: loss 5.6947, time 2899.71ms
iter 586510: loss 5.8485, time 125.27ms
iter 586520: loss 6.3270, time 126.03ms
iter 586530: loss 5.8843, time 124.67ms
iter 586540: loss 6.2879, time 125.74ms
iter 586550: loss 5.9178, time 124.99ms
iter 586560: loss 5.3981, time 124.98ms
iter 586570: loss 6.0379, time 125.53ms
iter 586580: loss 6.4334, time 125.50ms
iter 586590: loss 6.3458, time 125.95ms
iter 586600: loss 5.7171, time 126.23ms
iter 586610: loss 5.0138, time 127.20ms
iter 586620: loss 6.1221, time 125.84ms
iter 586630: loss 6.1550, time 125.21ms
iter 586640: loss 5.3916, time 126.25ms
iter 586650: loss 5.9966, time 125.42ms
iter 586660: loss 6.3614, time 124.65ms
iter 586670: loss 5.9796, time 125.51ms
iter 586680: loss 5.6606, time 125.26ms
iter 586690: loss 5.9262, time 124.86ms
iter 586700: loss 6.1401, time 125.44ms
iter 586710: loss 6.2615, time 125.14ms
iter 586720: loss 5.2829, time 125.41ms
iter 586730: loss 5.9781, time 125.13ms
iter 586740: loss 6.7375, time 125.28ms
step 586750: train loss 5.5106, val loss 5.5347
saving checkpoint to out-shakespeare-char
iter 586750: loss 5.8049, time 2884.42ms
iter 586760: loss 5.9838, time 125.37ms
iter 586770: loss 5.5928, time 125.46ms
iter 586780: loss 5.9981, time 125.38ms
iter 586790: loss 5.2820, time 125.02ms
iter 586800: loss 6.6025, time 125.89ms
iter 586810: loss 5.1191, time 125.49ms
iter 586820: loss 5.9742, time 125.57ms
iter 586830: loss 6.3014, time 125.37ms
iter 586840: loss 5.3060, time 125.65ms
iter 586850: loss 5.4992, time 125.47ms
iter 586860: loss 5.9955, time 125.62ms
iter 586870: loss 6.0706, time 125.82ms
iter 586880: loss 6.2412, time 125.83ms
iter 586890: loss 6.0305, time 125.39ms
iter 586900: loss 6.1741, time 125.74ms
iter 586910: loss 6.5764, time 125.66ms
iter 586920: loss 6.2271, time 125.68ms
iter 586930: loss 5.9970, time 125.41ms
iter 586940: loss 6.2439, time 123.19ms
iter 586950: loss 4.7542, time 121.52ms
iter 586960: loss 5.8444, time 121.45ms
iter 586970: loss 6.2809, time 124.08ms
iter 586980: loss 6.6303, time 121.54ms
iter 586990: loss 5.2502, time 121.37ms
step 587000: train loss 5.5704, val loss 5.5162
saving checkpoint to out-shakespeare-char
iter 587000: loss 5.4143, time 2896.75ms
iter 587010: loss 5.4401, time 122.68ms
iter 587020: loss 5.7233, time 121.37ms
iter 587030: loss 6.3952, time 121.75ms
iter 587040: loss 5.5938, time 124.20ms
iter 587050: loss 6.3802, time 121.86ms
iter 587060: loss 6.5718, time 121.46ms
iter 587070: loss 5.9241, time 122.22ms
iter 587080: loss 5.6828, time 121.35ms
iter 587090: loss 6.2267, time 121.53ms
iter 587100: loss 5.7458, time 121.53ms
iter 587110: loss 5.6082, time 122.52ms
iter 587120: loss 5.3956, time 121.34ms
iter 587130: loss 6.0661, time 121.58ms
iter 587140: loss 5.9227, time 125.53ms
iter 587150: loss 6.7218, time 121.32ms
iter 587160: loss 5.5460, time 121.59ms
iter 587170: loss 6.1562, time 121.42ms
iter 587180: loss 6.4745, time 120.75ms
iter 587190: loss 5.8922, time 121.58ms
iter 587200: loss 5.8788, time 121.50ms
iter 587210: loss 6.7105, time 122.47ms
iter 587220: loss 5.6727, time 121.40ms
iter 587230: loss 5.9396, time 121.55ms
iter 587240: loss 6.1401, time 123.99ms
step 587250: train loss 5.4827, val loss 5.5575
saving checkpoint to out-shakespeare-char
iter 587250: loss 5.8792, time 2891.83ms
iter 587260: loss 5.7379, time 121.53ms
iter 587270: loss 6.0374, time 120.97ms
iter 587280: loss 5.9504, time 124.87ms
iter 587290: loss 5.3284, time 121.10ms
iter 587300: loss 6.3312, time 121.38ms
iter 587310: loss 6.0398, time 121.38ms
iter 587320: loss 5.6192, time 121.47ms
iter 587330: loss 6.0540, time 121.55ms
iter 587340: loss 6.3454, time 121.33ms
iter 587350: loss 5.7762, time 122.94ms
iter 587360: loss 6.3252, time 121.53ms
iter 587370: loss 5.4385, time 121.58ms
iter 587380: loss 5.9898, time 122.52ms
iter 587390: loss 5.8933, time 126.50ms
iter 587400: loss 4.9023, time 125.47ms
iter 587410: loss 6.4948, time 125.54ms
iter 587420: loss 5.5582, time 125.51ms
iter 587430: loss 6.3933, time 125.87ms
iter 587440: loss 5.5817, time 125.49ms
iter 587450: loss 5.8129, time 125.40ms
iter 587460: loss 6.1526, time 125.67ms
iter 587470: loss 5.5855, time 125.07ms
iter 587480: loss 5.5236, time 125.89ms
iter 587490: loss 5.8594, time 125.59ms
step 587500: train loss 5.5273, val loss 5.5203
saving checkpoint to out-shakespeare-char
iter 587500: loss 6.5662, time 2886.37ms
iter 587510: loss 5.7000, time 124.94ms
iter 587520: loss 5.9229, time 124.32ms
iter 587530: loss 5.5507, time 125.23ms
iter 587540: loss 5.5942, time 125.52ms
iter 587550: loss 5.5977, time 125.24ms
iter 587560: loss 6.1061, time 124.70ms
iter 587570: loss 6.3523, time 124.00ms
iter 587580: loss 5.8725, time 125.14ms
iter 587590: loss 5.8554, time 124.23ms
iter 587600: loss 6.0094, time 125.06ms
iter 587610: loss 6.2906, time 123.29ms
iter 587620: loss 6.2399, time 124.97ms
iter 587630: loss 6.3547, time 124.74ms
iter 587640: loss 5.9329, time 124.83ms
iter 587650: loss 5.9340, time 124.62ms
iter 587660: loss 5.4495, time 124.84ms
iter 587670: loss 6.2202, time 124.33ms
iter 587680: loss 6.8047, time 124.91ms
iter 587690: loss 5.9825, time 123.50ms
iter 587700: loss 5.6106, time 125.13ms
iter 587710: loss 5.6353, time 124.63ms
iter 587720: loss 6.0948, time 124.83ms
iter 587730: loss 5.7373, time 124.72ms
iter 587740: loss 6.5787, time 124.99ms
step 587750: train loss 5.4907, val loss 5.5227
saving checkpoint to out-shakespeare-char
iter 587750: loss 5.9847, time 2906.02ms
iter 587760: loss 6.2285, time 125.44ms
iter 587770: loss 6.5312, time 127.55ms
iter 587780: loss 6.3338, time 125.64ms
iter 587790: loss 5.7208, time 128.48ms
iter 587800: loss 5.8439, time 126.02ms
iter 587810: loss 6.2961, time 128.10ms
iter 587820: loss 6.5794, time 125.94ms
iter 587830: loss 6.4686, time 128.29ms
iter 587840: loss 5.5781, time 126.16ms
iter 587850: loss 4.8407, time 128.24ms
iter 587860: loss 5.8347, time 125.49ms
iter 587870: loss 6.4254, time 128.44ms
iter 587880: loss 6.3780, time 125.65ms
iter 587890: loss 6.9327, time 128.12ms
iter 587900: loss 6.0259, time 124.96ms
iter 587910: loss 5.2570, time 127.38ms
iter 587920: loss 6.0148, time 125.79ms
iter 587930: loss 5.5609, time 128.60ms
iter 587940: loss 6.6008, time 125.49ms
iter 587950: loss 6.0590, time 128.31ms
iter 587960: loss 6.1010, time 125.59ms
iter 587970: loss 6.2004, time 128.35ms
iter 587980: loss 6.2423, time 125.80ms
iter 587990: loss 6.3832, time 128.44ms
step 588000: train loss 5.5348, val loss 5.5408
saving checkpoint to out-shakespeare-char
iter 588000: loss 5.4713, time 2890.44ms
iter 588010: loss 5.3309, time 125.64ms
iter 588020: loss 5.9088, time 125.63ms
iter 588030: loss 5.9192, time 125.69ms
iter 588040: loss 5.3464, time 125.76ms
iter 588050: loss 5.7879, time 128.26ms
iter 588060: loss 5.6771, time 126.46ms
iter 588070: loss 5.7137, time 128.27ms
iter 588080: loss 5.8423, time 125.69ms
iter 588090: loss 6.0566, time 128.50ms
iter 588100: loss 5.7420, time 125.66ms
iter 588110: loss 6.2444, time 128.38ms
iter 588120: loss 6.0963, time 125.66ms
iter 588130: loss 5.5114, time 128.33ms
iter 588140: loss 6.2910, time 126.04ms
iter 588150: loss 6.3827, time 128.17ms
iter 588160: loss 6.3653, time 125.49ms
iter 588170: loss 6.0624, time 128.19ms
iter 588180: loss 6.1155, time 125.85ms
iter 588190: loss 6.5917, time 125.58ms
iter 588200: loss 6.0097, time 125.81ms
iter 588210: loss 5.6341, time 125.92ms
iter 588220: loss 5.2021, time 125.51ms
iter 588230: loss 5.6756, time 123.88ms
iter 588240: loss 5.1760, time 125.64ms
step 588250: train loss 5.4650, val loss 5.5598
saving checkpoint to out-shakespeare-char
iter 588250: loss 5.5716, time 2878.89ms
iter 588260: loss 5.8569, time 125.48ms
iter 588270: loss 6.1525, time 126.21ms
iter 588280: loss 5.5626, time 125.94ms
iter 588290: loss 6.0152, time 126.02ms
iter 588300: loss 6.0316, time 125.53ms
iter 588310: loss 6.2371, time 125.82ms
iter 588320: loss 6.4560, time 125.61ms
iter 588330: loss 5.8573, time 126.08ms
iter 588340: loss 5.4722, time 125.67ms
iter 588350: loss 6.5282, time 126.08ms
iter 588360: loss 5.7264, time 126.00ms
iter 588370: loss 6.4444, time 125.59ms
iter 588380: loss 5.8887, time 125.59ms
iter 588390: loss 5.4200, time 125.84ms
iter 588400: loss 6.2575, time 125.72ms
iter 588410: loss 6.0120, time 125.85ms
iter 588420: loss 6.7367, time 125.73ms
iter 588430: loss 5.8382, time 125.55ms
iter 588440: loss 5.4353, time 125.86ms
iter 588450: loss 5.7752, time 126.23ms
iter 588460: loss 5.9979, time 125.85ms
iter 588470: loss 6.2090, time 125.51ms
iter 588480: loss 5.7770, time 124.93ms
iter 588490: loss 6.3962, time 124.86ms
step 588500: train loss 5.5405, val loss 5.5585
saving checkpoint to out-shakespeare-char
iter 588500: loss 5.7651, time 2890.15ms
iter 588510: loss 6.3438, time 125.92ms
iter 588520: loss 5.6598, time 125.70ms
iter 588530: loss 5.6893, time 125.80ms
iter 588540: loss 5.9983, time 125.68ms
iter 588550: loss 5.9731, time 125.97ms
iter 588560: loss 6.1251, time 125.71ms
iter 588570: loss 5.3745, time 125.66ms
iter 588580: loss 5.8113, time 125.58ms
iter 588590: loss 5.6766, time 125.95ms
iter 588600: loss 5.9494, time 125.95ms
iter 588610: loss 5.9785, time 125.71ms
iter 588620: loss 5.7111, time 125.74ms
iter 588630: loss 6.0736, time 125.53ms
iter 588640: loss 5.9579, time 125.57ms
iter 588650: loss 5.6406, time 125.82ms
iter 588660: loss 6.2329, time 125.80ms
iter 588670: loss 5.7323, time 125.74ms
iter 588680: loss 5.5602, time 125.99ms
iter 588690: loss 5.9369, time 124.98ms
iter 588700: loss 6.0507, time 125.82ms
iter 588710: loss 6.4072, time 125.94ms
iter 588720: loss 6.0343, time 125.74ms
iter 588730: loss 5.6343, time 125.71ms
iter 588740: loss 5.7072, time 128.50ms
step 588750: train loss 5.5522, val loss 5.5372
saving checkpoint to out-shakespeare-char
iter 588750: loss 5.6806, time 2889.12ms
iter 588760: loss 5.4768, time 124.49ms
iter 588770: loss 6.2571, time 125.64ms
iter 588780: loss 6.3740, time 125.91ms
iter 588790: loss 5.7793, time 125.82ms
iter 588800: loss 6.4553, time 125.35ms
iter 588810: loss 5.7559, time 125.67ms
iter 588820: loss 5.0775, time 125.50ms
iter 588830: loss 5.6159, time 125.77ms
iter 588840: loss 6.0683, time 125.37ms
iter 588850: loss 5.3378, time 125.99ms
iter 588860: loss 5.7451, time 125.06ms
iter 588870: loss 5.6195, time 125.63ms
iter 588880: loss 6.4753, time 125.61ms
iter 588890: loss 5.6615, time 125.30ms
iter 588900: loss 6.0188, time 125.32ms
iter 588910: loss 5.9722, time 126.20ms
iter 588920: loss 5.3823, time 125.29ms
iter 588930: loss 6.4142, time 125.48ms
iter 588940: loss 5.6690, time 125.19ms
iter 588950: loss 6.0056, time 125.67ms
iter 588960: loss 5.9244, time 125.79ms
iter 588970: loss 5.8857, time 125.24ms
iter 588980: loss 5.7253, time 125.57ms
iter 588990: loss 6.0265, time 125.39ms
step 589000: train loss 5.5417, val loss 5.5278
saving checkpoint to out-shakespeare-char
iter 589000: loss 5.7793, time 2896.91ms
iter 589010: loss 5.6995, time 125.78ms
iter 589020: loss 5.7399, time 125.61ms
iter 589030: loss 5.8289, time 125.63ms
iter 589040: loss 5.9331, time 125.36ms
iter 589050: loss 6.7804, time 125.67ms
iter 589060: loss 6.3238, time 125.37ms
iter 589070: loss 5.7128, time 125.26ms
iter 589080: loss 6.3536, time 124.32ms
iter 589090: loss 6.2435, time 126.23ms
iter 589100: loss 5.5644, time 125.82ms
iter 589110: loss 6.3654, time 125.06ms
iter 589120: loss 6.0133, time 125.79ms
iter 589130: loss 4.9333, time 125.55ms
iter 589140: loss 5.6811, time 125.95ms
iter 589150: loss 6.0286, time 126.03ms
iter 589160: loss 6.3484, time 125.65ms
iter 589170: loss 7.1304, time 125.38ms
iter 589180: loss 6.1871, time 125.61ms
iter 589190: loss 6.2024, time 125.02ms
iter 589200: loss 6.1113, time 125.69ms
iter 589210: loss 5.4304, time 125.06ms
iter 589220: loss 5.6510, time 125.31ms
iter 589230: loss 6.6700, time 125.31ms
iter 589240: loss 6.0606, time 126.17ms
step 589250: train loss 5.5348, val loss 5.5520
saving checkpoint to out-shakespeare-char
iter 589250: loss 5.6128, time 2892.65ms
iter 589260: loss 6.0431, time 125.69ms
iter 589270: loss 5.8421, time 125.33ms
iter 589280: loss 5.4691, time 125.72ms
iter 589290: loss 4.9890, time 121.50ms
iter 589300: loss 5.3690, time 121.97ms
iter 589310: loss 6.6377, time 121.12ms
iter 589320: loss 6.6411, time 121.82ms
iter 589330: loss 5.7795, time 121.55ms
iter 589340: loss 6.4915, time 121.68ms
iter 589350: loss 5.9728, time 122.06ms
iter 589360: loss 6.5862, time 121.66ms
iter 589370: loss 5.5468, time 121.62ms
iter 589380: loss 5.6763, time 121.06ms
iter 589390: loss 5.6311, time 122.02ms
iter 589400: loss 5.1057, time 121.81ms
iter 589410: loss 5.8197, time 121.68ms
iter 589420: loss 5.9332, time 122.33ms
iter 589430: loss 6.1021, time 121.82ms
iter 589440: loss 6.4008, time 122.27ms
iter 589450: loss 5.7342, time 123.13ms
iter 589460: loss 5.7198, time 121.74ms
iter 589470: loss 6.4082, time 121.73ms
iter 589480: loss 6.0521, time 124.19ms
iter 589490: loss 6.4997, time 121.74ms
step 589500: train loss 5.5487, val loss 5.5001
saving checkpoint to out-shakespeare-char
iter 589500: loss 6.5825, time 2898.97ms
iter 589510: loss 6.3637, time 121.09ms
iter 589520: loss 5.5988, time 124.31ms
iter 589530: loss 6.0296, time 121.94ms
iter 589540: loss 5.6292, time 121.66ms
iter 589550: loss 5.1816, time 121.64ms
iter 589560: loss 5.5785, time 122.21ms
iter 589570: loss 5.3100, time 121.69ms
iter 589580: loss 5.9519, time 121.78ms
iter 589590: loss 6.2056, time 123.17ms
iter 589600: loss 6.1570, time 121.79ms
iter 589610: loss 5.8309, time 122.73ms
iter 589620: loss 5.4899, time 124.40ms
iter 589630: loss 5.2168, time 121.91ms
iter 589640: loss 5.1821, time 121.64ms
iter 589650: loss 5.9855, time 121.67ms
iter 589660: loss 7.0607, time 121.95ms
iter 589670: loss 6.4687, time 123.07ms
iter 589680: loss 5.5089, time 121.70ms
iter 589690: loss 6.1499, time 122.09ms
iter 589700: loss 5.9995, time 121.78ms
iter 589710: loss 6.3792, time 121.77ms
iter 589720: loss 5.5093, time 124.41ms
iter 589730: loss 6.1977, time 121.83ms
iter 589740: loss 5.6555, time 121.68ms
step 589750: train loss 5.5460, val loss 5.5132
saving checkpoint to out-shakespeare-char
iter 589750: loss 5.7222, time 2904.08ms
iter 589760: loss 5.8592, time 121.99ms
iter 589770: loss 5.6843, time 121.71ms
iter 589780: loss 5.4577, time 121.80ms
iter 589790: loss 5.5929, time 122.59ms
iter 589800: loss 6.0330, time 121.17ms
iter 589810: loss 5.8167, time 121.34ms
iter 589820: loss 5.8080, time 124.35ms
iter 589830: loss 5.9275, time 121.47ms
iter 589840: loss 6.7589, time 121.87ms
iter 589850: loss 5.8024, time 121.60ms
iter 589860: loss 5.9693, time 122.92ms
iter 589870: loss 6.2076, time 121.11ms
iter 589880: loss 6.0405, time 122.67ms
iter 589890: loss 5.5863, time 123.19ms
iter 589900: loss 6.4042, time 121.41ms
iter 589910: loss 5.8636, time 121.83ms
iter 589920: loss 5.4149, time 124.16ms
iter 589930: loss 5.8146, time 121.82ms
iter 589940: loss 5.5870, time 121.93ms
iter 589950: loss 5.8059, time 121.56ms
iter 589960: loss 6.7554, time 120.78ms
iter 589970: loss 5.8870, time 121.50ms
iter 589980: loss 6.5162, time 121.67ms
iter 589990: loss 5.9399, time 122.83ms
step 590000: train loss 5.5236, val loss 5.5746
saving checkpoint to out-shakespeare-char
iter 590000: loss 5.0410, time 2893.21ms
iter 590010: loss 5.3662, time 122.16ms
iter 590020: loss 5.5600, time 122.69ms
iter 590030: loss 5.8864, time 121.68ms
iter 590040: loss 5.7310, time 121.42ms
iter 590050: loss 5.8531, time 121.04ms
iter 590060: loss 5.2146, time 122.99ms
iter 590070: loss 5.6865, time 121.58ms
iter 590080: loss 6.0078, time 123.36ms
iter 590090: loss 5.9331, time 121.39ms
iter 590100: loss 5.9056, time 121.70ms
iter 590110: loss 7.1165, time 121.52ms
iter 590120: loss 5.0064, time 121.40ms
iter 590130: loss 5.9088, time 122.65ms
iter 590140: loss 5.9943, time 121.34ms
iter 590150: loss 5.8678, time 121.62ms
iter 590160: loss 6.6313, time 124.17ms
iter 590170: loss 5.7825, time 121.51ms
iter 590180: loss 5.5169, time 121.82ms
iter 590190: loss 5.9140, time 121.50ms
iter 590200: loss 6.3493, time 121.23ms
iter 590210: loss 6.1592, time 121.86ms
iter 590220: loss 6.0484, time 121.32ms
iter 590230: loss 6.1660, time 122.51ms
iter 590240: loss 6.1014, time 121.45ms
step 590250: train loss 5.4766, val loss 5.5245
saving checkpoint to out-shakespeare-char
iter 590250: loss 6.0499, time 2909.52ms
iter 590260: loss 5.9607, time 121.35ms
iter 590270: loss 6.4245, time 121.54ms
iter 590280: loss 6.1786, time 121.25ms
iter 590290: loss 6.5336, time 121.60ms
iter 590300: loss 5.6104, time 122.56ms
iter 590310: loss 5.9490, time 121.79ms
iter 590320: loss 6.0456, time 121.30ms
iter 590330: loss 5.8590, time 122.52ms
iter 590340: loss 5.5250, time 121.43ms
iter 590350: loss 6.0982, time 121.48ms
iter 590360: loss 6.2716, time 124.04ms
iter 590370: loss 6.0175, time 121.46ms
iter 590380: loss 6.1169, time 121.20ms
iter 590390: loss 5.4889, time 121.26ms
iter 590400: loss 6.4171, time 121.59ms
iter 590410: loss 5.7126, time 121.26ms
iter 590420: loss 6.6297, time 121.22ms
iter 590430: loss 6.2852, time 122.53ms
iter 590440: loss 6.2845, time 121.36ms
iter 590450: loss 5.6668, time 121.24ms
iter 590460: loss 5.4449, time 122.65ms
iter 590470: loss 5.5738, time 121.30ms
iter 590480: loss 5.7419, time 121.11ms
iter 590490: loss 5.9114, time 124.33ms
step 590500: train loss 5.4423, val loss 5.4992
saving checkpoint to out-shakespeare-char
iter 590500: loss 5.3373, time 2908.62ms
iter 590510: loss 6.4302, time 121.42ms
iter 590520: loss 6.3114, time 121.40ms
iter 590530: loss 5.9972, time 124.59ms
iter 590540: loss 5.9069, time 121.27ms
iter 590550: loss 5.3911, time 121.16ms
iter 590560: loss 5.9168, time 121.78ms
iter 590570: loss 5.8595, time 121.62ms
iter 590580: loss 5.9660, time 121.18ms
iter 590590: loss 6.3740, time 121.60ms
iter 590600: loss 6.1989, time 122.55ms
iter 590610: loss 5.9689, time 121.31ms
iter 590620: loss 5.8875, time 121.28ms
iter 590630: loss 6.9087, time 122.50ms
iter 590640: loss 5.7827, time 121.26ms
iter 590650: loss 5.8710, time 121.19ms
iter 590660: loss 5.9868, time 124.06ms
iter 590670: loss 5.9738, time 121.37ms
iter 590680: loss 5.9544, time 121.30ms
iter 590690: loss 6.4766, time 121.24ms
iter 590700: loss 6.1565, time 121.48ms
iter 590710: loss 6.0480, time 121.11ms
iter 590720: loss 5.7168, time 121.27ms
iter 590730: loss 6.7254, time 122.87ms
iter 590740: loss 4.9816, time 121.48ms
step 590750: train loss 5.4683, val loss 5.4991
saving checkpoint to out-shakespeare-char
iter 590750: loss 5.9238, time 2895.86ms
iter 590760: loss 5.5770, time 124.21ms
iter 590770: loss 6.0598, time 124.58ms
iter 590780: loss 5.9406, time 124.35ms
iter 590790: loss 5.6932, time 125.12ms
iter 590800: loss 5.7076, time 124.86ms
iter 590810: loss 5.5889, time 125.14ms
iter 590820: loss 6.1347, time 124.52ms
iter 590830: loss 6.6727, time 125.11ms
iter 590840: loss 5.6421, time 124.37ms
iter 590850: loss 5.6312, time 125.17ms
iter 590860: loss 5.9256, time 124.52ms
iter 590870: loss 6.0319, time 125.86ms
iter 590880: loss 6.0791, time 125.68ms
iter 590890: loss 6.4043, time 125.86ms
iter 590900: loss 5.3056, time 125.01ms
iter 590910: loss 6.4442, time 125.62ms
iter 590920: loss 5.8337, time 125.02ms
iter 590930: loss 5.9758, time 125.78ms
iter 590940: loss 5.6855, time 126.01ms
iter 590950: loss 6.3097, time 125.85ms
iter 590960: loss 6.2465, time 125.00ms
iter 590970: loss 6.4394, time 125.84ms
iter 590980: loss 5.5304, time 125.52ms
iter 590990: loss 5.8409, time 125.15ms
step 591000: train loss 5.4937, val loss 5.4898
saving checkpoint to out-shakespeare-char
iter 591000: loss 5.9710, time 2903.50ms
iter 591010: loss 5.9455, time 125.58ms
iter 591020: loss 6.4990, time 128.63ms
iter 591030: loss 5.8291, time 125.77ms
iter 591040: loss 5.7644, time 127.26ms
iter 591050: loss 5.9732, time 124.74ms
iter 591060: loss 6.3469, time 128.20ms
iter 591070: loss 5.9798, time 125.51ms
iter 591080: loss 6.2204, time 127.92ms
iter 591090: loss 5.9952, time 124.97ms
iter 591100: loss 7.0856, time 125.90ms
iter 591110: loss 6.2472, time 124.34ms
iter 591120: loss 6.2264, time 126.94ms
iter 591130: loss 5.6954, time 124.44ms
iter 591140: loss 5.2776, time 126.11ms
iter 591150: loss 5.7746, time 124.39ms
iter 591160: loss 5.3443, time 126.87ms
iter 591170: loss 6.8167, time 124.58ms
iter 591180: loss 6.2354, time 127.30ms
iter 591190: loss 5.3797, time 125.28ms
iter 591200: loss 6.0490, time 127.41ms
iter 591210: loss 5.4080, time 124.78ms
iter 591220: loss 5.7057, time 128.04ms
iter 591230: loss 6.1740, time 125.56ms
iter 591240: loss 5.5690, time 128.07ms
step 591250: train loss 5.5159, val loss 5.4936
saving checkpoint to out-shakespeare-char
iter 591250: loss 5.3443, time 2892.34ms
iter 591260: loss 6.0555, time 127.48ms
iter 591270: loss 6.0608, time 123.91ms
iter 591280: loss 6.4613, time 127.39ms
iter 591290: loss 5.7467, time 125.09ms
iter 591300: loss 5.6778, time 127.26ms
iter 591310: loss 6.4384, time 124.39ms
iter 591320: loss 6.2999, time 127.37ms
iter 591330: loss 5.9628, time 124.91ms
iter 591340: loss 6.2356, time 127.37ms
iter 591350: loss 6.2126, time 124.03ms
iter 591360: loss 5.2561, time 127.59ms
iter 591370: loss 5.2879, time 124.84ms
iter 591380: loss 5.1745, time 126.86ms
iter 591390: loss 6.4390, time 124.23ms
iter 591400: loss 5.9645, time 126.56ms
iter 591410: loss 5.4083, time 124.84ms
iter 591420: loss 6.0229, time 126.97ms
iter 591430: loss 5.2423, time 124.37ms
iter 591440: loss 6.0880, time 124.13ms
iter 591450: loss 5.8464, time 124.88ms
iter 591460: loss 6.1484, time 124.76ms
iter 591470: loss 6.0801, time 124.19ms
iter 591480: loss 6.4437, time 124.11ms
iter 591490: loss 5.5981, time 124.78ms
step 591500: train loss 5.5602, val loss 5.5387
saving checkpoint to out-shakespeare-char
iter 591500: loss 6.0958, time 2916.64ms
iter 591510: loss 5.3640, time 123.92ms
iter 591520: loss 6.5136, time 124.78ms
iter 591530: loss 6.2940, time 124.78ms
iter 591540: loss 5.7768, time 124.96ms
iter 591550: loss 6.1548, time 124.62ms
iter 591560: loss 6.0154, time 125.46ms
iter 591570: loss 6.3410, time 124.50ms
iter 591580: loss 6.0212, time 124.74ms
iter 591590: loss 5.6735, time 124.91ms
iter 591600: loss 6.2171, time 124.04ms
iter 591610: loss 6.3360, time 125.35ms
iter 591620: loss 6.0842, time 126.15ms
iter 591630: loss 6.4124, time 124.64ms
iter 591640: loss 5.5349, time 125.58ms
iter 591650: loss 5.6114, time 125.25ms
iter 591660: loss 4.9041, time 124.35ms
iter 591670: loss 5.8620, time 124.24ms
iter 591680: loss 5.1102, time 124.44ms
iter 591690: loss 6.4855, time 125.37ms
iter 591700: loss 6.3704, time 124.50ms
iter 591710: loss 5.6296, time 125.36ms
iter 591720: loss 5.5175, time 125.64ms
iter 591730: loss 6.5139, time 125.42ms
iter 591740: loss 5.6424, time 124.87ms
step 591750: train loss 5.4960, val loss 5.5392
saving checkpoint to out-shakespeare-char
iter 591750: loss 6.2797, time 2912.67ms
iter 591760: loss 5.9996, time 124.81ms
iter 591770: loss 6.0324, time 124.61ms
iter 591780: loss 5.5266, time 125.43ms
iter 591790: loss 6.0041, time 124.47ms
iter 591800: loss 5.4305, time 125.55ms
iter 591810: loss 5.8926, time 124.29ms
iter 591820: loss 5.9169, time 126.47ms
iter 591830: loss 6.3349, time 125.55ms
iter 591840: loss 5.7578, time 126.71ms
iter 591850: loss 5.7586, time 125.47ms
iter 591860: loss 5.3171, time 125.47ms
iter 591870: loss 5.5975, time 124.86ms
iter 591880: loss 5.0826, time 125.74ms
iter 591890: loss 6.0813, time 125.53ms
iter 591900: loss 6.1218, time 125.38ms
iter 591910: loss 5.8092, time 124.82ms
iter 591920: loss 5.9607, time 125.90ms
iter 591930: loss 6.2154, time 125.84ms
iter 591940: loss 6.6563, time 125.94ms
iter 591950: loss 6.2776, time 124.70ms
iter 591960: loss 6.2611, time 125.62ms
iter 591970: loss 5.9862, time 125.72ms
iter 591980: loss 6.0676, time 125.42ms
iter 591990: loss 6.4143, time 125.50ms
step 592000: train loss 5.5151, val loss 5.5973
saving checkpoint to out-shakespeare-char
iter 592000: loss 6.3495, time 2904.93ms
iter 592010: loss 6.3256, time 125.54ms
iter 592020: loss 5.8371, time 125.46ms
iter 592030: loss 5.6743, time 125.41ms
iter 592040: loss 6.0771, time 125.44ms
iter 592050: loss 5.9170, time 125.61ms
iter 592060: loss 6.0819, time 125.46ms
iter 592070: loss 5.1669, time 125.39ms
iter 592080: loss 5.9588, time 125.40ms
iter 592090: loss 5.5036, time 125.35ms
iter 592100: loss 6.0203, time 125.93ms
iter 592110: loss 6.3422, time 125.84ms
iter 592120: loss 5.7215, time 126.02ms
iter 592130: loss 6.7715, time 125.97ms
iter 592140: loss 6.8500, time 125.55ms
iter 592150: loss 5.7133, time 125.73ms
iter 592160: loss 6.3754, time 124.54ms
iter 592170: loss 5.3149, time 125.72ms
iter 592180: loss 6.9555, time 125.78ms
iter 592190: loss 6.4292, time 125.25ms
iter 592200: loss 6.4347, time 125.00ms
iter 592210: loss 5.9543, time 125.56ms
iter 592220: loss 6.3569, time 125.35ms
iter 592230: loss 6.1402, time 125.37ms
iter 592240: loss 5.7796, time 125.43ms
step 592250: train loss 5.5561, val loss 5.5186
saving checkpoint to out-shakespeare-char
iter 592250: loss 5.7541, time 2902.36ms
iter 592260: loss 6.5098, time 125.69ms
iter 592270: loss 6.2721, time 125.49ms
iter 592280: loss 5.8491, time 125.59ms
iter 592290: loss 5.9579, time 125.79ms
iter 592300: loss 5.8556, time 124.91ms
iter 592310: loss 5.9346, time 125.43ms
iter 592320: loss 5.5779, time 125.46ms
iter 592330: loss 5.6717, time 126.31ms
iter 592340: loss 6.1833, time 125.93ms
iter 592350: loss 5.8861, time 125.47ms
iter 592360: loss 6.6311, time 126.74ms
iter 592370: loss 5.6613, time 125.39ms
iter 592380: loss 5.7456, time 125.43ms
iter 592390: loss 5.6588, time 125.80ms
iter 592400: loss 7.0745, time 125.67ms
iter 592410: loss 5.5392, time 125.81ms
iter 592420: loss 6.2465, time 125.77ms
iter 592430: loss 6.0877, time 125.87ms
iter 592440: loss 5.6327, time 126.80ms
iter 592450: loss 5.8285, time 125.85ms
iter 592460: loss 6.6456, time 125.80ms
iter 592470: loss 5.7679, time 125.92ms
iter 592480: loss 6.0470, time 126.01ms
iter 592490: loss 5.7810, time 126.34ms
step 592500: train loss 5.5361, val loss 5.5564
saving checkpoint to out-shakespeare-char
iter 592500: loss 6.1517, time 2890.68ms
iter 592510: loss 6.0185, time 125.93ms
iter 592520: loss 5.7802, time 125.64ms
iter 592530: loss 6.0824, time 124.66ms
iter 592540: loss 5.6149, time 126.75ms
iter 592550: loss 6.1199, time 125.61ms
iter 592560: loss 5.4613, time 125.54ms
iter 592570: loss 6.3810, time 125.61ms
iter 592580: loss 5.8477, time 125.13ms
iter 592590: loss 5.8394, time 125.89ms
iter 592600: loss 5.9974, time 125.85ms
iter 592610: loss 6.0432, time 126.18ms
iter 592620: loss 5.7843, time 126.09ms
iter 592630: loss 6.2308, time 125.49ms
iter 592640: loss 6.1837, time 125.97ms
iter 592650: loss 6.4484, time 126.22ms
iter 592660: loss 6.0128, time 126.45ms
iter 592670: loss 5.5765, time 125.44ms
iter 592680: loss 5.9092, time 124.88ms
iter 592690: loss 4.7812, time 125.56ms
iter 592700: loss 6.0420, time 124.79ms
iter 592710: loss 5.7048, time 125.27ms
iter 592720: loss 6.4261, time 124.39ms
iter 592730: loss 5.2561, time 125.08ms
iter 592740: loss 5.7619, time 125.60ms
step 592750: train loss 5.5607, val loss 5.5352
saving checkpoint to out-shakespeare-char
iter 592750: loss 5.7876, time 2888.80ms
iter 592760: loss 4.9963, time 125.77ms
iter 592770: loss 5.6434, time 126.04ms
iter 592780: loss 4.9460, time 125.02ms
iter 592790: loss 6.4251, time 124.99ms
iter 592800: loss 6.0815, time 125.05ms
iter 592810: loss 6.3682, time 124.64ms
iter 592820: loss 5.9919, time 124.97ms
iter 592830: loss 5.8723, time 125.15ms
iter 592840: loss 6.1939, time 124.86ms
iter 592850: loss 6.1872, time 124.78ms
iter 592860: loss 5.7482, time 125.36ms
iter 592870: loss 5.9682, time 124.90ms
iter 592880: loss 5.9680, time 125.86ms
iter 592890: loss 5.9980, time 124.73ms
iter 592900: loss 6.5905, time 125.00ms
iter 592910: loss 6.6185, time 124.01ms
iter 592920: loss 5.8756, time 124.90ms
iter 592930: loss 5.6071, time 125.79ms
iter 592940: loss 5.5417, time 125.80ms
iter 592950: loss 5.5644, time 125.54ms
iter 592960: loss 6.0061, time 125.70ms
iter 592970: loss 5.4657, time 125.41ms
iter 592980: loss 6.2095, time 125.78ms
iter 592990: loss 5.9683, time 124.99ms
step 593000: train loss 5.5483, val loss 5.5250
saving checkpoint to out-shakespeare-char
iter 593000: loss 5.9681, time 2888.52ms
iter 593010: loss 5.6503, time 125.12ms
iter 593020: loss 4.9730, time 125.32ms
iter 593030: loss 5.6638, time 124.87ms
iter 593040: loss 5.9728, time 125.23ms
iter 593050: loss 6.1962, time 125.83ms
iter 593060: loss 5.8555, time 125.77ms
iter 593070: loss 5.9613, time 126.00ms
iter 593080: loss 5.0700, time 125.82ms
iter 593090: loss 6.7955, time 126.11ms
iter 593100: loss 5.8493, time 125.98ms
iter 593110: loss 5.8893, time 125.94ms
iter 593120: loss 5.7550, time 126.02ms
iter 593130: loss 6.2205, time 126.05ms
iter 593140: loss 5.1605, time 126.65ms
iter 593150: loss 6.2581, time 124.81ms
iter 593160: loss 6.5437, time 125.58ms
iter 593170: loss 5.7441, time 125.27ms
iter 593180: loss 6.0899, time 124.84ms
iter 593190: loss 5.6562, time 125.48ms
iter 593200: loss 5.6029, time 125.22ms
iter 593210: loss 6.2301, time 125.42ms
iter 593220: loss 5.9829, time 124.62ms
iter 593230: loss 6.1301, time 125.09ms
iter 593240: loss 5.6692, time 125.71ms
step 593250: train loss 5.4672, val loss 5.5081
saving checkpoint to out-shakespeare-char
iter 593250: loss 5.3561, time 2865.61ms
iter 593260: loss 5.8135, time 124.64ms
iter 593270: loss 6.1579, time 125.51ms
iter 593280: loss 5.8981, time 125.62ms
iter 593290: loss 5.8046, time 125.25ms
iter 593300: loss 5.8951, time 125.27ms
iter 593310: loss 5.7317, time 124.10ms
iter 593320: loss 5.2432, time 125.78ms
iter 593330: loss 5.7881, time 125.62ms
iter 593340: loss 5.5582, time 125.37ms
iter 593350: loss 6.2485, time 125.22ms
iter 593360: loss 5.6601, time 126.31ms
iter 593370: loss 6.4416, time 125.87ms
iter 593380: loss 6.6487, time 125.76ms
iter 593390: loss 5.8122, time 125.91ms
iter 593400: loss 5.4664, time 125.87ms
iter 593410: loss 6.3408, time 125.81ms
iter 593420: loss 6.1281, time 125.82ms
iter 593430: loss 6.1765, time 125.91ms
iter 593440: loss 6.1717, time 125.80ms
iter 593450: loss 6.9217, time 125.97ms
iter 593460: loss 5.7074, time 125.86ms
iter 593470: loss 5.9438, time 126.00ms
iter 593480: loss 5.8262, time 128.30ms
iter 593490: loss 6.1429, time 125.75ms
step 593500: train loss 5.5429, val loss 5.5406
saving checkpoint to out-shakespeare-char
iter 593500: loss 6.8002, time 2881.06ms
iter 593510: loss 5.9707, time 125.84ms
iter 593520: loss 5.4625, time 125.70ms
iter 593530: loss 6.2277, time 125.49ms
iter 593540: loss 6.4606, time 125.13ms
iter 593550: loss 6.2186, time 124.50ms
iter 593560: loss 5.6064, time 125.41ms
iter 593570: loss 5.8774, time 125.24ms
iter 593580: loss 6.0634, time 125.55ms
iter 593590: loss 6.7285, time 126.41ms
iter 593600: loss 6.2936, time 125.24ms
iter 593610: loss 5.1985, time 124.91ms
iter 593620: loss 7.2555, time 125.71ms
iter 593630: loss 5.4523, time 126.01ms
iter 593640: loss 5.4504, time 124.55ms
iter 593650: loss 6.1416, time 125.51ms
iter 593660: loss 6.1527, time 125.54ms
iter 593670: loss 6.4006, time 124.84ms
iter 593680: loss 5.5153, time 123.97ms
iter 593690: loss 5.7776, time 126.03ms
iter 593700: loss 6.2591, time 125.84ms
iter 593710: loss 6.1684, time 126.15ms
iter 593720: loss 6.6702, time 124.73ms
iter 593730: loss 5.6430, time 126.00ms
iter 593740: loss 5.2741, time 125.85ms
step 593750: train loss 5.5678, val loss 5.5291
saving checkpoint to out-shakespeare-char
iter 593750: loss 6.1521, time 2914.62ms
iter 593760: loss 5.9710, time 124.46ms
iter 593770: loss 5.3280, time 125.55ms
iter 593780: loss 5.6900, time 125.48ms
iter 593790: loss 6.1812, time 125.00ms
iter 593800: loss 5.9823, time 125.78ms
iter 593810: loss 6.3024, time 125.64ms
iter 593820: loss 5.6503, time 125.60ms
iter 593830: loss 6.0446, time 124.51ms
iter 593840: loss 5.9817, time 125.59ms
iter 593850: loss 5.7649, time 125.64ms
iter 593860: loss 5.9064, time 125.37ms
iter 593870: loss 6.0784, time 124.77ms
iter 593880: loss 6.2405, time 124.99ms
iter 593890: loss 5.8998, time 126.48ms
iter 593900: loss 5.8182, time 125.56ms
iter 593910: loss 5.9800, time 124.64ms
iter 593920: loss 6.0318, time 125.48ms
iter 593930: loss 5.3935, time 125.60ms
iter 593940: loss 5.9810, time 125.27ms
iter 593950: loss 5.6492, time 124.37ms
iter 593960: loss 5.6155, time 125.49ms
iter 593970: loss 6.0269, time 124.82ms
iter 593980: loss 6.2015, time 124.99ms
iter 593990: loss 6.0209, time 124.99ms
step 594000: train loss 5.5123, val loss 5.5145
saving checkpoint to out-shakespeare-char
iter 594000: loss 5.5283, time 2912.66ms
iter 594010: loss 6.1682, time 126.29ms
iter 594020: loss 6.2071, time 124.29ms
iter 594030: loss 6.8367, time 126.10ms
iter 594040: loss 5.5005, time 123.87ms
iter 594050: loss 6.4157, time 124.53ms
iter 594060: loss 5.8960, time 126.22ms
iter 594070: loss 5.7446, time 126.35ms
iter 594080: loss 6.0343, time 124.90ms
iter 594090: loss 6.2766, time 124.47ms
iter 594100: loss 6.3504, time 125.46ms
iter 594110: loss 4.9748, time 125.80ms
iter 594120: loss 6.5323, time 124.98ms
iter 594130: loss 6.7099, time 123.70ms
iter 594140: loss 5.9717, time 125.32ms
iter 594150: loss 6.1857, time 125.65ms
iter 594160: loss 5.7406, time 125.05ms
iter 594170: loss 5.9335, time 124.97ms
iter 594180: loss 6.4685, time 125.38ms
iter 594190: loss 6.4026, time 125.50ms
iter 594200: loss 6.5398, time 124.74ms
iter 594210: loss 5.5555, time 125.36ms
iter 594220: loss 6.2074, time 125.71ms
iter 594230: loss 6.4869, time 125.52ms
iter 594240: loss 5.6780, time 124.15ms
step 594250: train loss 5.5250, val loss 5.5139
saving checkpoint to out-shakespeare-char
iter 594250: loss 5.8226, time 2888.56ms
iter 594260: loss 5.5878, time 125.66ms
iter 594270: loss 4.9965, time 125.62ms
iter 594280: loss 6.5658, time 125.37ms
iter 594290: loss 5.9572, time 124.81ms
iter 594300: loss 5.7679, time 125.23ms
iter 594310: loss 6.2230, time 125.36ms
iter 594320: loss 5.8283, time 124.89ms
iter 594330: loss 6.0865, time 124.58ms
iter 594340: loss 6.5260, time 125.31ms
iter 594350: loss 6.1154, time 125.22ms
iter 594360: loss 6.1038, time 125.38ms
iter 594370: loss 6.0005, time 125.09ms
iter 594380: loss 5.4415, time 125.39ms
iter 594390: loss 5.4656, time 125.46ms
iter 594400: loss 5.8306, time 124.96ms
iter 594410: loss 6.8248, time 125.51ms
iter 594420: loss 5.9546, time 125.75ms
iter 594430: loss 5.8604, time 125.47ms
iter 594440: loss 5.7082, time 124.36ms
iter 594450: loss 6.5934, time 125.26ms
iter 594460: loss 6.3759, time 125.39ms
iter 594470: loss 5.2927, time 125.69ms
iter 594480: loss 5.8905, time 124.68ms
iter 594490: loss 5.9485, time 125.45ms
step 594500: train loss 5.4984, val loss 5.5664
saving checkpoint to out-shakespeare-char
iter 594500: loss 6.7939, time 2917.02ms
iter 594510: loss 6.1413, time 125.58ms
iter 594520: loss 6.4502, time 125.83ms
iter 594530: loss 5.5278, time 125.82ms
iter 594540: loss 5.9671, time 125.73ms
iter 594550: loss 6.2704, time 125.81ms
iter 594560: loss 5.9756, time 125.72ms
iter 594570: loss 6.0961, time 125.89ms
iter 594580: loss 6.0358, time 125.88ms
iter 594590: loss 5.4100, time 125.79ms
iter 594600: loss 5.9547, time 125.88ms
iter 594610: loss 6.0667, time 125.83ms
iter 594620: loss 6.0790, time 125.87ms
iter 594630: loss 5.7886, time 125.74ms
iter 594640: loss 6.3428, time 125.60ms
iter 594650: loss 5.1441, time 126.03ms
iter 594660: loss 6.2434, time 128.52ms
iter 594670: loss 5.8549, time 125.92ms
iter 594680: loss 6.2493, time 128.35ms
iter 594690: loss 5.4992, time 125.75ms
iter 594700: loss 6.0990, time 127.05ms
iter 594710: loss 5.9486, time 125.28ms
iter 594720: loss 6.1080, time 128.28ms
iter 594730: loss 5.1193, time 125.17ms
iter 594740: loss 5.8197, time 124.71ms
step 594750: train loss 5.5701, val loss 5.5323
saving checkpoint to out-shakespeare-char
iter 594750: loss 5.6435, time 2900.73ms
iter 594760: loss 6.6703, time 125.45ms
iter 594770: loss 5.6356, time 125.30ms
iter 594780: loss 6.3894, time 125.54ms
iter 594790: loss 6.0260, time 125.36ms
iter 594800: loss 5.4011, time 126.02ms
iter 594810: loss 6.0941, time 125.38ms
iter 594820: loss 6.0121, time 125.32ms
iter 594830: loss 6.4610, time 124.56ms
iter 594840: loss 6.6439, time 124.49ms
iter 594850: loss 5.8550, time 125.08ms
iter 594860: loss 5.7590, time 125.37ms
iter 594870: loss 5.9739, time 124.57ms
iter 594880: loss 6.3730, time 125.52ms
iter 594890: loss 5.6734, time 124.57ms
iter 594900: loss 5.8729, time 125.29ms
iter 594910: loss 5.8877, time 124.98ms
iter 594920: loss 6.1644, time 128.31ms
iter 594930: loss 6.1060, time 125.71ms
iter 594940: loss 5.9451, time 128.56ms
iter 594950: loss 6.6258, time 125.93ms
iter 594960: loss 6.6982, time 127.28ms
iter 594970: loss 5.7672, time 124.54ms
iter 594980: loss 5.3450, time 127.98ms
iter 594990: loss 6.5041, time 125.23ms
step 595000: train loss 5.5119, val loss 5.5575
saving checkpoint to out-shakespeare-char
iter 595000: loss 6.0194, time 2887.04ms
iter 595010: loss 6.0525, time 125.31ms
iter 595020: loss 5.8068, time 124.99ms
iter 595030: loss 5.7128, time 125.60ms
iter 595040: loss 6.1276, time 125.35ms
iter 595050: loss 5.3930, time 124.58ms
iter 595060: loss 6.3752, time 125.63ms
iter 595070: loss 6.2994, time 125.60ms
iter 595080: loss 6.4474, time 124.16ms
iter 595090: loss 5.7686, time 125.36ms
iter 595100: loss 5.9368, time 125.60ms
iter 595110: loss 6.4277, time 125.77ms
iter 595120: loss 6.0086, time 125.40ms
iter 595130: loss 5.8529, time 124.94ms
iter 595140: loss 5.3894, time 125.52ms
iter 595150: loss 5.8017, time 125.41ms
iter 595160: loss 5.6805, time 125.58ms
iter 595170: loss 5.9817, time 125.99ms
iter 595180: loss 5.5439, time 125.38ms
iter 595190: loss 6.0282, time 125.55ms
iter 595200: loss 5.9406, time 125.75ms
iter 595210: loss 6.4333, time 125.66ms
iter 595220: loss 6.0701, time 125.25ms
iter 595230: loss 6.0772, time 124.77ms
iter 595240: loss 5.9838, time 125.05ms
step 595250: train loss 5.5355, val loss 5.4919
saving checkpoint to out-shakespeare-char
iter 595250: loss 6.4248, time 2883.84ms
iter 595260: loss 6.5306, time 125.71ms
iter 595270: loss 6.0505, time 128.29ms
iter 595280: loss 6.4369, time 125.59ms
iter 595290: loss 6.1979, time 128.32ms
iter 595300: loss 6.0596, time 125.67ms
iter 595310: loss 6.1393, time 128.34ms
iter 595320: loss 6.3394, time 125.08ms
iter 595330: loss 6.0408, time 128.84ms
iter 595340: loss 6.0042, time 125.29ms
iter 595350: loss 6.0140, time 128.00ms
iter 595360: loss 5.8553, time 125.71ms
iter 595370: loss 5.9510, time 127.90ms
iter 595380: loss 5.4581, time 125.43ms
iter 595390: loss 5.8030, time 125.90ms
iter 595400: loss 6.2471, time 125.49ms
iter 595410: loss 5.7367, time 125.42ms
iter 595420: loss 5.4977, time 125.51ms
iter 595430: loss 5.2702, time 126.21ms
iter 595440: loss 5.9133, time 125.83ms
iter 595450: loss 6.1580, time 125.15ms
iter 595460: loss 5.2776, time 125.53ms
iter 595470: loss 4.9801, time 126.13ms
iter 595480: loss 6.2916, time 125.11ms
iter 595490: loss 5.1571, time 125.18ms
step 595500: train loss 5.5487, val loss 5.5034
saving checkpoint to out-shakespeare-char
iter 595500: loss 6.1979, time 2898.43ms
iter 595510: loss 5.5650, time 126.26ms
iter 595520: loss 5.7970, time 124.98ms
iter 595530: loss 5.7408, time 125.16ms
iter 595540: loss 5.4635, time 124.16ms
iter 595550: loss 6.8583, time 125.60ms
iter 595560: loss 5.7271, time 125.12ms
iter 595570: loss 6.6357, time 124.91ms
iter 595580: loss 5.8098, time 124.96ms
iter 595590: loss 5.6190, time 125.29ms
iter 595600: loss 5.6600, time 124.95ms
iter 595610: loss 5.8290, time 125.45ms
iter 595620: loss 5.9334, time 125.25ms
iter 595630: loss 5.4460, time 125.64ms
iter 595640: loss 6.0732, time 125.55ms
iter 595650: loss 5.8203, time 125.35ms
iter 595660: loss 5.7261, time 124.54ms
iter 595670: loss 6.0988, time 125.36ms
iter 595680: loss 6.3946, time 124.60ms
iter 595690: loss 5.6459, time 124.92ms
iter 595700: loss 5.8016, time 124.17ms
iter 595710: loss 5.8413, time 125.57ms
iter 595720: loss 6.2618, time 125.35ms
iter 595730: loss 5.7327, time 124.95ms
iter 595740: loss 6.9962, time 124.44ms
step 595750: train loss 5.5286, val loss 5.5045
saving checkpoint to out-shakespeare-char
iter 595750: loss 6.3342, time 2896.08ms
iter 595760: loss 5.7619, time 121.68ms
iter 595770: loss 6.3079, time 121.90ms
iter 595780: loss 6.1856, time 121.00ms
iter 595790: loss 6.3824, time 121.44ms
iter 595800: loss 5.3381, time 121.47ms
iter 595810: loss 6.1854, time 124.25ms
iter 595820: loss 5.9972, time 121.46ms
iter 595830: loss 5.2384, time 121.44ms
iter 595840: loss 6.0140, time 122.22ms
iter 595850: loss 6.1259, time 121.89ms
iter 595860: loss 6.2764, time 121.48ms
iter 595870: loss 5.9275, time 121.43ms
iter 595880: loss 6.2654, time 121.28ms
iter 595890: loss 6.6848, time 121.56ms
iter 595900: loss 5.9094, time 121.65ms
iter 595910: loss 5.6223, time 123.01ms
iter 595920: loss 6.6728, time 121.04ms
iter 595930: loss 5.3060, time 121.48ms
iter 595940: loss 5.7821, time 124.39ms
iter 595950: loss 5.6949, time 120.94ms
iter 595960: loss 5.9188, time 121.49ms
iter 595970: loss 6.6396, time 121.56ms
iter 595980: loss 6.1678, time 121.90ms
iter 595990: loss 6.0124, time 121.56ms
step 596000: train loss 5.5823, val loss 5.5417
saving checkpoint to out-shakespeare-char
iter 596000: loss 6.2004, time 2919.14ms
iter 596010: loss 5.7133, time 124.15ms
iter 596020: loss 5.8584, time 125.61ms
iter 596030: loss 6.1795, time 124.92ms
iter 596040: loss 6.1701, time 125.20ms
iter 596050: loss 5.8383, time 124.90ms
iter 596060: loss 5.4766, time 124.98ms
iter 596070: loss 5.9019, time 124.84ms
iter 596080: loss 5.4232, time 125.04ms
iter 596090: loss 5.3393, time 127.70ms
iter 596100: loss 6.3441, time 125.90ms
iter 596110: loss 6.3341, time 127.89ms
iter 596120: loss 6.0562, time 125.16ms
iter 596130: loss 6.2303, time 127.73ms
iter 596140: loss 5.7375, time 124.57ms
iter 596150: loss 5.4618, time 124.93ms
iter 596160: loss 6.1598, time 124.84ms
iter 596170: loss 5.4755, time 125.27ms
iter 596180: loss 5.3586, time 125.31ms
iter 596190: loss 6.3730, time 124.97ms
iter 596200: loss 6.4503, time 124.59ms
iter 596210: loss 5.5059, time 125.13ms
iter 596220: loss 5.3655, time 125.13ms
iter 596230: loss 5.4012, time 127.92ms
iter 596240: loss 5.9839, time 125.55ms
step 596250: train loss 5.4777, val loss 5.5273
saving checkpoint to out-shakespeare-char
iter 596250: loss 5.8444, time 2875.12ms
iter 596260: loss 5.5559, time 126.09ms
iter 596270: loss 6.3600, time 127.18ms
iter 596280: loss 6.1579, time 125.99ms
iter 596290: loss 5.5342, time 126.46ms
iter 596300: loss 5.7555, time 125.97ms
iter 596310: loss 5.6689, time 124.81ms
iter 596320: loss 5.4376, time 125.02ms
iter 596330: loss 5.5776, time 125.91ms
iter 596340: loss 6.7255, time 125.99ms
iter 596350: loss 5.5894, time 125.89ms
iter 596360: loss 5.5261, time 125.74ms
iter 596370: loss 5.1085, time 126.31ms
iter 596380: loss 6.3533, time 128.63ms
iter 596390: loss 5.9423, time 125.74ms
iter 596400: loss 6.8527, time 128.57ms
iter 596410: loss 6.4934, time 125.66ms
iter 596420: loss 5.8502, time 128.84ms
iter 596430: loss 6.2840, time 125.58ms
iter 596440: loss 6.6754, time 127.93ms
iter 596450: loss 5.8448, time 124.00ms
iter 596460: loss 5.9358, time 127.24ms
iter 596470: loss 4.9972, time 124.97ms
iter 596480: loss 5.9111, time 124.23ms
iter 596490: loss 5.7363, time 124.84ms
step 596500: train loss 5.4620, val loss 5.5314
saving checkpoint to out-shakespeare-char
iter 596500: loss 5.9669, time 2909.10ms
iter 596510: loss 5.8653, time 125.81ms
iter 596520: loss 6.1438, time 128.18ms
iter 596530: loss 6.6090, time 126.39ms
iter 596540: loss 6.1532, time 128.34ms
iter 596550: loss 6.3431, time 125.87ms
iter 596560: loss 6.0513, time 127.66ms
iter 596570: loss 6.1661, time 125.41ms
iter 596580: loss 5.5131, time 127.81ms
iter 596590: loss 5.9102, time 125.32ms
iter 596600: loss 5.1675, time 127.95ms
iter 596610: loss 5.3592, time 125.42ms
iter 596620: loss 5.9450, time 127.87ms
iter 596630: loss 6.1628, time 124.24ms
iter 596640: loss 5.5460, time 127.68ms
iter 596650: loss 5.3970, time 124.90ms
iter 596660: loss 6.7188, time 127.47ms
iter 596670: loss 5.3684, time 125.14ms
iter 596680: loss 5.7887, time 125.20ms
iter 596690: loss 5.5723, time 125.35ms
iter 596700: loss 5.7559, time 125.36ms
iter 596710: loss 6.2716, time 125.28ms
iter 596720: loss 6.0396, time 124.90ms
iter 596730: loss 6.4064, time 125.56ms
iter 596740: loss 6.5109, time 125.51ms
step 596750: train loss 5.5391, val loss 5.5261
saving checkpoint to out-shakespeare-char
iter 596750: loss 6.2617, time 2894.15ms
iter 596760: loss 5.2459, time 125.69ms
iter 596770: loss 5.4958, time 125.57ms
iter 596780: loss 6.3178, time 125.64ms
iter 596790: loss 6.0465, time 125.58ms
iter 596800: loss 5.3902, time 125.51ms
iter 596810: loss 5.9881, time 125.85ms
iter 596820: loss 5.7921, time 125.58ms
iter 596830: loss 6.6462, time 125.82ms
iter 596840: loss 5.2834, time 125.70ms
iter 596850: loss 6.3110, time 125.66ms
iter 596860: loss 5.9530, time 125.68ms
iter 596870: loss 5.7724, time 125.13ms
iter 596880: loss 5.3569, time 125.64ms
iter 596890: loss 5.6518, time 125.64ms
iter 596900: loss 6.4955, time 125.47ms
iter 596910: loss 5.2254, time 125.83ms
iter 596920: loss 5.1028, time 125.43ms
iter 596930: loss 6.5914, time 125.54ms
iter 596940: loss 5.3570, time 125.10ms
iter 596950: loss 6.7203, time 125.58ms
iter 596960: loss 5.6625, time 125.57ms
iter 596970: loss 6.4769, time 125.68ms
iter 596980: loss 6.5085, time 125.61ms
iter 596990: loss 5.2957, time 125.56ms
step 597000: train loss 5.5320, val loss 5.5841
saving checkpoint to out-shakespeare-char
iter 597000: loss 5.8481, time 2893.63ms
iter 597010: loss 5.3803, time 125.44ms
iter 597020: loss 6.0974, time 125.32ms
iter 597030: loss 6.4629, time 125.31ms
iter 597040: loss 5.5567, time 125.29ms
iter 597050: loss 6.2426, time 125.87ms
iter 597060: loss 5.9355, time 124.71ms
iter 597070: loss 5.5614, time 125.26ms
iter 597080: loss 6.0933, time 125.24ms
iter 597090: loss 5.4151, time 125.44ms
iter 597100: loss 6.1802, time 125.65ms
iter 597110: loss 5.4976, time 125.18ms
iter 597120: loss 6.3484, time 125.57ms
iter 597130: loss 6.1017, time 125.51ms
iter 597140: loss 5.7677, time 125.42ms
iter 597150: loss 5.6251, time 125.75ms
iter 597160: loss 5.9329, time 125.44ms
iter 597170: loss 6.2394, time 125.81ms
iter 597180: loss 6.9735, time 125.27ms
iter 597190: loss 5.7064, time 125.49ms
iter 597200: loss 6.1501, time 125.40ms
iter 597210: loss 6.2127, time 125.47ms
iter 597220: loss 5.2484, time 125.70ms
iter 597230: loss 5.1335, time 125.99ms
iter 597240: loss 5.7416, time 125.71ms
step 597250: train loss 5.5829, val loss 5.5354
saving checkpoint to out-shakespeare-char
iter 597250: loss 6.0926, time 2903.19ms
iter 597260: loss 6.5200, time 125.62ms
iter 597270: loss 6.4355, time 125.20ms
iter 597280: loss 5.8496, time 125.91ms
iter 597290: loss 6.0031, time 125.79ms
iter 597300: loss 5.3499, time 125.24ms
iter 597310: loss 5.8470, time 125.32ms
iter 597320: loss 6.0225, time 124.92ms
iter 597330: loss 5.5029, time 126.40ms
iter 597340: loss 5.9721, time 125.56ms
iter 597350: loss 5.8402, time 125.75ms
iter 597360: loss 6.6200, time 126.21ms
iter 597370: loss 5.6926, time 125.67ms
iter 597380: loss 5.6578, time 124.96ms
iter 597390: loss 5.8852, time 125.40ms
iter 597400: loss 6.3111, time 125.63ms
iter 597410: loss 5.4464, time 125.67ms
iter 597420: loss 5.9579, time 125.85ms
iter 597430: loss 5.4713, time 125.79ms
iter 597440: loss 5.9344, time 125.81ms
iter 597450: loss 5.6695, time 125.97ms
iter 597460: loss 5.7341, time 125.59ms
iter 597470: loss 5.9728, time 125.83ms
iter 597480: loss 5.7930, time 126.10ms
iter 597490: loss 5.5157, time 125.86ms
step 597500: train loss 5.5098, val loss 5.5347
saving checkpoint to out-shakespeare-char
iter 597500: loss 5.7751, time 2895.11ms
iter 597510: loss 5.8153, time 125.58ms
iter 597520: loss 5.2014, time 125.54ms
iter 597530: loss 6.0725, time 125.53ms
iter 597540: loss 5.4801, time 125.55ms
iter 597550: loss 6.4377, time 125.55ms
iter 597560: loss 6.3727, time 126.26ms
iter 597570: loss 6.2042, time 125.66ms
iter 597580: loss 6.1584, time 126.00ms
iter 597590: loss 6.0536, time 125.62ms
iter 597600: loss 6.2628, time 126.35ms
iter 597610: loss 6.0776, time 124.88ms
iter 597620: loss 6.8842, time 126.09ms
iter 597630: loss 5.9073, time 125.21ms
iter 597640: loss 5.4573, time 125.92ms
iter 597650: loss 5.5806, time 126.69ms
iter 597660: loss 6.1583, time 125.49ms
iter 597670: loss 6.0327, time 126.52ms
iter 597680: loss 5.6880, time 125.43ms
iter 597690: loss 5.3952, time 125.52ms
iter 597700: loss 5.7299, time 125.43ms
iter 597710: loss 6.1319, time 125.54ms
iter 597720: loss 5.6628, time 125.89ms
iter 597730: loss 6.0196, time 125.49ms
iter 597740: loss 6.0426, time 125.48ms
step 597750: train loss 5.5213, val loss 5.4578
saving checkpoint to out-shakespeare-char
iter 597750: loss 6.3564, time 2885.87ms
iter 597760: loss 5.9044, time 125.58ms
iter 597770: loss 6.1500, time 125.41ms
iter 597780: loss 5.4221, time 125.16ms
iter 597790: loss 6.1494, time 125.61ms
iter 597800: loss 6.1122, time 125.23ms
iter 597810: loss 6.1288, time 125.86ms
iter 597820: loss 5.8968, time 125.74ms
iter 597830: loss 5.7341, time 125.25ms
iter 597840: loss 5.9038, time 125.74ms
iter 597850: loss 6.3446, time 126.19ms
iter 597860: loss 5.5938, time 125.60ms
iter 597870: loss 5.6270, time 125.56ms
iter 597880: loss 6.0054, time 125.58ms
iter 597890: loss 6.1075, time 125.55ms
iter 597900: loss 6.5606, time 126.06ms
iter 597910: loss 6.4836, time 125.71ms
iter 597920: loss 5.5557, time 125.44ms
iter 597930: loss 5.7188, time 125.27ms
iter 597940: loss 6.1416, time 124.17ms
iter 597950: loss 6.7052, time 125.31ms
iter 597960: loss 5.4644, time 124.79ms
iter 597970: loss 6.1511, time 125.35ms
iter 597980: loss 5.8113, time 124.98ms
iter 597990: loss 5.9744, time 123.62ms
step 598000: train loss 5.5156, val loss 5.4975
saving checkpoint to out-shakespeare-char
iter 598000: loss 5.9172, time 2873.86ms
iter 598010: loss 6.4589, time 125.13ms
iter 598020: loss 5.8690, time 124.84ms
iter 598030: loss 5.9716, time 124.96ms
iter 598040: loss 5.8190, time 125.31ms
iter 598050: loss 6.0084, time 124.97ms
iter 598060: loss 6.6917, time 124.81ms
iter 598070: loss 5.5782, time 124.44ms
iter 598080: loss 5.3302, time 123.87ms
iter 598090: loss 6.2990, time 125.12ms
iter 598100: loss 6.1733, time 124.59ms
iter 598110: loss 5.8482, time 125.04ms
iter 598120: loss 5.1462, time 125.10ms
iter 598130: loss 5.9808, time 124.93ms
iter 598140: loss 6.3360, time 124.77ms
iter 598150: loss 6.7732, time 125.16ms
iter 598160: loss 6.6512, time 124.63ms
iter 598170: loss 5.8822, time 124.45ms
iter 598180: loss 6.1609, time 124.36ms
iter 598190: loss 5.4513, time 124.86ms
iter 598200: loss 5.8913, time 124.89ms
iter 598210: loss 4.7575, time 124.48ms
iter 598220: loss 5.7990, time 123.44ms
iter 598230: loss 6.3911, time 124.16ms
iter 598240: loss 6.0850, time 124.21ms
step 598250: train loss 5.5027, val loss 5.5123
saving checkpoint to out-shakespeare-char
iter 598250: loss 5.7046, time 2856.42ms
iter 598260: loss 6.2494, time 125.66ms
iter 598270: loss 6.3904, time 124.98ms
iter 598280: loss 5.8182, time 125.22ms
iter 598290: loss 6.3054, time 124.50ms
iter 598300: loss 5.8746, time 125.52ms
iter 598310: loss 6.4857, time 124.89ms
iter 598320: loss 6.3709, time 125.38ms
iter 598330: loss 5.9592, time 124.49ms
iter 598340: loss 6.2252, time 125.24ms
iter 598350: loss 6.3480, time 125.39ms
iter 598360: loss 5.9567, time 124.98ms
iter 598370: loss 5.8563, time 125.42ms
iter 598380: loss 6.4306, time 125.18ms
iter 598390: loss 6.3307, time 125.35ms
iter 598400: loss 6.2311, time 125.52ms
iter 598410: loss 5.6979, time 125.56ms
iter 598420: loss 6.5065, time 125.87ms
iter 598430: loss 5.2891, time 125.16ms
iter 598440: loss 5.6879, time 125.72ms
iter 598450: loss 5.5767, time 125.62ms
iter 598460: loss 6.1311, time 125.15ms
iter 598470: loss 5.7411, time 125.29ms
iter 598480: loss 5.7835, time 125.82ms
iter 598490: loss 5.3838, time 125.38ms
step 598500: train loss 5.5103, val loss 5.4684
saving checkpoint to out-shakespeare-char
iter 598500: loss 6.5952, time 2892.19ms
iter 598510: loss 6.3831, time 121.19ms
iter 598520: loss 6.0603, time 122.58ms
iter 598530: loss 4.9070, time 121.13ms
iter 598540: loss 5.8678, time 120.45ms
iter 598550: loss 6.0422, time 122.91ms
iter 598560: loss 4.9519, time 121.19ms
iter 598570: loss 5.5345, time 121.16ms
iter 598580: loss 5.8333, time 121.15ms
iter 598590: loss 6.0623, time 121.31ms
iter 598600: loss 6.9667, time 121.27ms
iter 598610: loss 5.6096, time 120.34ms
iter 598620: loss 6.4902, time 122.60ms
iter 598630: loss 5.4411, time 121.19ms
iter 598640: loss 5.8270, time 121.09ms
iter 598650: loss 5.7910, time 123.51ms
iter 598660: loss 6.0896, time 121.60ms
iter 598670: loss 5.5408, time 121.90ms
iter 598680: loss 6.0975, time 121.06ms
iter 598690: loss 5.8618, time 121.70ms
iter 598700: loss 5.6005, time 121.22ms
iter 598710: loss 5.9971, time 121.18ms
iter 598720: loss 5.8288, time 122.43ms
iter 598730: loss 5.5191, time 121.12ms
iter 598740: loss 5.8732, time 121.33ms
step 598750: train loss 5.5411, val loss 5.5580
saving checkpoint to out-shakespeare-char
iter 598750: loss 5.1232, time 2911.28ms
iter 598760: loss 5.9836, time 121.59ms
iter 598770: loss 5.9001, time 121.03ms
iter 598780: loss 5.9905, time 121.11ms
iter 598790: loss 6.0185, time 121.70ms
iter 598800: loss 5.9610, time 121.52ms
iter 598810: loss 5.5517, time 122.77ms
iter 598820: loss 6.1650, time 121.71ms
iter 598830: loss 6.5107, time 121.67ms
iter 598840: loss 5.3760, time 124.34ms
iter 598850: loss 6.0323, time 121.75ms
iter 598860: loss 5.3852, time 121.65ms
iter 598870: loss 5.7320, time 121.84ms
iter 598880: loss 5.2228, time 121.60ms
iter 598890: loss 6.3459, time 122.16ms
iter 598900: loss 6.1612, time 121.05ms
iter 598910: loss 6.3555, time 123.00ms
iter 598920: loss 5.8247, time 121.65ms
iter 598930: loss 6.2159, time 121.85ms
iter 598940: loss 5.9428, time 123.37ms
iter 598950: loss 5.2911, time 121.73ms
iter 598960: loss 5.6525, time 121.50ms
iter 598970: loss 6.5069, time 123.81ms
iter 598980: loss 6.1109, time 121.85ms
iter 598990: loss 5.6677, time 121.91ms
step 599000: train loss 5.5482, val loss 5.5003
saving checkpoint to out-shakespeare-char
iter 599000: loss 5.6547, time 2912.55ms
iter 599010: loss 5.7176, time 122.67ms
iter 599020: loss 5.8675, time 121.30ms
iter 599030: loss 6.6789, time 121.25ms
iter 599040: loss 5.8857, time 124.12ms
iter 599050: loss 5.7180, time 121.21ms
iter 599060: loss 6.2539, time 121.44ms
iter 599070: loss 5.6312, time 121.20ms
iter 599080: loss 5.5187, time 121.46ms
iter 599090: loss 5.7799, time 121.09ms
iter 599100: loss 6.2185, time 121.39ms
iter 599110: loss 6.3995, time 122.50ms
iter 599120: loss 6.2910, time 121.08ms
iter 599130: loss 5.8134, time 121.17ms
iter 599140: loss 6.1610, time 122.55ms
iter 599150: loss 6.0655, time 121.96ms
iter 599160: loss 6.0769, time 121.21ms
iter 599170: loss 6.9625, time 124.04ms
iter 599180: loss 5.6144, time 121.21ms
iter 599190: loss 6.0628, time 121.16ms
iter 599200: loss 6.1094, time 121.36ms
iter 599210: loss 5.7080, time 121.66ms
iter 599220: loss 6.6335, time 121.22ms
iter 599230: loss 6.6106, time 121.22ms
iter 599240: loss 5.7456, time 122.50ms
step 599250: train loss 5.5497, val loss 5.5131
saving checkpoint to out-shakespeare-char
iter 599250: loss 6.0661, time 2912.61ms
iter 599260: loss 6.4101, time 121.53ms
iter 599270: loss 5.7124, time 124.60ms
iter 599280: loss 6.3664, time 121.31ms
iter 599290: loss 5.6246, time 121.22ms
iter 599300: loss 5.6812, time 121.25ms
iter 599310: loss 5.9778, time 122.53ms
iter 599320: loss 6.4166, time 121.37ms
iter 599330: loss 5.2742, time 121.38ms
iter 599340: loss 5.8312, time 122.59ms
iter 599350: loss 5.5910, time 121.52ms
iter 599360: loss 6.5789, time 121.42ms
iter 599370: loss 6.4768, time 121.35ms
iter 599380: loss 5.5645, time 121.69ms
iter 599390: loss 5.7038, time 121.33ms
iter 599400: loss 6.5123, time 121.31ms
iter 599410: loss 5.9196, time 122.65ms
iter 599420: loss 6.1794, time 121.02ms
iter 599430: loss 5.0712, time 121.30ms
iter 599440: loss 5.1866, time 122.64ms
iter 599450: loss 5.8613, time 121.86ms
iter 599460: loss 5.6001, time 121.40ms
iter 599470: loss 6.0019, time 124.02ms
iter 599480: loss 6.6800, time 121.65ms
iter 599490: loss 6.0929, time 121.30ms
step 599500: train loss 5.5693, val loss 5.5305
saving checkpoint to out-shakespeare-char
iter 599500: loss 6.0089, time 2898.22ms
iter 599510: loss 6.0753, time 125.53ms
iter 599520: loss 6.5487, time 124.16ms
iter 599530: loss 5.6077, time 125.02ms
iter 599540: loss 5.5516, time 125.29ms
iter 599550: loss 5.9939, time 125.04ms
iter 599560: loss 6.2564, time 124.57ms
iter 599570: loss 5.9580, time 125.03ms
iter 599580: loss 6.0144, time 125.05ms
iter 599590: loss 5.4171, time 125.45ms
iter 599600: loss 5.1820, time 124.26ms
iter 599610: loss 6.2743, time 125.68ms
iter 599620: loss 6.0093, time 125.17ms
iter 599630: loss 6.0138, time 125.90ms
iter 599640: loss 5.6359, time 125.72ms
iter 599650: loss 6.0458, time 125.72ms
iter 599660: loss 5.4566, time 125.57ms
iter 599670: loss 6.1462, time 125.83ms
iter 599680: loss 6.3598, time 125.41ms
iter 599690: loss 5.8624, time 125.27ms
iter 599700: loss 5.6986, time 125.66ms
iter 599710: loss 6.0884, time 125.60ms
iter 599720: loss 6.7295, time 126.13ms
iter 599730: loss 6.3139, time 123.70ms
iter 599740: loss 5.2564, time 128.61ms
step 599750: train loss 5.5389, val loss 5.5130
saving checkpoint to out-shakespeare-char
iter 599750: loss 6.0899, time 2894.12ms
iter 599760: loss 6.0534, time 128.61ms
iter 599770: loss 6.0256, time 124.71ms
iter 599780: loss 5.8440, time 128.19ms
iter 599790: loss 6.1246, time 125.68ms
iter 599800: loss 5.8630, time 128.06ms
iter 599810: loss 6.3489, time 125.49ms
iter 599820: loss 5.4242, time 127.89ms
iter 599830: loss 5.8692, time 125.57ms
iter 599840: loss 7.0643, time 128.14ms
iter 599850: loss 6.0727, time 125.48ms
iter 599860: loss 5.9032, time 128.32ms
iter 599870: loss 5.9392, time 125.37ms
iter 599880: loss 5.5014, time 128.03ms
iter 599890: loss 6.5526, time 125.12ms
iter 599900: loss 6.4590, time 128.49ms
iter 599910: loss 5.0770, time 126.67ms
iter 599920: loss 6.3795, time 128.19ms
iter 599930: loss 6.1854, time 125.11ms
iter 599940: loss 6.2726, time 127.61ms
iter 599950: loss 6.3117, time 125.06ms
iter 599960: loss 6.0601, time 127.98ms
iter 599970: loss 5.4923, time 125.62ms
iter 599980: loss 6.0411, time 125.51ms
iter 599990: loss 5.7502, time 125.51ms
step 600000: train loss 5.5006, val loss 5.5206
saving checkpoint to out-shakespeare-char
iter 600000: loss 5.9555, time 2891.01ms
