
=== Start adding workers ===
=> Add worker SGDMWorker(index=0, momentum=0.9)
=> Add worker SGDMWorker(index=1, momentum=0.9)
=> Add worker SGDMWorker(index=2, momentum=0.9)
=> Add worker SGDMWorker(index=3, momentum=0.9)
=> Add worker SGDMWorker(index=4, momentum=0.9)
=> Add worker SGDMWorker(index=5, momentum=0.9)
=> Add worker SGDMWorker(index=6, momentum=0.9)
=> Add worker SGDMWorker(index=7, momentum=0.9)
=> Add worker SGDMWorker(index=8, momentum=0.9)
=> Add worker SGDMWorker(index=9, momentum=0.9)
=> Add worker ByzantineWorker(index=10)
=> Add worker ByzantineWorker(index=11)

=== Start adding graph ===
<codes.graph_utils.RandomSmallWorldGraph object at 0x7feb84b9a400>

Train epoch 1
[E 1B0  |    384/60000 (  1%) ] Loss: 2.3054 top1= 10.0000

=== Peeking data label distribution E1B0 ===
Worker 0 has targets: tensor([0, 0, 0, 0, 0], device='cuda:0')
Worker 1 has targets: tensor([1, 1, 1, 1, 1], device='cuda:0')
Worker 2 has targets: tensor([1, 2, 2, 2, 2], device='cuda:0')
Worker 3 has targets: tensor([2, 3, 3, 3, 3], device='cuda:0')
Worker 4 has targets: tensor([3, 4, 4, 4, 4], device='cuda:0')
Worker 5 has targets: tensor([4, 5, 5, 5, 5], device='cuda:0')
Worker 6 has targets: tensor([6, 6, 6, 6, 6], device='cuda:0')
Worker 7 has targets: tensor([7, 7, 7, 7, 7], device='cuda:0')
Worker 8 has targets: tensor([7, 8, 8, 8, 8], device='cuda:0')
Worker 9 has targets: tensor([8, 9, 9, 9, 9], device='cuda:0')
Worker 10 has targets: tensor([4, 8, 8, 6, 9], device='cuda:0')
Worker 11 has targets: tensor([5, 3, 6, 0, 9], device='cuda:0')



=== Log global consensus distance @ E1B0 ===
consensus_distance=0.710



=== Log average shortest path distance for small world @ E1B0 ===
2.7777777777777777


[E 1B10 |   4224/60000 (  7%) ] Loss: 0.6599 top1= 75.0000
[E 1B20 |   8064/60000 ( 13%) ] Loss: 0.3207 top1= 89.6875
[E 1B30 |  11904/60000 ( 20%) ] Loss: 0.4218 top1= 95.0000
[E 1B40 |  15744/60000 ( 26%) ] Loss: 0.3488 top1= 97.5000
[E 1B50 |  19584/60000 ( 33%) ] Loss: 0.3306 top1= 96.2500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.0514 top1= 24.9499

Train epoch 2
[E 2B0  |    384/60000 (  1%) ] Loss: 0.4382 top1= 92.5000

=== Log global consensus distance @ E2B0 ===
consensus_distance=8.210


[E 2B10 |   4224/60000 (  7%) ] Loss: 0.3008 top1= 96.5625
[E 2B20 |   8064/60000 ( 13%) ] Loss: 0.2721 top1= 98.7500
[E 2B30 |  11904/60000 ( 20%) ] Loss: 0.2582 top1= 98.7500
[E 2B40 |  15744/60000 ( 26%) ] Loss: 0.3136 top1= 97.5000
[E 2B50 |  19584/60000 ( 33%) ] Loss: 0.2893 top1= 97.8125

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.9681 top1= 34.5353

Train epoch 3
[E 3B0  |    384/60000 (  1%) ] Loss: 0.3785 top1= 94.6875

=== Log global consensus distance @ E3B0 ===
consensus_distance=8.322


[E 3B10 |   4224/60000 (  7%) ] Loss: 0.2819 top1= 98.1250
[E 3B20 |   8064/60000 ( 13%) ] Loss: 0.2613 top1= 98.7500
[E 3B30 |  11904/60000 ( 20%) ] Loss: 0.2483 top1= 99.3750
[E 3B40 |  15744/60000 ( 26%) ] Loss: 0.2642 top1= 99.3750
[E 3B50 |  19584/60000 ( 33%) ] Loss: 0.3126 top1= 96.5625

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.9458 top1= 33.3734

Train epoch 4
[E 4B0  |    384/60000 (  1%) ] Loss: 0.3326 top1= 95.6250

=== Log global consensus distance @ E4B0 ===
consensus_distance=8.399


[E 4B10 |   4224/60000 (  7%) ] Loss: 0.0512 top1= 99.3750
[E 4B20 |   8064/60000 ( 13%) ] Loss: 0.1184 top1= 95.3125
[E 4B30 |  11904/60000 ( 20%) ] Loss: 0.0791 top1= 98.4375
[E 4B40 |  15744/60000 ( 26%) ] Loss: 0.1112 top1= 96.8750
[E 4B50 |  19584/60000 ( 33%) ] Loss: 0.2571 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.9252 top1= 34.3049

Train epoch 5
[E 5B0  |    384/60000 (  1%) ] Loss: 0.3266 top1= 95.9375

=== Log global consensus distance @ E5B0 ===
consensus_distance=8.463


[E 5B10 |   4224/60000 (  7%) ] Loss: 0.2667 top1= 98.1250
[E 5B20 |   8064/60000 ( 13%) ] Loss: 0.0909 top1= 98.7500
[E 5B30 |  11904/60000 ( 20%) ] Loss: 0.0782 top1= 96.8750
[E 5B40 |  15744/60000 ( 26%) ] Loss: 0.2725 top1= 90.0000
[E 5B50 |  19584/60000 ( 33%) ] Loss: 0.2661 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.8841 top1= 37.0994

Train epoch 6
[E 6B0  |    384/60000 (  1%) ] Loss: 0.2981 top1= 96.5625

=== Log global consensus distance @ E6B0 ===
consensus_distance=8.513


[E 6B10 |   4224/60000 (  7%) ] Loss: 0.0281 top1= 99.6875
[E 6B20 |   8064/60000 ( 13%) ] Loss: 0.0562 top1= 97.8125
[E 6B30 |  11904/60000 ( 20%) ] Loss: 0.2393 top1= 90.0000
[E 6B40 |  15744/60000 ( 26%) ] Loss: 0.2393 top1= 91.2500
[E 6B50 |  19584/60000 ( 33%) ] Loss: 0.0943 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.8629 top1= 34.0946

Train epoch 7
[E 7B0  |    384/60000 (  1%) ] Loss: 0.3140 top1= 94.3750

=== Log global consensus distance @ E7B0 ===
consensus_distance=6.998


[E 7B10 |   4224/60000 (  7%) ] Loss: 0.0492 top1= 98.4375
[E 7B20 |   8064/60000 ( 13%) ] Loss: 0.0858 top1= 96.2500
[E 7B30 |  11904/60000 ( 20%) ] Loss: 0.1981 top1= 93.4375
[E 7B40 |  15744/60000 ( 26%) ] Loss: 0.0226 top1= 99.6875
[E 7B50 |  19584/60000 ( 33%) ] Loss: 0.0816 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6935 top1= 35.4968

Train epoch 8
[E 8B0  |    384/60000 (  1%) ] Loss: 0.0892 top1= 95.6250

=== Log global consensus distance @ E8B0 ===
consensus_distance=0.848


[E 8B10 |   4224/60000 (  7%) ] Loss: 0.0513 top1= 98.1250
[E 8B20 |   8064/60000 ( 13%) ] Loss: 0.0581 top1= 97.8125
[E 8B30 |  11904/60000 ( 20%) ] Loss: 0.0715 top1= 98.1250
[E 8B40 |  15744/60000 ( 26%) ] Loss: 0.0526 top1= 97.5000
[E 8B50 |  19584/60000 ( 33%) ] Loss: 0.1059 top1= 97.1875

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6776 top1= 43.4796

Train epoch 9
[E 9B0  |    384/60000 (  1%) ] Loss: 0.1298 top1= 96.8750

=== Log global consensus distance @ E9B0 ===
consensus_distance=0.847


[E 9B10 |   4224/60000 (  7%) ] Loss: 0.0395 top1= 99.6875
[E 9B20 |   8064/60000 ( 13%) ] Loss: 0.0634 top1= 98.4375
[E 9B30 |  11904/60000 ( 20%) ] Loss: 0.0794 top1= 97.8125
[E 9B40 |  15744/60000 ( 26%) ] Loss: 0.0226 top1= 99.0625
[E 9B50 |  19584/60000 ( 33%) ] Loss: 0.0673 top1= 99.0625

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6732 top1= 44.1006

Train epoch 10
[E10B0  |    384/60000 (  1%) ] Loss: 0.1384 top1= 96.8750

=== Log global consensus distance @ E10B0 ===
consensus_distance=0.881


[E10B10 |   4224/60000 (  7%) ] Loss: 0.0572 top1= 97.8125
[E10B20 |   8064/60000 ( 13%) ] Loss: 0.0609 top1= 99.0625
[E10B30 |  11904/60000 ( 20%) ] Loss: 0.1020 top1= 97.5000
[E10B40 |  15744/60000 ( 26%) ] Loss: 0.0318 top1= 98.7500
[E10B50 |  19584/60000 ( 33%) ] Loss: 0.0813 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.7153 top1= 41.0757

Train epoch 11
[E11B0  |    384/60000 (  1%) ] Loss: 0.1320 top1= 97.1875

=== Log global consensus distance @ E11B0 ===
consensus_distance=0.918


[E11B10 |   4224/60000 (  7%) ] Loss: 0.0852 top1= 96.8750
[E11B20 |   8064/60000 ( 13%) ] Loss: 0.0700 top1= 98.4375
[E11B30 |  11904/60000 ( 20%) ] Loss: 0.0561 top1= 99.3750
[E11B40 |  15744/60000 ( 26%) ] Loss: 0.0149 top1=100.0000
[E11B50 |  19584/60000 ( 33%) ] Loss: 0.0844 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6880 top1= 43.2192

Train epoch 12
[E12B0  |    384/60000 (  1%) ] Loss: 0.1430 top1= 97.8125

=== Log global consensus distance @ E12B0 ===
consensus_distance=0.948


[E12B10 |   4224/60000 (  7%) ] Loss: 0.0376 top1= 99.6875
[E12B20 |   8064/60000 ( 13%) ] Loss: 0.0632 top1= 98.4375
[E12B30 |  11904/60000 ( 20%) ] Loss: 0.1071 top1= 97.5000
[E12B40 |  15744/60000 ( 26%) ] Loss: 0.0621 top1= 96.8750
[E12B50 |  19584/60000 ( 33%) ] Loss: 0.0670 top1= 99.0625

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6884 top1= 43.1691

Train epoch 13
[E13B0  |    384/60000 (  1%) ] Loss: 0.1462 top1= 96.2500

=== Log global consensus distance @ E13B0 ===
consensus_distance=0.981


[E13B10 |   4224/60000 (  7%) ] Loss: 0.0481 top1= 99.3750
[E13B20 |   8064/60000 ( 13%) ] Loss: 0.0429 top1= 99.3750
[E13B30 |  11904/60000 ( 20%) ] Loss: 0.0761 top1= 99.0625
[E13B40 |  15744/60000 ( 26%) ] Loss: 0.0179 top1= 99.6875
[E13B50 |  19584/60000 ( 33%) ] Loss: 0.0840 top1= 98.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.7080 top1= 41.6567

Train epoch 14
[E14B0  |    384/60000 (  1%) ] Loss: 0.1243 top1= 97.1875

=== Log global consensus distance @ E14B0 ===
consensus_distance=1.010


[E14B10 |   4224/60000 (  7%) ] Loss: 0.0360 top1= 99.3750
[E14B20 |   8064/60000 ( 13%) ] Loss: 0.0799 top1= 97.1875
[E14B30 |  11904/60000 ( 20%) ] Loss: 0.0834 top1= 98.4375
[E14B40 |  15744/60000 ( 26%) ] Loss: 0.0259 top1= 99.0625
[E14B50 |  19584/60000 ( 33%) ] Loss: 0.0879 top1= 98.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6868 top1= 43.5597

Train epoch 15
[E15B0  |    384/60000 (  1%) ] Loss: 0.1112 top1= 98.1250

=== Log global consensus distance @ E15B0 ===
consensus_distance=1.034


[E15B10 |   4224/60000 (  7%) ] Loss: 0.0751 top1= 98.4375
[E15B20 |   8064/60000 ( 13%) ] Loss: 0.0631 top1= 98.7500
[E15B30 |  11904/60000 ( 20%) ] Loss: 0.0658 top1= 98.1250
[E15B40 |  15744/60000 ( 26%) ] Loss: 0.0139 top1=100.0000
[E15B50 |  19584/60000 ( 33%) ] Loss: 0.0666 top1= 99.0625

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6554 top1= 43.8802

Train epoch 16
[E16B0  |    384/60000 (  1%) ] Loss: 0.1404 top1= 97.5000

=== Log global consensus distance @ E16B0 ===
consensus_distance=1.064


[E16B10 |   4224/60000 (  7%) ] Loss: 0.0532 top1= 98.7500
[E16B20 |   8064/60000 ( 13%) ] Loss: 0.0914 top1= 97.1875
[E16B30 |  11904/60000 ( 20%) ] Loss: 0.0637 top1= 98.4375
[E16B40 |  15744/60000 ( 26%) ] Loss: 0.0275 top1= 99.3750
[E16B50 |  19584/60000 ( 33%) ] Loss: 0.0913 top1= 98.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6734 top1= 43.4696

Train epoch 17
[E17B0  |    384/60000 (  1%) ] Loss: 0.0917 top1= 97.8125

=== Log global consensus distance @ E17B0 ===
consensus_distance=1.091


[E17B10 |   4224/60000 (  7%) ] Loss: 0.0451 top1= 99.3750
[E17B20 |   8064/60000 ( 13%) ] Loss: 0.0590 top1= 98.1250
[E17B30 |  11904/60000 ( 20%) ] Loss: 0.0850 top1= 97.5000
[E17B40 |  15744/60000 ( 26%) ] Loss: 0.0156 top1=100.0000
[E17B50 |  19584/60000 ( 33%) ] Loss: 0.0745 top1= 98.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6845 top1= 43.1390

Train epoch 18
[E18B0  |    384/60000 (  1%) ] Loss: 0.1337 top1= 97.5000

=== Log global consensus distance @ E18B0 ===
consensus_distance=1.114


[E18B10 |   4224/60000 (  7%) ] Loss: 0.0562 top1= 98.4375
[E18B20 |   8064/60000 ( 13%) ] Loss: 0.0752 top1= 98.4375
[E18B30 |  11904/60000 ( 20%) ] Loss: 0.1244 top1= 96.8750
[E18B40 |  15744/60000 ( 26%) ] Loss: 0.0267 top1= 98.7500
[E18B50 |  19584/60000 ( 33%) ] Loss: 0.1055 top1= 98.1250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6568 top1= 44.7416

Train epoch 19
[E19B0  |    384/60000 (  1%) ] Loss: 0.1199 top1= 97.1875

=== Log global consensus distance @ E19B0 ===
consensus_distance=1.145


[E19B10 |   4224/60000 (  7%) ] Loss: 0.0659 top1= 98.4375
[E19B20 |   8064/60000 ( 13%) ] Loss: 0.0932 top1= 96.8750
[E19B30 |  11904/60000 ( 20%) ] Loss: 0.0974 top1= 97.5000
[E19B40 |  15744/60000 ( 26%) ] Loss: 0.0372 top1= 98.4375
[E19B50 |  19584/60000 ( 33%) ] Loss: 0.0680 top1= 99.3750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.6731 top1= 44.1206

Train epoch 20
[E20B0  |    384/60000 (  1%) ] Loss: 0.1278 top1= 97.8125

=== Log global consensus distance @ E20B0 ===
consensus_distance=1.170


[E20B10 |   4224/60000 (  7%) ] Loss: 0.0648 top1= 97.8125
[E20B20 |   8064/60000 ( 13%) ] Loss: 0.0512 top1= 99.0625
[E20B30 |  11904/60000 ( 20%) ] Loss: 0.2419 top1= 92.8125
[E20B40 |  15744/60000 ( 26%) ] Loss: 0.0972 top1= 97.5000
[E20B50 |  19584/60000 ( 33%) ] Loss: 0.1351 top1= 95.9375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3570 top1= 60.1062

Train epoch 21
[E21B0  |    384/60000 (  1%) ] Loss: 0.1752 top1= 96.5625

=== Log global consensus distance @ E21B0 ===
consensus_distance=1.109


[E21B10 |   4224/60000 (  7%) ] Loss: 0.1292 top1= 95.9375
[E21B20 |   8064/60000 ( 13%) ] Loss: 0.2284 top1= 93.7500
[E21B30 |  11904/60000 ( 20%) ] Loss: 0.1323 top1= 97.1875
[E21B40 |  15744/60000 ( 26%) ] Loss: 0.1067 top1= 96.5625
[E21B50 |  19584/60000 ( 33%) ] Loss: 0.1029 top1= 98.1250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3573 top1= 57.4319

Train epoch 22
[E22B0  |    384/60000 (  1%) ] Loss: 0.1690 top1= 96.5625

=== Log global consensus distance @ E22B0 ===
consensus_distance=1.135


[E22B10 |   4224/60000 (  7%) ] Loss: 0.0637 top1= 98.1250
[E22B20 |   8064/60000 ( 13%) ] Loss: 0.1370 top1= 96.2500
[E22B30 |  11904/60000 ( 20%) ] Loss: 0.0769 top1= 99.0625
[E22B40 |  15744/60000 ( 26%) ] Loss: 0.0552 top1= 98.7500
[E22B50 |  19584/60000 ( 33%) ] Loss: 0.1212 top1= 97.5000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3700 top1= 55.1983

Train epoch 23
[E23B0  |    384/60000 (  1%) ] Loss: 0.1475 top1= 97.1875

=== Log global consensus distance @ E23B0 ===
consensus_distance=1.155


[E23B10 |   4224/60000 (  7%) ] Loss: 0.0941 top1= 97.1875
[E23B20 |   8064/60000 ( 13%) ] Loss: 0.1139 top1= 97.8125
[E23B30 |  11904/60000 ( 20%) ] Loss: 0.1322 top1= 96.2500
[E23B40 |  15744/60000 ( 26%) ] Loss: 0.0599 top1= 97.8125
[E23B50 |  19584/60000 ( 33%) ] Loss: 0.0812 top1= 98.1250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3833 top1= 55.5288

Train epoch 24
[E24B0  |    384/60000 (  1%) ] Loss: 0.2038 top1= 96.2500

=== Log global consensus distance @ E24B0 ===
consensus_distance=1.181


[E24B10 |   4224/60000 (  7%) ] Loss: 0.0804 top1= 96.8750
[E24B20 |   8064/60000 ( 13%) ] Loss: 0.1101 top1= 97.5000
[E24B30 |  11904/60000 ( 20%) ] Loss: 0.1032 top1= 97.8125
[E24B40 |  15744/60000 ( 26%) ] Loss: 0.0608 top1= 98.1250
[E24B50 |  19584/60000 ( 33%) ] Loss: 0.1105 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3716 top1= 55.4788

Train epoch 25
[E25B0  |    384/60000 (  1%) ] Loss: 0.1473 top1= 97.5000

=== Log global consensus distance @ E25B0 ===
consensus_distance=1.196


[E25B10 |   4224/60000 (  7%) ] Loss: 0.1218 top1= 95.6250
[E25B20 |   8064/60000 ( 13%) ] Loss: 0.1000 top1= 98.4375
[E25B30 |  11904/60000 ( 20%) ] Loss: 0.0966 top1= 98.4375
[E25B40 |  15744/60000 ( 26%) ] Loss: 0.0552 top1= 98.7500
[E25B50 |  19584/60000 ( 33%) ] Loss: 0.0990 top1= 98.4375

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3705 top1= 55.6490

Train epoch 26
[E26B0  |    384/60000 (  1%) ] Loss: 0.1485 top1= 96.8750

=== Log global consensus distance @ E26B0 ===
consensus_distance=1.218


[E26B10 |   4224/60000 (  7%) ] Loss: 0.0556 top1= 98.4375
[E26B20 |   8064/60000 ( 13%) ] Loss: 0.0907 top1= 99.0625
[E26B30 |  11904/60000 ( 20%) ] Loss: 0.0691 top1= 99.0625
[E26B40 |  15744/60000 ( 26%) ] Loss: 0.0720 top1= 97.1875
[E26B50 |  19584/60000 ( 33%) ] Loss: 0.0863 top1= 99.3750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3456 top1= 57.7224

Train epoch 27
[E27B0  |    384/60000 (  1%) ] Loss: 0.1506 top1= 96.5625

=== Log global consensus distance @ E27B0 ===
consensus_distance=1.166


[E27B10 |   4224/60000 (  7%) ] Loss: 0.0699 top1= 97.5000
[E27B20 |   8064/60000 ( 13%) ] Loss: 0.1207 top1= 98.1250
[E27B30 |  11904/60000 ( 20%) ] Loss: 0.0528 top1= 98.4375
[E27B40 |  15744/60000 ( 26%) ] Loss: 0.0572 top1= 98.1250
[E27B50 |  19584/60000 ( 33%) ] Loss: 0.1550 top1= 98.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3174 top1= 59.5553

Train epoch 28
[E28B0  |    384/60000 (  1%) ] Loss: 0.1516 top1= 97.1875

=== Log global consensus distance @ E28B0 ===
consensus_distance=1.184


[E28B10 |   4224/60000 (  7%) ] Loss: 0.0779 top1= 97.5000
[E28B20 |   8064/60000 ( 13%) ] Loss: 0.0768 top1= 98.1250
[E28B30 |  11904/60000 ( 20%) ] Loss: 0.0751 top1= 98.4375
[E28B40 |  15744/60000 ( 26%) ] Loss: 0.0486 top1= 98.7500
[E28B50 |  19584/60000 ( 33%) ] Loss: 0.1241 top1= 99.3750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3095 top1= 59.7556

Train epoch 29
[E29B0  |    384/60000 (  1%) ] Loss: 0.1908 top1= 95.3125

=== Log global consensus distance @ E29B0 ===
consensus_distance=1.207


[E29B10 |   4224/60000 (  7%) ] Loss: 0.0785 top1= 97.8125
[E29B20 |   8064/60000 ( 13%) ] Loss: 0.0963 top1= 98.7500
[E29B30 |  11904/60000 ( 20%) ] Loss: 0.0715 top1= 98.1250
[E29B40 |  15744/60000 ( 26%) ] Loss: 0.0788 top1= 98.1250
[E29B50 |  19584/60000 ( 33%) ] Loss: 0.0809 top1= 98.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.3374 top1= 58.1230

Train epoch 30
[E30B0  |    384/60000 (  1%) ] Loss: 0.1555 top1= 96.5625

=== Log global consensus distance @ E30B0 ===
consensus_distance=1.219


[E30B10 |   4224/60000 (  7%) ] Loss: 0.1124 top1= 95.9375
[E30B20 |   8064/60000 ( 13%) ] Loss: 0.0887 top1= 98.4375
[E30B30 |  11904/60000 ( 20%) ] Loss: 0.0527 top1= 99.3750
[E30B40 |  15744/60000 ( 26%) ] Loss: 0.0803 top1= 98.1250
[E30B50 |  19584/60000 ( 33%) ] Loss: 0.0902 top1= 99.0625

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.2663 top1= 59.6054

