
=== Start adding workers ===
=> Add worker SGDMWorker(index=0, momentum=0.9)
=> Add worker SGDMWorker(index=1, momentum=0.9)
=> Add worker SGDMWorker(index=2, momentum=0.9)
=> Add worker SGDMWorker(index=3, momentum=0.9)
=> Add worker SGDMWorker(index=4, momentum=0.9)

=== Start adding graph ===
Ring(n=5)

Train epoch 1
[E 1B0  |    160/60000 (  0%) ] Loss: 2.3148 top1=  7.5000

=== Peeking data label distribution E1B0 ===
Worker 0 has targets: tensor([9, 4, 7, 0, 3], device='cuda:0')
Worker 1 has targets: tensor([3, 9, 4, 6, 1], device='cuda:0')
Worker 2 has targets: tensor([5, 6, 8, 8, 3], device='cuda:0')
Worker 3 has targets: tensor([4, 8, 9, 9, 2], device='cuda:0')
Worker 4 has targets: tensor([7, 8, 4, 9, 5], device='cuda:0')



=== Log mixing matrix @ E1B0 ===
[[0.333 0.333 0.    0.    0.333]
 [0.333 0.333 0.333 0.    0.   ]
 [0.    0.333 0.333 0.333 0.   ]
 [0.    0.    0.333 0.333 0.333]
 [0.333 0.    0.    0.333 0.333]]


[E 1B10 |   1760/60000 (  3%) ] Loss: 2.3023 top1= 14.3750
[E 1B20 |   3360/60000 (  6%) ] Loss: 2.3035 top1=  9.3750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3019 top1= 11.3482

Train epoch 2
[E 2B0  |    160/60000 (  0%) ] Loss: 2.3019 top1= 10.0000
[E 2B10 |   1760/60000 (  3%) ] Loss: 2.2993 top1= 11.2500
[E 2B20 |   3360/60000 (  6%) ] Loss: 2.3042 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3014 top1= 11.3482

Train epoch 3
[E 3B0  |    160/60000 (  0%) ] Loss: 2.3018 top1= 10.0000
[E 3B10 |   1760/60000 (  3%) ] Loss: 2.2971 top1= 11.2500
[E 3B20 |   3360/60000 (  6%) ] Loss: 2.3050 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3012 top1= 11.3482

Train epoch 4
[E 4B0  |    160/60000 (  0%) ] Loss: 2.3019 top1= 10.0000
[E 4B10 |   1760/60000 (  3%) ] Loss: 2.2956 top1= 11.2500
[E 4B20 |   3360/60000 (  6%) ] Loss: 2.3056 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3011 top1= 11.3482

Train epoch 5
[E 5B0  |    160/60000 (  0%) ] Loss: 2.3020 top1= 10.0000
[E 5B10 |   1760/60000 (  3%) ] Loss: 2.2946 top1= 11.2500
[E 5B20 |   3360/60000 (  6%) ] Loss: 2.3061 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3010 top1= 11.3482

Train epoch 6
[E 6B0  |    160/60000 (  0%) ] Loss: 2.3020 top1= 10.0000
[E 6B10 |   1760/60000 (  3%) ] Loss: 2.2938 top1= 11.2500
[E 6B20 |   3360/60000 (  6%) ] Loss: 2.3063 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3008 top1= 11.3482

Train epoch 7
[E 7B0  |    160/60000 (  0%) ] Loss: 2.3018 top1= 10.0000
[E 7B10 |   1760/60000 (  3%) ] Loss: 2.2930 top1= 11.2500
[E 7B20 |   3360/60000 (  6%) ] Loss: 2.3061 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.3002 top1= 11.3482

Train epoch 8
[E 8B0  |    160/60000 (  0%) ] Loss: 2.3010 top1= 10.0000
[E 8B10 |   1760/60000 (  3%) ] Loss: 2.2917 top1= 11.2500
[E 8B20 |   3360/60000 (  6%) ] Loss: 2.3048 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.2982 top1= 11.3482

Train epoch 9
[E 9B0  |    160/60000 (  0%) ] Loss: 2.2984 top1= 10.0000
[E 9B10 |   1760/60000 (  3%) ] Loss: 2.2875 top1= 11.2500
[E 9B20 |   3360/60000 (  6%) ] Loss: 2.2969 top1= 10.6250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=2.2821 top1= 12.3998

Train epoch 10
[E10B0  |    160/60000 (  0%) ] Loss: 2.2778 top1= 13.7500
[E10B10 |   1760/60000 (  3%) ] Loss: 2.2313 top1= 25.0000
[E10B20 |   3360/60000 (  6%) ] Loss: 2.0411 top1= 32.5000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=1.1729 top1= 68.4295

Train epoch 11
[E11B0  |    160/60000 (  0%) ] Loss: 1.2339 top1= 58.7500
[E11B10 |   1760/60000 (  3%) ] Loss: 0.7051 top1= 73.7500
[E11B20 |   3360/60000 (  6%) ] Loss: 0.6673 top1= 80.0000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.4481 top1= 87.1595

Train epoch 12
[E12B0  |    160/60000 (  0%) ] Loss: 0.7441 top1= 78.1250
[E12B10 |   1760/60000 (  3%) ] Loss: 0.4729 top1= 82.5000
[E12B20 |   3360/60000 (  6%) ] Loss: 0.3877 top1= 87.5000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3629 top1= 89.0425

Train epoch 13
[E13B0  |    160/60000 (  0%) ] Loss: 0.5294 top1= 83.7500
[E13B10 |   1760/60000 (  3%) ] Loss: 0.3130 top1= 93.1250
[E13B20 |   3360/60000 (  6%) ] Loss: 0.3630 top1= 86.2500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3517 top1= 89.5433

Train epoch 14
[E14B0  |    160/60000 (  0%) ] Loss: 0.4560 top1= 87.5000
[E14B10 |   1760/60000 (  3%) ] Loss: 0.2766 top1= 90.6250
[E14B20 |   3360/60000 (  6%) ] Loss: 0.2267 top1= 93.1250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3168 top1= 90.7752

Train epoch 15
[E15B0  |    160/60000 (  0%) ] Loss: 0.3573 top1= 88.7500
[E15B10 |   1760/60000 (  3%) ] Loss: 0.2142 top1= 93.7500
[E15B20 |   3360/60000 (  6%) ] Loss: 0.1924 top1= 93.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.2893 top1= 91.5465

Train epoch 16
[E16B0  |    160/60000 (  0%) ] Loss: 0.2524 top1= 91.8750
[E16B10 |   1760/60000 (  3%) ] Loss: 0.2125 top1= 93.7500
[E16B20 |   3360/60000 (  6%) ] Loss: 0.2028 top1= 91.8750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3074 top1= 90.4948

Train epoch 17
[E17B0  |    160/60000 (  0%) ] Loss: 0.2443 top1= 92.5000
[E17B10 |   1760/60000 (  3%) ] Loss: 0.1293 top1= 93.1250
[E17B20 |   3360/60000 (  6%) ] Loss: 0.0712 top1= 96.8750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.2787 top1= 91.9571

Train epoch 18
[E18B0  |    160/60000 (  0%) ] Loss: 0.1970 top1= 93.7500
[E18B10 |   1760/60000 (  3%) ] Loss: 0.0788 top1= 96.2500
[E18B20 |   3360/60000 (  6%) ] Loss: 0.0639 top1= 97.5000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.2752 top1= 92.3878

Train epoch 19
[E19B0  |    160/60000 (  0%) ] Loss: 0.0698 top1= 98.1250
[E19B10 |   1760/60000 (  3%) ] Loss: 0.0332 top1= 98.7500
[E19B20 |   3360/60000 (  6%) ] Loss: 0.0558 top1= 98.1250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3048 top1= 92.4379

Train epoch 20
[E20B0  |    160/60000 (  0%) ] Loss: 0.0434 top1= 98.7500
[E20B10 |   1760/60000 (  3%) ] Loss: 0.0482 top1= 98.1250
[E20B20 |   3360/60000 (  6%) ] Loss: 0.0660 top1= 98.1250

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3199 top1= 92.0473

