
=== Start adding workers ===
=> Add worker SGDMWorker(index=0, momentum=0.9)
=> Add worker SGDMWorker(index=1, momentum=0.9)
=> Add worker SGDMWorker(index=2, momentum=0.9)
=> Add worker SGDMWorker(index=3, momentum=0.9)
=> Add worker SGDMWorker(index=4, momentum=0.9)

=== Start adding graph ===
Ring(n=5)

Train epoch 1
[E 1B0  |    160/60000 (  0%) ] Loss: 2.3148 top1=  7.5000

=== Peeking data label distribution E1B0 ===
Worker 0 has targets: tensor([9, 4, 7, 0, 3], device='cuda:0')
Worker 1 has targets: tensor([3, 9, 4, 6, 1], device='cuda:0')
Worker 2 has targets: tensor([5, 6, 8, 8, 3], device='cuda:0')
Worker 3 has targets: tensor([4, 8, 9, 9, 2], device='cuda:0')
Worker 4 has targets: tensor([7, 8, 4, 9, 5], device='cuda:0')



=== Log mixing matrix @ E1B0 ===
[[0.333 0.333 0.    0.    0.333]
 [0.333 0.333 0.333 0.    0.   ]
 [0.    0.333 0.333 0.333 0.   ]
 [0.    0.    0.333 0.333 0.333]
 [0.333 0.    0.    0.333 0.333]]


[E 1B10 |   1760/60000 (  3%) ] Loss: 1.9207 top1= 45.0000
[E 1B20 |   3360/60000 (  6%) ] Loss: 0.8124 top1= 70.0000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.4654 top1= 85.8574

Train epoch 2
[E 2B0  |    160/60000 (  0%) ] Loss: 0.7616 top1= 76.8750
[E 2B10 |   1760/60000 (  3%) ] Loss: 0.5259 top1= 81.8750
[E 2B20 |   3360/60000 (  6%) ] Loss: 0.3701 top1= 87.5000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3128 top1= 90.6651

Train epoch 3
[E 3B0  |    160/60000 (  0%) ] Loss: 0.3797 top1= 89.3750
[E 3B10 |   1760/60000 (  3%) ] Loss: 0.1803 top1= 95.0000
[E 3B20 |   3360/60000 (  6%) ] Loss: 0.1505 top1= 93.7500

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.2695 top1= 91.6867

Train epoch 4
[E 4B0  |    160/60000 (  0%) ] Loss: 0.1579 top1= 95.6250
[E 4B10 |   1760/60000 (  3%) ] Loss: 0.0746 top1= 98.1250
[E 4B20 |   3360/60000 (  6%) ] Loss: 0.0549 top1= 99.3750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.2721 top1= 92.1775

Train epoch 5
[E 5B0  |    160/60000 (  0%) ] Loss: 0.0615 top1= 98.7500
[E 5B10 |   1760/60000 (  3%) ] Loss: 0.0212 top1=100.0000
[E 5B20 |   3360/60000 (  6%) ] Loss: 0.0322 top1= 99.3750

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3029 top1= 91.9872

Train epoch 6
[E 6B0  |    160/60000 (  0%) ] Loss: 0.0336 top1= 99.3750
[E 6B10 |   1760/60000 (  3%) ] Loss: 0.0526 top1= 98.7500
[E 6B20 |   3360/60000 (  6%) ] Loss: 0.0171 top1=100.0000

=> Averaged model (Global Average Validation Accuracy) | Eval Loss=0.3113 top1= 92.2776

Train epoch 7
[E 7B0  |    160/60000 (  0%) ] Loss: 0.0192 top1= 99.3750
[E 7B10 |   1760/60000 (  3%) ] Loss: 0.0090 top1=100.0000
[E 7B20 |   3360/60000 (  6%) ] Loss: 0.0158 top1= 99.3750
