Dataset: MNIST
Device: cuda
Batch Size: 32
Optimizer Parameters: lr=1e-05, betas=(0.9, 0.98), eps=1e-06, weight_decay=0.01
Classes: ['0 - zero', '1 - one', '2 - two', '3 - three', '4 - four', '5 - five', '6 - six', '7 - seven', '8 - eight', '9 - nine']
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw

Epsilon:  0.5
Delta:  8.333333333333334e-06
Clip Param C:  0.1
DP-SGD with sampling rate = 0.0533% and noise_multiplier = 1.2805631372100243 iterated over 56250 steps satisfies differential privacy with eps = 0.5 and delta = 8.333333333333334e-06.
Noise Scale:  1.2805631372100243
**********
Num Epochs: 30
tensor(2.1055, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
****the 0^th epoch *****
**** on training set *****
Accuracy Rate: 0.7850833535194397
*************************
**** on testing set *****
Accuracy Rate: 0.7811501622200012
*************************
tensor(1.9141, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.7559, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5244, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.6396, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5977, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4316, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5039, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5088, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3369, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4395, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4258, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3887, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.7070, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3398, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3203, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5078, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4092, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4961, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3398, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
****the 10^th epoch *****
**** on training set *****
Accuracy Rate: 0.9401833415031433
*************************
**** on testing set *****
Accuracy Rate: 0.9434903860092163
*************************
tensor(1.4424, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5918, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3027, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3184, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4902, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4746, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5078, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3359, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3105, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5938, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3477, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4980, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.2754, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4092, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.2666, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3789, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5391, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.2617, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3047, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
****the 20^th epoch *****
**** on training set *****
Accuracy Rate: 0.9298833608627319
*************************
**** on testing set *****
Accuracy Rate: 0.9306110143661499
*************************
tensor(1.6523, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4473, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3545, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4082, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4141, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4512, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4365, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3105, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3398, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4795, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3945, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.4756, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3564, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.5332, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3340, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.3330, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
tensor(1.2646, device='cuda:0', dtype=torch.float16, grad_fn=<DivBackward0>)
Training Time:  8666.702280759811
Accuracy Rate: 0.9619608521461487
------------------------------------
