# Distributed Unlearning with Lossy Compression



Repo for 'Distributed Unlearning with Lossy Compression'.

This is a modified clone of the repo - 'Fast-FedUL: A Training-Free Federated Unlearning with Provable Skew Resilience' [![arXiv](https://img.shields.io/badge/arXiv-2405.18040-b31b1b.svg)](https://arxiv.org/abs/2405.18040)

'Fast-FedUL: A Training-Free Federated Unlearning with Provable Skew Resilience' original repo - https://github.com/thanhtrunghuynh93/fastFedUL

## Introduction
In this work we study lossy compression schemes for facilitating distributed server-side unlearning  
with limited memory footprint. We identify suitable lossy compression mechanism based on random  
lattice coding and sparsification. For a family of stochastic compression schemes encompassing  
probabilistic and subtractive dithered quantization, we rigorously show how one can guarantee a finite  
bound on the difference between a desired model that is trained from scratch and a model unlearned from  
lossy compressed stored updates. Our numerical study shows that suitable lossy compression can enable distributed  
unlearning with notably reduced memory footprint while having the unlearned model achieve similar performance  
to one that is trained from scratch.

## Usage
This code has been tested on Python 3.12.4, PyTorch 2.4.1 and CUDA 11.8.

### Prerequisite
1. PyTorch 2.4.1 + torchvision
2. numpy
3. PyYaml
4. ujson


## Data loading and preperations 

1. Unrar the **ARDIS** dataset to backdoor the MNIST model

```
unrar project_path/customdata/ARDIS_DATASET_IV/ARDIS_DATASET_IV.rar project_path/customdata/ARDIS_DATASET_IV/
```
2. Genereate new federated task

```bash
python generate_fedtask.py --dataset mnist --dist 0 --skew 0 --num_clients 25
```

This is for a federated task with 25 clients, mnist dataset for the i.i.d case

3. Train the FL model and perform Unlearning algorithm with all 6 compression settings that were presented in the paper.

```bash
bash run_exp.sh
```

Inside this file you can control the following:
- Number of code words (Controling the quantization rates)
- Value of K (for top-K and rand-k compressions)
