# Energy-efficient Reinforcement Learning by Discovering Neural Pathways


---


## Install dependencies
Create environment: `conda env create -f environment.yaml`
Activate Environment: `conda activate DAPD`

---
# Online Single task
* `Directory: Online\`

* **DAPD**
  * `python train_pruned.py seed=0 env="HalfCheetah-v2" keep_ratio=0.05 iterative_pruning=True continual_pruning=False ips_threshold=8000 mask_update_mavg=1`
  * **NOTE**: To reproduce results, Use the hyper-parameters mentioned in the paper
````
Important Hyper-parameters for DAPD:
keep_ratio=0.05           # ratio (out of 1) of parameters we want to train. ex: keep_ratio=0.05 means we are pruning 95% of the network and keeping 5% for training
mask_update_mavg=1        # Length og moving average, K
iterative_pruning=True    # Allows periodic mask update
continual_pruning=False   # if True: Keep pruning all the way, if False: stop pruning after reaching threshold episodic return
ips_threshold=8000        # iterative pruning stopping (ips) when reached this threshold (TH) reward
````

### Baselines:
We use the default hyper-parameter used in https://arxiv.org/abs/2205.15043
* **RiGL**
  * run `python train_pruned_rlx2.py seed=0 env="HalfCheetah-v2"  keep_ratio=0.05 agent.pruning_algo=rigl`
* **Rlx2**
  * run `python train_pruned_rlx2.py seed=0 env="HalfCheetah-v2"  keep_ratio=0.05 use_dynamic_buffer=True agent.pruning_algo=rlx2`


`
