# This code repo is for review purposes only. Do not distribute.

# Installation with Conda
We recommend using Conda to install the necessary dependencies. To do so, run the following commands:
```
conda env create -n ppicker -f environment.yml
conda activate ppicker
```

# Model Checkpoint
The model checkpoint can be downloaded from [here](https://drive.google.com/file/d/1QQcpvFzgMZcWF_Yuba8O3tJJ4pmpu6EB/view?usp=share_link).
After downloading, place the checkpoint in the current directory.

# Downloading Datasets
Run the following command to download the TomoTwin data as well as the SHREC 2021 dataset. These datasets are used for training and evaluation.
```
bash download.sh
```
Paths to the datasets are set in `data_paths.py`.

# Evaluating ProPicker
To evaluate ProPicker on the TomoTwin validation data, run
```
python evaluate_on_tomotwin_runs.py
```
All parameters for evaluation are set in `eval_cfg.py`. If the checkpoint has been downloaded successfully, the evaluation script will produce the best-case F1 score for all proteins in the TomoTwin validation dataset and write them to `./eval`. These results were used to produce Figure 2 in the manuscript.

# Training ProPicker

Training is done running 
```
python train.py
```
All parameters for training are set in `train_cfg.py`.


# Note
This repository contains code copied from the following projects:
- [DeePiCt](https://github.com/ZauggGroup/DeePiCt): In `clustering_and_picking/clustering_and_cleaning.py`
- [TomoTwin](https://github.com/MPI-Dortmund/tomotwin-cryoet): In `clustering_and_picking/tomotwin_evaluation_routine.py` and `my_tomotwin`. The latter (`my_tomotwin`) is a modified version of the original TomoTwin codebase. The only substantial modification in in `my_tomotwin` was to adapt the `(my_)tomotwin.modules.networks.SiameseNet3D.SiameseNet3D` model class such that it could be turned into a 3D U-Net by adding our own decoder (see `train_cfg.py`). The rest of the code is mostly unchanged.