# 🧠 TopoHKS: Topological Heat Kernel Signatures

This repository contains the code to reproduce the results for **TopoHKS**.  
For the conference supplementary material, we currently provide the full setup for the **PROTEIN** dataset.  
Upon publication, we will release the complete, polished codebase supporting all datasets and configurations.

---

## 📦 Environment Setup

To set up the Conda environment:

```bash
conda env create -f env.yml
conda activate TopoHKS
```


## 🧪 Data Generation
Data preprocessing is handled in the `HKSGoesTopological-data` folder.

To generate the protein dataset Laplacians:

Adjust the file paths in gen_laplacians_protein.py to point to your desired data directories.

Run the script:

```
python gen_laplacians_protein.py
```

## 🚀 Model Training
Model training is done in the `HKSGoesTopological-model` folder.

To train the transformer model from scratch on the PROTEIN dataset:

Adjust the data path in train_transformer_scratch_proteins.py.

Run:

```
python train_transformer_scratch_proteins.py
```

## 📌 Notes
This is a partial release for reproducibility purposes.

Full documentation, multi-dataset support, and pretrained models will follow post-publication.

