Towards A Universally Transferable Acceleration Method for Density Functional Theory
================================

# Contents

During the reviewing process, we supply the following materials:

* A small sample of the SCFbench dataset.
* The data pipeline for the SCFbench dataset.
* The PyTorch `nn.Module` of the species-wise linear layer for the prediction of the electron density coefficients.
* The NequIP model architecture with the species-wise linear layer.
* Example code for computing the density coefficients from a density matrix.

Upon the acceptance of the paper, we will release all the above materials as well as:

* The full SCFbench dataset.
* The training code for models.
* The full evaluation code.


# Requirements

* torch
* e3nn
* pyscf
* lmdb
* numpy>1.26


# Dataset Usage

The sample dataset contains the `main` dataset (the dataset for training, validation and in-distribution testing) and the `ood-test` dataset.

Each dataset contains several `parts`, each of which corresponds to a specific piece of information. The parts are:

* `base`: the basic information of the molecule, including atomic numbers, coordinates, etc.
* `dm`: the density matrix of the molecule.
* `fock`: the Hamiltonian (fock) matrix of the molecule.
* `auxdensity.denfit`: the density coefficients on def2-universal-jfit.
* `auxdensity.denfit.etb2.0`: the density coefficients on the ETB basis of def2-svp with $\beta=2.0$.
* `auxdensity.denfit.etb1.5`: the density coefficients on the ETB basis of def2-svp with $\beta=1.5$.

Example:

```python
from dataset import SCFBenchDataset

# Loading base info (atomic numbers, coordinates, etc.), density matrix, Hamiltonian (fock) matrix and the density coefficients on def2-universal-jfit.
parts_to_load = ['base', 'dm', 'fock', 'auxdensity.denfit']
dataset = SCFBenchDataset(data_root='dataset/main', parts_to_load=parts_to_load)
dataset[0].keys()

# Loading the base info and the density coefficients on the ETB basis of def2-svp with $\beta=1.5$.
parts_to_load = ['base', 'auxdensity.denfit.etb1.5']
dataset = SCFBenchDataset(data_root='dataset/ood-test', parts_to_load=parts_to_load, auxbasis='etb:def2-svp:1.5') 
dataset[0].keys()

# for the raw data, use the underlying dataset
dataset.dataset[0].keys()
```
