Lifelong Machine Learning Potentials
====================================

Introduction
------------

A lifelong machine learning potential (lMLP) is a representation of the potential energy surface
for arbitrary systems with uncertainty quantification which can be fine-tuned and extended in a
rolling fashion. Hence, it unites accuracy, efficiency, and flexibility.

This software enables lMLP training and prediction.

Installation
------------

The module lmlp can be installed using pip (pip3) once the repository has been cloned.

.. code-block:: bash

   git clone <lmlp-repository>
   pip install ./lmlp

A non super user can install the package using a virtual environment or the ``--user`` flag.
If there is no space left on device for TMPDIR, one can use ``TMPDIR=<PATH>`` in front of pip,
with <PATH> being a directory with more space for temporary files.

For higher performance Intel SVML and TBB can be exploited.

.. code-block:: bash

   pip install icc-rt tbb

If they are installed via pip, please make sure that the respective library path of pip is added
to the environment variable LD_LIBRARY_PATH. The path of the pip installation location is shown
by ``pip -V``. Only the leading part ``/.../lib/`` of this path needs to be added to
``LD_LIBRARY_PATH``. We recommend to add the export statement also to your ``~/.bashrc`` file.

.. code-block:: bash

   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/.../lib/   # replace ... by actual library path

Usage
-----

**Training**

1. Prepare episodic memory file, descriptor file, and supplemental potential file (see examples and
   tools in GitLab).

2. Adjust settings in input_lmlp.py (see examples).

3. Run input_lmlp.py.

.. code-block:: bash

   python3 -u input_lmlp.py > output.dat

In the generalization setting file several lMLPs can be combined for an ensemble prediction by
adding the names of the respective generalization files in the ensemble list and their test RMSEs
in the test_rmse list. The other properties in the generalization setting file have to match for
all lMLPs in the ensemble.

**Prediction**

.. code-block:: python

   import numpy as np
   import lmlp

   # Required input
   generalization_setting_file = '...'
   elements = np.array([...], dtype=str)   # shape: (n_atoms)
   positions = np.array([...])   # shape: (n_atoms, 3), unit: Angstrom

   # Optional input
   lattice = np.array([...])   # shape: (3, 3), unit: Angstrom
   atomic_classes = np.array([...], dtype=int)   # shape: (n_atoms), values: 1 -> QM atom, 2 -> MM atom
   atomic_charges = np.array([...])   # shape: (n_atoms), unit: elementary charge

   # Initialize lMLP calculator
   lMLP = lmlp.lMLP_calculator(
       generalization_setting_file, uncertainty_scaling=2.0, active_learning_file=None,
       active_learning_thresholds=(3.0, 3.0, 3.0))

   # Simple energy and forces prediction requires elements and positions
   energy, forces = lMLP.predict(elements, positions)   # energy unit: eV, forces unit: eV/Ang

   # In addition lattice can be provided for periodic systems,
   # atomic_classes and atomic_charges are required for QM/MM predictions,
   # name will be assigned to the structure in the active learning output file,
   # calc_forces enables forces calculation (only available properties are returned),
   # calc_uncertainty enables uncertainty quantification (only available properties are returned)
   energy, forces, energy_uncertainty, forces_uncertainty = lMLP.predict(
       elements, positions, lattice=lattice, atomic_classes=atomic_classes,
       atomic_charges=atomic_charges, name=None, calc_forces=True, calc_uncertainty=True)

The number of threads used by Numba during prediction can be specified by the environment variable
NUMBA_NUM_THREADS. The default is no parallelization.

.. code-block:: bash

   NUMBA_NUM_THREADS=4 python3 prediction_lmlp.py

Numba and PyTorch perform just-in-time compilation at the first time the respective code is executed.
The compiled Numba functions are cached for future use. Just-in-time compilation can be turned off
by setting NUMBA_JIT=0 and PYTORCH_JIT=0.

License and Copyright Information
---------------------------------

The module lmlp is distributed under the BSD 3-Clause "New" or "Revised" License.
For more license and copyright information, see the file ``LICENSE.txt``.

How to Cite
-----------

When publishing results obtained with lmlp, please cite the respective release
as archived on Zenodo (DOI: 10.5281/zenodo.7912832, 10.5281/zenodo.8192949).

In addition, we kindly request you to cite
M. Eckhoff, M. Reiher, `Lifelong Machine Learning Potentials
<https://doi.org/10.1021/acs.jctc.3c00279>`_, J. Chem. Theory Comput. 19, 3509-3525 (2023)
when working with lMLPs and
M. Eckhoff, M. Reiher, `CoRe optimizer: an all-in-one solution for machine learning
<https://doi.org/10.1088/2632-2153/ad1f76>`_, Mach. Learn.: Sci. Technol. 5, 015018, (2024)
when working with the CoRe optimizer.

Support and Contact
-------------------

In case you encounter any problems or bugs, please write a message to lifelong_ml@phys.chem.ethz.ch.
