A Fast Selected Inversion Algorithm for Green's Function Calculation in Many-Body Quantum Monte Carlo Simulations
Abstract: The Hubbard Hamiltonian provides a theoretical framework for describing electron interactions of quantum many-body systems in condensed matter physics. Determinant Quantum Monte Carlo (DQMC) simulations of the Hubbard Hamiltonian have contributed greatly to understanding important properties of materials. Physical measurements such as superconductivity and magnetic susceptibility are based on selected entries of a large set of Green's functions. The computations of Green's functions are equivalent to computing selected blocks of the inverses of large p-cyclic matrices. The performance of the state-of-art algorithm for computing Green's functions is around 100 Gflops on a 12-core Intel "Ivy Bridge" processor. In this paper, we describe a fast selected inversion (FSI) algorithm for computing selected entries of Green's functions and present a parallel implementation using hybrid MPI/OpenMP programming. The FSI algorithm rests on three ideas: (1) applying a block cyclic reduction for a structure-preserving reduction, (2) computing the inverse of the reduced block p-cyclic matrix by a structured orthogonal factorization, (3) using the block entries of the inverse of the reduced block p-cyclic matrix as seeds to rapidly form the selected inversion in parallel. Performance results of the new FSI algorithm on Edison, National Energy Research Scientific Computing Center (NERSC)'s Cray XC30 supercomputer, show an 80% improvement to 180 Gflops on the Intel "Ivy Bridge" processor. The parallel applications of the FSI algorithm for computing selected entries of multiple Green's functions reach to 20 -- 30 Tflops on 100 compute nodes with 2400 cores. The preliminary results show that the FSI algorithm speeds up a full DQMC simulation of the Hubbard Hamiltonian by a factor of five, reducing from three and a half hours down to only forty minutes on the l2-core processor.
Loading