Density fitting for crystalline calculations

Modules: pyscf.pbc.df

Introduction

In addition to molecular calculations, density fitting (DF) is also useful in periodic calculations, as it reduces the computational cost of manipulating the electron repulsion integrals (ERIs). Currently, PySCF provides three periodic DF methods that differ in their employed auxiliary basis functions.

  • Gaussian density fitting (GDF) uses Gaussian-type orbitals (GTOs) as the auxiliary basis functions. [64] This parallels the molecular DF class.

  • Fast Fourier transform density fitting (FFTDF) uses plane waves (PWs) as the auxiliary basis functions. FFTDF is also known as the Gaussian and plane wave (GPW) approach in the literature and other software packages. [65, 66]

  • Mixed density fitting (MDF) uses both PWs and GTOs as the auxiliary basis functions. [64]

This section of the user document covers the basic usage of these DF methods in periodic calculations. The recommended choice of DF methods for different applications is also provided. See Density fitting (DF) for the use of DF in molecular calculations.

Using DF in periodic calculations

Unlike for molecular calculations, DF is used by default in periodic SCF calculations. For example, initializing a KRHF (i.e., RHF with \(k\)-point sampling) object uses FFTDF by default:

import numpy as np
from pyscf.pbc import gto, scf
cell = gto.M(atom="He 0 0 0", a=np.eye(3)*2, basis="6-31g")
kpts = cell.make_kpts([2,2,2])
mf = scf.KRHF(cell, kpts)
print(mf.with_df)  # <pyscf.pbc.df.fft.FFTDF at 0x7feb78373c88>

The periodic SCF classes also provide methods for using GDF:

mf = scf.KRHF(cell, kpts).density_fit()

and MDF:

mf = scf.KRHF(cell, kpts).mix_density_fit()

In addition to using the APIs above, the user can also initialize a DF object first and overwrite the with_df attribute of the SCF object. For example:

from pyscf.pbc import df
mydf = df.GDF(cell, kpts).build()
mf.with_df = mydf

Post-SCF calculations will automatically use the same DF method to handle ERIs. For example, an MP2 calculation using GDF can be performed as follows:

from pyscf.pbc import scf, mp
mf = scf.KRHF(mf, kpts).density_fit()
mf.kernel()         # perform HF with GDF
mmp = mp.KMP2(mf)
mmp.kernel()        # perform MP2 with GDF

Controlling DF error

The DF error is introduced by the incompleteness of the finite auxiliary basis used to expand the atomic orbital pair densities. The DF error can often be reduced by increasing the number of auxiliary basis functions being used.

FFTDF

FFTDF uses plane waves (PWs) as the auxiliary basis, whose size is determined by FFTDF.mesh, which is set to Cell.mesh upon initialization. Cell.mesh is a 1d array-like object of three integer numbers, [nx, ny, nz], that defines the number of PWs (or the real-space grid points in the unit cell) in the \(x\), \(y\) and \(z\) directions, respectively. The total number of PWs being used for FFTDF is therefore nx * ny * nz. PySCF determines Cell.mesh from Cell.ke_cutoff - the kinetic energy cutoff. By default, Cell.ke_cutoff is determined by Cell.precision and the most compact atomic orbital in the basis set.

To use a PW basis of a different size, the user can either overwrite FFTDF.mesh directly or change it by specifying Cell.ke_cutoff. An example is provided as follows:

import numpy as np
from pyscf.pbc import gto, df

def print_mesh(mesh):
    print("mesh = [%d, %d, %d]  (%d PWs)" % (*mesh, np.prod(mesh)))

cell = gto.M(atom="He 0 0 0", a=np.eye(3)*2, basis="gth-dzvp", pseudo="gth-pade")
kpts = cell.make_kpts([2,2,2])
mydf = df.FFTDF(cell, kpts)
print_mesh(mydf.mesh)
# output: mesh = [42, 42, 42]  (74088 PWs)
mydf.mesh = [17,17,17]
print_mesh(mydf.mesh)
# output: mesh = [17, 17, 17]  (4913 PWs)
cell.ke_cutoff = 60   # unit: Hartree
cell.build()          # rebuild cell to update cell.mesh
mydf = df.FFTDF(cell, kpts)
print_mesh(mydf.mesh)
# output: mesh = [14, 14, 14]  (2744 PWs)

Note that PySCF’s default for Cell.precision is relatively conservative (\(10^{-8}\)). This often leads to a Cell.ke_cutoff that is higher than the default used by other packages using FFTDF. [66] For a more cost-effective choice, the user can lower the ke_cutoff, but should confirm convergence through testing.

GDF

GDF uses Gaussian-type orbitals (GTOs) as the auxiliary basis and parallels the df module for molecular calculations. We guide the readers to Choice of auxiliary basis for more details on how to specify the auxiliary basis sets for GDF. A PBC example can be found in examples/pbc/35-gaussian_density_fit.py.

MDF

MDF uses mixed GTOs and PWs as the fitting basis. The GTO part of the auxiliary basis can be set in the same way as for GDF (again, see Choice of auxiliary basis), while the PW part is similar to FFTDF, i.e., setting either MDF.mesh or Cell.ke_cutoff. The default size of the PW basis is again relatively conservative and the user is recommended to test convergence, as mentioned in FFTDF.

Saving and reusing DF tensors

While FFTDF is implemented in the so-called integral-direct manner and needs no “initialization”, both GDF and MDF pre-compute the Cholesky decomposed electron repulsion integrals (CDERIs) and save the 3-index tensor to disk for later use. The APIs for saving and reusing the CDERIs in GDF and MDF are the same as in the molecular df module; we guide the user to Saving and reusing DF tensors for a detailed description. A PBC example is provided in examples/pbc/35-gaussian_density_fit.py.

Choice of DF method

The choice of DF method depends on the type of calculation, the required accuracy, and the available computational resources.

Type of calculation

  • All-electron versus pseudopotential:

    • For all-electron calculations, only GDF and MDF can be used because FFTDF would require an impractically large PW basis to describe the core orbitals accurately (hydrogen and helium are two exceptions since they don’t have core orbitals).

    • For pseudopotential-based calculations, all three DF methods can be used.

  • Dimensionality:

    • For calculations on low-dimensional systems (0D, 1D, and 2D), only GDF and MDF can be used. The user needs to specify the dimension by setting Cell.dimension.

  • Treatment of exact exchange:

    • For HF or DFT calculations using a hybrid functional, the exact exchange integral has a divergence in reciprocal space that needs special treatment. [67] Different treatments can be used by setting the exxdiv attribute upon initializing a periodic SCF object. Currently, GDF and MDF only support exxdiv = None and exxdiv = "ewald", while FFTDF also supports exxdiv = "vcut_sph" and exxdiv = "vcut_ws". See SCF and DFT methods for more details.

Required accuracy

  • FFTDF can be considered “exact” for pseudopotential-based calculations within the given AO basis if a sufficiently large PW basis is used.

  • GDF has a typical error of \(10^{-5} \sim 10^{-4} E_h\) per atom in the converged SCF energy when tested on simple 3D solids of first- and second-row elements and using default auxiliary basis sets. [64, 68] This error can be reduced by using a larger auxiliary basis set. [68]

  • MDF is in general more accurate than GDF and comparable to FFTDF if a sufficiently large PW basis is used. The typical error of MDF is \(10^{-6} E_h\) per atom or lower in the SCF energy with the default parameters. [64]

Computational resources

  • FFTDF uses very little disk space but requires \(O(N_k n_{\mathrm{AO}}^2 N_{\mathrm{PW}})\) memory, where \(N_{\mathrm{PW}}\) is the size of the PW basis. Despite the modest, linear dependence on \(N_k\), the memory requirement could be high for systems that require a relatively large PW basis.

  • GDF requires enough disk space to hold the pre-computed CDERIs. The size of these integrals grows quickly with the system size and scales as \(O(N_k^2 n_{\mathrm{AO}}^2 n_{\mathrm{aux}})\), where \(N_k\) is the number of k-points, \(n_{\mathrm{AO}}\) is the number of AOs per unit cell, and \(n_{\mathrm{aux}}\) is the number of auxiliary basis functions per unit cell. Note that for DFT calculations using pure exchange-correlation functionals (LDA and GGA), the storage requirement is reduced to \(O(N_k n_{\mathrm{AO}}^2 n_{\mathrm{aux}})\), which is much more modest.

  • MDF requires both \(O(N_k^2 n_{\mathrm{AO}}^2 n_{\mathrm{aux}})\) disk space to store pre-computed integrals of the GTO part of the auxiliary basis and \(O(N_k n_{\mathrm{AO}}^2 N_{\mathrm{PW}})\) memory for the PW part. However, both \(n_{\mathrm{aux}}\) and \(N_{\mathrm{PW}}\) here are smaller than that required by GDF and FFTDF, respectively.