Localized orbitals#

Modules: pyscf.lo

Introduction#

A molecular orbital is usually delocalized, i.e. it has non-negligible amplitude over the whole system rather than only around some atom(s) or bond(s). However, one can choose a unitary rotation $U$

ϕ = ψ U

such that the resulting orbitals $ϕ$ are as spatially localized as possible. This is typically achieved by one of two classes of methods. The first is to project the orbitals onto a predefined local set of orbitals, which can be e.g. atomic orbitals or pseudo-atomic orbitals. The second is to optimize a cost function $f$ , which measures the locality of the molecular orbitals. Because there is no unambiguous choice for the localization criterion, several criteria have been suggested. Boys localization minimizes the spread of the orbital

f (U) = \sum_{i} ⟨ ψ_{i} | r^{2} | ψ_{i} ⟩ - ⟨ ψ_{i} | r | ψ_{i} ⟩^{2}

Boys localized orbitals [70] in periodic systems are typically termed maximally localized Wannier orbitals (MLWF) [71].

Pipek-Mezey (PM) localization [72] maximizes the population charges on the atoms

f (U) = \sum_{I}^{atoms} \sum_{i} {| q_{i}^{I} |}^{2}

Note that PM localization depends on the choice of atomic orbitals used for the population analysis. Several choices of populations are available, e.g. Mulliken or based on (meta-) L"owdin orbitals. Intrinsic bond orbitals (IBOs) can be viewed as a special case of PM localization using intrinsic atomic orbitals (IAOs) as population method. See Ref. [73] for a summary of choices of orbitals. Note that PM localization preserves the separation between $σ$ and $π$ orbitals.

Edmiston-Ruedenberg (ER) localization [74] maximizes the orbital Coulomb self-repulsion,

f (U) = \sum_{i} (i i | i i)

ER localization, however, is computationally more expensive than the Boys or PM approaches.

Localized orbitals can be calculated via the pivoted Cholesky factorization of a density-like matrix $D = C C^{†}$ . [75] Since $C$ is generally a rectangular matrix containing only the subset of $N$ orbitals intended for localization, the matrix $D$ is positive-semidefinite. It can be factored using a Cholesky decomposition with full column pivoting,

P^{†} D P = L L^{†},

where $L$ is a lower triangular matrix and $P$ is a permutation matrix. In the end, the $N$ leftmost columns of $P L$ are taken as the localized orbitals. While Cholesky orbitals are usually not as localized as, for example, PM or Boys orbitals, the procedure is non-iterative and produces unique result, except possibly for the impact of degeneracies. Cholesky orbitals can serve as an excellent guess for iterative localization procedures.

A summary of the functionality of the lo module is given below:

Method	optimization	cost function	PBC	ref
(meta-) L"owdin	No		yes	[76, 77]
Natural atomic orbitals	No		gamma	[78]
Intrinsic atomic orbitals	No		yes	[79]
Cholesky orbitals	No		no	[75]
Boys	yes	dipole	no	[70]
Pipek-Mezey	yes	local charges	gamma	[72]
Intrinsic bond orbitals	yes	IAO charges	gamma	[79]
Edmiston-Ruedenberg	yes	coulomb integral	gamma	[74]

For example, to obtain the natural atomic orbital coefficients (in terms of the original atomic orbitals):

import numpy
from pyscf import gto, scf, lo

x = .63
mol = gto.M(atom=[['C', (0, 0, 0)],
                  ['H', (x ,  x,  x)],
                  ['H', (-x, -x,  x)],
                  ['H', (-x,  x, -x)],
                  ['H', ( x, -x, -x)]],
            basis='ccpvtz')
mf = scf.RHF(mol).run()

# C matrix stores the AO to localized orbital coefficients
C = lo.orth_ao(mf, 'nao')