nao.examples.qmd_C60 — NAO Examples: QMD C60#

A demonstration of the Nao module used to calculate the polarizability of C60 at rooms temperature using Quantum Molecular Dynamic (QMD). The relaxation and the ground state calculations are done using the Siesta program. This example will calculate the geometries of 5000 steps of C60 using Siesta and the polarizability will be obtained by calculating 200 steps. This example MUST be done on a server because it will perform 200 calculations. This example has been done for the Torque resources manager and will need to be adapted to your case before to run.

To start the calculations, we first need to generate the 5000 geometries using Molecular Dynamic (MD) with temperature controlled by means of a Nosé thermostat. For this we use the following siesta input file,

SystemLabel                siesta
PAO.EnergyShift            50 meV
PAO.BasisSize              DZP
PAO.SoftDefault            .true.
PAO.BasisType              split


XC.functional     GGA 
XC.authors        PBE 

WriteIonPlotFiles .false.
WriteCoorXmol    .true.
WriteMDXmol      .true.

%include "C60_org.fdf"

AtomCoorFormatOut  Ang

DM.UseSaveDM               .true.
DM.Tolerance               0.0001
DM.MixingWeight            0.25
MaxSCFIterations           400
DM.NumberPulay             4

MD.TypeOfRun               Nose 
MD.InitialTemperature      300 K
MD.TargetTemperature       300 K
MD.NumCGsteps              5000
MD.FinalTimeStep           5000
UseSaveData               .true.

MeshCutOff                250 Ry
MD.UseSaveXV             .true.

SaveRho                  .false.
COOP.Write               .false.
WriteDenchar             .false.

The file containing the position of the atoms can be downloaded with the following link.

You run this calculation with the normal siesta command (don’t forget the C.psf file)::: siesta < siesta_C60_thermal.fdf > siesta.out

or with a bash script if you run on a server. Once, this first calculation finish you will be left with a siesta.ANI file containing the geometry at each time step. We will first generate the xyz files corresponding to the geometries for comomdity,

#!/bin/bash

fname=${1:-siesta.ANI}
suffi=${2:-0}
echo ".ANI file with MM geometries            : " $fname
echo "Suffix to choose geometries for QM part : " $suffi

mkdir -p xyz
echo "cp $fname xyz/"
cp $fname xyz/
echo "cd xyz"
cd xyz
LIT=`head -n1 $fname` 
NLINES_PER_FILE=$(expr $LIT + 2)
echo "rm x*"
rm x*
echo "split -l $NLINES_PER_FILE -d -a 6 $fname"
split -l $NLINES_PER_FILE -d -a 6 $fname
echo "xyz2fdf.py x0*$suffi > xyz2fdf.py.out"
xyz2fdf.py x0*$suffi &> xyz2fdf.py.out
cd ..
ls xyz/*.fdf

This will create a folder xyz containing the 5000 generated geometries. We need then to generate the fdf geometries file for siesta:

cd cyz
python xyz2fdf.py

xyz2fdf.py is a small python script that will generate the geometries for 200 calculations from the 5000 previous geometries because 5000 is a bit too much. The script look like,

from __future__ import division
import numpy as np
import ase.io as io
import subprocess

def write_geofdf(atoms, fname="geo.fdf"):
    from ase.data import chemical_symbols

    f=open(fname, "w")
    f.write("NumberOfAtoms   {0}\n".format(len(atoms)))
    f.write("NumberOfSpecies {0}\n".format(len(set(atoms.numbers))))
    f.write("\n")
    f.write('%block ChemicalSpeciesLabel\n')
    species_label = {}
    for i, nb in enumerate(set(atoms.numbers)):
        species_label[chemical_symbols[nb]] = i+1
        f.write('  {0}  {1}  '.format(i+1, nb)+ chemical_symbols[nb]+'\n')
    f.write("%endblock ChemicalSpeciesLabel\n")
    f.write("\n")

    f.write("AtomicCoordinatesFormat  Ang\n")
    f.write("%block AtomicCoordinatesAndAtomicSpecies\n")
    for ia, atm in enumerate(atoms):
        pos = atm.position
        f.write("{0:.6f}  {1:.6f}  {2:.6f}  {3} {4}  ".format(pos[0], pos[1], pos[2],
            species_label[atm.symbol], ia+1) + atm.symbol + "\n")

    f.write("%endblock AtomicCoordinatesAndAtomicSpecies")

    f.close()

xyz_range = np.arange(0, 5000, 25)

for i, xyz in enumerate(xyz_range):
    if xyz < 10:
        num = "00000{0}".format(xyz)
    elif xyz < 100:
        num = "0000{0}".format(xyz)
    elif xyz < 1000:
        num = "000{0}".format(xyz)
    elif xyz < 10000:
        num = "00{0}".format(xyz)
    else:
        raise ValueError("xyz too large?? {0}".format(xyz))

    path = "calc_" + num
    atoms = io.read("x"+num, format="xyz")
    subprocess.call("mkdir " + path, shell=True)
    write_geofdf(atoms, fname=path+"/geo.fdf")
    subprocess.call("cp siesta_C60.fdf " + path, shell=True)
    subprocess.call("cp C.psf " + path, shell=True)

The script will generate the siesta input files in different trajectory, don’t forget to add in the xyz directory the siesta input

SystemLabel                siesta
PAO.EnergyShift            50 meV
PAO.BasisSize              DZP
PAO.SoftDefault            .true.
PAO.BasisType              split


XC.functional     GGA 
XC.authors        PBE 

%include "geo.fdf"

AtomCoorFormatOut  Ang

DM.MixingWeight            0.01
DM.Tolerance               0.0001
DM.MixingWeight            0.25
DM.NumberPulay             4
MaxSCFIterations           400

MD.TypeOfRun               CG 
MD.NumCGsteps              0
MD.MaxForceTol     0.02   eV/Ang

SolutionMethod     Diagon

MeshCutOff                250 Ry

WriteCoorXmol     .True.
WriteDenchar      .True.
COOP.Write        .True.
XML.Write         .True.

After running the xyz2fdf.py script you should have 200 folders containing the 3 input files,

siesta_C60.fdf
geo.fdf
C.psf

I advise you to create a new folder, and to copy all the calculations input into this new folder:

cd ..
mkdir calculations
cp -r xyz/calc_* calculations
cd calculations

Now we will need script to perform the siesta and nao calculations. Here, I will perform in total 50 jobs, each running 4 calculations. Each jobs will be using 6 cpus. I’m assuming that you are using the Torque managing system. We will use two python script to perform the calculations,

submit_jobs.py

calc_polarizability.py

The first one is the script that you call in order to submit the 200 jobs and it call the second script calc_polarizability.py. This second script will perform the siesta and nao calculations for each configurations.

submit_jobs.py:

from __future__ import division
import numpy as np
import subprocess

# The bash script for Torque
script = """#!/bin/bash  
#PBS -q parallel
#PBS -l nodes=1:ppn=6
#PBS -l mem=5gb
#PBS -l cput=400:00:00
#PBS -N calc_C60


ulimit -s unlimited

export NPROCS=`wc -l < $PBS_NODEFILE`
export OMP_NUM_THREADS=${NPROCS}
export PATH="/Path/To/Python/binary:$PATH" 
export LD_LIBRARY_PATH="/Path/To/Needed/Library:$LD_LIBRARY_PATH" # libxc, libcint, libxcfun
export PYTHONPATH="/Path/To/Pyscf:$PYTHONPATH"

# ASE necessay for using ase.units
#ASE
export ASE_HOME=/Path/To/ASE
export PYTHONPATH="${ASE_HOME}:$PYTHONPATH"
export PATH="${ASE_HOME}/tools:$PATH"

# you may need to change the job directory
export LSCRATCH_DIR=/scratch/$USER/jobs/$PBS_JOBID
mkdir -p $LSCRATCH_DIR
cd $PBS_O_WORKDIR

# load the right module depending your pyscfi/siesta compilation
# you better to use the same compiler for both program to avoid problems
ml purge
ml load intel/2015b FFTW/3.3.4-intel-2015b
"""

# the range of our 200 calculations
xyz_range = np.arange(0, 5000, 25)

start = 0
step = 4
end = 0

while end < xyz_range.size:
    
    end = start + step
    if end > xyz_range.size:
        end = xyz_range.size + 1

    calcs = xyz_range[start:end]
    include = ["calc_polarizability.py"]
    for i, xyz in enumerate(calcs[0:calcs.shape[0]]):
        if xyz < 10:
            num = "00000{0}".format(xyz)
        elif xyz < 100:
            num = "0000{0}".format(xyz)
        elif xyz < 1000:
            num = "000{0}".format(xyz)
        elif xyz < 10000:
            num = "00{0}".format(xyz)
        else:
            raise ValueError("xyz too large?? {0}".format(xyz))
        include.append("calc_"+num)

    fcalc = calcs[0]
    ecalc = calcs[calcs.shape[0]-1]+25

    lines = script
    for files in include:
        lines += "cp -r " + files + " $LSCRATCH_DIR\n"
        
    lines += "cd $LSCRATCH_DIR\n"
    lines += "./calc_polarizability.py --np ${NPROCS} " + "--start {0} --end {1} >& calc_{0}to{1}.out\n".format(fcalc, ecalc)

    lines += "export RESULTS_DIR=$PBS_O_WORKDIR\n"
    lines += "mkdir -p $RESULTS_DIR\n"
    lines += "cp -r * $RESULTS_DIR\n"
    fname = "run.calc_C60_{0}to_{1}.sh".format(fcalc, ecalc)
    f = open(fname, "w")
    f.write(lines) # write bash script
    f.close()
    start = end
    
    # submit job to Torque
    subprocess.call("qsub " + fname, shell=True)

calc_polarizability.py

#!/Path/To/Python/binary
from __future__ import print_function, division
import os,numpy as np
from pyscf.nao import tddft_iter
import sys
import argparse
from ase.units import Ry, eV, Ha
import subprocess
from timeit import default_timer as timer

def run_tddft_iter():
  t1 = timer()
  td = tddft_iter(label="siesta", iter_broadening=0.0035/Ha, xc_code='LDA,PZ', level=0, tddft_iter_tol=1e-3, tol_loc=1e-4, tol_biloc=1e-6)
  t2 = timer()
  print("time tddft_iter = ", t2-t1)
  omegas = np.arange(2.0, 8.0, 2.5E-3)/Ha + 1j*td.eps
  t3 = timer()
  pxx = -td.comp_polariz_nonin_ave(omegas)
  t4 = timer()
  print("time chi0 = ", t4-t3)

  data = np.zeros((omegas.shape[0], 3))
  data[:, 0] = omegas.real*Ha
  data[:, 1] = pxx.real
  data[:, 2] = pxx.imag
  np.savetxt('polarizability_nonin_siesta.avg.txt', data)

  t5 = timer()
  pxx = -td.comp_polariz_inter_ave(omegas)
  t6 = timer()
  print("time chi = ", t6-t5)
  data = np.zeros((omegas.shape[0], 3))
  data[:, 0] = omegas.real*Ha
  data[:, 1] = pxx.real
  data[:, 2] = pxx.imag
  np.savetxt('polarizability_inter_siesta.avg.txt', data)
  print("nb call:")
  print("rf0_ncalls = {0}, matvec_ncalls ={1}".format(td.rf0_ncalls, td.matvec_ncalls))
  t7 = timer()
  print("total time = ", t7-t1)

parser = argparse.ArgumentParser()
parser.add_argument('--np', type=int, default=1, help="number of processor to use")
parser.add_argument('--start', type=int, default=0, help="starting calc")
parser.add_argument('--end', type=int, default=25, help="end calculation")

args_par = parser.parse_args()


xyz_range = np.arange(args_par.start, args_par.end, 25)
mpi_exe="/scratch/mbarbry/intel/intelpython2/bin/mpirun"
siesta_path="/scratch/software/SIESTA/4.0b-485-intel-2015b/siesta"

siesta_exe = mpi_exe + " -np {0} ".format(args_par.np) + siesta_path + " < siesta.fdf > siesta.out"
for i, xyz in enumerate(xyz_range):
    if xyz < 10:
        num = "00000{0}".format(xyz)
    elif xyz < 100:
        num = "0000{0}".format(xyz)
    elif xyz < 1000:
        num = "000{0}".format(xyz)
    elif xyz < 10000:
        num = "00{0}".format(xyz)
    else:
        raise ValueError("xyz too large?? {0}".format(xyz))
    path = "calc_" + num
    os.chdir(path)
    # Run siesta
    subprocess.call("export OMP_NUM_THREADS=1", shell=True)
    subprocess.call(siesta_exe, shell=True)

    # Run pyscf.nao
    subprocess.call("export OMP_NUM_THREADS={0}".format(args_par.np), shell=True)
    run_tddft_iter()
    os.chdir("../")