A script to generate NWChem input files for use in performance and scalability experiments
This script was created when I was working on this paper and was hardened significantly when I was working at Argonne.
J. R. Hammond, N. Govind, K. Kowalski, J. Autschbach and S. S. Xantheas, J. Chem. Phys. 131, 214103 (2009). Accurate dipole polarizabilities for water clusters N=2-12 at the coupled-cluster level of theory and benchmarking of various density functionals
Please note that this script is not particularly Pythonic, because the author's approach to Python is pragmatic, which is a polite way of saying "lazy".
If you run the script without arguments, it prints a helpful message, but don't start generating input files yet.
Usage: ./make_nwinput.py <cluster> <method> <basis> <task>
<cluster> can be: w1 w2 w3 w4 w5 w6cage w6book w6prism w6cyclic w7 w8s4 w8d2d w9
w10 w11i434 w11i4412 w11i443 w11i515 w11i551 w12 w13 w14 w15 w16
w17int w17surf w18 w19 w20dode w20fused w20face w20edge w21
rubrene or wN for any N not already listed.
<method> can be pbe0, b3lyp or other supported functional represented by a single string
d-hfx - direct SCF using the DFT module
d-scf - direct SCF using the SCF module
sd-scf - semidirect SCF using the SCF module
d-mp2 - direct MP2
sd-mp2 - semidirect MP2
ri-mp2 - resolution-of-identity MP2 (must use Dunning basis)
"rccsd(t)" - partial-direct RHF-CCSD(T) (does not use poing-group symmetry)
"ccsd(t)" - canonical ROHF-CCSD(T) via TCE (expoits D2h and subgroups thereof)
<r>ccsd-t - (same as above but avoides parentheses in file names)
<tce> - any method with cc, mbpt or cis in the name will be treated as a TCE method
<basis> can be 6-31[1][++]G[**] and [aug-]cc-p[c]v[*]z
(where s = * and p = + because otherwise input files are difficult to deal with)
Note: There is no pre-defined RI basis for Pople basis sets,
so that will not be configured automatically, whereas
it will be for Dunning basis sets.
<task> can be energy, optimize, frequency, etc.
At the top of the script are some parameters that you may need to change for each machine. Alternatively, you can leave them and edit the generated input files. The latter allows for more fine-tuning, and is essential if you want to run different versions of the same input, because you'll need to use different subdirectories if the prefix
is the same.
################################################
# #
# MACHINE-DEPENDENT CONFIGURATION INFORMATION #
# #
################################################
# this is probably reasonable on a system with 4 GB per MPI process
# (assuming running 1 MPI per core, which is not always optimal)
stack_mem=1500
heap_mem=100
global_mem=1500
# Do not store semidirect CCSD integrals on disk.
# This is appropriate if your CPU is much faster than your filesystem.
nodisk = False
# Use OpenMP support in semidirect CCSD(T).
# You must compile your binary with USE_OPENMP for this to be effective.
openmp = True
# these are the paths where you job will write files
# this is the directory where the RTDB and MOVECS files will be written.
# in many cases, it is reasonable to have this path be in your home directory.
# the filesystem on which this directory is located must be shared (e.g. NFS, GPFS, Lustre)
permanent_dir = '.'
# the scratch disk is treated like local disk.
# on almost all machines, it should be the local scratch disk on the node.
# exceptions to this rule are Blue Gene and Cray systems, which either have
# no local disk or the local disk (on Cray, /tmp) should not be used since
# it (1) is small (2) is slow (3) will kill the node if it fills up.
scratch_dir = '/tmp'
These job will run probably run on a laptop and definitely on a workstation.
DFT is a relatively inexpensive method.
./make_nwchem_input.py w5 b3lyp cc-pvtz energy
Semidirect CCSD(T) is limited in functionality but is more efficient than the TCE for molecules without symmetry and uses a lot less memory since the full set of two-electron integrals is not stored.
./make_nwchem_input.py w3 "rccsd(t)" 6-31G energy
TCE supports symmetry and a wide range of methods, but since CCSD(T) is usually what is used for benchmarking, that is all our script tries to support.
The TCE input files generated by this script use a pretty good set of options, certainly better than what most users guess. Please read the documentation for more information.
./make_nwchem_input.py w3 "ccsd(t)" 6-31G energy
There are two parameters that make jobs do more computation: the molecule considered and the basis set. This script focuses on water clusters, which provide a set of molecules that range from trivial to heroic to study with various methods. For example, 24 water molecules with CCSD(T) and triple-zeta basis (modified cc-pVTZ) was a Gordon Bell Prize finalist a few years ago.
You need to have some experience with quantum chemistry methods to understand the precise scaling, but here is a rough guide:
- One water molecule has 8 valence electrons and 10 total. For DFT and SCF methods, the latter matters. For MP2 and CCSD(T) methods, the former matters.
- You can determine how many basis functions per water molecule for each basis by running SCF with one water.
SCF and DFT scale as O(R^4) for small systems and O(R^2) to O(N^3) for large systems, where N is the total number of electrons and R is the total number of basis functions.
MP2 scales as O(NR^4). RI-MP2 has a smaller prefactor than semidirect MP2 because the bottleneck kernel is DGEMM rather than atomic integral evaluation.
CCSD(T) scales as O(N^3 R^4). Semidirect CCSD(T) requires O(N^2 R^2) global memory, whereas TCE requires O(R^4). The triples evaluation uses O(T^6) local memory, where T is the tilesize, which is a user-controllable parameter. A tile size of more than 24 will cause CCSD(T) to segfault in most cases.