ChIMES Active Learning Driver Configuration File Options
Optional config.py Variables:
Assorted General Options
Input variable |
Variable type |
Required |
Default |
Value/Options/Notes |
|---|---|---|---|---|
|
str |
N |
“” |
E-mail address for driver to sent status updates to. If blank (“”), no emails are sent. |
|
int |
N |
1 |
Only used for active learning strategies are selected. Seed for random number generator. |
|
list of str |
Y |
None |
List of atom types in system of interest, e.g. [“C”,”H”,”O”]. |
|
int |
Y |
None |
Number of different state points at which to conduct iterative learning. |
|
list of str |
N |
[“”] |
List of species to track in molanal output, e.g. ["C1 O1 1(O-C)", "C1 O2 2(O-C)"]. |
|
int |
N |
0 |
Cycle at which to start including stress tensors from ALC generated configrations. |
|
str |
N |
“ALL” |
How stress tensors should be included in the fit. Options are: “DIAG” or “ALL”. |
|
int |
N |
float |
Thermal smearing temperature in K; if "None", different values are used for each case, set in the ALL_BASE_FILES traj_list.dat. |
|
str |
Y |
None |
Location of the source directory for ALD |
|
str |
Y |
None |
Location of the ALD job being ran |
General HPC Options
Input variable |
Variable type |
Required |
Default |
Value/Options/Notes |
|---|---|---|---|---|
|
int |
Y |
36 |
Number of processors per node on HPC platform. |
|
str |
Y |
None |
Charge bank/account name on HPC platform. |
|
str |
N |
slurm |
HPC platform type options are slurm, TACC, or qsub. |
|
str |
Y |
None |
Full path to python2.X exectuable on HPC platform. |
|
bool |
N |
True |
Controls whether driver status updates are e-mailed to user. |
ChIMES LSQ Options
Input variable |
Variable type |
Required |
Default |
Value/Options/Notes |
|---|---|---|---|---|
|
str |
N |
|
Path to LSQ base files required by the driver (e.g. traj_list.dat, fm_setup.in, etc.)-Note: In greatlakes, all paths provided must be absolute paths using the “realpath” command, not just the current working directory from “pwd”. |
|
str |
Y |
|
Absolute path to ChIMES_lsq executable. |
|
str |
Y |
“cmake/3.21.1 + mkl + intel-classic/2021.6.0-magic + mvapich2/2.3.7” |
System-specific modules needed to run ChIMES-LSQ jobs |
|
str |
N |
|
Absolute path to ChIMES_lsq.py (formely, lsq2.py). |
|
str |
N |
|
Absolute path to post_proc_lsq2.py. |
|
str |
Y |
“” |
Path to directory containing the ChIMES_LSQ source code |
|
int |
Y |
4 |
Number of nodes to use when running chimes_lsq. |
|
str |
Y |
pbatch |
Queue to submit chimes_lsq job to. |
|
str |
Y |
“04:00:00” |
Walltime for chimes_lsq job. |
|
int |
Y |
8 |
Number of nodes to use when running dlasso |
|
int |
Y |
|
Number of procs per node to use when running dlasso |
|
str |
Y |
pbatch |
Queue to submit the dlasso job to |
|
str |
Y |
“04:00:00” |
Walltime for dlasso job |
|
int |
N |
1 |
Number of unique fm_setup.in files; allows fitting, e.g., multiple overlapping models to the same data |
|
str |
N |
dlasso |
Regression algorithm to use for fitting; only dlasso supported for now |
|
float |
N |
1e-5 |
Regression regularization variable. |
|
bool |
N |
True |
Controls whether A-matrix is normalized prior to solution. |
|
bool |
N |
False |
Should ALC-0 (or 1 if no clustering) weights be read directly from a user specified file? |
|
str |
N |
None |
Set if |
|
special |
N |
1.0 |
Weights to apply to full-frame forces - many options, see note below. |
|
special |
N |
5.0 |
Weights to apply to gas phase forces - many options, see note below. |
|
special |
N |
0.1 |
Weights to apply to full-frame energies - many options, see note below. |
|
special |
N |
0.1 |
Weights to apply to gas phase energies - many options, see note below. |
|
special |
N |
250.0 |
Weights to apply to full-frame stress tensor components - many options, see note below. |
Note
There are numerous options available for weighting, and weights are applied separately to full-frame forces, gas phase forces, full-frame energies, gas phase energies, and full-frame stress.
If a WEIGHTS_* option is set to a single floating point value, that value is applied to all candidate data of that type, e.g., if WEIGHTS_FORCE = 1.0, all full-frame forces will be assigned a weight of 1.0.
Additional weighting styles can be selected by letter:
Aw = a0
Bw = a0*(this_cycle-1)^a1 # NOTE: treats this_cycle = 0 as this_cycle = 1
Cw = a0*exp(a1*|X|/a2)
Dw = a0*exp(a1[X-a2]/a3)
Ew = n_atoms^a0
Fw = a0*exp(a1[ X/n_atoms-a2]/a3)
Gw = a0*exp(a1(|X|-a2)/a3)
where “X” is the value being weighted.
WEIGHTS_FORCE = [["B"],[1.0,-1.0]] would select weighting style B and apply a weight of 1.0 to each full-frame force component in the first ALD cycle; weighting would decrease by a factor (this_cycle)^(-1.0) each cycle.
Multiple weighting schemes can be combined as well. For example WEIGHTS_FORCE = [ ["A","B"], [[100.0 ],[1.0,-1.0]]] would add an additional multiplicative factor of 100 to the previous example.
Molecular Dynamics Options
Input variable |
Variable type |
Required |
Default |
Value/Options/Notes |
|---|---|---|---|---|
|
str |
Y |
None |
Iterative MD method. Options are “CHIMES” (used for ChIMES model development) or “DFTB” (used when generating ChIMES corrections to DFTB). |
|
str |
N |
None |
Only used when |
|
str |
N |
|
Only used when |
|
str |
N |
|
Used when |
|
list of int |
N |
[4] * |
Number of nodes to use for MD jobs at each case. Number can be different for each case (e.g., [2,2,4,8] for four cases). |
|
list of str |
N |
[“pbatch”] * |
Queue type to use for MD jobs at each case. Can be different for each case. |
|
list of str |
N |
[“4:00:00”] * |
Walltime to use for MD jobs at each case. Can be different for each case. |
|
str |
N |
|
Absolute path to MD input files like case-0.indep-0.run_md.in |
|
str |
Y |
None |
MPI-compatible MD exectuable absolute path (either path to "lmp_mpi_chimes" or "chimes_md-mpi"). |
|
str |
N |
|
Serial MD executable absolute path (either LAMMPS path or CHIMES_MD_SER). |
|
str |
N |
cmake/3.21.1 + intel-classic/2021.6.0-magic + mvapich2/2.3.7 + mkl |
System-specific modules needed to run ChIMES MD jobs. |
|
float |
N |
1.0E6 |
ChIMES penalty function prefactor. |
|
float |
N |
0.02 |
ChIMES pentalty function kick-in distance |
|
str |
N |
None |
Absolute path to molanal executable. |
|
int |
N |
|
Path to input files if using it as a reference ("QM") method. |
|
int |
N |
1 |
Number of nodes to use for LAMMPS jobs. |
|
str |
N |
|
Path to lmp2xyz.py |
|
int |
N |
1 |
Number of procs per node to use for LAMMPS jobs. |
|
str |
N |
[“00:30:00”] |
Walltime for LAMMPS calculations (HH:MM:SS). |
|
str |
N |
“pdebug” |
Queue to submit LAMMPS jobs to. |
|
str |
N |
None |
Absolute path to LAMMPS executable. |
|
str |
N |
None |
System-specific modules needed to run LAMMPS. |
|
str |
N |
“” |
Memory requirements for running LAMMPS jobs. |
|
str |
N |
|
Units LAMMPS input/output is expected to be. |
CHIMES_MD_SERis used for old i/o based ChIMES/DFTB linking - update required, but needs bad_cfg printing in DFTB+ (requires change to interface)
Correction Fitting Options
Input variable |
Variable type |
Required |
Default |
Value/Options/Notes |
|---|---|---|---|---|
|
bool |
N |
False |
Is this ChIMES model being fit as a correction to another method? |
|
str |
N |
None |
Method type being corrected. Currently only “DFTB” is supported |
|
list of str |
N |
None |
List of parameter files needed to run simulations/single points with the method to be corrected |
|
str |
N |
None |
Executable to use when subtracting existing forces/energies/stresses from method to be corrected |
|
bool |
N |
False |
Should electron temperatures be set to values in traj_list.dat (false) or in specified file location, for correction calculation? Only needed if correction method is QM-based. See notes below. |
Note
Note: If corrections are used, ChIMES_MD_{NODES,QUEUE,TIME} are all used to specify DFTB runs. These should be renamed to simulation_{...} for the generalized MD block (which should become SIM block).
Note: If CORRECTED_TEMPS_BY_FILE is set to be True , temperaturess in traj_list.dat are ignored by correction FES subtraction. Instead, each training trajectory file in ALL_BASE_FILES/ALC-0_BASEFILES needs a corresponding .temps file that gives the temperature for each frame
Hierarchical Fitting Options
Input variable |
Variable type |
Default |
Value/Options/Notes |
|---|---|---|---|
|
bool |
False |
Is this a hierarchical fit (i.e., building on existing parameters?”) |
|
list of str |
None |
List of parameter files to build on, which should be in ALL_BASE_FILES/HIERARCH_PARAMS |
|
str |
None |
MD method to use for subtracting existing parameter contributions - current options are CHIMES or LMP |
|
str |
None |
Executable to use when subtracting existing parameter contributions |
Reference QM Method Options
Input variable |
Variable type |
Default |
Value/Options/Notes |
|---|---|---|---|
|
str |
|
Absolute path to QM input files generic to all QM methods. Can specify separately if multiple methods are being used (see code-specific options below) |
|
str |
VASP |
Specifies which nominal QM code to use for bulk configurations; options are “VASP” or “DFTB+” |
|
int |
VASP |
Specifies which nominal QM code to use for gas configurations; options are “VASP”, “DFTB+”, and “Gaussian” |
VASP-Specific Options
Input variable |
Variable type |
Default |
Value/Options/Notes |
|---|---|---|---|
|
int |
6 |
Number of nodes to use for VASP jobs |
|
int |
|
Number of processors to use per node for VASP jobs |
|
str |
“04:00:00” |
Walltime for VASP calculations (HH:MM:SS) |
|
str |
“pbatch” |
Queue to submit VASP jobs to |
|
str |
None |
A path to a VASP executable must be specified if |
|
str |
“mkl” |
Modules to load during VASP run |
|
str |
|
Absolute path to vasp2yzf.py |
|
str |
“” |
Memory requirements for running VASP jobs |
DFTB+ -Specific Options
Input variable |
Variable type |
Default |
Value/Options/Notes |
|---|---|---|---|
|
str |
|
Absolute path to DFTB+ input files. |
|
int |
1 |
Number of nodes to use for VASP jobs |
|
int |
1 |
Number of processors to use per node for VASP jobs |
|
str |
“04:00:00” |
Walltime for VASP calculations (HH:MM:SS) |
|
str |
“pbatch” |
Queue to submit VASP jobs to |
|
str |
None |
A path to a VASP executable must be specified if |
|
str |
“mkl” |
Modules to load during VASP run |
|
str |
“” |
Memory requirements for running DFTB+ jobs |
|
str |
|
Absolute path to dftgen_to_xyz.py |
CP2K-Specific Options
Input variable |
Variable type |
Default |
Value/Options/Notes |
|---|---|---|---|
|
int |
6 |
Number of nodes to use for CP2K jobs |
|
int |
|
Number of processors to use per node for CP2K jobs |
|
str |
“04:00:00” |
Walltime for CP2K calculations (HH:MM:SS) |
|
str |
“pbatch” |
Queue to submit CP2K jobs to |
|
str |
None |
A path to a CP2K executable must be specified if |
|
str |
“mkl” |
Modules to load during CP2K run |
|
str |
|
Absolute path to CP2K2yzf.py |
|
str |
“” |
Memory requirements for running CP2K jobs |
|
str |
None |
Path to the directory containing potential and functional files for CP2K |
Gaussian-Specific Options
Input variable |
Variable type |
Default |
Value/Options/Notes |
|---|---|---|---|
|
int |
4 |
Number of nodes to use for Gaussian jobs |
|
int |
|
Number of processors to use per node for Gaussian jobs |
|
str |
“04:00:00” |
Walltime for Gaussian calculations (HH:MM:SS) |
|
str |
“pbatch” |
Queue to submit Gaussian jobs to |
|
str |
None |
A path to a Gaussian executable must be specified if |
|
str |
None |
Absolute path to Gaussian scratch directory |
|
str |
None |
Name of file containing single atom energies from Gaussian and target planewave method |
|
str |
“” |
Memory requirements for running Gaussian jobs |
Note
The file specified for GAUS_REF is structured like:
<chemical symbol> <Gaussian energy> <planewave code energy>
<chemical symbol> <Gaussian energy> <planewave code energy>
<chemical symbol> <Gaussian energy> <planewave code energy>
...
<chemical symbol> <Gaussian energy> <planewave code energy>
Energies are expected in kcal/mol and there should be an entry for each atom type of interest.