How to use VASP¶
Software
|
Version
|
Cluster
|
---|---|---|
VASP
|
6.4.2-vanilla
|
Dardel
|
This is a vanilla version of VASP 6.4.2, i.e. no extensions have been added to the VASP source code.
For a list of new features in VASP6, see the VASP wiki.
General observations¶
-
VASP is not helped by hyperthreading.
-
Running on fewer than 128 tasks per node allocates more memory to each MPI task. This can in some cases improve performance and is necessary if your job crashes with an out-of-memory (OOM) error. Further information from the VASP wiki can be found here and here. You can check the example job script for using 64 MPI tasks x 2 OpenMP threads per node under Running Vasp.
How to choose the number of cores¶
Rule of thumb:
-
1 atom per core = Good
-
0.5 atom per core = Could work (but bad efficiency and time wasted)
-
< 0.5 atom per core = Don’t do it
Explanation of above:
The number of bands is more important than the number of atoms, but typically you have about 4 bands per atom in VASP.
To choose a good number of cores, you can use this checklist:
-
Check how many bands you have in the calculation. Let’s call this “NB”.
-
Cores = NB is best you can do.
-
For better efficiency, typically 90%+, aim for at least 4 bands per core, i.e. Cores = NB / 4.
-
If you can use k-point parallelization (“KPAR”), use it! It improves scaling a lot. You can run up to cores = #kpts * NB / 4.
-
You have now determined the number of cores.
-
Look at this number. Does it look “strange”? Try to adjust the number of bands to make the number of cores more even, .e.g we don’t want a prime number. Good numbers are multiple of 4, 8, 12, 16, etc. For example, 512 is better than 501.
-
Calculate the number of nodes necessary, e.g. 512 cores (128 cores/node) = 4 compute nodes.
-
For a wide calculation with less than 4 bands per core, try decreasing the number of cores per node to 64, or even 32. You may also have to do this get memory available for each MPI rank.
Parallelization settings¶
Parallelization over k-points is recommended when it is possible to do so. In practice, KPAR should be set to be equal to the number of nodes. Please also make sure that the k-points can be evenly distributed over nodes. For example, a calculation with 15 k-points can run on 15 nodes with KPAR=15.
NCORE determines the number of cores that work on an individual orbital. The recommended value for NCORE is 16.
Vasp Filenames¶
-
vasp : this is normal regular VASP version for calculations using >1 k-point.
-
vasp_gamma : gamma-point only version of VASP. Use this one if you only have the gamma point. It is much faster and uses less memory.
-
vasp_noncollinear : VASP for noncollinear and spin-orbit coupling calculations.
Potential files and vdW kernel¶
Projector augmented wave (PAW) potentials can be found at:
/pdc/software/23.12/other/vasp/potpaw-64/
To use one of the nonlocal vdW functionals one needs to put the file vdw_kernel.bindat
into the run directory (along with INCAR, POSCAR, POTCAR and KPOINTS). This file can be found at:
/pdc/software/23.12/other/vasp/vdw_kernel/vdw_kernel.bindat
Running Vasp¶
Here is an example of a job script requesting 128 MPI processes per node:
#!/bin/bash
#SBATCH -A naissYYYY-X-XX
#SBATCH -J my_vasp_job
#SBATCH -t 01:00:00
#SBATCH -p main
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=128
module load PDC/23.12
module load vasp/6.4.2-vanilla
export OMP_NUM_THREADS=1
srun vasp
Since OpenMP is supported by this module, you can also submit a job requesting 64 MPI processes per node and 2 OpenMP threads per MPI process, using the job script below. Please note that in this case you need to specify --cpus-per-task
, OMP_NUM_THREADS
, and OMP_PLACES
, and that the value of --cpus-per-task
is equal to 2x OMP_NUM_THREADS
, becasue AMD’s simultaneous multithreading (SMT) is enabled.
Please also note that it is necessary set the SRUN_CPUS_PER_TASK
environment variable in the job script so that srun
can work as expected. (See https://slurm.schedmd.com/srun.html)
#!/bin/bash
#SBATCH -A naissYYYY-X-XX
#SBATCH -J my_vasp_job
#SBATCH -t 01:00:00
#SBATCH -p main
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=64
#SBATCH --cpus-per-task=4
module load PDC/23.12
module load vasp/6.4.2-vanilla
export OMP_NUM_THREADS=2
export OMP_PLACES=cores
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
srun vasp
Disclaimer¶
PDC takes no responsibility for the correctness of results produced with the binaries. Always evaluate the binaries against known results for the systems and properties you are investigating before using the binaries for production jobs.