Running ChemShell on HPC clusters

Python launcher

ChemShell installed on a supercomputer via a platform file can normally preload all necessary modules so you will not have to do it yourself. ChemShell by default launches an interactive Python/ChemShell environment, as on a Linux workstation. This is to enable small tasks to be completed prior to submitting the heavy compute to the scheduler. For example:

YOUR_CHEMSH_ROOT/bin/intel/chemsh -np 2 --debug 5 YOUR_CHEMSH_JOB.py

Warning

The command above will run ChemShell on the login nodes, which is only suitable for running very small pre-processing calculations and only if permitted by the system administrators. Any significant computations should normally be submitted to the HPC queue.

To submit your job to the scheduling system on the supercomputer, provide the --submit flag. You will also need to set some of the parameters that control your job (such as wall time). For example:

YOUR_CHEMSH_ROOT/bin/intel/chemsh --submit -A YOUR_ACCOUNT -J YOUR_JOBNAME -wt 24:00:00 -np 1024 -nwg 64 -npmm 8 --debug 3 YOUR_CHEMSH_JOB.py

requests to the job scheduler 1024 MPI processes divided into 64 task-farm workgroups, 8 processes for the MM calculation within a workgroup, and 24 hours of walltime.

Note

The following arguments to the chemsh script may be useful, although the exact options that are enabled will depend on the platform file used on the HPC system:

-A    Budget code
-J    Job name
-wt   Wall time
-q    Quality of Service (QoS)
-p    Partition
-np   Number of processes
-npmm Number of processes used for MM
-nwg  Number of workgroups (for task-farmed calculations)

Submission script

Some users may prefer to use a submission script, especially when a platform file for the HPC system they are using is not available. The submission script will depend somewhat on the HPC system, but this example shows the required elements:

#!/bin/bash

# Set job information
#SBATCH --job-name=my_chemsh
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1
#SBATCH --time=00:20:00

#SBATCH --account=c01-chem
#SBATCH --partition=standard
#SBATCH --qos=standard

# Load the modules used to compile ChemShell
module load intel
module load intel-mpi
module load SciPy-bundle

# Run the job using the chemsh.x executable
srun YOUR_CHEMSH_ROOT/bin/intel/chemsh.x -np 128 -npmm 4 my_job.py

Python interpreter

Finally, it is also possible to use Python directly. However, in this case ChemShell itself will be running in serial. This might still make sense on a HPC system if standalone codes are being used by ChemShell, as it can launch these codes in parallel if ChemShell itself is run as a serial Python task:

#!/bin/bash

# Set job information
#SBATCH --job-name=my_chemsh
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1
#SBATCH --time=00:20:00

#SBATCH --account=c01-chem
#SBATCH --partition=standard
#SBATCH --qos=standard

# Load the modules used to compile ChemShell
module load intel
module load intel-mpi
module load SciPy-bundle

# Point python to the ChemShell install directory
export PYTHONPATH=YOUR_CHEMSH_ROOT
export CHEMSH_ARCH=intel


# Run the job using the chemsh.x executable
srun python3 my_job.py

Note

If the PYTHONPATH and CHEMSH_ARCH environment variables are missing, python will not be able to find the chemsh module.

Warning

The task-farming parallelisation features of ChemShell cannot be used when using the Python interpreter to run ChemShell.