MD analysis

ChemShell currently provides several basic MD analysis tools.

Plotting trajectories

ChemShell is able to visualise the MD trajectory profile if matplotlib is available to the user. (When this is not the case, ChemShell will install it locally if there is access to the internet at the compiletime.) After an MD simulation is carried out using the MD driver, profile data containing change of energies (including pre-MD energy minimisation) and temperature during the trajectory will be written in the format of NumPy compressed array with the extension .npz. The default filename is _profile_md.npz but can be changed as a parameter in the function MD.run() and vary in different workflows such as the Protein Solvation. The command to plot such a file’s data is:

$ chemsh --plot _profile_md.npz

Note

This is to be run in a Unix shell session (e.g., bash) of your operating system but not in a Python script or Python shell

To change the units in which the data are displayed, use the --units command-line argument and provide one or more unit names. For example, to use kcal/mol and nanosecond:

$ chemsh --plot _profile_md.npz --units kcal/mol ns

Trajectory analysis

We normally would like to validate an MD simulation’s equilabrium state after it is finished. In ChemShell we can do so by analysing the change of the structure’s RMSD during an MD trajectory. For a protein system, it makes more sense to do the analysis on only the backbone atoms of amino acids. This can be done via the Fragment.select() function. The resulting snapshot structures and data file _profile_analysis.npz in the format of NumPy compressed array with the extension .npz are saved in directory _analysis. Here is an example script:

# create a fragment by reading in the PDB used by the MD simulation
my_protein = Fragment(coords='my_filename.pdb')

# get a list of atom indices of only the protein's backbone
indices_backbone = my_protein.selectByBackbone()

# load the DCD information onto the fragment and analyse the RMSD
my_protein.analyse(dcd='my_filename.dcd',
                   nsnapshots=10,
                   select=indices_backbone,
                   timestep=2.0,
                   nsteps=1000,
                   units=('angstrom', 'ps'))

Similarly to the MD profile, _profile_analysis.npz can also be visualised by ChemShell’s plotting feature. The default units are Å and ps and do not accept the --units argument.

$ chemsh --plot _analysis/_profile_analysis.npz

Note

Unlike the above Python function Fragment.analyse(), this command is to be run in a Unix shell session (e.g., bash) of your operating system but not in a Python script or Python shell

Taking snapshots

Snapshot structures, in both PDB and PQR formats, will be automatically taken out of the second half of a successful trajectory (.DCD file) when an MD Simulations simulation is done. But the user can also take MD snapshots anytime from an existing trajectory by running the tools.takeSnapshots() function of the ChemShell tools module, for example

from chemsh import *

my_frag = Fragment(coords='my_filename.psf')

tools.takeSnapshots(my_frag,
                    xst='my_filename.xst',
                    dcd='my_filename.dcd',
                    pbc_wrap='non-solvent',
                    nsnapshots=20,
                   )

In the above example piece of input script, the XST file is also needed for a periodic simulation to perfrom PBC wrapping. PSF is preferred to PDB for creating the fragment object if the snapshots are taken for doing QM/MM calcualtions, because PSF contains charge information, though no coordinates, that can be saved in the PQR-formatted snapshots for QM/MM calculations to start with. If the snapshots are taken using a PDB-based fragment, subsequent QM/MM calculations must load charges separately.