Fragment objects

class Fragment

Fragment objects are used to hold geometry data of molecules and structures with periodic boundary conditions.

Fragments can be defined by direct setting of the object attributes, or by reading in from XYZ ('.xyz'), ChemShell punch ('.pun') or chemical JSON ('.json') files as follows:

h2o = Fragment(coords='water.xyz')

The h2o fragment defined by this can then be used in calculations using the argument frag=h2o as appropriate.

Fragment properties

totalcharge

Total charge of the system

connect_mode

Rules for determining atomic connectivity.

Allowed values:

  • 'covalent':

  • 'ionic':

cell

Object of <class Cell> holding unit cell information for periodic systems

natoms

Number of atoms in the fragment

nbqs

Number of point charges in the fragment

nshells

Number of point charges in the fragment

names

Array of atom labels

coords

Array of atom coordinates in a.u. (natoms, 3)

znums

Array of atomic numbers (natoms)

charges

Array of atom charges (natoms)

masses

Array of atom masses (natoms)

bqs

Object holding point charge information

shells

Object holding shell information

Fragment enquiry

Fragment.centroid()

Return the centroid of the fragment

Fragment.radius()

Return the radius of the smallest sphere which, when centred at the centroid of the fragment, encloses all the atoms

Fragment.angle(i, j, k)

Return the angle formed by atoms i-j-k in degrees

Fragment.dihedral(i, j, k, l)

Return the torsion angle of atoms i-j-k-l in degrees

Fragment.hascharges()

Return True if any atom charges have been set

Fragment functions

Fragment.save(filename=None, format='json')

Write out fragment data to disk using specified file format ('json', 'punch' or 'xyz').

Examples:

Fragment.connect()

Compute connectivity table

Fragment.connIJ(i, j)

Add connection between atoms i and j

Fragment.delete(todelete)

Delete an atom or range of atoms

Fragment.delete(toinsert, pos=0)

Insert an atom or range of atoms at position pos

Fragment.merge(frag2)

Merge fragments

Fragment.addCharges(dictionary_of_charges)

Add charges to labelled species as appropriate.

Fragment.addShells(...)

Add shells to core species with charges as specified.

Atoms selecting methods

The ChemShell Fragment objects have a range of methods for selecting particular atoms.

Fragment.select(radius=0.0, atoms='', indices=[], regions=[], invert=False, union=False, return_masks=False, **kwargs)
Parameters
  • radius (float, optional) – Select atoms that lie within the radius distance (in a.u.) from the centre. See: Fragment.selectByRadius()

  • atoms (str or list, optional) – See: Fragment.selectByAtoms()

  • indices (list, optional) – Select atoms by using one or more Python sequences. See: Fragment.selectByIndices()

  • regions (list, optional) – Fragment.selectByRegion()

  • invert (bool, optional) – Invert the selection

  • union (bool, optional) – Return the union of results by all provided criterion parameters. By default (union=False) the intersection of all criteria is returned

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

  • **kwargs – Other keyword options, especially the following ones for biomolecular fragments (typically created from PDB/PQR/PSF):

  • chainIDs (str or list) – One chain ID or a list of multiple chain IDs, defaults to []. Please note chainIDs are string-type while resIDs and segIDs are integers.

  • resIDs (int or list) – One residue ID or a list of multiple residue IDs, defaults to []. Please note chainIDs are string-type while resIDs and segIDs are integers.

  • segIDs – One segment ID or a list of multiple segment IDs, defaults to []. Please note chainIDs are string-type while resIDs and segIDs are integers.

  • residues (str or list) – One residue name or a list of multiple residue names, defaults to []

  • segments (str or list) – One segment name or a list of multiple segment names, defaults to []

  • truncate_by (str) – Truncating method for side_chain, defaults to '' (no truncating). ChemShell currently supports 'alpha-beta' which selects the side chain atoms of amino acids if side_chain=True or the backbone atoms of amino acids if side_chain=False

  • side_chain (list) – Select only the side chain if truncate_by is not '', for example 'alpha-beta', defaults to True. side_chain=False selects the backbone atoms if truncate_by='alpha-beta'. Currently amino acids are supported

  • ff (str) – Name of the forcefield scheme, defaults to 'charmm'

Returns

Return a sorted, non-repeating NumPy array of selected atom indices according to the provided criterion (criteria). If no valid criterion (criteria) provided, an empty array is returned (if return_masks=False)

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Example

my_cluster.select(regions=[2,3], atoms=['Mg','O'], strict=False, radius=10.0)

returns the indices of atoms of all regions 2 or 3, with names starting with “O” or “Mg”, and lying within 10.0 Bohr

my_protein.select(residues=['ALA','CYS'], segments='PRT2', truncate_by='alpha-beta')

returns the indices of alanine or cysteine’s side chain atoms on segment “PRT2”

my_protein.select(residues=['ALA'], truncate_by='alpha-beta', indices=[999,1000,1001], union=True)

returns the indices of alanines’ side chain atoms and atoms 999, 1000, and 1001 (please note in Python the indexing starts from 0)

Fragment.selectByAtoms(*args, strict=True, return_masks=False)
Parameters
  • *args (int or str or bytes) – One or more arguments of the allowed types. If more than one arguments are provided, the union of each criterion’s result is returned. Nested lists are not allowed. Provide int values for atomic numbers (for example, 1 matches all hydrogen atoms), and str or bytes for atom names

  • strict (bool, optional) – Perform strict matching for atom names. strict=False allows loose matching of atom names that start with the given keywords. For example, in this case 'H' matches 'H11' and 'H12'

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

A sorted NumPy array of selected atom indices according to the provided matching criterion (criteria). If no valid parameter is provided (for example, no matching), an empty array is returned

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Example

my_frag.selectByAtoms('OM', 'CLA', 7, 'FE')

returns the indices of all atoms named as 'OM', 'CLA', or 'FE', as well as all nitrogen atoms (atomic number 7)

Fragment.selectByBackbone()
Returns

Return a sorted, non-repeating NumPy array of the indices of backbone atoms of proteins (yet to be implemented for nucleotides). Currently only the CHARMM forcefield scheme is supported

Return type

NumPy array of integers

Note

This method is only valid for biomolecular fragments (typically created from PDB/PQR/PSF)

Fragment.selectByCommonNames(names, ff='charmm', side_chain=True, truncate_by='alpha-beta', invert=False, return_masks=False)
Parameters
  • names (str or list) – A single name or a list of multiple names. If more than one names are given, the union of the results are returned. Currently accepted names include: 'amino acids', 'ions', 'non-solvent', 'protein', 'solvent' (incl. ions), and 'water'. But residue and segment names can also be used in this method

  • ff (str) – Name of the forcefield scheme, defaults to 'charmm'

  • side_chain (list) – Select only the side chain if truncate_by is not '', for example 'alpha-beta', defaults to True. side_chain=False selects the backbone atoms if truncate_by='alpha-beta'. Currently amino acids are supported. See Fragment.select()

  • truncate_by (str) – Truncating method for side_chain, defaults to '' (no truncating). ChemShell currently supports 'alpha-beta' which selects the side chain atoms of amino acids if side_chain=True or the backbone atoms of amino acids if side_chain=False. See Fragment.select()

  • invert (bool, optional) – Invert the selection

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

A sorted NumPy array of selected atom indices according to the given name(s). If no valid parameter is provided (for example, no matching), an empty array is returned

Return type

NumPy array of integers (default) or booleans (return_masks=True)

my_protein.selectByCommonNames(['amino-acid', 'ions'])

returns the indices of atoms on all amino acids’ side chains (because by default side_chain=True and truncate_by='alpha-beta') and indices of ionic atoms (see also Fragment.selectIons()).

Note

This method is only valid for biomolecular fragments (typically created from PDB/PQR/PSF)

Fragment.selectByIndices(*args, return_masks=False)
Parameters
  • *args (int or list or range) – One or more arguments of the allowed types. Nested lists of any depth can be comprehended and flattened. See the example below

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

Return a sorted, non-repeating NumPy array of selected atom indices according to the provided indexing. If no valid parameter is provided (for example, out of the range of the fragment’s atoms), an empty array is returned

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Example

my_frag.selectByIndices(99, range(2), [[range(10,12),11,[[1000,1003]]]])

returns [0 1 10 11 99 1000 1003]

Note

Please keep in mind in Python the indexing starts from 0

Note

tuple cannot be used as an argument of Fragment.selectByIndices()

Fragment.selectByIons(ff='charmm', invert=False, return_masks=False)
Parameters
  • ff (str) – Name of the forcefield scheme, defaults to 'charmm'

  • invert (bool, optional) – Invert the selection

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

Return a sorted, non-repeating NumPy array of the indices of all ionic atoms (any atom with a name among “LIT”, “SOD”, “POT”, “MG”, “CAL”, “RUB”, “CES”, “ZN2”, “CD2”, “CLA”, “BAR”, “OH”, and “SO4” for the CHARMM forcefield)

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Note

This method is only valid for biomolecular fragments (typically created from PDB/PQR/PSF)

Fragment.selectByRadius(radius, range=[], centre=None, unit='a.u.', boundary=None, return_masks=False)
Parameters
  • radius (float) – Radius distance (unit='a.u.' by default) from the centre

  • range (list or NumPy array, optional) – Contrain the selecting within this range of atom indices. The default range (range=[]) are all atoms

  • centre (int or list, optional) – By default (centre=None) the fragment’s centroid is taken as the centre. It is possible to assign either an atom’s index as the centre, for example, centre=100, or a list of specific 3D coordinates (in a.u.), for example, centre=[1.0,2.0,3.0]

  • boundary (None or str, optional) – Policy about selecting (boudnary='inclusive') or unselecting (boundary='exclusive') molecules across the selection boundary. For biomolecular fragments (created from PDB/PQR/PSF, for example), the default policy is 'inclusive'. For other types of fragments, it defaults to None that the entireness of the boundary molecules will not be respected

  • unit (str, optional) – Unit of the radius distance

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

Return an array of indices of the atoms within the given radius from the centre (default centre is the fragment’s centroid). The returned array

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Fragment.selectByRegions(*args, return_masks=False)
Parameters
  • *args (int) – One or more int indicating the region suffix in atom names. Namely, 1 selects all atoms with names ending with '1', for example, 'Mg1' and 'O1'. The region suffixes are defined by ChemShell’s QM/MM finite cluster model

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

Return a sorted, non-repeating NumPy array of selected atom indices according to the given region number(s). The union of results by each region number is returned. If no valid parameter is provided, an empty array is returned

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Example

my_frag.selectByRegions(1,2,3)

returns indices of all region 1, 2, and 3 atoms

Note

This method is for ChemShell’s QM/MM finite cluster model only

Fragment.selectByShell(around=[], convex=True, padding=5.0, unit='a.u.', boundary=None | 'inclusive', return_masks=False)
Parameters
  • around (list or NumPy array, optional) – Reference atoms around which the shell is to be cut out

  • convex (bool, optional) – Cut out a shell around the convex hull of the chosen reference atoms. This is default and faster, though it results in a shell slightly different from convex=False which takes the whole chosen reference atoms

  • padding (float, optional) – Padding distance from the reference atoms: only atoms within this distance will be selected

  • unit (str, optional) – Unit for the padding distance

  • boundary (None or str, optional) – Policy about selecting (boudnary='inclusive') or unselecting (boundary='exclusive') molecules across the selection boundary. For biomolecular fragments (created from PDB/PQR/PSF, for example), the default policy is 'inclusive'. For other types of fragments, it defaults to None that the entireness of the boundary molecules will not be respected. See boundary of Fragment.selectByRadius()

  • return_masks (bool, optional) – Return an array of masks (True and False) instead of indices

Returns

Return a sorted, non-repeating NumPy array of selected atom indices according to the given conditions. If no valid parameter is provided (for example around is left undefined), the indices of the whole fragment are returned

Return type

NumPy array of integers (default) or booleans (return_masks=True)

Note

Here the “shell” means a shell shape around the selected reference atoms. It should not be mistaken for the “shell” model of polarisable forcefield

Note

We have rewritten the convex=True method since v21.0.0. It now automatically includes all the interior atoms and runs way faster than convex=False. It is worth noticing that the both methods result in slightly different selections, with the surface by the former being smoother

Cell object

The cell object within the Fragment contains information on the unit cell of a periodic system.

Fragment.dim

Dimensionality of periodic system. Can be '1D', '2D' or '3D'.

Fragment.cell.consts

Lattice constants as a list (a, b, c, alpha, beta, gamma)

Fragment.cell.vectors

Cartesian cell vectors (in a.u.)

Fragment.cell.fractional

Array of atom fractional coordinates (natoms)

BQs object

The BQs object within the Fragment contains information on any point charges in the system.

Fragment.bqs.names

Array of point charge labels (nbqs)

Fragment.bqs.coords

Array of point charge coordinates in a.u. (nbqs, 3)

Fragment.bqs.charges

Array of point charge charges (nbqs)

Shells object

The Shells object within the Fragment contains information on any shells in the system.

Fragment.shells.coreatoms

Array of indices of parent atoms (nshells)

Fragment.shells.names

Array of shell labels (nshells)

Fragment.shells.coords

Array of absolute shell coordinates in a.u. (nshells, 3)

Fragment.shells.displace

Array of shell coordinates in a.u. relative to parent atom (nshells, 3)

Fragment.shells.charges

Array of shell charges (nshells)