Fragment objects
- class Fragment
Fragment objects are used to hold geometry data of molecules and structures, along with their boundary conditions.
Fragments can be defined by direct setting of the object attributes,
or by reading in from XYZ ('.xyz'
), ChemShell punch ('.pun'
) or chemical JSON ('.cjson'
) files as follows:
h2o = Fragment(coords='water.xyz')
The h2o fragment defined by this can then be used in calculations
using the argument frag=h2o
as appropriate.
Fragment properties
- totalcharge
Total charge of the system
- connect_mode
(Default:
'covalent'
) Rules for determining atomic connectivity (chemical bonds). Atoms are considered to be connected according to:R_ij < scale*(rad_i + rad_j) + toler
where R_ij is the distance between two atoms, and the atomic radii (rad_i and rad_j) are controlled by the value of
connect_mode
, scale isconnect_scale
and toler isconnect_toler
.Allowed values:
'covalent'
: Covalent radii will be used to calculate connectivity.'ionic'
: Ionic radii will be used to calculate connectivity.None
: No connections will be calculated.
The connectivity is particularly important for MM forcefields where bonded interactions are defined.
Note
For connect_mode='ionic'
, for ions of the same sign, the distance cut-off is further scaled down by a factor of 0.5 to exclude unintentional coordination.
- connect_scale
(Default:
1.0
) Connectivity scale (Seeconnect_mode
).
- connect_toler
(Default:
0.5
) Connectivity tolerance (Seeconnect_mode
).
- cell
Object of
<class Cell>
holding unit cell information for periodic systems
- natoms
Number of atoms in the fragment
- nbqs
Number of point charges in the fragment
- nshells
Number of point charges in the fragment
- names
Array of atom labels
- coords
Array of atom coordinates in a.u. (natoms, 3)
- znums
Array of atomic numbers (natoms)
- charges
Array of atom charges (natoms)
- masses
Array of atom masses (natoms)
- bqs
Object holding point charge information
- shells
Object holding shell information
Fragment enquiry
- Fragment.centroid()
Return the centroid of the fragment
- Fragment.radius()
Return the radius of the smallest sphere which, when centred at the centroid of the fragment, encloses all the atoms
- Fragment.angle(i, j, k)
Return the angle formed by atoms i-j-k in degrees
- Fragment.dihedral(i, j, k, l)
Return the torsion angle of atoms i-j-k-l in degrees
- Fragment.hascharges()
Return True if any atom charges have been set
Fragment functions
- Fragment.save(filename=None, format='cjson')
Write out fragment data to disk using specified file format (
'cjson'
,'punch'
or'xyz'
).Examples:
- Fragment.connect()
Compute connectivity table
- Fragment.connIJ(i, j)
Add connection between atoms i and j
- Fragment.delete(todelete)
Delete an atom or range of atoms
- Fragment.delete(toinsert, pos=0)
Insert an atom or range of atoms at position
pos
- Fragment.merge(frag2)
Merge fragments
- Fragment.addCharges(dictionary_of_charges)
Add charges to labelled species as appropriate.
- Fragment.addShells(...)
Add shells to core species with charges as specified.
Atoms selecting methods
The ChemShell Fragment
objects have a range of methods for
selecting particular atoms.
- Fragment.select(radius=0.0, atoms='', indices=[], regions=[], invert=False, union=False, return_masks=False, **kwargs)
- Parameters
radius (float, optional) – Select atoms that lie within the radius distance (in a.u.) from the centre. See:
Fragment.selectByRadius()
atoms (str or list, optional) – See:
Fragment.selectByAtoms()
indices (list, optional) – Select atoms by using one or more Python sequences. See:
Fragment.selectByIndices()
regions (list, optional) –
Fragment.selectByRegion()
invert (bool, optional) – Invert the selection
union (bool, optional) – Return the union of results by all provided criterion parameters. By default (
union=False
) the intersection of all criteria is returnedreturn_masks (bool, optional) – Return an array of masks (
True
andFalse
) instead of indices**kwargs – Other keyword options, especially the following ones for biomolecular fragments (typically created from PDB/PQR/PSF):
chainIDs (str or list) – One chain ID or a list of multiple chain IDs, defaults to
[]
. Please note chainIDs are string-type while resIDs and segIDs are integers.resIDs (int or list) – One residue ID or a list of multiple residue IDs, defaults to
[]
. Please note chainIDs are string-type while resIDs and segIDs are integers.segIDs – One segment ID or a list of multiple segment IDs, defaults to
[]
. Please note chainIDs are string-type while resIDs and segIDs are integers.residues (str or list) – One residue name or a list of multiple residue names, defaults to
[]
segments (str or list) – One segment name or a list of multiple segment names, defaults to
[]
truncate_by (str) – Truncating method for
side_chain
, defaults to''
(no truncating). ChemShell currently supports'alpha-beta'
which selects the side chain atoms of amino acids ifside_chain=True
or the backbone atoms of amino acids ifside_chain=False
side_chain (list) – Select only the side chain if
truncate_by
is not''
, for example'alpha-beta'
, defaults toTrue
.side_chain=False
selects the backbone atoms iftruncate_by='alpha-beta'
. Currently amino acids are supportedff (str) – Name of the forcefield scheme, defaults to
'charmm'
- Returns
Return a sorted, non-repeating NumPy array of selected atom indices according to the provided criterion (criteria). If no valid criterion (criteria) provided, an empty array is returned (if
return_masks=False
)- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)- Example
my_cluster.select(regions=[2,3], atoms=['Mg','O'], strict=False, radius=10.0)
returns the indices of atoms of all regions 2 or 3, with names starting with “O” or “Mg”, and lying within 10.0 Bohr
my_protein.select(residues=['ALA','CYS'], segments='PRT2', truncate_by='alpha-beta')
returns the indices of alanine or cysteine’s side chain atoms on segment “PRT2”
my_protein.select(residues=['ALA'], truncate_by='alpha-beta', indices=[999,1000,1001], union=True)
returns the indices of alanines’ side chain atoms and atoms 999, 1000, and 1001 (please note in Python the indexing starts from 0)
See also
Fragment.selectByAtoms()
Fragment.selectByIndices()
Fragment.selectByRadius()
Fragment.selectByRegions()
- Fragment.selectByAtoms(*args, strict=True, return_masks=False)
- Parameters
*args (int or str or bytes) – One or more arguments of the allowed types. If more than one arguments are provided, the union of each criterion’s result is returned. Nested lists are not allowed. Provide
int
values for atomic numbers (for example,1
matches all hydrogen atoms), andstr
orbytes
for atom namesstrict (bool, optional) – Perform strict matching for atom names.
strict=False
allows loose matching of atom names that start with the given keywords. For example, in this case'H'
matches'H11'
and'H12'
return_masks (bool, optional) – Return an array of masks (
True
andFalse
) instead of indices
- Returns
A sorted NumPy array of selected atom indices according to the provided matching criterion (criteria). If no valid parameter is provided (for example, no matching), an empty array is returned
- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)- Example
my_frag.selectByAtoms('OM', 'CLA', 7, 'FE')
returns the indices of all atoms named as 'OM'
, 'CLA'
, or 'FE'
, as well as all nitrogen atoms (atomic number 7
)
- Fragment.selectByBackbone()
- Returns
Return a sorted, non-repeating NumPy array of the indices of backbone atoms of proteins (yet to be implemented for nucleotides). Currently only the CHARMM forcefield scheme is supported
- Return type
NumPy array of integers
Note
This method is only valid for biomolecular fragments (typically created from PDB/PQR/PSF)
- Fragment.selectByCommonNames(names, ff='charmm', side_chain=True, truncate_by='alpha-beta', invert=False, return_masks=False)
- Parameters
names (str or list) – A single name or a list of multiple names. If more than one names are given, the union of the results are returned. Currently accepted names include:
'amino acids'
,'ions'
,'non-solvent'
,'protein'
,'solvent'
(incl. ions), and'water'
. But residue and segment names can also be used in this methodff (str) – Name of the forcefield scheme, defaults to
'charmm'
side_chain (list) – Select only the side chain if
truncate_by
is not''
, for example'alpha-beta'
, defaults toTrue
.side_chain=False
selects the backbone atoms iftruncate_by='alpha-beta'
. Currently amino acids are supported. SeeFragment.select()
truncate_by (str) – Truncating method for
side_chain
, defaults to''
(no truncating). ChemShell currently supports'alpha-beta'
which selects the side chain atoms of amino acids ifside_chain=True
or the backbone atoms of amino acids ifside_chain=False
. SeeFragment.select()
invert (bool, optional) – Invert the selection
return_masks (bool, optional) – Return an array of masks (
True
andFalse
) instead of indices
- Returns
A sorted NumPy array of selected atom indices according to the given name(s). If no valid parameter is provided (for example, no matching), an empty array is returned
- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)
my_protein.selectByCommonNames(['amino-acid', 'ions'])
returns the indices of atoms on all amino acids’ side chains (because by default side_chain=True
and truncate_by='alpha-beta'
) and indices of ionic atoms (see also Fragment.selectIons()
).
Note
This method is only valid for biomolecular fragments (typically created from PDB/PQR/PSF)
- Fragment.selectByIndices(*args, return_masks=False)
- Parameters
- Returns
Return a sorted, non-repeating NumPy array of selected atom indices according to the provided indexing. If no valid parameter is provided (for example, out of the range of the fragment’s atoms), an empty array is returned
- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)- Example
my_frag.selectByIndices(99, range(2), [[range(10,12),11,[[1000,1003]]]])
returns [0 1 10 11 99 1000 1003]
Note
Please keep in mind in Python the indexing starts from 0
Note
tuple
cannot be used as an argument of Fragment.selectByIndices()
- Fragment.selectByIons(ff='charmm', invert=False, return_masks=False)
- Parameters
- Returns
Return a sorted, non-repeating NumPy array of the indices of all ionic atoms (any atom with a name among “LIT”, “SOD”, “POT”, “MG”, “CAL”, “RUB”, “CES”, “ZN2”, “CD2”, “CLA”, “BAR”, “OH”, and “SO4” for the CHARMM forcefield)
- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)
Note
This method is only valid for biomolecular fragments (typically created from PDB/PQR/PSF)
- Fragment.selectByRadius(radius, range=[], centre=None, unit='a.u.', boundary=None, return_masks=False)
- Parameters
radius (float) – Radius distance (
unit='a.u.'
by default) from the centrerange (list or NumPy array, optional) – Contrain the selecting within this range of atom indices. The default range (
range=[]
) are all atomscentre (int or list, optional) – By default (
centre=None
) the fragment’s centroid is taken as the centre. It is possible to assign either an atom’s index as the centre, for example,centre=100
, or a list of specific 3D coordinates (in a.u.), for example,centre=[1.0,2.0,3.0]
boundary (None or str, optional) – Policy about selecting (
boudnary='inclusive'
) or unselecting (boundary='exclusive'
) molecules across the selection boundary. For biomolecular fragments (created from PDB/PQR/PSF, for example), the default policy is'inclusive'
. For other types of fragments, it defaults toNone
that the entireness of the boundary molecules will not be respectedunit (str, optional) – Unit of the radius distance
return_masks (bool, optional) – Return an array of masks (
True
andFalse
) instead of indices
- Returns
Return an array of indices of the atoms within the given radius from the centre (default centre is the fragment’s centroid). The returned array
- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)
- Fragment.selectByRegions(*args, return_masks=False)
- Parameters
*args (int) – One or more
int
indicating the region suffix in atom names. Namely,1
selects all atoms with names ending with'1'
, for example,'Mg1'
and'O1'
. The region suffixes are defined by ChemShell’s QM/MM finite cluster modelreturn_masks (bool, optional) – Return an array of masks (
True
andFalse
) instead of indices
- Returns
Return a sorted, non-repeating NumPy array of selected atom indices according to the given region number(s). The union of results by each region number is returned. If no valid parameter is provided, an empty array is returned
- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)- Example
my_frag.selectByRegions(1,2,3)
returns indices of all region 1, 2, and 3 atoms
Note
This method is for ChemShell’s QM/MM finite cluster model only
- Fragment.selectByShell(around=[], convex=True, padding=5.0, unit='a.u.', boundary=None | 'inclusive', return_masks=False)
- Parameters
around (list or NumPy array, optional) – Reference atoms around which the shell is to be cut out
convex (bool, optional) – Cut out a shell around the convex hull of the chosen reference atoms. This is default and faster, though it results in a shell slightly different from
convex=False
which takes the whole chosen reference atomspadding (float, optional) – Padding distance from the reference atoms: only atoms within this distance will be selected
unit (str, optional) – Unit for the padding distance
boundary (None or str, optional) – Policy about selecting (
boudnary='inclusive'
) or unselecting (boundary='exclusive'
) molecules across the selection boundary. For biomolecular fragments (created from PDB/PQR/PSF, for example), the default policy is'inclusive'
. For other types of fragments, it defaults toNone
that the entireness of the boundary molecules will not be respected. Seeboundary
ofFragment.selectByRadius()
return_masks (bool, optional) – Return an array of masks (
True
andFalse
) instead of indices
- Returns
Return a sorted, non-repeating NumPy array of selected atom indices according to the given conditions. If no valid parameter is provided (for example
around
is left undefined), the indices of the whole fragment are returned- Return type
NumPy array of integers (default) or booleans (
return_masks=True
)
Note
Here the “shell” means a shell shape around the selected reference atoms. It should not be mistaken for the “shell” model of polarisable forcefield
Note
We have rewritten the convex=True
method since v21.0.0. It now automatically includes all the interior atoms and runs way faster than convex=False
. It is worth noticing that the both methods result in slightly different selections, with the surface by the former being smoother
Cell object
The cell object within the Fragment contains information on the unit cell of a periodic system.
- Fragment.ndimensions
Number of dimensions of periodicity in the system. Can be
0
,1
,2
or3
.
- Fragment.cell.consts
Lattice constants as a list (a, b, c, alpha, beta, gamma)
- Fragment.cell.vectors
Cartesian cell vectors (in a.u.)
- Fragment.cell.fractional
Array of atom fractional coordinates (natoms)
BQs object
The BQs object within the Fragment contains information on any point charges in the system.
- Fragment.bqs.names
Array of point charge labels (nbqs)
- Fragment.bqs.coords
Array of point charge coordinates in a.u. (nbqs, 3)
- Fragment.bqs.charges
Array of point charge charges (nbqs)
Shells object
The Shells object within the Fragment contains information on any shells in the system.
- Fragment.shells.coreatoms
Array of indices of parent atoms (nshells)
- Fragment.shells.names
Array of shell labels (nshells)
- Fragment.shells.coords
Array of absolute shell coordinates in a.u. (nshells, 3)
- Fragment.shells.displace
Array of shell coordinates in a.u. relative to parent atom (nshells, 3)
- Fragment.shells.charges
Array of shell charges (nshells)