.. _p450_qmmm: ************************************* QM/MM calculations on the P450 enzyme ************************************* .. note:: The scripts for this step of the tutorial can be found in ``p450_enzyme/2_qmmm``. In this stage we perform QM/MM calculations on P450cam based on our non-periodic model of the solvated protein. A single-point QM/MM calculation ================================ We are now ready to run a QM/MM energy evaluation on our protein with solvent shell. The script to run this is ``p450_qmmm_sp.py``. First, read in the Fragment from the previous stage, again in PQR format to retain charges:: # Read in protein system with solvent shell in PQR format (including charges) p450_w_water_shell = Fragment(coords='p450_w_water_shell.pqr') We next need to define the QM region in the QM/MM calculation, by building up atom lists corresponding to the haem and its oxygen and cysteinate ligands. Note to reduce the computational cost of the calculation we do not include haem side chains, or the camphor molecule. The region can be defined either explicitly with atom numbers (counting from zero) or through ChemShell atom selection commands:: # Haem (without side chains) list_indices_haem_wo_sidechain = list(range(6384, 6413)) # Oxygen: atoms 6355, 6356 list_indices_o2 = p450_w_water_shell.select(residues='O2').tolist() # CYS347 (truncated as SCH3): atoms 5445, 5446, 5447, 5448 list_indices_cys347 = p450_w_water_shell.select(residues='CYS', resIDs=347, truncate_by='alpha-beta', visualise='vwd').tolist() .. note :: In a production calculation, we would usually include the camphor molecule in the QM region and also consider what further amino acid residues in the active site should be included to give an accurate description of reactivity. The lists can then be concatenated to define the full QM region:: # Concatenate the lists to form the QM region indices_qm_region = list_indices_haem_wo_sidechain + list_indices_o2 + list_indices_cys347 The charge of the QM region must next be defined, which is calculated based on the classical forcefield charges then internally rounded to the nearest integer:: # The given value for QM charge will be rounded to the nearest integer qm_charge = p450_w_water_shell.charges[indices_qm_region].sum() In the case of our system, the resulting charge will be -1 due to the cysteine ligand. For convenience, the QM region can then be saved to be checked in a visualiser:: # Save for viewing frag_qm_region = p450_w_water_shell.getSelected(indices_qm_region) frag_qm_region.save('qm_region.pdb') .. note :: ChemShell can also print a sequence of atoms that can be used in e.g. VMD to visualise the QM region:: print("\n To visualise the QM region with VMD:") vmd_select = 'index ' + ' or index '.join([ str(s) for s in indices_qm_region ]) print('\n' + vmd_select) Once the QM region has been fully defined, a QM Theory object should be specified. In the example script, we use GAMESS-UK, but other QM codes may be used. The commonly-used functional B3LYP is used, with the 6-31G* for all atoms except Fe, where LANL2 with corresponding ECP is used. The basis set and ECP definitions are kept in separate files. We supply the charge of the QM system based on the calculation earlier in the input, while we have to specify a spin multiplicity. In the case of the haem-O2 complex the lowest energy state is a triplet. The resulting specification is as follows.:: # Define QM method my_qm = GAMESS_UK(method='dft', functional='b3lyp', atom_names='symbol', basis='hcno_6-31gp_fe_lanl2dz_gamess-uk.bas', ecp='fe_lanl2dz_gamess-uk.ecp', charge=qm_charge, mult=3, scftype='uhf', maxcycles=2000, restart= False) .. note :: In this example the ``atom_names`` option is specified to use simple element names as used in the basis set and ECP files. We now define the forcefield for the MM part of the calculation. For the QM/MM stage we use DL_POLY for the MM calculation rather than NAMD. The DL_FIELD package can be used in this case to convert CHARMM forcefield data into DL_POLY format. In the first step, we convert the user-defined camphor forcefield files into DL_FIELD format:: # First, convert user-defined CHARMM files for camphor to DL_FIELD format udff = DL_FIELD.charmm2udff(rtf='../1_solvation/camphor.rtf', prm='../1_solvation/camphor.prm', version='CHARMM36_prot') .. note :: The above step is only necessary if there are user-defined parameters, as we have in this case for camphor. Standard CHARMM parameters will be defined automatically by DL_FIELD. We can now combine the user-defined camphor forcefield with the default CHARMM36_prot forcefield using DL_FIELD:: # Now use DL_FIELD to define the forcefield using built-in CHARMM36_prot # NB: cutoff distance defined in a.u. my_ff = DL_FIELD(ff=['CHARMM36_prot', udff], override_builtin=False, original_tip3p=False) with the options: * ``override_builtin`` specifies whether user-defined parameters should override those of the standard CHARMM forcefield (default: False) * ``original_tip3p`` specifies whether to apply the original TIP3P model (default: False) Once the forcefield is defined, the ``DL_POLY`` Theory object can be straightforwardly specified as:: # Define the MM method (DL_POLY) using the forcefield defined by DL_FIELD my_mm = DL_POLY(ff=my_ff, dl_field_charges=False) .. note :: To keep the PQR charges, ``dl_field_charges=False`` must be specified; otherwise they will be overridden by charges provided by DL_FIELD. Then we define the QM/MM calculation as in previous tutorials, and finally run the calculation itself:: # Define the QM/MM calculation my_qmmm = QMMM(frag=p450_w_water_shell, qm=my_qm, mm=my_mm, qm_region=indices_qm_region) # Run a single-point QM/MM energy evaluation my_sp = SP(theory=my_qmmm) my_sp.run() QM/MM geometry optimisation =========================== Geometry optimisations on the QM/MM system are run in a similar manner to single point calculations, see ``p450_qmmm_opt.py``. In order to run a geometry optimisation an active region must be defined, beyond which the protein/solvent system is fixed to ensure that it does not lose integrity. The active region can be defined as all atoms within a certain distance of the QM region, using another shell selection command:: # Define active region for geometry optimisation indices_active_region = p450_w_water_shell.selectByShell(around=indices_qm_region, padding=7.0, unit='angstrom', boundary='incl') # Save for viewing frag_active_region = p450_w_water_shell.getSelected(indices_active_region) frag_active_region.save('active_region.pdb') Again, the region can be checked later by visualising ``active_region.pdb``. The active region should then be fed into the ``Opt()`` task as follows:: # Run QM/MM geometry optimisation with DL-FIND my_opt = Opt(theory=my_qmmm, active=indices_active_region, maxcycle=1000, maxene=2000) my_opt.run() Porting from other workflows ============================ In case only a PDB but no PQR file is available, for example, when the system has been prepared by other workflows (such as AMBER), we can use the DL_FIELD charges alternatively. ``p450_qmmm_sp_from_pdb.py`` is an example script demonstrating this use case:: p450_w_water_shell = Fragment(coords='p450_w_water_shell.pdb') Then switch on the ``dl_field_charges`` option (default: ``True``) of the ``DL_POLY`` object:: my_mm = DL_POLY(ff=my_ff, dl_field_charges=True) The QM object's ``charge`` option (default: ``0``) can use the value ``'automatic'`` (or equally ``'dl_field'``) to let ChemShell determine it by summing up the formal charges assigned by DL_FIELD (Of course, you can use a number you want, too, such as ``-2``, ``-1``, ``0``, ``1``, ``-2``, or so):: my_qm = GAMESS_UK(method='dft', functional='b3lyp', atom_names='symbol', basis='hcno_6-31gp_fe_lanl2dz_gamess-uk.bas', ecp='fe_lanl2dz_gamess-uk.ecp', charge='automatic', mult=3, scftype='uhf', maxcycles=2000, restart= False) Alternatively, atom charges can also be explicitly imported from a list/an array:: list_charges = [ 1.0, 2.0, 3.0,... ] p450_w_water_shell = Fragment(coords='p450_w_water_shell.pdb', charges=list_charges) In this case it is necessary to set ``dl_field_charges=False`` in the ``DL_POLY`` object to prevent the charges being overwritten by DL_FIELD. To be continued =============== In future versions of this tutorial, we will show how to prepare the active oxidising species Compound I, and explore the reaction path for the hydroxylation of camphor.