to segfault when a system not listed in the MMFFBndk
table was found. THe Herschbach-Laurie fallback according
to MMFF.V was implemented and a relevant test was added
in testMMFFHelpers.cpp
- fixed a bug in Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.cpp
which caused misassignment of atom types in CYGUAN01 upon shuffling
the order of atoms in the validation SDF files
- added checks for acos and asin function parameters to be within
a (-1, 1) range
Code/ForceField/MMFF/Params.h, Code/ForceField/UFF/Params.h,
Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.cpp
and Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.h (I realized
their uselessness thanks to a warning issued by Intel C++ compiler)
- refactored O3A code
- added the possibility to set weighted constraints on selected
atom pairs
- added an option to carry out local-only optimization
whose absence might cause intermittent problem in parsing the
logs on Windows due to tellg/seekg not correctly handling CR/LF
- Fixed the code for assigning the HOCN MMFF94 atom type
(thanks to Toby Wright for reporting this)
- Added a missing copyright notice in testMMFFForceField.h
This covers at least the specific instances from the bug report.
Still need to figure out a general way to identify these automatically
to make sure all are fixed.
removed when the heavy atom they are connected to is not
in its default valence state, while it has one of the
acceptable valence states (otherwise it still has to be
removed for sanitization purposes)
- updated the MMFF validation suite SMILES accordingly
Code/ForceField/MMFF/testMMFFForceField.h
- re-prepared the SDF files from their MOL2 counterparts present in the
original MMFF validation suites. For this purpose a C++ program was
written which only uses information in MOL2 files and in .fc files
provided by Halgren. The C++ program does not depend on RDKit.
- re-prepared the SMILES files from their SDF counterparts using
a Python script which calls MolToSmiles()
- found an issue which affects 2 files in the test suite, namely
ERULE_05 and PO02A (only the hypervalent notation). The issue is
connected with removal of hydrogens bonded to phosphorous and
appears to be fixed by the modifications in Code/GraphMol/AddHs.cpp
and Code/GraphMol/SmilesParse/SmilesWrite.cpp. This issue is
independent of the changes ini the SDF files; indeed, it has
always been present, and had been previously addressed by
manual correction of the two SMILES strings
data races in the multithreaded test
- Removed a spurious #include in Code/ForceField/Wrap/ForceField.cpp
- Restored caching in Code/GraphMol/Descriptors/Crippen.cpp
to the current value) (C++/Python)
- added absolute/relative AngleConstraints (C++/Python)
- added absolute/relative TorsionConstraints (C++/Python)
- added PositionConstraints (C++/Python)
- exposed fixedPoints from Python
- added relevant C++/Python tests
- removed a number of redundant "this->" in member functions
- moved some getGrad() code into Utils::calcAngleBendGrad and
Utils::calcTorsionGrad to avoid repeating the same code
for constraints
constructor (which is overkill, if the molecule had already been
sanitized) with a call to MolOps::Kekulize(). Thus it is not
necessary to call Kekulize() either from Python or from C++,
and no changes are required to the scripts/source codes
previously used for UFF
- removed the code which throws an exception asking to reload the
molecule with sanitize=false since it is not necessary:
only one test in the MMFF validation suite fails if the
molecule is aromatized and then re-kekulized (CIKSEU10), and
it represents a case where the position of double bonds in
a conjugated, non-aromatic system makes a difference for atom
type assignments, which in general is a nonsense. This is not
due to a bug in the code, but rather depends on MMFF atom
typing rules. Hence, I kept the sanitize=false and the call to
sanitizeMMFFMol() in testMMFFForceField.cpp, but I would not
generalize this requirement to "normal" molecules, because it
is really not necessary, since you do not have a reference
kekulization to refer to in the real world.
- updated Docs/Book/GettingStartedInPython.rst accordingly
- updated tests accordingly
to a newly allocated MMFFMolProperties object) with a simple
constructor of the MMFFMolProperties object
- Replaced in a few MMFF-related functions the "ROMol *" argument
with a "ROMol &" argument for consistency with similar RDKit
functions
- Renamed the SetupMMFFForceField() function in Python into
GetMMFFMolProperties() for consistency
- Updated the MMFF tests according to the aforementioned changes
in the API
4-membered rings containing sp2 atoms. The hack consists in
altering on-the-fly the theta0 equilibrium angle, depending on
ring size and collocation of the two edges of the angle (i.e.,
both edges inside the ring or one inside and one outside)
- Added a relevant test in
Code/GraphMol/ForceFieldHelpers/UFF/testUFFHelpers.cpp
to original Rappe' UFF equations are in Code/ForceField/UFF/AngleBend.cpp
and Code/ForceField/UFF/BondStretch.cpp; the changes in
Code/ForceField/UFF/TorsionAngle.cpp are purely cosmetic.
Tests modified according to the small differences in geometries
and energies following the implementation of those corrections:
- Code/ForceField/UFF/testUFFForceField.cpp,
- Code/GraphMol/DistGeomHelpers/Wrap/testDistGeom.py,
- Code/GraphMol/DistGeomHelpers/testDgeomHelpers.cpp,
- Code/GraphMol/MolChemicalFeatures/Wrap/testChemicalFeatures.py,
- rdkit/Chem/Pharm3D/UnitTestEmbed.py
Coordinate files modified according to the small differences in
geometries following the implementation of those corrections:
- Code/GraphMol/DistGeomHelpers/test_data/initCoords.random.sdf
- added #ifdef M_PI (...) #endif in all relevant places
- made length() and sqLength() method consistent
with respect to usage of pow(x, 2) vs x*x in
Code/Geometry/point.h
- removed gzip-related boost.iostreams dependency and
replaced with portable "cmake -E tar xzf" command
in Code/ForceField/MMFF/CMakeLists.txt
incorrect typing might arise when hydrogens were not added after
generating 3D coordinates from SMILES strings; now all 761 test molecules
are correctly typed no matter whether hydrogens are explicit or implicit
- MMFF test suite: I have cut down to the bare essential the
MMFF94/MMFF94s reference log files, but their size could be reduced only
by about 30%. It could have been reduced more converting multiple spaces
into a tab, but the MMFF94 file (the larger one) would still be around
11 MB, and human readability would be greatly impaired. Hence I decided
to keep the spaces and gzip the reference logs, which reduces their
combined size to ~ 3.5 MB, which I think is fine; the test program checks
if the gunzipped files already exist, otherwise it gunzips them upfront.
While cutting, I also sorted the molecules in the same order as in the
SDF/SMILES files, so that it runs about 10 times faster than before.
Now the test runs on MMFF94 only (MMFF94s only concerns different OOP
parameters, there are no algorithmic differences, so as long as one does
not alter the original parameters it can be safely skipped), computing
every 4th molecule, and it runs in 12 seconds on my laptop. Running
all molecules takes ~ 50 seconds, but I think it is rather overkill,
and I would keep it as it is.
- I have added a test suite for MMFF ForceFieldHelpers (like the one
already existing for UFF); I have also complemented the Python wrapper
test suite for ForceFieldHelpers with a few tests for MMFF.
- I have written Python wrappers for the MMFF-related functionality;
while doing that I realized that many of the wrapper code relocations
that I made in my previous pull request were not necessary/appropriate,
so I reverted them. The only difference from the UFF Python API is that,
just like for the C++ API, in addition to the PyForceField object there
is a PyMMFFMolProperties object which is created before constructing the
force field itself; the PyMMFFMolProperties is necessary to set (e.g.,
dielectric constant, dielectric model) or get (e.g., atom type, formal
and partial charge) some MMFF properties which are not present in UFF,
while preserving binary compatibility of the libraries. Probably you
remember that we discussed about setting atom type and charge properties
with SetProp besides populating the MMFFMolProperties object, in order
to allow easy access to Python users. However, I think that the solution
I adopted is preferrable since it is more consistent with the C++ API,
it enables faster access to properties and it allows tailoring the MMFF
environment (i.e., choosing MMFF94/MMFF94s, setting the verbosity level,
including/excluding terms from the MMFF equation, setting dielectric
constant/model) just as from C++.
The MMFF-related Python functions I implemented are:
* MMFFOptimizeMolecule(mol, mmffVariant = "MMFF94", maxIters = 200,
nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
= true)
uses MMFF to optimize a molecule's structure (just like
UFFOptimizeMolecule)
* SanitizeMMFFMol(mol)
sanitizes a molecule according to MMFF requirements
* SetupMMFFForceField(mol, mmffVariant = "MMFF94", mmffVerbosity = 0)
returns a PyMMFFMolProperties object for a molecule; the
PyMMFFMolProperties object is required by MMFFGetMoleculeForceField()
and can be used to get/set MMFF properties
* MMFFGetMoleculeForceField(mol, pyMMFFMolProperties,
nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
= true)
returns a MMFF force field for a molecule (just like
UFFGetMoleculeForceField)
* MMFFHasAllMoleculeParams(mol)
checks if MMFF parameters are available for all of a molecule's atoms
(just like UFFHasAllMoleculeParams)
There are also a few methods connected to the PyMMFFMolProperties class
which mirror those available from C++ for the MMFFMolProperties class:
* GetMMFFAtomType(idx)
Retrieves MMFF atom type for atom with index idx
* GetMMFFFormalCharge(idx)
Retrieves MMFF formal charge for atom with index idx
* GetMMFFPartialCharge(idx)
Retrieves MMFF partial charge for atom with index idx
* SetMMFFDielectricModel(dielModel = 1)
sets the DielModel MMFF property (1: constant; 2: distance-dependent;
defaults to constant)
* SetMMFFDielectricConstant(dielConst = 1.0)
Sets the DielConst MMFF property (defaults to 1.0)
* SetMMFFBondTerm(state = True)
Sets the bond term to be included in the MMFF equation (defaults
to True)
* SetMMFFAngleTerm(state = True)
Sets the angle term to be included in the MMFF equation (defaults
to True)
* SetMMFFStretchBendTerm(state = True)
Sets the stretch-bend term to be included in the MMFF equation (defaults
to True)
* SetMMFFOopTerm(state = True)
Sets the out-of-plane bend term to be included in the MMFF equation
(defaults to True)
* SetMMFFTorsionTerm(state = True)
Sets the torsional term to be included in the MMFF equation (defaults
to True)
* SetMMFFVdWTerm(state = True)
Sets the Van der Waals term to be included in the MMFF equation
(defaults to True)
* SetMMFFEleTerm(state = True)
Sets the electrostatic term to be included in the MMFF equation
(defaults to True)
* SetMMFFVariant(mmffVariant = "MMFF94")
Sets the MMFF variant to be used ("MMFF94" or "MMFF94s"; defaults to
"MMFF94")
* SetMMFFVerbosity(verbosity = 0)
Sets the MMFF verbosity (0: none; 1: low; 2: high; defaults to 0)
Hence, most users will do something like this to optimize a molecule
structure obtained from a SMILES string:
from rdkit import Chem
from rdkit.Chem import AllChem
m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
AllChem.SanitizeMMFFMol(m)
m2 = Chem.AddHs(m)
AllChem.EmbedMolecule(m2)
# Opt
AllChem.MMFFOptimizeMolecule(m2)
print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)
Those willing to play a bit more with MMFF properties may do the
following:
from rdkit import Chem
from rdkit.Chem import AllChem
m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
AllChem.SanitizeMMFFMol(m)
m2 = Chem.AddHs(m)
AllChem.EmbedMolecule(m2)
pyMP = AllChem.SetupMMFFForceField(m2)
pyMP.SetMMFFVariant("MMFF94s")
pyMP.SetMMFFDielectricModel(2)
pyFF = AllChem.MMFFGetMoleculeForceField(m2, pyMP)
pyFF.Minimize()
print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)
print 'Energy = {0:12.4f}'.format(pyFF.CalcEnergy())
i = 0
for i in range(0, m2.GetNumAtoms()):
print '{0:4d} {1:4d} {2:8.4f} {3:8.4f}'.format(i + 1,
int(pyMP.GetMMFFAtomType(i)),
float(pyMP.GetMMFFFormalCharge(i)),
float(pyMP.GetMMFFPartialCharge(i)))
- OOP backport to UFF. I added the inversion term to the UFF
implementation following the original UFF paper by Rappe'. I have already
modified the figures in a couple of test files to reflect the new energy
values.
- 2-bit neighbor matrix and graph-based angle enumeration now reflect
the MMFF implementation.
Code/ForceField and Code/GraphMol/ForceFieldHelpers to the
respective UFF subfolders since from now on UFF will not be
the only available force field anymore. I updated the
relevant CMakeLists.txt files accordingly.
Paolo