* very basics: actually parsing the new atom stereochem features
* add some input verification for the chiral permutations
* fix a typo
add quadruple bond SMILES/SMARTS extension
* add forgotten files
* patch from Roger
* add Roger's parsing examples
* typo
* new tests
* adjusted version of next PR from Roger:
- add SP2D hybridization for square planar (this may change)
- some modernizationof Chirality.cpp
- stop using < HybridizationType in Chirality.cpp (should probably do this elsewhere too)
- improved handling of hybridization assignment for new stereochem
- handle new stereo/hybridization in UFF
- tests for the above
* perception of non-tetrahedral stereo from 3D (from Roger S)
Basic testing of SP and TB based on opensmiles docs
* potential fixes for octahedral assignment
more tests
* docs update
need way more!
* map the TH tags directly to @ tags
* very basics of SMILES writing
this does not work with anything that changes the permutation order
like canonicalization or writing things in rings.
* start to support the getChiralAcross API
* more testing
* consistency
* add hasNonTetrahedralStereo() and getIdealAngleBetweenLigands()
* assignStereochemistry should only remove non-tetrahedral stereo
* re-simplify those tests
* cleanup matrix stream output
* initial pass at supporting nontet stereo in distgeom
* backup
* start on the reference docs
* TBP reference
* first pass at Oh finished
* update SP section
* more doc updates
* fix a typo
* add param to not remove Hs connected to non-tetrahedral atoms
* VERY basic coord generation for square planar
* TBP basics
* basic OH depiction
* start testing missing ligands
allow non-tet stereo in rings (ugly, but correct)
* add new TBP functions from Roger
* update depiction code for new API
* backup, the new tests work so far
* Finish the TB tests
* OH tests pass too
* cleanup
* first pass at getting correct SMILES with reordering
need way more testing than this
* ensure permutation 0 is correctly preserved
* some progress towards adding non-tetrahedral stereo to StereoInfo
* doc update
* add non-tet chiral classes to python wrappers
* make sure removeAllHs also gets neighbors of non-tetrahedral centers
more testing
* a bit of depictor cleanup
* make the assignment from 3D more tolerant
more testing
* improve the bulk testing
* cleanup
* remove a bit of redundant code
* ensure we don't write bogus permutation values to SMILES
* fix some rebase problems
* allow assignStereochemistryFrom3D() to be called without sanitization
* allow disabling the non-tetrahedral stereo when it's not explicit
* get that working on windows too
* add "needsHs()" query
* add warning for embedding without Hs
* add H checks to UFF and MMFF as well;
a small amount of cleanup in the MMFF atom typing
* remove include from headers
* update implementation files
* completely remove BOOST_FOREACH (#7)
* convert those changes to use auto
* get rid of all usage of BOOST_FOREACH
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* first cleanup
* next round of changes. all tests pass
* Fixes#2909
* Fixes#2910
* further cleanup
* some cleanup/refactoring of the Dict class
* remove now extraneous calls to hasProp() before clearProp()
* minor refactoring of RDProps.h
* Switch from using our own version of round() to std::round()
* replace some boost::math stuff with the equivalents from std::
* cleanups in SmartsWrite
* refactor out a bunch of duplicated code
* fix an instance of undefined behavior
* changes in response to review
* run clang-tidy with readability-braces-around-statements
clang-format the results
clean up all the parts that clang-tidy-8 broke
* fix problem on windows
* - added two convenience functions to allow miniming all molecule confs with a pre-generated force-field (e.g., to use constraints)
- removed a lot of code duplication between Code/GraphMol/ForceFieldHelpers/MMFF/MMFF.h Code/GraphMol/ForceFieldHelpers/UFF/UFF.h
- homogenized the nonBondedThreshold to 100.0 across all MMFF functions
* - avoided code duplication
- added tests
- added PRECONDITION clauses
This removes some redundancy from some of the test code in order to bring
the runtime down. This does not affect test coverage and shouldn't do
anything bad to the overall test quality.
* do not use new on loggers
* del pointers in testDistGeom
* Update Dict hasNonPOD status on bulk update
* delete new Dicts in memtest1.cpp
* fixes in MolSuppliers and testFMCS
* PeriodicTable singleton as unique_ptr
* fix EEM_arrays leak
* fix leaks in testPBF
* fix ParamCollection leak in test UFF
* fix leaks in MMFF
* clear prop dict before read in in pickler
* fix leaks in testFreeSASA
* fix leaks in test3D
* modernize Dict.h & SmilesParse.cpp
* fix leaks in testQuery
* fix leaks in testCrystalFF
* fix leaks in cxsmilesTest
* fix leaks in Catalog & mol cat test
* fix leaks in ShapeUtils & tests
* fix leaks in testSubgraphs1
* fix leaks testFingerprintGenerators
* fix leaks in Catalog/FilterCatalog
* fix leaks in graphmolqueryTest
* these changes reduce bison parse leaks
* fixed leaks in testChirality.cpp
* fix leaks + 2 tests in testMolWriter
* fix 4m leaks in substructLibraryTest
* small improvements to molTautomerTest; still leaks
* fix leaks in testRGroupDecomp
* fix leaks in test; parser still leaks
* fix leaks in itertest
* fix 4m leaks in testDepictor
* fixes in smatest; still leaking due to parser
* fixes in testSLNParse; still leaking due to parser
* flex/bison: always add atoms with ownership; smarts error cleanup
* fix leaks in testReaction
* fix leaks in testSubstructMatch
* fix leaks in resMolSupplierTest
* fix leaks in testChemTransforms + bug in ChemTransforms
* fix leaks in testPickler
* fix leaks in testMolTransform
* fix leaks in testFragCatalog
* fix leak in testSLNParse. Still leaks due to Smiles
* fixed most leaks in testMolSupplier
* pre bison fix
* fix some atom & bond parse problems; others still fail
* bison smiles & smarts, atoms & bonds more or less fixed
* fix leaks in molopstest.cpp
* fix leaks in testFingerprints, MACCS.cpp & AtomPairs.cpp
* fix leaks in moldraw2Dtest1
* fix leaks in testDescriptors
* fix leaks in testInchi
* fix leaks in testUFFForceFieldHelpers
* fix leaks in hanoiTest & new_canon.h
* fix leaks in testMMFFForceField
* fix leaks in graphmolTest1
* fix leaks in testMMFFForceFieldHelpers
* fix leaks in testDistGeomHelpers
* fix leaks in testMolAlign
* initialize occupancy & temp facto with default values
* fix leak in TautomerTransform
* updated suppressions
* fix testStructChecker
* fix logging & py tests
* fix TautomerTransform class/struct issue
* remove misplaced delete in testSLNParse
* deinit in testAvalonLib1
* fix Avalon-triggered(?) bug in StructChecker/Pattern.cpp
* fix random testMolWriter/Supplier fails
- diversify output file names to avoid clashing.
- unify Writers close/destruct behavior.
- flushing/closing in tests.
* use reset in FFs Params.cpp
* comments on testMMFFForceField
* unrequired 'if's added to mol suppliers
* correct cast in FilterCatalog.h
* use unique_ptr in MACCS Patterns
* remove unrequred if in new_canon
* update & move suppressions
* boost::thread mostly gone... still need to get rid of once
everything compiles
* replace boost::call_once
* remove link-time dependency on boost::thread
* first pass at using async
* switch to using async everywhere
* Fixes atom documentation
* Fixes#1461
This is a complicated one. Basically URANGE_CHECK when
used on unsigned integers has a problem when the size of
the range it’s checking is 0. The standard operations is
to check
URANGE(num, size-1)
Which (for unsigned integers) obviously rolls over.
This fixes all usage cases to be
URANGE(num+1, size)
And fixes the bugs found. (addBond and the mmff tests)
* Fixes#1461 - Updates URANGE_CHECK to be 0<=x<hi
* - optimization to UFF and MMFF forcefields
* - further optimizations (memset, factoring unnecessary in-loop
initialization out of the loop, replacing if clause with pre-increment)
* - fixed a couple of stylistic glitches
* - the torsionSmarts parameter in addTorsions() is now a const std::string&
* - implemented the DefaultTorsionBondSmarts singleton using boost::call_once()
o rdkit gains a RDKit::common_properties namespace that contains common string value properties
o Dict.h and below gain getPropIfPresent that attempts to retrieve a property and returns
true/false on success or failure. This is used to optimize access.
o rdkit learns how to pass property keys by reference, not value.
A new namespace has been added to RDKit, common_properties
that contains the std::string values for commonly used
properties. This helps to avoid typos in string values
but also avoids a creation of std::strings from character
values. All accessors (has/get/clear and getPropIfPresent) now pass
the key by reference.
Additionally, getPropIfPresent removes the double lookup
of hasProp/getProp which can be a significant speedup
in the smiles and smarts parsers (10-20%)
(i.e., equilibrium distances/angles and force constants) from
UFF and MMFF in response to two requests recently appeared
on the RDKit-discuss mailing list:
http://sourceforge.net/p/rdkit/mailman/message/32953737/http://sourceforge.net/p/rdkit/mailman/message/32880156/
- did some clean up on the MMFF code
- NB there are two ABI changes:
1) StretchBendContrib(ForceField *owner,
const unsigned int idx1, const unsigned int idx2, const unsigned int idx3,
const MMFFStbn *mmffStbnParams, const MMFFAngle *mmffAngleParams,
const MMFFBond *mmffBondParams1, const MMFFBond *mmffBondParams2);
previously was:
StretchBendContrib(ForceField *owner,
const unsigned int idx1, const unsigned int idx2, const unsigned int idx3,
const std::pair<bool, const MMFFStbn *> mmffStbnParams,
const MMFFAngle *mmffAngleParams, const MMFFBond *mmffBondParams1,
const MMFFBond *mmffBondParams2);
2) std::pair<double, double> calcStbnForceConstants(const MMFFStbn *mmffStbnParams);
previously was:
std::pair<double, double> calcStbnForceConstants
(const std::pair<bool, const MMFFStbn *> mmffStbnParams);
The two changes are NOT mandatory - however, both the StretchBendContrib constructor
and calcStbnForceConstants(), though public, are basically "internal" method that
most likely no-one has ever invoked. Given that the current API is MUCH better
and cleaner, I would really advise for the new version.
4-membered rings containing sp2 atoms. The hack consists in
altering on-the-fly the theta0 equilibrium angle, depending on
ring size and collocation of the two edges of the angle (i.e.,
both edges inside the ring or one inside and one outside)
- Added a relevant test in
Code/GraphMol/ForceFieldHelpers/UFF/testUFFHelpers.cpp
incorrect typing might arise when hydrogens were not added after
generating 3D coordinates from SMILES strings; now all 761 test molecules
are correctly typed no matter whether hydrogens are explicit or implicit
- MMFF test suite: I have cut down to the bare essential the
MMFF94/MMFF94s reference log files, but their size could be reduced only
by about 30%. It could have been reduced more converting multiple spaces
into a tab, but the MMFF94 file (the larger one) would still be around
11 MB, and human readability would be greatly impaired. Hence I decided
to keep the spaces and gzip the reference logs, which reduces their
combined size to ~ 3.5 MB, which I think is fine; the test program checks
if the gunzipped files already exist, otherwise it gunzips them upfront.
While cutting, I also sorted the molecules in the same order as in the
SDF/SMILES files, so that it runs about 10 times faster than before.
Now the test runs on MMFF94 only (MMFF94s only concerns different OOP
parameters, there are no algorithmic differences, so as long as one does
not alter the original parameters it can be safely skipped), computing
every 4th molecule, and it runs in 12 seconds on my laptop. Running
all molecules takes ~ 50 seconds, but I think it is rather overkill,
and I would keep it as it is.
- I have added a test suite for MMFF ForceFieldHelpers (like the one
already existing for UFF); I have also complemented the Python wrapper
test suite for ForceFieldHelpers with a few tests for MMFF.
- I have written Python wrappers for the MMFF-related functionality;
while doing that I realized that many of the wrapper code relocations
that I made in my previous pull request were not necessary/appropriate,
so I reverted them. The only difference from the UFF Python API is that,
just like for the C++ API, in addition to the PyForceField object there
is a PyMMFFMolProperties object which is created before constructing the
force field itself; the PyMMFFMolProperties is necessary to set (e.g.,
dielectric constant, dielectric model) or get (e.g., atom type, formal
and partial charge) some MMFF properties which are not present in UFF,
while preserving binary compatibility of the libraries. Probably you
remember that we discussed about setting atom type and charge properties
with SetProp besides populating the MMFFMolProperties object, in order
to allow easy access to Python users. However, I think that the solution
I adopted is preferrable since it is more consistent with the C++ API,
it enables faster access to properties and it allows tailoring the MMFF
environment (i.e., choosing MMFF94/MMFF94s, setting the verbosity level,
including/excluding terms from the MMFF equation, setting dielectric
constant/model) just as from C++.
The MMFF-related Python functions I implemented are:
* MMFFOptimizeMolecule(mol, mmffVariant = "MMFF94", maxIters = 200,
nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
= true)
uses MMFF to optimize a molecule's structure (just like
UFFOptimizeMolecule)
* SanitizeMMFFMol(mol)
sanitizes a molecule according to MMFF requirements
* SetupMMFFForceField(mol, mmffVariant = "MMFF94", mmffVerbosity = 0)
returns a PyMMFFMolProperties object for a molecule; the
PyMMFFMolProperties object is required by MMFFGetMoleculeForceField()
and can be used to get/set MMFF properties
* MMFFGetMoleculeForceField(mol, pyMMFFMolProperties,
nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
= true)
returns a MMFF force field for a molecule (just like
UFFGetMoleculeForceField)
* MMFFHasAllMoleculeParams(mol)
checks if MMFF parameters are available for all of a molecule's atoms
(just like UFFHasAllMoleculeParams)
There are also a few methods connected to the PyMMFFMolProperties class
which mirror those available from C++ for the MMFFMolProperties class:
* GetMMFFAtomType(idx)
Retrieves MMFF atom type for atom with index idx
* GetMMFFFormalCharge(idx)
Retrieves MMFF formal charge for atom with index idx
* GetMMFFPartialCharge(idx)
Retrieves MMFF partial charge for atom with index idx
* SetMMFFDielectricModel(dielModel = 1)
sets the DielModel MMFF property (1: constant; 2: distance-dependent;
defaults to constant)
* SetMMFFDielectricConstant(dielConst = 1.0)
Sets the DielConst MMFF property (defaults to 1.0)
* SetMMFFBondTerm(state = True)
Sets the bond term to be included in the MMFF equation (defaults
to True)
* SetMMFFAngleTerm(state = True)
Sets the angle term to be included in the MMFF equation (defaults
to True)
* SetMMFFStretchBendTerm(state = True)
Sets the stretch-bend term to be included in the MMFF equation (defaults
to True)
* SetMMFFOopTerm(state = True)
Sets the out-of-plane bend term to be included in the MMFF equation
(defaults to True)
* SetMMFFTorsionTerm(state = True)
Sets the torsional term to be included in the MMFF equation (defaults
to True)
* SetMMFFVdWTerm(state = True)
Sets the Van der Waals term to be included in the MMFF equation
(defaults to True)
* SetMMFFEleTerm(state = True)
Sets the electrostatic term to be included in the MMFF equation
(defaults to True)
* SetMMFFVariant(mmffVariant = "MMFF94")
Sets the MMFF variant to be used ("MMFF94" or "MMFF94s"; defaults to
"MMFF94")
* SetMMFFVerbosity(verbosity = 0)
Sets the MMFF verbosity (0: none; 1: low; 2: high; defaults to 0)
Hence, most users will do something like this to optimize a molecule
structure obtained from a SMILES string:
from rdkit import Chem
from rdkit.Chem import AllChem
m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
AllChem.SanitizeMMFFMol(m)
m2 = Chem.AddHs(m)
AllChem.EmbedMolecule(m2)
# Opt
AllChem.MMFFOptimizeMolecule(m2)
print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)
Those willing to play a bit more with MMFF properties may do the
following:
from rdkit import Chem
from rdkit.Chem import AllChem
m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
AllChem.SanitizeMMFFMol(m)
m2 = Chem.AddHs(m)
AllChem.EmbedMolecule(m2)
pyMP = AllChem.SetupMMFFForceField(m2)
pyMP.SetMMFFVariant("MMFF94s")
pyMP.SetMMFFDielectricModel(2)
pyFF = AllChem.MMFFGetMoleculeForceField(m2, pyMP)
pyFF.Minimize()
print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)
print 'Energy = {0:12.4f}'.format(pyFF.CalcEnergy())
i = 0
for i in range(0, m2.GetNumAtoms()):
print '{0:4d} {1:4d} {2:8.4f} {3:8.4f}'.format(i + 1,
int(pyMP.GetMMFFAtomType(i)),
float(pyMP.GetMMFFFormalCharge(i)),
float(pyMP.GetMMFFPartialCharge(i)))
- OOP backport to UFF. I added the inversion term to the UFF
implementation following the original UFF paper by Rappe'. I have already
modified the figures in a couple of test files to reflect the new energy
values.
- 2-bit neighbor matrix and graph-based angle enumeration now reflect
the MMFF implementation.
Code/ForceField and Code/GraphMol/ForceFieldHelpers to the
respective UFF subfolders since from now on UFF will not be
the only available force field anymore. I updated the
relevant CMakeLists.txt files accordingly.
Paolo