Commit Graph

21 Commits

Author SHA1 Message Date
Paolo Tosco
72066affe9 - Optimization of UFF and MMFF forcefields (#1218)
* - optimization to UFF and MMFF forcefields

* - further optimizations (memset, factoring unnecessary in-loop
  initialization out of the loop, replacing if clause with pre-increment)

* - fixed a couple of stylistic glitches

* - the torsionSmarts parameter in addTorsions() is now a const std::string&

* - implemented the DefaultTorsionBondSmarts singleton using boost::call_once()
2017-01-04 08:33:34 +01:00
Greg Landrum
e04aed8ea8 another batch of warnings squashed 2016-03-30 13:44:21 +02:00
Greg Landrum
e08e0d16d8 first pass, using google style 2015-11-14 14:58:11 +01:00
Brian Kelley
54311dff9c Suppresses warnings in tests 2015-10-18 16:09:58 -04:00
Paolo Tosco
e776a11b7d - Forcefield tests now use RDKit::feq() instead of RDKit::round() 2015-05-09 18:33:03 +01:00
Paolo Tosco
623f816c33 - modified force field constraint tests to be more robust 2015-05-05 13:07:52 +01:00
Greg Landrum
07078af4ca support copying ForceField objects 2015-03-11 03:06:32 +01:00
Brian Kelley
95a92282d1 Dictionary access is saniztized and optimized.
o rdkit gains a RDKit::common_properties namespace that contains common string value properties

 o Dict.h and below gain getPropIfPresent that attempts to retrieve a property and returns
  true/false on success or failure.  This is used to optimize access.

 o rdkit learns how to pass property keys by reference, not value.

A new namespace has been added to RDKit, common_properties
that contains the std::string values for commonly used
properties.  This helps to avoid typos in string values
but also avoids a creation of std::strings from character
values.  All accessors (has/get/clear and getPropIfPresent) now pass
the key by reference.

Additionally, getPropIfPresent removes the double lookup
of hasProp/getProp which can be a significant speedup
in the smiles and smarts parsers (10-20%)
2015-01-15 12:23:29 -05:00
ptosco
df89d39f13 - fixed a bug in UFF/TorsionConstraint.cpp and MMFF/TorsionConstraint.cpp
which caused issues with negative angles
- made also UFF/AngleConstraint.cpp and MMFF/AngleConstraint.cpp more
  robust against angle ranges involving negative values
- added relevant C++ and Python tests
2014-12-02 22:47:18 +00:00
ptosco
ba4a48ce05 - fixed a bug in Code/ForceField/MMFF/testMMFFForceField.cpp
- fixed a bug in Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.cpp
  which caused misassignment of atom types in CYGUAN01 upon shuffling
  the order of atoms in the validation SDF files
- added checks for acos and asin function parameters to be within
  a (-1, 1) range
2014-06-01 16:23:03 +01:00
ptosco
5b2f6763a5 - Added the std::fstream::binary flag in testMMFFForceField.cpp
whose absence might cause intermittent problem in parsing the
  logs on Windows due to tellg/seekg not correctly handling CR/LF
- Fixed the code for assigning the HOCN MMFF94 atom type
  (thanks to Toby Wright for reporting this)
- Added a missing copyright notice in testMMFFForceField.h
2014-04-14 23:57:22 +01:00
Greg Landrum
aa7095984e merge 2013-12-03 05:11:30 +01:00
ptosco
5b70cdbdc1 - added relative DistanceConstraints (i.e., +/- with respect
to the current value) (C++/Python)
- added absolute/relative AngleConstraints (C++/Python)
- added absolute/relative TorsionConstraints (C++/Python)
- added PositionConstraints (C++/Python)
- exposed fixedPoints from Python
- added relevant C++/Python tests
- removed a number of redundant "this->" in member functions
- moved some getGrad() code into Utils::calcAngleBendGrad and
  Utils::calcTorsionGrad to avoid repeating the same code
  for constraints
2013-12-02 19:58:29 +01:00
Greg Landrum
4ace62f019 remove more clang warnings 2013-12-02 05:10:23 +01:00
ptosco
1957baadb0 - replaced the call to sanitizeMMFFMol() in the MMFFMolProperties
constructor (which is overkill, if the molecule had already been
  sanitized) with a call to MolOps::Kekulize(). Thus it is not
  necessary to call Kekulize() either from Python or from C++,
  and no changes are required to the scripts/source codes
  previously used for UFF
- removed the code which throws an exception asking to reload the
  molecule with sanitize=false since it is not necessary:
  only one test in the MMFF validation suite fails if the
  molecule is aromatized and then re-kekulized (CIKSEU10), and
  it represents a case where the position of double bonds in
  a conjugated, non-aromatic system makes a difference for atom
  type assignments, which in general is a nonsense. This is not
  due to a bug in the code, but rather depends on MMFF atom
  typing rules. Hence, I kept the sanitize=false and the call to
  sanitizeMMFFMol() in testMMFFForceField.cpp, but I would not
  generalize this requirement to "normal" molecules, because it
  is really not necessary, since you do not have a reference
  kekulization to refer to in the real world.
- updated Docs/Book/GettingStartedInPython.rst accordingly
- updated tests accordingly
2013-10-01 23:16:15 +02:00
ptosco
57147304bb - Replaced the setupMMFFForceField() function (which returned a pointer
to a newly allocated MMFFMolProperties object) with a simple
  constructor of the MMFFMolProperties object
- Replaced in a few MMFF-related functions the "ROMol *" argument
  with a "ROMol &" argument for consistency with similar RDKit
  functions
- Renamed the SetupMMFFForceField() function in Python into
  GetMMFFMolProperties() for consistency
- Updated the MMFF tests according to the aforementioned changes
  in the API
2013-09-28 18:45:51 +02:00
ptosco
ec8eb5a1bf - Changed all occurrences of RDKit::PI into M_PI
- added #ifdef M_PI (...) #endif in all relevant places
- made length() and sqLength() method consistent
  with respect to usage of pow(x, 2) vs x*x in
  Code/Geometry/point.h
- removed gzip-related boost.iostreams dependency and
  replaced with portable "cmake -E tar xzf" command
  in Code/ForceField/MMFF/CMakeLists.txt
2013-09-20 17:45:41 +02:00
ptosco
b1acab59b0 - I have made MMFF atom typing more robust since I realized that
incorrect typing might arise when hydrogens were not added after
  generating 3D coordinates from SMILES strings; now all 761 test molecules
  are correctly typed no matter whether hydrogens are explicit or implicit

- MMFF test suite: I have cut down to the bare essential the
  MMFF94/MMFF94s reference log files, but their size could be reduced only
  by about 30%. It could have been reduced more converting multiple spaces
  into a tab, but the MMFF94 file (the larger one) would still be around
  11 MB, and human readability would be greatly impaired. Hence I decided
  to keep the spaces and gzip the reference logs, which reduces their
  combined size to ~ 3.5 MB, which I think is fine; the test program checks
  if the gunzipped files already exist, otherwise it gunzips them upfront.
  While cutting, I also sorted the molecules in the same order as in the
  SDF/SMILES files, so that it runs about 10 times faster than before.
  Now the test runs on MMFF94 only (MMFF94s only concerns different OOP
  parameters, there are no algorithmic differences, so as long as one does
  not alter the original parameters it can be safely skipped), computing
  every 4th molecule, and it runs in 12 seconds on my laptop. Running
  all molecules takes ~ 50 seconds, but I think it is rather overkill,
  and I would keep it as it is.

- I have added a test suite for MMFF ForceFieldHelpers (like the one
  already existing for UFF); I have also complemented the Python wrapper
  test suite for ForceFieldHelpers with a few tests for MMFF.

- I have written Python wrappers for the MMFF-related functionality;
  while doing that I realized that many of the wrapper code relocations
  that I made in my previous pull request were not necessary/appropriate,
  so I reverted them. The only difference from the UFF Python API is that,
  just like for the C++ API, in addition to the PyForceField object there
  is a PyMMFFMolProperties object which is created before constructing the
  force field itself; the PyMMFFMolProperties is necessary to set (e.g.,
  dielectric constant, dielectric model) or get (e.g., atom type, formal
  and partial charge) some MMFF properties which are not present in UFF,
  while preserving binary compatibility of the libraries. Probably you
  remember that we discussed about setting atom type and charge properties
  with SetProp besides populating the MMFFMolProperties object, in order
  to allow easy access to Python users. However, I think that the solution
  I adopted is preferrable since it is more consistent with the C++ API,
  it enables faster access to properties and it allows tailoring the MMFF
  environment (i.e., choosing MMFF94/MMFF94s, setting the verbosity level,
  including/excluding terms from the MMFF equation, setting dielectric
  constant/model) just as from C++.

  The MMFF-related Python functions I implemented are:

  * MMFFOptimizeMolecule(mol, mmffVariant = "MMFF94", maxIters = 200,
      nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
      = true)

    uses MMFF to optimize a molecule's structure (just like
    UFFOptimizeMolecule)

  * SanitizeMMFFMol(mol)

    sanitizes a molecule according to MMFF requirements

  * SetupMMFFForceField(mol, mmffVariant = "MMFF94", mmffVerbosity = 0)

    returns a PyMMFFMolProperties object for a molecule; the
    PyMMFFMolProperties object is required by MMFFGetMoleculeForceField()
    and can be used to get/set MMFF properties

  * MMFFGetMoleculeForceField(mol, pyMMFFMolProperties,
      nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
      = true)

    returns a MMFF force field for a molecule (just like
    UFFGetMoleculeForceField)

  * MMFFHasAllMoleculeParams(mol)

    checks if MMFF parameters are available for all of a molecule's atoms
    (just like UFFHasAllMoleculeParams)

  There are also a few methods connected to the PyMMFFMolProperties class
  which mirror those available from C++ for the MMFFMolProperties class:

  * GetMMFFAtomType(idx)

    Retrieves MMFF atom type for atom with index idx

  * GetMMFFFormalCharge(idx)

    Retrieves MMFF formal charge for atom with index idx

  * GetMMFFPartialCharge(idx)

    Retrieves MMFF partial charge for atom with index idx

  * SetMMFFDielectricModel(dielModel = 1)

    sets the DielModel MMFF property (1: constant; 2: distance-dependent;
    defaults to constant)

  * SetMMFFDielectricConstant(dielConst = 1.0)

    Sets the DielConst MMFF property (defaults to 1.0)

  * SetMMFFBondTerm(state = True)

    Sets the bond term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFAngleTerm(state = True)

    Sets the angle term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFStretchBendTerm(state = True)

    Sets the stretch-bend term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFOopTerm(state = True)

    Sets the out-of-plane bend term to be included in the MMFF equation
    (defaults to True)

  * SetMMFFTorsionTerm(state = True)

    Sets the torsional term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFVdWTerm(state = True)

    Sets the Van der Waals term to be included in the MMFF equation
    (defaults to True)

  * SetMMFFEleTerm(state = True)

    Sets the electrostatic term to be included in the MMFF equation
    (defaults to True)

  * SetMMFFVariant(mmffVariant = "MMFF94")

    Sets the MMFF variant to be used ("MMFF94" or "MMFF94s"; defaults to
    "MMFF94")

  * SetMMFFVerbosity(verbosity = 0)

    Sets the MMFF verbosity (0: none; 1: low; 2: high; defaults to 0)

  Hence, most users will do something like this to optimize a molecule
  structure obtained from a SMILES string:

  from rdkit import Chem
  from rdkit.Chem import AllChem

  m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
  AllChem.SanitizeMMFFMol(m)
  m2 = Chem.AddHs(m)
  AllChem.EmbedMolecule(m2)
  # Opt
  AllChem.MMFFOptimizeMolecule(m2)
  print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)

  Those willing to play a bit more with MMFF properties may do the
  following:

  from rdkit import Chem
  from rdkit.Chem import AllChem

  m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
  AllChem.SanitizeMMFFMol(m)
  m2 = Chem.AddHs(m)
  AllChem.EmbedMolecule(m2)
  pyMP = AllChem.SetupMMFFForceField(m2)
  pyMP.SetMMFFVariant("MMFF94s")
  pyMP.SetMMFFDielectricModel(2)
  pyFF = AllChem.MMFFGetMoleculeForceField(m2, pyMP)
  pyFF.Minimize()
  print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)
  print 'Energy = {0:12.4f}'.format(pyFF.CalcEnergy())
  i = 0
  for i in range(0, m2.GetNumAtoms()):
    print '{0:4d} {1:4d} {2:8.4f} {3:8.4f}'.format(i + 1,
      int(pyMP.GetMMFFAtomType(i)),
      float(pyMP.GetMMFFFormalCharge(i)),
      float(pyMP.GetMMFFPartialCharge(i)))

- OOP backport to UFF. I added the inversion term to the UFF
  implementation following the original UFF paper by Rappe'. I have already
  modified the figures in a couple of test files to reflect the new energy
  values.

- 2-bit neighbor matrix and graph-based angle enumeration now reflect
  the MMFF implementation.
2013-09-16 12:08:02 +02:00
Greg Landrum
42c56194f8 get the tests running in a reasonable amount of time 2013-08-20 04:37:34 +02:00
Greg Landrum
0212d82468 compiles and passes all tests 2013-08-19 05:53:49 +02:00
ptosco
3f4297fa44 Created a new MMFF branch. I moved some files/folders from
Code/ForceField and Code/GraphMol/ForceFieldHelpers to the
respective UFF subfolders since from now on UFF will not be
the only available force field anymore. I updated the
relevant CMakeLists.txt files accordingly.

Paolo
2013-08-18 09:11:29 +02:00