Commit Graph

93 Commits

Author SHA1 Message Date
Nadine Schneider
0132cc72e9 Merge branch 'newCanon' into master3
Conflicts:
	Code/GraphMol/SmilesParse/test.cpp
2015-04-10 10:25:47 +02:00
Nadine Schneider
5d963846b8 merge 2015-04-10 09:44:18 +02:00
Greg Landrum
8c3d698472 remove some compiler warnings 2015-03-22 18:13:58 +01:00
Greg Landrum
3cd4001615 Support releasing the GIL during some of the longer operations we just sped up. 2015-03-15 05:20:30 +01:00
Greg Landrum
c1b809f255 add MMFF.h to the list of headers 2015-03-12 03:47:43 +01:00
Greg Landrum
17c89e0e70 support the same multi-threaded minimization with MMFF 2015-03-11 06:09:53 +01:00
Greg Landrum
f80532c0fd update the multithreaded UFF helper to not construct the force field for each conformer 2015-03-11 03:26:50 +01:00
Greg Landrum
1792cf23f9 expose the UFFOptimizeMoleculeConfs to python 2015-03-10 06:17:51 +01:00
Greg Landrum
d0d0bbf495 fix a problem with the multi-threading
add more tests.
2015-03-10 06:16:58 +01:00
Greg Landrum
7d115c5c3a initial pass at supporting multi-threaded minimization. This may not be where we want to land, but it does seem to work 2015-03-10 04:10:02 +01:00
Greg Landrum
ca5c77cc50 add a UFFOptimizeMoleculeConfs() method 2015-03-09 07:06:47 +01:00
Greg Landrum
f41ea05edf also return energy from UFFOptimizeMolecule() 2015-03-09 06:20:15 +01:00
Greg Landrum
35aa4d0de2 add UFFOptimizeMolecule() to C++ API 2015-03-09 06:09:59 +01:00
Greg Landrum
a0a02b4a99 merge on master 2015-02-03 16:41:46 +01:00
Paolo Tosco
30f1dc1bef - Fixed two trivial bugs that went unnoticed so far because that
part of code is never executed running the validation suite
  (and probably hardly executed at all), but still worth fixing!
2015-01-29 13:35:58 +00:00
Greg Landrum
2d9f809770 fix some broken tests; pyDistGeom is still failing 2015-01-23 06:43:12 -05:00
Paolo Tosco
eac634383e - fixed two minor bugs in MMFF code concerning the Badger-Herschbach-Laurie
equations.
2015-01-21 22:44:11 +00:00
Brian Kelley
95a92282d1 Dictionary access is saniztized and optimized.
o rdkit gains a RDKit::common_properties namespace that contains common string value properties

 o Dict.h and below gain getPropIfPresent that attempts to retrieve a property and returns
  true/false on success or failure.  This is used to optimize access.

 o rdkit learns how to pass property keys by reference, not value.

A new namespace has been added to RDKit, common_properties
that contains the std::string values for commonly used
properties.  This helps to avoid typos in string values
but also avoids a creation of std::strings from character
values.  All accessors (has/get/clear and getPropIfPresent) now pass
the key by reference.

Additionally, getPropIfPresent removes the double lookup
of hasProp/getProp which can be a significant speedup
in the smiles and smarts parsers (10-20%)
2015-01-15 12:23:29 -05:00
Greg Landrum
fd127bb10e add missed include file 2014-11-17 08:27:05 +01:00
Greg Landrum
6c373ef11f - added C++ and Python helpers to retrieve force-field parameters
(i.e., equilibrium distances/angles and force constants) from
  UFF and MMFF in response to two requests recently appeared
  on the RDKit-discuss mailing list:
  http://sourceforge.net/p/rdkit/mailman/message/32953737/
  http://sourceforge.net/p/rdkit/mailman/message/32880156/
- did some clean up on the MMFF code
- NB there are two ABI changes:
  1) StretchBendContrib(ForceField *owner,
       const unsigned int idx1, const unsigned int idx2, const unsigned int idx3,
       const MMFFStbn *mmffStbnParams, const MMFFAngle *mmffAngleParams,
       const MMFFBond *mmffBondParams1, const MMFFBond *mmffBondParams2);
     previously was:
     StretchBendContrib(ForceField *owner,
       const unsigned int idx1, const unsigned int idx2, const unsigned int idx3,
       const std::pair<bool, const MMFFStbn *> mmffStbnParams,
       const MMFFAngle *mmffAngleParams, const MMFFBond *mmffBondParams1,
       const MMFFBond *mmffBondParams2);
  2) std::pair<double, double> calcStbnForceConstants(const MMFFStbn *mmffStbnParams);
     previously was:
     std::pair<double, double> calcStbnForceConstants
       (const std::pair<bool, const MMFFStbn *> mmffStbnParams);
  The two changes are NOT mandatory - however, both the StretchBendContrib constructor
  and calcStbnForceConstants(), though public, are basically "internal" method that
  most likely no-one has ever invoked. Given that the current API is MUCH better
  and cleaner, I would really advise for the new version.
2014-11-17 05:51:20 +01:00
Greg Landrum
77bb8ee681 fix MMFF atom-type initialization problem with Hs 2014-08-25 10:15:14 +02:00
Paolo Tosco
1c3a03961d - fixed a bug which caused the MMFF parameterization code
to segfault when a system not listed in the MMFFBndk
  table was found. THe Herschbach-Laurie fallback according
  to MMFF.V was implemented and a relevant test was added
  in testMMFFHelpers.cpp
2014-08-18 16:11:03 +01:00
ptosco
ba4a48ce05 - fixed a bug in Code/ForceField/MMFF/testMMFFForceField.cpp
- fixed a bug in Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.cpp
  which caused misassignment of atom types in CYGUAN01 upon shuffling
  the order of atoms in the validation SDF files
- added checks for acos and asin function parameters to be within
  a (-1, 1) range
2014-06-01 16:23:03 +01:00
ptosco
13bb831dd5 - removed a few spurious, though non harmful, "const" keywords from
Code/ForceField/MMFF/Params.h, Code/ForceField/UFF/Params.h,
  Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.cpp
  and Code/GraphMol/ForceFieldHelpers/MMFF/AtomTyper.h (I realized
  their uselessness thanks to a warning issued by Intel C++ compiler)
- refactored O3A code
- added the possibility to set weighted constraints on selected
  atom pairs
- added an option to carry out local-only optimization
2014-04-15 22:13:54 +01:00
ptosco
5b2f6763a5 - Added the std::fstream::binary flag in testMMFFForceField.cpp
whose absence might cause intermittent problem in parsing the
  logs on Windows due to tellg/seekg not correctly handling CR/LF
- Fixed the code for assigning the HOCN MMFF94 atom type
  (thanks to Toby Wright for reporting this)
- Added a missing copyright notice in testMMFFForceField.h
2014-04-14 23:57:22 +01:00
Greg Landrum
f700cc7fb0 Fixes: #224 2014-03-01 05:58:31 +01:00
Greg Landrum
4ace62f019 remove more clang warnings 2013-12-02 05:10:23 +01:00
Greg Landrum
60a68669da remove more clang warnings 2013-12-02 04:46:46 +01:00
Greg Landrum
3e70352f7a Fixes #162 2013-11-28 07:49:22 +01:00
Greg Landrum
6f08c4e4d7 Test multi-thread safety for UFF and MMFF94. Both pass
Fixes #160
2013-11-20 06:42:27 +01:00
Greg Landrum
2f6c55f8cc O3A memory leak plugged
Part of #159
2013-11-20 04:29:14 +01:00
Greg Landrum
1c6fd07cfd small MMFF memory leaks plugged
Part of #159
2013-11-20 03:15:53 +01:00
Greg Landrum
1acae67cba Merge branch 'master' into PDB_29September2013 2013-10-08 05:51:57 +02:00
ptosco
d402867610 - fixed a bug in the MMFF stretch-bend gradient calculation
- put in better evidence the extents of the UFF hack so that it can be
  easily commented out if required
2013-10-07 17:53:20 +02:00
Greg Landrum
c95d57aa54 add in-place form of addHs() 2013-10-04 07:58:53 +02:00
ptosco
1957baadb0 - replaced the call to sanitizeMMFFMol() in the MMFFMolProperties
constructor (which is overkill, if the molecule had already been
  sanitized) with a call to MolOps::Kekulize(). Thus it is not
  necessary to call Kekulize() either from Python or from C++,
  and no changes are required to the scripts/source codes
  previously used for UFF
- removed the code which throws an exception asking to reload the
  molecule with sanitize=false since it is not necessary:
  only one test in the MMFF validation suite fails if the
  molecule is aromatized and then re-kekulized (CIKSEU10), and
  it represents a case where the position of double bonds in
  a conjugated, non-aromatic system makes a difference for atom
  type assignments, which in general is a nonsense. This is not
  due to a bug in the code, but rather depends on MMFF atom
  typing rules. Hence, I kept the sanitize=false and the call to
  sanitizeMMFFMol() in testMMFFForceField.cpp, but I would not
  generalize this requirement to "normal" molecules, because it
  is really not necessary, since you do not have a reference
  kekulization to refer to in the real world.
- updated Docs/Book/GettingStartedInPython.rst accordingly
- updated tests accordingly
2013-10-01 23:16:15 +02:00
Greg Landrum
d6faad6e40 more mmff api consistency 2013-09-29 07:58:34 +02:00
ptosco
57147304bb - Replaced the setupMMFFForceField() function (which returned a pointer
to a newly allocated MMFFMolProperties object) with a simple
  constructor of the MMFFMolProperties object
- Replaced in a few MMFF-related functions the "ROMol *" argument
  with a "ROMol &" argument for consistency with similar RDKit
  functions
- Renamed the SetupMMFFForceField() function in Python into
  GetMMFFMolProperties() for consistency
- Updated the MMFF tests according to the aforementioned changes
  in the API
2013-09-28 18:45:51 +02:00
ptosco
3646270a49 - Implemented a hack to improve the UFF-generated geometries of 3- and
4-membered rings containing sp2 atoms. The hack consists in
  altering on-the-fly the theta0 equilibrium angle, depending on
  ring size and collocation of the two edges of the angle (i.e.,
  both edges inside the ring or one inside and one outside)
- Added a relevant test in
  Code/GraphMol/ForceFieldHelpers/UFF/testUFFHelpers.cpp
2013-09-24 14:45:08 +02:00
Greg Landrum
2efd05a7e3 Fixes #105 2013-09-23 01:38:14 -04:00
ptosco
ec8eb5a1bf - Changed all occurrences of RDKit::PI into M_PI
- added #ifdef M_PI (...) #endif in all relevant places
- made length() and sqLength() method consistent
  with respect to usage of pow(x, 2) vs x*x in
  Code/Geometry/point.h
- removed gzip-related boost.iostreams dependency and
  replaced with portable "cmake -E tar xzf" command
  in Code/ForceField/MMFF/CMakeLists.txt
2013-09-20 17:45:41 +02:00
Greg Landrum
3a59557c91 fix a typo introduced in this morning's refactoring 2013-09-20 14:54:29 +02:00
Greg Landrum
f4efe253cb get windows builds working 2013-09-20 04:46:49 +01:00
Greg Landrum
219d6ea690 allow the MMFF optimization code to be called a second time on a molecule 2013-09-18 05:01:38 +02:00
Greg Landrum
8924ffd1ef modify sanitizeMMFFMol() so that it no longer throws when sanitization fails;
remove precondition test for sanitizeMMFFMol() failure from setupMMFFForceField() in favor of throwing the appropriate exception
2013-09-18 04:51:49 +02:00
ptosco
b1acab59b0 - I have made MMFF atom typing more robust since I realized that
incorrect typing might arise when hydrogens were not added after
  generating 3D coordinates from SMILES strings; now all 761 test molecules
  are correctly typed no matter whether hydrogens are explicit or implicit

- MMFF test suite: I have cut down to the bare essential the
  MMFF94/MMFF94s reference log files, but their size could be reduced only
  by about 30%. It could have been reduced more converting multiple spaces
  into a tab, but the MMFF94 file (the larger one) would still be around
  11 MB, and human readability would be greatly impaired. Hence I decided
  to keep the spaces and gzip the reference logs, which reduces their
  combined size to ~ 3.5 MB, which I think is fine; the test program checks
  if the gunzipped files already exist, otherwise it gunzips them upfront.
  While cutting, I also sorted the molecules in the same order as in the
  SDF/SMILES files, so that it runs about 10 times faster than before.
  Now the test runs on MMFF94 only (MMFF94s only concerns different OOP
  parameters, there are no algorithmic differences, so as long as one does
  not alter the original parameters it can be safely skipped), computing
  every 4th molecule, and it runs in 12 seconds on my laptop. Running
  all molecules takes ~ 50 seconds, but I think it is rather overkill,
  and I would keep it as it is.

- I have added a test suite for MMFF ForceFieldHelpers (like the one
  already existing for UFF); I have also complemented the Python wrapper
  test suite for ForceFieldHelpers with a few tests for MMFF.

- I have written Python wrappers for the MMFF-related functionality;
  while doing that I realized that many of the wrapper code relocations
  that I made in my previous pull request were not necessary/appropriate,
  so I reverted them. The only difference from the UFF Python API is that,
  just like for the C++ API, in addition to the PyForceField object there
  is a PyMMFFMolProperties object which is created before constructing the
  force field itself; the PyMMFFMolProperties is necessary to set (e.g.,
  dielectric constant, dielectric model) or get (e.g., atom type, formal
  and partial charge) some MMFF properties which are not present in UFF,
  while preserving binary compatibility of the libraries. Probably you
  remember that we discussed about setting atom type and charge properties
  with SetProp besides populating the MMFFMolProperties object, in order
  to allow easy access to Python users. However, I think that the solution
  I adopted is preferrable since it is more consistent with the C++ API,
  it enables faster access to properties and it allows tailoring the MMFF
  environment (i.e., choosing MMFF94/MMFF94s, setting the verbosity level,
  including/excluding terms from the MMFF equation, setting dielectric
  constant/model) just as from C++.

  The MMFF-related Python functions I implemented are:

  * MMFFOptimizeMolecule(mol, mmffVariant = "MMFF94", maxIters = 200,
      nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
      = true)

    uses MMFF to optimize a molecule's structure (just like
    UFFOptimizeMolecule)

  * SanitizeMMFFMol(mol)

    sanitizes a molecule according to MMFF requirements

  * SetupMMFFForceField(mol, mmffVariant = "MMFF94", mmffVerbosity = 0)

    returns a PyMMFFMolProperties object for a molecule; the
    PyMMFFMolProperties object is required by MMFFGetMoleculeForceField()
    and can be used to get/set MMFF properties

  * MMFFGetMoleculeForceField(mol, pyMMFFMolProperties,
      nonBondedThresh = 100.0, confId = -1, ignoreInterfragInteractions
      = true)

    returns a MMFF force field for a molecule (just like
    UFFGetMoleculeForceField)

  * MMFFHasAllMoleculeParams(mol)

    checks if MMFF parameters are available for all of a molecule's atoms
    (just like UFFHasAllMoleculeParams)

  There are also a few methods connected to the PyMMFFMolProperties class
  which mirror those available from C++ for the MMFFMolProperties class:

  * GetMMFFAtomType(idx)

    Retrieves MMFF atom type for atom with index idx

  * GetMMFFFormalCharge(idx)

    Retrieves MMFF formal charge for atom with index idx

  * GetMMFFPartialCharge(idx)

    Retrieves MMFF partial charge for atom with index idx

  * SetMMFFDielectricModel(dielModel = 1)

    sets the DielModel MMFF property (1: constant; 2: distance-dependent;
    defaults to constant)

  * SetMMFFDielectricConstant(dielConst = 1.0)

    Sets the DielConst MMFF property (defaults to 1.0)

  * SetMMFFBondTerm(state = True)

    Sets the bond term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFAngleTerm(state = True)

    Sets the angle term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFStretchBendTerm(state = True)

    Sets the stretch-bend term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFOopTerm(state = True)

    Sets the out-of-plane bend term to be included in the MMFF equation
    (defaults to True)

  * SetMMFFTorsionTerm(state = True)

    Sets the torsional term to be included in the MMFF equation (defaults
    to True)

  * SetMMFFVdWTerm(state = True)

    Sets the Van der Waals term to be included in the MMFF equation
    (defaults to True)

  * SetMMFFEleTerm(state = True)

    Sets the electrostatic term to be included in the MMFF equation
    (defaults to True)

  * SetMMFFVariant(mmffVariant = "MMFF94")

    Sets the MMFF variant to be used ("MMFF94" or "MMFF94s"; defaults to
    "MMFF94")

  * SetMMFFVerbosity(verbosity = 0)

    Sets the MMFF verbosity (0: none; 1: low; 2: high; defaults to 0)

  Hence, most users will do something like this to optimize a molecule
  structure obtained from a SMILES string:

  from rdkit import Chem
  from rdkit.Chem import AllChem

  m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
  AllChem.SanitizeMMFFMol(m)
  m2 = Chem.AddHs(m)
  AllChem.EmbedMolecule(m2)
  # Opt
  AllChem.MMFFOptimizeMolecule(m2)
  print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)

  Those willing to play a bit more with MMFF properties may do the
  following:

  from rdkit import Chem
  from rdkit.Chem import AllChem

  m = Chem.MolFromSmiles("O=C(C)c1cccnc1", False)
  AllChem.SanitizeMMFFMol(m)
  m2 = Chem.AddHs(m)
  AllChem.EmbedMolecule(m2)
  pyMP = AllChem.SetupMMFFForceField(m2)
  pyMP.SetMMFFVariant("MMFF94s")
  pyMP.SetMMFFDielectricModel(2)
  pyFF = AllChem.MMFFGetMoleculeForceField(m2, pyMP)
  pyFF.Minimize()
  print >>file('structure_min.sdf','w'), Chem.MolToMolBlock(m2)
  print 'Energy = {0:12.4f}'.format(pyFF.CalcEnergy())
  i = 0
  for i in range(0, m2.GetNumAtoms()):
    print '{0:4d} {1:4d} {2:8.4f} {3:8.4f}'.format(i + 1,
      int(pyMP.GetMMFFAtomType(i)),
      float(pyMP.GetMMFFFormalCharge(i)),
      float(pyMP.GetMMFFPartialCharge(i)))

- OOP backport to UFF. I added the inversion term to the UFF
  implementation following the original UFF paper by Rappe'. I have already
  modified the figures in a couple of test files to reflect the new energy
  values.

- 2-bit neighbor matrix and graph-based angle enumeration now reflect
  the MMFF implementation.
2013-09-16 12:08:02 +02:00
Greg Landrum
0212d82468 compiles and passes all tests 2013-08-19 05:53:49 +02:00
ptosco
3f4297fa44 Created a new MMFF branch. I moved some files/folders from
Code/ForceField and Code/GraphMol/ForceFieldHelpers to the
respective UFF subfolders since from now on UFF will not be
the only available force field anymore. I updated the
relevant CMakeLists.txt files accordingly.

Paolo
2013-08-18 09:11:29 +02:00
Greg Landrum
b4f2ec374e Fixes: #61 2013-07-04 18:00:09 +02:00
Greg Landrum
e29aa466db github-28: fix a doc wart 2013-05-09 14:45:12 +00:00