Code/ForceField/MMFF/testMMFFForceField.h
- re-prepared the SDF files from their MOL2 counterparts present in the
original MMFF validation suites. For this purpose a C++ program was
written which only uses information in MOL2 files and in .fc files
provided by Halgren. The C++ program does not depend on RDKit.
- re-prepared the SMILES files from their SDF counterparts using
a Python script which calls MolToSmiles()
- found an issue which affects 2 files in the test suite, namely
ERULE_05 and PO02A (only the hypervalent notation). The issue is
connected with removal of hydrogens bonded to phosphorous and
appears to be fixed by the modifications in Code/GraphMol/AddHs.cpp
and Code/GraphMol/SmilesParse/SmilesWrite.cpp. This issue is
independent of the changes ini the SDF files; indeed, it has
always been present, and had been previously addressed by
manual correction of the two SMILES strings
data races in the multithreaded test
- Removed a spurious #include in Code/ForceField/Wrap/ForceField.cpp
- Restored caching in Code/GraphMol/Descriptors/Crippen.cpp
and scoring functions
- Added cost, weight and scoring functions using atom-based
Crippen logP contributions
- Added relevant tests for the new functionality
- Fixed a bug in the O3A::trans() method (it returned the
weighted RMSD and not the unweighted one as O3A::align())
Also the API of the method has been changed (now it takes
a reference to a Transform3D object rather than returning
a pointer to a newly created one)
Important points:
-----------------
- The constructors now accept optional pointers to
MolHistogram objects instead of pointers to arrays of
double. I think it is better for performance, since
rebuilding the histogram involves running through two
nested loops over all atoms, even though the
3DDistanceMat is provided to the O3A constructor.
This change breaks binary compatibility with previous
C++ programs linking to RDKit MolAlign library; I do not
think this is big issue, and while I was there I made
other changes which cause binary incompatibility.
If needed, we may preserve binary compatibility
reverting the MolHistogram change and the other ones.
- From Python, nothing changed in the interface to previous
MMFFO3A functionality. It MIGHT be more appropriate to
change the "GetO3A()" function into "GetMMFFO3A()", but
I have not done that to avoid breaking existing scripts;
the choice is yours
- As of now, the code contains a number of conditional
compilation directives checking for the
USE_O3A_CONSTRUCTOR macro; if USE_O3A_CONSTRUCTOR is
defined, then the code is built with an alternative O3A
constructor which allows choosing whether one wishes
to use MMFF or Crippen descriptors. Otherwise, no
alternative constructor is built, but rather two
functions which return a pointer to an O3A object.
I prefer by far the first solution (namely, with
USE_O3A_CONSTRUCTOR defined), but again, the choice is
yours. The code is tested and works in both cases.
- Custom cost, weight and scoring functions can easily
be defined in external programs without need to rebuild
the RDKit, allowing flexibility. The new custom
functionality can be accessed by calling the "bigger"
O3A constructor.
and scoring functions
- Added cost, weight and scoring functions using atom-based
Crippen logP contributions
- Added relevant tests for the new functionality
Important points:
-----------------
- The constructors now accept optional pointers to
MolHistogram objects instead of pointers to arrays of
double. I think it is better for performance, since
rebuilding the histogram involves running through two
nested loops over all atoms, even though the
3DDistanceMat is provided to the O3A constructor.
This change breaks binary compatibility with previous
C++ programs linking to RDKit MolAlign library; I do not
think this is big issue, and while I was there I made
other changes which cause binary incompatibility.
If needed, we may preserve binary compatibility
reverting the MolHistogram change and the other ones.
- From Python, nothing changed in the interface to previous
MMFFO3A functionality. It MIGHT be more appropriate to
change the "GetO3A()" function into "GetMMFFO3A()", but
I have not done that to avoid breaking existing scripts;
the choice is yours
- As of now, the code contains a number of conditional
compilation directives checking for the
USE_O3A_CONSTRUCTOR macro; if USE_O3A_CONSTRUCTOR is
defined, then the code is built with an alternative O3A
constructor which allows choosing whether one wishes
to use MMFF or Crippen descriptors. Otherwise, no
alternative constructor is built, but rather two
functions which return a pointer to an O3A object.
I prefer by far the first solution (namely, with
USE_O3A_CONSTRUCTOR defined), but again, the choice is
yours. The code is tested and works in both cases.
- Custom cost, weight and scoring functions can easily
be defined in external programs without need to rebuild
the RDKit, allowing flexibility. The new custom
functionality can be accessed by calling the "bigger"
O3A constructor.
and scoring functions
- Added cost, weight and scoring functions using atom-based
Crippen logP contributions
- Added relevant tests for the new functionality
Important points:
-----------------
- The constructors now accept optional pointers to
MolHistogram objects instead of pointers to arrays of
double. I think it is better for performance, since
rebuilding the histogram involves running through two
nested loops over all atoms, even though the
3DDistanceMat is provided to the O3A constructor.
This change breaks binary compatibility with previous
C++ programs linking to RDKit MolAlign library; I do not
think this is big issue, and while I was there I made
other changes which cause binary incompatibility.
If needed, we may preserve binary compatibility
reverting the MolHistogram change and the other ones.
- From Python, nothing changed in the interface to previous
MMFFO3A functionality. It MIGHT be more appropriate to
change the "GetO3A()" function into "GetMMFFO3A()", but
I have not done that to avoid breaking existing scripts;
the choice is yours
- As of now, the code contains a number of conditional
compilation directives checking for the
USE_O3A_CONSTRUCTOR macro; if USE_O3A_CONSTRUCTOR is
defined, then the code is built with an alternative O3A
constructor which allows choosing whether one wishes
to use MMFF or Crippen descriptors. Otherwise, no
alternative constructor is built, but rather two
functions which return a pointer to an O3A object.
I prefer by far the first solution (namely, with
USE_O3A_CONSTRUCTOR defined), but again, the choice is
yours. The code is tested and works in both cases.
- Custom cost, weight and scoring functions can easily
be defined in external programs without need to rebuild
the RDKit, allowing flexibility. The new custom
functionality can be accessed by calling the "bigger"
O3A constructor.