In Code/Numerics/Optimizer/BFGSOpt.h, the gradient-convergence check
computed
double term = std::max(funcVal * gradScale, 1.0);
...
test /= term;
if (test < gradTol) return 0;
When funcVal (the current energy) is negative, funcVal * gradScale is
negative and std::max clamps the denominator to 1.0. The convergence
test therefore divides the gradient norm by 1 instead of by the
intended |E| * gradScale, which over-tightens the criterion by a factor
of |funcVal * gradScale| whenever |funcVal * gradScale| > 1.
Negative energies are a normal mid-minimization state for force fields
that include stabilizing terms (MMFF94, UFF with charges, AMBER-style
potentials), so this affects realistic workloads: extra BFGS iterations
or, occasionally, hitting MAXITS and returning the "too many iterations"
status when convergence would otherwise have been reached.
The fix is to use |funcVal| in the denominator, matching the pattern
used three lines below ('std::max(fabs(pos[i]), 1.0)') and matching the
intended interpretation as a magnitude.
A new test case 'testBFGSOptimizationNegativeEnergy' in
testOptimizer.cpp minimizes a 2D quadratic whose value is always
negative along the convergence path and verifies the optimizer reaches
the analytic minimum.
git blame attributes the original line to commit e08e0d16d (Nov 2015),
when the optimizer was restructured; the surrounding code does use
absolute values, so this reads as an oversight rather than an
intentional choice.
RDKit
What is it?
The RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python.
- BSD license - a business friendly license for open source
- Core data structures and algorithms in C++
- Python 3.x wrapper generated using Boost.Python
- Java and C# wrappers generated with SWIG
- JavaScript (generated with emscripten) and CFFI wrappers around important functionality
- 2D and 3D molecular operations
- Descriptor and Fingerprint generation for machine learning
- Molecular database cartridge for PostgreSQL supporting substructure and similarity searches as well as many descriptor calculators
- Cheminformatics nodes for KNIME
- Contrib folder with useful community-contributed software harnessing the power of the RDKit
Installation and getting started
If you are working in Python and using conda (our recommendation), installation is super easy:
$ conda install -c conda-forge rdkit
You can then take a look at our Getting Started in Python guide.
More detailed installation instructions are available in Docs/Book/Install.md.
Documentation
Available on the RDKit page and in the Docs folder on GitHub
The RDKit blog often has useful tips and tricks.
Support and Community
If you have questions, comments, or suggestions, the best places for those are:
If you've found a bug or would like to request a feature, please create an issue
We also have a LinkedIn group
We have a yearly user group meeting (the UGM) where members of the community do presentations and lightning talks on things they've done with the RDKit. Materials from past UGMs, which can quite useful, are also online:
- 2012 UGM, London
- 2013 UGM, Hinxton
- 2014 UGM, Darmstadt
- 2015 UGM, Zurich
- 2016 UGM, Basel
- 2017 UGM, Berlin
- 2018 UGM, Cambridge
- 2019 UGM, Hamburg
- 2020 UGM, virtual
- 2021 UGM, virtual
- 2022 UGM, Berlin
- 2023 UGM, Mainz
- 2024 UGM, Zurich
License
Code released under the BSD license.