* CIPLabeler performance: Store vector of bonds
CIPLabelling refers to bonds by index over and over again. This
causes a measurable hit in performance in findConfigs() because
we iterate over a bitset of "allowed" bonds. For very large
molecules with many bonds, this can be a rate-limiting step!
This affects many PDB-sized structures.
2J3N goes from 0.7s to 0.25s with this change.
I had another example for which the findBondWithIdx() call was
taking 500ms of a 700ms call (after the performance update
in #9171 was implemented)
* yikes, XXL reserve
thanks, greg
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
---------
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
For example, PDB ID 2ZP8 goes from 133s to 53s on my M3 laptop.
Boost graph edges (our bonds) cannot be accessed by index, they
must be searched for - linearly. This protein only has 27,000
bonds (after addition of explicit hydrogens), but that's still
a lot.
* atropisomer handling added
* fixed non-used variables, linking directives
* BOOST LIB start/stop fixes, linking fix
* Fixes for RDKIT CI errors
* minimalLib fix
* changed vector<enum> for java builds
* check for extra chars in CIP labeling
* removed wrong deprecated message
* fix ostrstream output error?
* restored _ChiralAtomRank to lowercase first letter
* changes for merged master
* Fixed catch label for new Catch package
* update expected psql results
* get swig wrappers building
* restore MolFileStereochem to FileParsers
* fix java wrapper for reapplyMolBlockWedging
* some suggestions
* move a couple functions out of Bond
* Merge branch 'master' into pr/atropisomers2
* merged master
* Renamed setStereoanyFromSquiggleBond
* atropisomers in cdxml, rationalize atrop wedging, stereoGroups in drawMol
* fix for CI build
* attempt to fix java build in CI
* attempt to fix java build in CI #2
* New routine to remove non-explicit 3D-geneated chirality
* changed to use pair for atrop atoms and related bonds
* Changes as per PR reviews
* PR review respnses
* PR review reponse - more
* Fix merge from master
* fixing java ci after merge
* Updated the help doc for atripisomers
* update the atropisomer docs
* improve the images
* add the source CXSMILES
---------
Co-authored-by: greg landrum <greg.landrum@gmail.com>
* Fixes#4996
also switches to using the GraphMol version of catch_main.cpp so builds are faster
* Fixes#4998
we should probably discuss this one
* compare with previous results
* add port of centres
* Several changes:
- Added a test based on RDKit issue 2984
(default RDKit fails it, this gets it right)
- Use bond directions for bond stereo (label is no longer required)
- Fix bugs in rules 4b and 5new
- Fix some mem errors
- clang-formatted
- some other minor cleanups
* Several changes and some improvements:
- Added LGPL license, as well as a mention in the doc.
- Fix/update/add some comments
- Fix typo/bug in Mancude calculation
- Fix bug in rules 4b, 5New
- Fix Sp2 Bond dir reference
- Re clang-format
- other minor changes suggested by Dan
* Another bunch of changes:
- require integer-order bonds; kekulize when required
- fix fraction comparison
- rename sq Cis/Trans e/z
- replace queues with vectors
- update copyright notices
- revert LGPL changes
- fix Asymmetric typo
* move to separate lib/mod, add python validation test
* Moving away from the original implementation:
- Rename to CIPLabeler
- Remove the abstraction layer
- Remove some stats stuff
- Push some CIPMol functions down to Node
- Use RDKit's isotope info
* Another bundle of changes. The most relevant ones:
- fix parity translation
- use cis trans as bond reference -- breaks #2984 test
- kill a lot of unused code
- use lists for queues
- store nodes and edges in digraph
- add prefixes to class data member names
- update changeRoot() test
- use fastFindRings() for mancude rings
- update docs
- add references to the scientific paper
- Document the Mancude functions
- Fix Mancude atom types and their comments
- remove mol data member from SequenceRule
- replace Fraction with boost::rational
- update comments, docstrings and the doc
* fix building the test
* Changes here include:
- adding bitset overload for the labeling function
- python wrap of the overload
- handling trigonal pyramids with implicit H
- setting bond labels sets stereo atoms, cis/trans
- nix LEFT/RIGHT/TOGETHER/OPPOSITE constants
- don't use GLOB in cmake
- a decent amount of refactoring
* Minor edits to new_CIP_labeling (#6)
* Some changes for clarity
Added some documentation and changed some variable names to match
my understanding. Also a ran clang-tidy to ensure that all blocks
were brace-enclosed.
* Return a reference instead of a copy for performance
This is called many times and showed up after some light
profiling. This change bumped throughput by about 20%
* move out of Graphmol
* move .hpp headers to .h
* update documentation; add label set of atoms test
* Address comments:
- Added references to centres to CIPLabeler.h and Python Wrap.
- Update validation test to skip sanitization.
- Document mancude fractional atomic number calculation.
- Use unittest assertions in python test.
- Update mancude docstrings to 'resonance' instad of 'tautomers'.
- Rename prioritise() to prioritize().
- Add postcondition to check carriers size in Tetrahedral.cpp.
- Use getNeighbors() in Tetrahedral.cpp.
- Move findStereoAtoms to Chirality namespace.
- Move code back into GraphMol.
- Fix typos and reformat doc.
* More comments:
- Mention why we use boost's unordered map rather than the std one.
- Fix include in Python wrapper.
* Addressed second batch of comments:
- fix the bug in rule 4b
- fix docstring for rule 2
- move atomic mass calculation from rule 2 to node
- addressed some build warnings
- simplify sp2bond::label(comp)
- add start/end atoms to Sp2Bond constructor
- update system/local includes
Co-authored-by: Dan N <dan.nealschneider@schrodinger.com>