40 Commits

Author SHA1 Message Date
Dan Nealschneider
67fc0708e5 CIPLabeler performance: Store vector of bonds (#9250)
* CIPLabeler performance: Store vector of bonds

CIPLabelling refers to bonds by index over and over again. This
causes a measurable hit in performance in findConfigs() because
we iterate over a bitset of "allowed" bonds. For very large
molecules with many bonds, this can be a rate-limiting step!

This affects many PDB-sized structures.

2J3N goes from 0.7s to 0.25s with this change.

I had another example for which the findBondWithIdx() call was
taking 500ms of a 700ms call (after the performance update
in #9171 was implemented)

* yikes, XXL reserve

thanks, greg

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2026-05-06 11:57:28 +02:00
Dan Nealschneider
1663989053 CIP labeller performance: Don't calculate auxiliary descriptors unnecessarily (#9171)
* CIP labeller: Don't calculate auxiliary descriptors unnecessarily

The first 3 rules (the constitutional rules) are pretty easy
to understand. After rule 3, we need to calculate auxiliary
stereo descriptors to break ties.

However, we _were actually_ calculating auxiliary stereodescriptors
for all centers! We should only need to calculate auxiliary
stereocenters for sites that are needed to break ties.

This cost time - it also caused errors if the auxiliary descriptors
needed a graph expansion, because bonds in the digraph might be
pointed in the wrong direction.

Example case PDB ID 4AXM
Before this commit, errored with "Could not calculate parity! Carrier mismatch"
after 14s. After this commit, completes successfully in 0.036s.
Labelled centers all match (for the centers that had labels in
the failure case).

Includes a test that I can imagine breaking with this optimization.
The reference labels are from before this change

* Ensure all "arms" of stereo bonds and atropisomer bonds are expanded

For tetrahedral centers, ranking using the constitutional rules
always expands as far as is needed (but no further). For SP2bond
and atropisomers, if the first side is not resolvable, the
second side is never visited.

If the constitutional rules don't resolve a side, we need to
label the auxiliary centers. It's important to label all
auxiliary centers that _will_ be visited, so we need to know
what centers will be visited.

This commit updates the label() call in SP2 and Atropisomer
bonds to always attempt to label both sides if using the
constitutional rule set.

The constitutional rules are cheap, and if they fail, we
always go on to the full rule set. It is not a savings to skip
the search on the second side if we're going to keep going
anyway!

Includes a test that reproduces Ricardo's example.

This has no measurable effect on performance relative to the
original solution

* If any parts of the center have been seen, label it.

I couldn't make an example hit this, but Ric is totally
theoretically right

* Greg's ranges suggestion #2

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* any_of for container search

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2026-05-06 06:12:50 +02:00
Ricardo Rodriguez
5f85fafb75 Fixes #9153 (#9154)
* add tests

* fix issue

* make test pass under both legacy and modern stereo

* remove the tests
2026-03-12 18:29:59 +01:00
Ricardo Rodriguez
86902488e9 Store CIP-ranked anchors after CIP labeling. (#9056)
* add the _CIPNeighborRanks property

* store CIP-ranked chiral neighbors

* store CIP-ranked SP2 bond and atropisomer anchors

* add a test

* boost headers in test

* add Atom::NOATOM

* add NOATOM test

* amend and clarify implicit H in Tetrahedral

* rename property

* rename property to _CIPNeighborOrder

* deprecate Chirality::StereoInfo::NOATOM
2026-01-29 18:23:44 +01:00
Ricardo Rodriguez
163dd42d4c implement hasPrimaryLabel (#9052) 2026-01-19 14:53:03 +01:00
Ricardo Rodriguez
d523c6d818 Make assignCIPLabels Ctrl+c interruptable (#8589)
* add ctrl+c interrupt to assignCIPLabels

* add missing include
2025-06-27 08:27:44 +02:00
Ricardo Rodriguez
d570dee093 CIP labeler: attempt to resolve "easy" stereo centers first (#8582)
* resolve easy chirality labels first

* add a test
2025-06-15 16:22:22 +02:00
Niels Maeder
3ec88d643a Add the safeSetattr to the rdMolFiles param objects (#8457)
* add the safeSetattr to the rdMolFile param objects

* 'fix' test

* removed unused param
2025-04-22 06:45:06 +02:00
Greg Landrum
86141183c1 Moving towards getting all tests to pass when using the new stereo code (#8409)
* Fixes #8379

* check in some working tests

* test passes

* test passes

* test passes

* test passes

* test passes

* ensure that the invariants flush the streams on failure

* tests pass

* test passes

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* Fixes #8391

* tests pass

* fix a test with legacy
not clear why this was not causing problems before

* make a test work

* Fixes #8396

* gcc builds work

* fingerprint tests pass

* mention backwards incompatible change

* fix a problem with FindMolChiralCenters

* more testing details

* enable the test status output

* Fixes #8432

fix a bug in double-bond stereo handling for template matching

* all depictor tests pass

* use the new-stereo chiral ranks in the depiction code

* always assign new-stereo chiral ranks

* make _ChiralAtomRank a computed property
This is analogous to _CIPRank

* tweak to the way the atom ordering is computed for 2D coordinate generation

* update two expected results

* backup

* response to review

* tests pass

* tests pass

---------

Co-authored-by: = <=>
2025-04-15 14:00:32 +02:00
Paolo Tosco
71c4103475 Suppress large amounts of 'BOOST_NO_CXX98_FUNCTION_BASE macro redefined' warnings in clang/emcc builds (#7747) 2025-03-29 20:04:58 +01:00
Ricardo Rodriguez
db0df54347 Fix some minor issues reported by ubsan and the compiler (#8015)
* initialize chiralityPossible

* fix build warning

* Fix integer overflow

* fix downcasting MarvinMolBase to MarvinMol

* Fix buildwarning

* increase PairList container to 64 bit

* fix testDict

* Update Code/RDGeneral/testDict.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/rules/Pairlist.h

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/rules/Pairlist.h

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Fix catch_tests.cpp

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-20 09:09:22 +01:00
tadhurst-cdd
57a9d2928f Fix incorrect CIP values for some aromatic atropisomers (#7957) 2024-11-09 10:32:46 +01:00
Ricardo Rodriguez
ccfb1fa688 ... and more mem errors fixed (#7924)
* fixleak in CIP labels catch test

* fix leak in Murtagh clustering

* do not leak writers in streambuf

* fix leaks in fingerprintgeneratorwrapper

* remove 'minor leak' comments
2024-10-25 07:01:34 +02:00
tadhurst-cdd
0de215a1f8 Fix canonicalization of stereogroups (#7041)
* atropisomer handling added

* fixed non-used variables,  linking directives

* BOOST LIB start/stop fixes, linking fix

* Fixes for RDKIT CI errors

* minimalLib fix

* changed vector<enum> for java builds

* check for extra chars in CIP labeling

* removed wrong deprecated message

* fix ostrstream output error?

* restored _ChiralAtomRank to lowercase first letter

* changes for merged master

* Fixed catch label for new Catch package

* update expected psql results

* get swig wrappers building

* restore MolFileStereochem to FileParsers

* fix java wrapper for reapplyMolBlockWedging

* test changes

* some suggestions

* move a couple functions out of Bond

* Merge branch 'master' into pr/atropisomers2

* merged master

* Renamed setStereoanyFromSquiggleBond

* atropisomers in cdxml, rationalize atrop wedging, stereoGroups in drawMol

* Merge branch 'master' into pr/specialQueries

* changes from previous PR

* Iclude false chiral

* rigorous enhnced stereo canoncalization

* Added more tests and clenup

* removed commented out code

* corrected init of SmilesWriteParams

* added MolFileStereoChem.h to the header files

* Renamed Rxn parser to MrvBlockToChemicalReaction

* To make catch2 work, and match the checksum

* Fixed Structchecker errors

* fix CI for DetermineBonds catch test

* error in catch_test for CI

* Allow custom  smileWriteParams  in GetMolLayers

* misnamed entry point

* ReactionFromMrvString change name

* remove adding writeParams to GetMolLayers

* make rigorous enhanced stereo the default, and fix tests

* only one abs group no longer needs Rigorous Enhanced treatment

* changed string_view to string in catch test

* Canonicalize Enhnaced Stereo only resturne unique smiles

* Now allows or and and groups together

* internal routines inside detail scope

* fix test error

* changed string back to string_view and fixed a CHECK

* Fixes for PR review tests

* Fix RDKit_Book.rst failure on build test

* fix xqm sql test

* updated expected files for cxsmiles_test

* Fixed removal of atom attrs

* Fixed tests after merge of master

* More efficient version of Stereo Groups Canonicalization

* Fixes for ctests

* removed debug code

* readded cipLabel test

* fix generalizedSubstruct/catch_tests.cpp error

* hueristics to improve speed

* Rationaized control of abs groups

* removed unused routine

* added rigorous stereo group treatment to test

* some suggested changes

* Changes per PR review and removed some changes to smiles

* Fixed CI errors

* changes per PR review

* more PR review vhanges and cleanup

* Fixed PSql PKL change

* changes as per PR review

* Restored error type for bad mols for canonicalizeStereoGroups and added a test

* Merge master and fix test in MolDraw2D

* Fix for randomize test error and other PR review comments

* Removed unsued variable to fix mac CI

* do not force aromatization in canonicalizeStereoGroups

* changes as per PR review

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2024-10-11 17:09:18 +02:00
Dan Nealschneider
c1de3aefb5 Another performance improvement in CIP labelling (#7854)
For example, PDB ID 2ZP8 goes from 133s to 53s on my M3 laptop.
Boost graph edges (our bonds) cannot be accessed by index, they
must be searched for - linearly. This protein only has 27,000
bonds (after addition of explicit hydrogens), but that's still
a lot.
2024-10-06 05:25:14 +02:00
Ricardo Rodriguez
ea7f51f3b1 Fix some mem errors in 2024.09.1 (#7867) 2024-10-03 16:05:43 +02:00
Dan Nealschneider
a4fe959523 Simple speed up for CIPLabeler (#7826)
* Speed up CIPLabeler

std::list is almost never the data structure that you want. vector
uses fewer allocations, and the data has better locality. This
speeds up the CIPLabeler on some protein examples by about 2x.

* Greg's recommendation to use std::vector

It's just as fast. For my example, this made queue sizes of like
1,186,390. Reserving ahead doesn't affect performance.
2024-10-01 04:11:03 +02:00
Greg Landrum
da6cd73168 Run clang-format across everything (#7849)
* run clang-format-18 across Code/*.cpp and Code/*.h

* run clang-format-18 across External
2024-09-26 13:39:02 +02:00
Greg Landrum
138bdc8d58 Fix some missing headers when doing "make install" (#7667)
* remove unnecessary include

* add a couple of missing include files to rdkit_headers

* fix header destination
2024-07-25 11:07:03 -04:00
tadhurst-cdd
4ce1a5640a Ring stereo atropisomers (#7486)
* changed string_view to string in catch test

* fix for ring-stereo atropisomers

* removed debug cout

* removed change to build in debug mode

* change catch arg to a rerference

* changes as per PR review
2024-06-13 08:35:42 +02:00
tadhurst-cdd
a2b149a806 No coords atropisomers - fix smiles output of atrop wedges after reordering (#7418)
* removed string_view in favor of string for catch test

* add parsing and generation of atropisomers when coords not present

* changed string_view to string in catch test

* more docs

* reformulation of the docs

* make an error message a little bit more useful

* small optimization
clang-format

* add `BondWedgingParameters` to new function

* changes for CIP test errors

* Updated internal doc to match what it does

* changes per PR review

* removed cout statements in tests

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-05-07 17:06:33 +02:00
Greg Landrum
6bede685e0 Support atropisomers in the conformer generator (#7098)
* change a define to a "using"

* basic atropiosomer support for confgen

currently only supports bonds with two exo substituents on each end

* support systems with two exo bonds

* cleanup

* add test for #7109

Fixes #7109

* clean the tests up a bit
2024-02-06 15:48:25 +01:00
tadhurst-cdd
73b4da2a44 New tests for specical query atoms and atropisomers (#7010)
* New tests for speical query atoms and atropisomers

* fixed error, and used unique_ptrs

* Removed test that makes GraphMol depend on GenericGroups

* More to remove GraphMol dependency on GenericGroups
2024-01-11 05:54:14 +01:00
tadhurst-cdd
d5d4d194ec atropisomer handling added (#6903)
* atropisomer handling added

* fixed non-used variables,  linking directives

* BOOST LIB start/stop fixes, linking fix

* Fixes for RDKIT CI errors

* minimalLib fix

* changed vector<enum> for java builds

* check for extra chars in CIP labeling

* removed wrong deprecated message

* fix ostrstream output error?

* restored _ChiralAtomRank to lowercase first letter

* changes for merged master

* Fixed catch label for new Catch package

* update expected psql results

* get swig wrappers building

* restore MolFileStereochem to FileParsers

* fix java wrapper for reapplyMolBlockWedging

* some suggestions

* move a couple functions out of Bond

* Merge branch 'master' into pr/atropisomers2

* merged master

* Renamed setStereoanyFromSquiggleBond

* atropisomers in cdxml, rationalize atrop wedging, stereoGroups in drawMol

* fix for CI build

* attempt to fix java build in CI

* attempt to fix java build in CI #2

* New routine to remove non-explicit  3D-geneated chirality

* changed to use pair for atrop atoms and related bonds

* Changes as per PR reviews

* PR review respnses

* PR review reponse - more

* Fix merge from master

* fixing java ci after merge

* Updated the help doc for atripisomers

* update the atropisomer docs

* improve the images

* add the source CXSMILES

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2023-12-22 04:58:18 +01:00
Greg Landrum
2957ab4576 switch to catch2 v3 (#6898)
* switch to catch2 v3
Fixes #6894

* fix a couple of problems noticed in the CI builds

* more warning cleanup

* changes in response to review
2023-11-15 06:45:42 +01:00
Ric
880a8e5725 Reformat Python code for 2023.03 release (#6294)
* run yapf

* run isort

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2023-04-28 06:53:56 +02:00
Ric
58d135a874 Reformat C/C++ code ahead of 2023.03 release (#6295)
* format files

* format template files too
2023-04-28 04:42:35 +02:00
tadhurst-cdd
309ea55d16 Add a timeout protection for CIP calculations (#5772)
* Add a timeout protection for CIP calculations

Authored-by: Tad Hurst <tad.hurst@collaborativedrug.com>

* Update Code/GraphMol/CIPLabeler/CIPLabeler.h

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/CIPLabeler.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/catch_tests.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/catch_tests.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/CIPLabeler.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* changes as per PR review by Greg Landrum

* Update Code/GraphMol/CIPLabeler/CIPLabeler.h

Co-authored-by: Ric <ricrogz@users.noreply.github.com>

* Added python wrapper for CIP interation error message

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
Co-authored-by: Ric <ricrogz@users.noreply.github.com>
2022-12-09 05:28:38 +01:00
Greg Landrum
594c58f86c make the catch tests build faster (#5284)
* reorg the catch tests
the goal here is to make the builds faster

* make that easier
2022-05-17 04:39:33 +02:00
Ric
c02e35d4e7 Fixes #5142 (#5164)
* add a test

* update more atomic num == 0 rules
2022-04-07 16:51:48 +02:00
Eisuke Kawashima
ba6d8e0d3b clang-tidy: readability-simplify-boolean-expr (#4639) 2022-03-17 13:50:50 +01:00
Eisuke Kawashima
27f711a658 Run clang-tidy (readability-braces-around-statements) (#4977)
https://github.com/rdkit/rdkit/pull/3024#discussion_r526549843
2022-03-10 08:00:10 +01:00
Greg Landrum
9626726b85 Fixes #4996 and #4998 (#5001)
* Fixes #4996

also switches to using the GraphMol version of catch_main.cpp so builds are faster

* Fixes #4998

we should probably discuss this one

* compare with previous results
2022-02-12 04:18:17 +01:00
Eisuke Kawashima
11532089de Run clang-format against cpp (#4358) 2021-10-20 04:25:27 +02:00
Eisuke Kawashima
78aac3c1bc Run clang-format against header files (#4143) 2021-06-08 07:57:51 +02:00
Ric
c9199cf1da Address #2753 (#3750)
* add test export heder to gitignore

* define export macros in separate file

* install new header

* patch GA with the new macros

* fix struct declarations

* fix conformerparser exports

* fix MolSGroupParsing ParseV3000Array export

* fix java wrappers

* export exceptions

* remove duplicated exports

* Build RDGeneral exceptions into lib

* export queries, only for *nix

* fix RingDecomposerLib header manipulation

* fix CIP labeler test issues
2021-02-15 14:29:04 +01:00
Eisuke Kawashima
ee31ec96be Replace − (U+2212, MINUS SIGN) with -- (#3777)
Fix #3738
2021-02-04 10:03:21 +01:00
Greg Landrum
19bdd21de1 Updated code for chirality perception (#3324)
* add new test (it fails, of course)

* isAtomPotentialTetrahedralCenter() there and tested
tests cases for molecular stereo written (but failing, of course)
create new_chirality.cpp, we will probably want to change this at some point
new StereoInfo structure

* more infrastructure
- isBondPotentialStereoBond()
- two getStereoInfo() functions
- associated unit tests

* backup

* oops

* backup

* switch to always using four atoms for bonds

* backup

* add new test (it fails, of course)

* isAtomPotentialTetrahedralCenter() there and tested
tests cases for molecular stereo written (but failing, of course)
create new_chirality.cpp, we will probably want to change this at some point
new StereoInfo structure

* more infrastructure
- isBondPotentialStereoBond()
- two getStereoInfo() functions
- associated unit tests

* backup

* oops

* backup

* switch to always using four atoms for bonds

* backup

* this now actually works

* doc update

* add a test to demo that ring stereo is not working

* more testing

* add a fun CIP test

* add review note

* debugging

* remove extraneous debugging
turn off tests for ring-double bond stereo

* disable the ring-stereo fix... this breaks a few tests, but we will recover

* works, needs cleanup, chirality code needs re-testing

* nothing works

* Fixes #3322

* Python and C++ tests now pass

* clang-format

* first pass at python wrappers

* improve doctest

* basic optimization...
stop with the copying

* rename

* all tests passing again

* optimization

* fix the sort in the tests

* looks like this might fix the windows-dll build problems

* update tests

* the fun never ends

* comment cleanup

* handle deliberately unspecified atoms/bonds

* add cleanIt option

* add flagPossible

* add option to use the new code to the SMILES parser

* additional testing

* additional testing

* a bit of additional testing never hurts

* changes in response to review

* fixes a bug with potential parastereo not being cleared

other changes in response to review

* update docs
2020-09-02 15:00:29 +02:00
Greg Landrum
c0a62388a2 switch to using target_compile_definitions instead of add_definitions (#3350)
* switch to using target_compile_definitions instead of add_definitions

* missed one
2020-08-21 04:49:07 +02:00
Ric
d54e77e375 Add new CIP labelling algorithm (#3234)
* add port of centres

* Several changes:
    - Added a test based on RDKit issue 2984
        (default RDKit fails it, this gets it right)
    - Use bond directions for bond stereo (label is no longer required)
    - Fix bugs in rules 4b and 5new
    - Fix some mem errors
    - clang-formatted
    - some other minor cleanups

* Several changes and some improvements:
    - Added LGPL license, as well as a mention in the doc.
    - Fix/update/add some comments
    - Fix typo/bug in Mancude calculation
    - Fix bug in rules 4b, 5New
    - Fix Sp2 Bond dir reference
    - Re clang-format
    - other minor changes suggested by Dan

* Another bunch of changes:
  - require integer-order bonds; kekulize when required
  - fix fraction comparison
  - rename sq Cis/Trans e/z
  - replace queues with vectors
  - update copyright notices
  - revert LGPL changes
  - fix Asymmetric typo

* move to separate lib/mod, add python validation test

* Moving away from the original implementation:
    - Rename to CIPLabeler
    - Remove the abstraction layer
    - Remove some stats stuff
    - Push some CIPMol functions down to Node
    - Use RDKit's isotope info

* Another bundle of changes. The most relevant ones:
    - fix parity translation
    - use cis trans as bond reference -- breaks #2984 test
    - kill a lot of unused code
    - use lists for queues
    - store nodes and edges in digraph
    - add prefixes to class data member names
    - update changeRoot() test
    - use fastFindRings() for mancude rings
    - update docs
    - add references to the scientific paper
    - Document the Mancude functions
    - Fix Mancude atom types and their comments
    - remove mol data member from SequenceRule
    - replace Fraction with boost::rational
    - update comments, docstrings and the doc

* fix building the test

* Changes here include:
    - adding bitset overload for the labeling function
    - python wrap of the overload
    - handling trigonal pyramids with implicit H
    - setting bond labels sets stereo atoms, cis/trans
    - nix LEFT/RIGHT/TOGETHER/OPPOSITE constants
    - don't use GLOB in cmake
    - a decent amount of refactoring

* Minor edits to new_CIP_labeling (#6)

* Some changes for clarity

Added some documentation and changed some variable names to match
my understanding. Also a ran clang-tidy to ensure that all blocks
were brace-enclosed.

* Return a reference instead of a copy for performance

This is called many times and showed up after some light
profiling. This change bumped throughput by about 20%

* move out of Graphmol

* move .hpp headers to .h

* update documentation; add label set of atoms test

* Address comments:
    - Added references to centres to CIPLabeler.h and Python Wrap.
    - Update validation test to skip sanitization.
    - Document mancude fractional atomic number calculation.
    - Use unittest assertions in python test.
    - Update mancude docstrings to 'resonance' instad of 'tautomers'.
    - Rename prioritise() to prioritize().
    - Add postcondition to check carriers size in Tetrahedral.cpp.
    - Use getNeighbors() in Tetrahedral.cpp.
    - Move findStereoAtoms to Chirality namespace.
    - Move code back into GraphMol.
    - Fix typos and reformat doc.

* More comments:
    - Mention why we use boost's unordered map rather than the std one.
    - Fix include in Python wrapper.

* Addressed second batch of comments:
    - fix the bug in rule 4b
    - fix docstring for rule 2
    - move atomic mass calculation from rule 2 to node
    - addressed some build warnings
    - simplify sp2bond::label(comp)
    - add start/end atoms to Sp2Bond constructor
    - update system/local includes

Co-authored-by: Dan N <dan.nealschneider@schrodinger.com>
2020-07-07 20:34:33 +02:00