143 Commits

Author SHA1 Message Date
Gareth Jones
8e9adcd467 Adds some features to the C# SWIG wrappers (#9274) 2026-05-14 09:07:56 +02:00
Yakov Pechersky
c6cabf4153 Speed-up tautomer canonicalization, no API changes (#9134)
* Speed up tautomer canonicalization by deferring on SSSR calc

* Lazy kekulization for tautomer enumeration

Defer kekulization of tautomers until they are actually needed for
transform matching. This avoids creating kekulized copies for:
1. The initial tautomer (until first iteration)
2. New tautomers that may never be processed (if enumeration ends early)

The Tautomer class now supports lazy initialization of the kekulized
form via getKekulized() method.

Performance improvement: ~7% additional speedup (total ~22-24% from baseline)

* Use count-only substructure matching in tautomer scoring

* Add SubstructMatchCount regression test

* MolStandardize: reduce enumerate overhead

* MolStandardize: avoid per-tautomer ring recomputation

* Atom: cache PeriodicTable pointer in valence calcs

* Atom: reuse PeriodicTable in getEffectiveAtomicNum

* PeriodicTable: add atomic fast path for getTable

* GraphMol: reduce ROMol copy reallocations

* MolStandardize: use quickCopy for per-match product copies

Use RWMol(*kmol, true) in tautomer enumeration to avoid copying properties/bookmarks/conformers for each candidate. This reduces deep-copy overhead without changing chemistry.

* MolStandardize: pre-filter scoring patterns by element/connectivity

For tautomer scoring, pre-compute which SubstructTerms are relevant for
a given input molecule. Since tautomerization only moves H atoms and
changes bond orders (never creates/destroys heavy-atom bonds), patterns
requiring missing elements or connectivity can be skipped for all
tautomers of that molecule.

Two-stage filtering:
1. Element check: skip patterns requiring atoms not in the molecule
2. Connectivity check: skip patterns whose bond-order-agnostic structure
   doesn't match the input molecule's connectivity

This reduces the number of VF2 substructure calls per tautomer from 12
to typically 3-5, depending on the molecule's composition.

* MolStandardize: preserve molecule properties for canonical tautomer

Copy molecule properties from the original input to the canonical tautomer
result. Since quickCopy during enumeration skips d_props to avoid overhead,
extended SMILES data like link nodes (LN) was lost. This restores them
on the final result.

* TautomerQuery: preserve molecule properties (e.g. link nodes) in tautomers

TautomerQuery::fromMol() uses TautomerEnumerator::enumerate() which uses
quickCopy for performance. This doesn't copy molecule properties like
_molLinkNodes. Without this fix, XQMol output would lose link node
extensions in the SMILES.

Copy properties from the original query molecule to all enumerated
tautomers before constructing the TautomerQuery. This preserves extended
SMILES data without impacting enumeration performance.

* MolStandardize: use parallel iteration and cache bond lookups

Replace O(n) getAtomWithIdx/getBondWithIdx calls with parallel iteration
over atom/bond ranges in canonicalizeInPlace and enumerate. Cache bond
lookups in setTautomerStereoAndIsoHs to avoid repeated O(n) searches.

* perf: add specialized matchers for simple tautomer scoring patterns

Replace VF2 graph matching with O(n) loops for 6 simple patterns:
- countDoubleOrAromaticBonds: C=O, N=O, P=O patterns
- countMethyls: [CX4H3] methyl groups
- countCarbonDoubleHetero: [C]=[/home/dcvuser/rdkit;Code/GraphMol/MolStandardize/Tautomer.h] aliphatic C=hetero
- countAromaticCarbonExocyclicN: [c]=aromatic C=exocyclic N
Complex patterns (benzoquinone, oxim, guanidine, aci-nitro) still use VF2.
Combined with the pre-filtering optimization, this achieves ~3.7x speedup
(~2500ms vs ~9300ms original) for tautomer canonicalization.

* Fix tautomer canonicalize dropping conformers from quickCopy

quickCopy (RWMol(*mol, true)) skips conformers, so tautomer
enumeration products lose 2D/3D coordinates. This causes InChI
generation to omit the /b (double bond E/Z stereo) layer, since
E/Z is derived from atomic coordinates.

Fix: copy conformers from the original molecule onto the canonical
tautomer after pickCanonical in TautomerEnumerator::canonicalize().

Tests: SMILES-based E/Z check in testTautomer.cpp, molblock-based
conformer preservation check in catch_tests.cpp.

* add test on canonicalize losing stereo

* add regression test for exocyclic C=C tautomer canonicalization

The getTautomerStateKey() pre-filter (commit 2595ef748) can falsely
deduplicate distinct tautomers when their atom-index-ordered state
patterns happen to match, leading canonicalize() to pick the wrong
canonical form for molecules with STEREOTRANS-pinned exocyclic C=C
bonds after RemoveHs.

Test verifies that O=C(CC1=CC2=CC=COC2)NC1=O canonicalizes to the
exocyclic form O=C1CC(=CC2=CC=COC2)C(=O)N1, not the endocyclic form
O=C1C=C(C=C2CC=COC2)C(=O)N1.

Currently expected to FAIL until the state key dedup bug is fixed.

* MolStandardize: expand tautomer connectivity SMARTS

* MolStandardize: scope tautomer pattern enum

* MolStandardize: trim tautomer pattern enum

* MolStandardize: use symmetric ring scoring
2026-03-31 06:42:40 +02:00
Ricardo Rodriguez
680520e0ad Follow up to PR #8968 (#9168)
* implement consistency check

* add more consistency checks

* check direction consistency accross double bond

* clean up directions for non-stereo bonds

* fix counts for second from atom dirs; add check

* handle inconconsistent bond dirs

* add more tests, pubchem cases, and update existing

* drop statics

* fix typo

* make sourceBond arg const

* fix consistency check
2026-03-20 04:28:17 +01:00
Yakov Pechersky
0986d22c58 Deterministic kekulize, independent of atom and bond order (#9125)
* Make kekulization deterministic

* Add tautomer order-independence regression (python)

* Adjust tautomer tests for deterministic kekulization

* Update graphmol wedged-bond kekulization checks

* SmilesParse: update aromatic bond index expectations

* SmilesParse: refresh cxsmilesTest expected files

* Depictor: update testDepictor expected MolBlocks

* Depictor: update depictorCatch expectations

* Depictor Wrap: update expected MolBlock for pyDepictor

* MarvinParse: update testMrvToMol expected outputs

* FileParsers: refresh testAtropisomers expected outputs

* FileParsers: update tests for deterministic kekulization

* MolDraw2D: refresh brittle bond assertions

* RascalMCES: update expected cluster size

* MinimalLib: make cffi wedging check order-independent

* documentation fix

* MinimalLib: update Kekulé bond table in aligned-coords test

* Hoist duplicated lambdas to TEST_CASE scope

* Remove unused originalWedges variable

* Remove redundant bounds check; clarify wedge-end preference

* Pre-sort allAtms by wedge-end + rank

* Use mol.atomNeighbors() for neighbor iteration

* Check inAllAtms before linear-scanning done

* Drop redundant optsV/wedgedOptsV sorts

* Remove unused Canon.h include

* Add canonical parameter to Kekulize; skip ranking during sanitization

* Test canonical re-kekulization preserves stereo across atom orderings

* MinimalLib: update Kekulé bond orders in invertedWedges

* Change Kekulize canonical default to false, expose in Python wrappers

* keep rank order, push_back

* Revert "RascalMCES: update expected cluster size"

This reverts commit a81bb39495.

* docstring change

* expose new flag to python wrapper

* document changes in ReleaseNotes.md

* revert minimallib test changes again

* canonical = true defaults

* Revert "revert minimallib test changes again"

This reverts commit 039e1d84da.

* Reapply "RascalMCES: update expected cluster size"

This reverts commit 7b83a7a3e8.

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2026-03-19 08:43:13 +01:00
Yakov Pechersky
67b73acba4 when shifting double bonds in tautomerization, set double bond stereo to STEREOANY (#9119)
* when shifting double bonds in tautomerization, set double bond stereo to STEREOANY

fixes #9102

notably, do this only to non-ring bonds
move tests over to assert this
avoid index-based bond lookup in test assertions
since bond indexing can move in tautomers

* inchi unittest check

* fast rings
2026-02-19 19:29:17 +01:00
Niels Maeder
db93262a3e Add safeSetattr to more params / options objects (#8842)
* add safeSetattr to varios params objects

* added safeSetattr to further params / options
2025-10-08 16:15:20 +02:00
Ricardo Rodriguez
7b7a8a4e17 Refactor iostreams includes (#8846)
* refactor iostreams includes

* restore ostream to MonomerInfo.cpp
2025-10-08 16:08:01 +02:00
Ricardo Rodriguez
a4b63d7df5 Minor refactor of the python wrappers (#8847)
* refactor python wrappers

* fix FilterHierarchyMatcher converted already registered warning
2025-10-05 09:42:31 +02:00
Greg Landrum
86141183c1 Moving towards getting all tests to pass when using the new stereo code (#8409)
* Fixes #8379

* check in some working tests

* test passes

* test passes

* test passes

* test passes

* test passes

* ensure that the invariants flush the streams on failure

* tests pass

* test passes

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* tests pass

* Fixes #8391

* tests pass

* fix a test with legacy
not clear why this was not causing problems before

* make a test work

* Fixes #8396

* gcc builds work

* fingerprint tests pass

* mention backwards incompatible change

* fix a problem with FindMolChiralCenters

* more testing details

* enable the test status output

* Fixes #8432

fix a bug in double-bond stereo handling for template matching

* all depictor tests pass

* use the new-stereo chiral ranks in the depiction code

* always assign new-stereo chiral ranks

* make _ChiralAtomRank a computed property
This is analogous to _CIPRank

* tweak to the way the atom ordering is computed for 2D coordinate generation

* update two expected results

* backup

* response to review

* tests pass

* tests pass

---------

Co-authored-by: = <=>
2025-04-15 14:00:32 +02:00
Greg Landrum
8eb02b8bed switch to C++20 (#8039)
* c++20 builds working

* get MolStandardize building with clang19

* get FMCS building with clang-19

* set cxx version to c++20

* remove a few more compiler warnings

* bump min boost version, CI cleanup

* boost 1.81 is not available from conda-forge

* remove unused constants

* bump linux version for CI

* remove another unused variable

* fix (hopefully) cartridge CI builds

* simplify cartridge environment

* try postgresql14 in CI

* start the postgresql service

* change the columns used in the pandastools nbtest

* remove missed merge conflict artifact

* get github4823 test to pass with numpy 2.2

* remove a compiler warning/error with g++13
2025-04-09 11:57:17 +02:00
Greg Landrum
32608ae0b4 Atoms bonded to metal atoms should always have their H counts explicit in SMILES (#8318)
* refactor the code to determine whether or not an atom is in brackets

* move the definition of isMetal to QueryOps

* atoms bound to metals in SMILES should always be in square brackets

Implementation and some test updates

needs confirmation that all of tests run

* basic tests pass

* java tests pass

* update js tests

* doc updates

* Update Code/GraphMol/catch_graphmol.cpp

Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>

* Update Code/GraphMol/SmilesParse/test.cpp

Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>

* finish fixing tests

* bump yaehmop version to allow compilation to work

---------

Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
2025-03-29 07:26:03 +01:00
Greg Landrum
fa048eacc5 Replace GetImplicitValence() and GetExplicitValence() with GetValence() (#7926) 2025-01-28 21:09:03 +01:00
Hussein Faara
8411f4535e remove no-op macros and dead code (pt 2) (#8035)
* remove no-op macros and dead code (pt 2)

* test failures due to whitespace changes?

* actually run the testFeatures tests

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2025-01-11 14:10:00 +01:00
Brian Kelley
9495dd5413 Expose tautomer scoring functions to python (#7994)
* Expose tautomer scoring functions to python

* Add more tests/documentation

* Rename getDefaultTautomerSubstructs to getDefaultTautomerScoreSubstructs

* Remove ROMOL_SPTR

* Add full custom scoring function example

* Run clang format

* Use proper BOOST_PYTHON_FUNCTION_OVERLOADS

* Use default copy constructor
2024-11-15 05:37:35 +01:00
Yakov Pechersky
ad4ee83aec Fixes #7044 (#7137)
* oxime tests

* Support allenes in canonicalizing double bonds

* alternate solution to the problem

* expand comment

* reactivate conjugated nitro test

* Fix conjugated nitro tests, have a bondstereo test

* Empty commit to re-proc tests

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-10-18 05:32:20 +02:00
Greg Landrum
7d2598267a Fixes #7689 (#7851) 2024-09-26 19:22:26 +02:00
Greg Landrum
da6cd73168 Run clang-format across everything (#7849)
* run clang-format-18 across Code/*.cpp and Code/*.h

* run clang-format-18 across External
2024-09-26 13:39:02 +02:00
Riccardo Vianello
add896f578 refactor MolStandardize::PipelineOptions::allowEmptyMolecules (#7699) 2024-08-13 15:33:13 +02:00
Greg Landrum
b1663052b8 Remove Descriptors as a dependency of many other RDKit libraries (#7700)
* move mol weight and formula calculators to MolOps and refactor them a bit.
The descriptors are still there and should remain.

* remove other unnecessary dependencies on Descriptors

* Update adapter.cpp

Co-authored-by: Paolo Tosco <paolo.tosco.mail@gmail.com>

---------

Co-authored-by: Paolo Tosco <paolo.tosco.mail@gmail.com>
2024-08-13 13:22:43 +02:00
Riccardo Vianello
3f7caf0147 Extend RDKit::MolStandardize with a validation and standardization Pipeline (#7582)
* Extend RDKit::MolStandardize with a validation and standardization Pipeline

* suggested changes

* apply clang-format

* apply yapf

* MolStandardize::FeaturesValidation optionally disallow dative bonds

* add allowDativeBondType to MolStandardize::PipelineOptions

* apply clang-format

* make the API of other validation classes more consistent with MolStandardize::FeaturesValidation

* apply clang-format

* PipelineStage to enum class
remove virtual functions from Pipeline class
be explicit about enums

* light refactoring to avoid what I think is an unnecessary call to `parse`

* a bit of modernization

* make the pipeline configurable

* make parse and serialize configurable too

* switch to storing pipeline stages using uints

* add a simple test for providing a pipeline

* update pointer alignment for clang-format

* test modifying the parser and serializer

* update swig requirement

* changes in response to review

* changes in response to review

* rename PipelineResult's *MolBlock members to *MolData

* upgrade swig to 4.2 in the CI environments

* add a few missing export directives

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2024-07-30 17:09:16 +02:00
Greg Landrum
9d26fc229d Fixes #7642 (#7643) 2024-07-25 04:48:00 +02:00
Riccardo Vianello
99288ff988 fix one case of undesired 1-3 charge recombination (#7561) 2024-06-25 15:44:36 +02:00
Greg Landrum
724716b2c6 Switch to isoelectronic valence model (#7491)
* change valence model to use isolobal analogy

Remove support for five-coordinate C+ and, by analogy, five-coordinate N+2

Removes support for charge states that take atoms past the end of the periodic table
  i.e. [Lv-4] is no longer supported

* update the tests for that

* remove valence state of 6 for Al

* fix representation of phosphate in the mol2 parser

this is a correction of what was done during #5973

* cleanup the exceptions for P, S, As, and Se

* drop valence states:

Si 6, P 7, As 7

* a couple of additional changes from #7397

* update java tests

* fix an inconsistency: Rb now supports valence -1

* documentation

* - replace operator[] with at() for bounds check
- extract some code into a function to avoid duplication
- use TAB as separator throughout in the periodic table data for consistency

* removing the .at() usage

We know that these vectors aren't empty, so there's no need for the bounds check.

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-06-25 15:38:49 +02:00
Riccardo Vianello
256515f40d Optionally limit the MolStandardize::Uncharger to only alter the protonation state (#7458)
* refactoring of the Uncharger class

* optionally limit the Uncharger to only change the protonation state

* changes in response to review

* refactor the handling of positively charged atoms

* changes in response to review

* changes in response to review

* Revert "changes in response to review"

This reverts commit d42e1286d5.

* changes in response to review
2024-06-13 08:37:24 +02:00
Riccardo Vianello
187d2ca007 Prevent the normalization of conjugated cations from applying to the oxime oxygen (#7403) 2024-04-30 17:06:07 +02:00
Riccardo Vianello
24aba6904e Fix the Uncharger 'force' option w/ non-neutralizable negatively charged sites (#7382) 2024-04-24 09:19:27 +02:00
Riccardo Vianello
0979a9c391 fix 1,3- 1,5- conjugated cation normalizer transforms (#7330)
* fix 1,3- 1,5- conjugated cation normalizer transforms

* small refactoring

* add some test cases related to the normalization of conjugated cations
2024-04-09 06:00:32 +02:00
Riccardo Vianello
35ac4accd9 restrict the application of 1,3- 1,5- conjugated cation normalization (#7287) 2024-04-01 06:38:38 +02:00
Riccardo Vianello
84f8186954 Fix Uncharger applying to already neutralized perhalic groups (#7211) 2024-03-07 07:57:24 +01:00
Riccardo Vianello
06d2e2e89f Add a 'force' option to MolStandardizer::Uncharger (#7088)
* Add a 'force' option to MolStandardize::Uncharger

* update comment

* add more test cases exercising MolStandardize::Uncharger

* fix the neutralization of surplus negative charges

* changes in response to review

* Add a test case for MolStandardize::Uncharger

* refactor the neutralization of negative charges in MolStandardize::Uncharger
2024-02-06 15:34:28 +01:00
Riccardo Vianello
b3cb9cab21 refactoring of MolStandardize validation module (#7085)
* refactoring of MolStandardize validation module

* redefine MolStandardize::ValidationErrorInfo as an alias for std::string

* changes in response to review

* describe the backward incompatible changes to MolStandardize in the release notes
2024-01-24 12:58:25 +01:00
tadhurst-cdd
d5d4d194ec atropisomer handling added (#6903)
* atropisomer handling added

* fixed non-used variables,  linking directives

* BOOST LIB start/stop fixes, linking fix

* Fixes for RDKIT CI errors

* minimalLib fix

* changed vector<enum> for java builds

* check for extra chars in CIP labeling

* removed wrong deprecated message

* fix ostrstream output error?

* restored _ChiralAtomRank to lowercase first letter

* changes for merged master

* Fixed catch label for new Catch package

* update expected psql results

* get swig wrappers building

* restore MolFileStereochem to FileParsers

* fix java wrapper for reapplyMolBlockWedging

* some suggestions

* move a couple functions out of Bond

* Merge branch 'master' into pr/atropisomers2

* merged master

* Renamed setStereoanyFromSquiggleBond

* atropisomers in cdxml, rationalize atrop wedging, stereoGroups in drawMol

* fix for CI build

* attempt to fix java build in CI

* attempt to fix java build in CI #2

* New routine to remove non-explicit  3D-geneated chirality

* changed to use pair for atrop atoms and related bonds

* Changes as per PR reviews

* PR review respnses

* PR review reponse - more

* Fix merge from master

* fixing java ci after merge

* Updated the help doc for atripisomers

* update the atropisomer docs

* improve the images

* add the source CXSMILES

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2023-12-22 04:58:18 +01:00
Greg Landrum
c7c9ad3328 Add in place and multithread support for more of the MolStandardize code (#6970) 2023-12-12 17:21:18 +01:00
Paolo Tosco
2b4202867e Add Python modules to generate stubs and automatically patch docstrings (#6919)
* - added gen_rdkit_stubs Python module to generate rdkit-stubs
- added patch_rdkit_docstrings Python module to patch existing C++ sources to fix docstrings missing self parameter and add named parameters taken from C++ signatures where possible
- added rdkit-stubs/CMakeLists.txt to build rdkit-stubs as part of the RDKit build
- added an option to CMakeLists.txt to enable building rdkit-stubs as part of the RDKit build (defaults to OFF)

* fixed CMakeLists.txt, rdkit-stubs/CMakeLists.txt and a doctest

* - added missing cmp_func parameter
- fixed case with overloads with optional parameters
- do not trim params if expected_param_count == -1
- add dummy parameter names if we could not find any
- keep into account member functions when making up parameter names
- address __init__ and make_constructor __init__ functions
- fix incorrectly assigned staticmethods

* patched sources

* address residual few remarks

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2023-11-30 04:54:18 +01:00
Greg Landrum
15751b3651 Add multi-threaded versions of some MolStandardize operations (#6909)
* initial addition of MT support to MolStandardize

* do the other inplace functions

* add mt ops to python wrappers
including tests

* release the GIL

* remove exploratory code added during dev

* make normalizer thread safe

* refactor some repeated code
2023-11-24 18:36:17 -05:00
Greg Landrum
2957ab4576 switch to catch2 v3 (#6898)
* switch to catch2 v3
Fixes #6894

* fix a couple of problems noticed in the CI builds

* more warning cleanup

* changes in response to review
2023-11-15 06:45:42 +01:00
Greg Landrum
086d2deab4 remove the deprecated python MolStandardize implementation (#6819) 2023-10-22 05:24:07 +02:00
Ric
9737b10143 fix some more leaks (#6684)
* fix some more leaks

* review comments
2023-09-04 06:38:01 +02:00
Ric
2bd6481280 Fix some build warnings (#6618)
* fixes a copy constructor issue:

In copy constructor ‘boost::shared_ptr<T>::shared_ptr(const boost::shared_ptr<T>&) [with T = RDKit::ROMol]’,
    inlined from ‘PyObject* RDKit::RunReactant(ChemicalReaction*, T, unsigned int) [with T = boost::python::api::object]’ at /tmp/rdkit_builder/rdkit/Code/GraphMol/ChemReactions/Wrap/rdChemReactions.cpp:129:14:
/usr/include/boost/smart_ptr/shared_ptr.hpp:416:66: warning: ‘*(const boost::shared_ptr<RDKit::ROMol>*)((char*)&<unnamed> + offsetof(boost::python::extract<boost::shared_ptr<RDKit::ROMol> >,boost::python::extract<boost::shared_ptr<RDKit::ROMol> >::<unnamed>.boost::python::converter::extract_rvalue<boost::shared_ptr<RDKit::ROMol> >::m_data.boost::python::converter::rvalue_from_python_data<boost::shared_ptr<RDKit::ROMol> >::<unnamed>.boost::python::converter::rvalue_from_python_storage<boost::shared_ptr<RDKit::ROMol> >::storage)).boost::shared_ptr<RDKit::ROMol>::px’ may be used uninitialized [-Wmaybe-uninitialized]
  416 |     shared_ptr( shared_ptr const & r ) BOOST_SP_NOEXCEPT : px( r.px ), pn( r.pn )
      |                                                                ~~^~
/tmp/rdkit_builder/rdkit/Code/GraphMol/ChemReactions/Wrap/rdChemReactions.cpp: In function ‘PyObject* RDKit::RunReactant(ChemicalReaction*, T, unsigned int) [with T = boost::python::api::object]’:
/tmp/rdkit_builder/rdkit/Code/GraphMol/ChemReactions/Wrap/rdChemReactions.cpp:129:30: note: ‘<anonymous>’ declared here
  129 |   ROMOL_SPTR react = python::extract<ROMOL_SPTR>(reactant);
      |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/boost/smart_ptr/shared_ptr.hpp:17:
In copy constructor ‘boost::detail::shared_count::shared_count(const boost::detail::shared_count&)’,
    inlined from ‘boost::shared_ptr<T>::shared_ptr(const boost::shared_ptr<T>&) [with T = RDKit::ROMol]’ at /usr/include/boost/smart_ptr/shared_ptr.hpp:416:72,
    inlined from ‘PyObject* RDKit::RunReactant(ChemicalReaction*, T, unsigned int) [with T = boost::python::api::object]’ at /tmp/rdkit_builder/rdkit/Code/GraphMol/ChemReactions/Wrap/rdChemReactions.cpp:129:14:
/usr/include/boost/smart_ptr/detail/shared_count.hpp:438:67: warning: ‘((const boost::detail::shared_count*)((char*)&<unnamed> + offsetof(boost::python::extract<boost::shared_ptr<RDKit::ROMol> >,boost::python::extract<boost::shared_ptr<RDKit::ROMol> >::<unnamed>.boost::python::converter::extract_rvalue<boost::shared_ptr<RDKit::ROMol> >::m_data.boost::python::converter::rvalue_from_python_data<boost::shared_ptr<RDKit::ROMol> >::<unnamed>.boost::python::converter::rvalue_from_python_storage<boost::shared_ptr<RDKit::ROMol> >::storage)))[1].boost::detail::shared_count::pi_’ may be used uninitialized [-Wmaybe-uninitialized]
  438 |     shared_count(shared_count const & r) BOOST_SP_NOEXCEPT: pi_(r.pi_)
      |                                                                 ~~^~~
/tmp/rdkit_builder/rdkit/Code/GraphMol/ChemReactions/Wrap/rdChemReactions.cpp: In function ‘PyObject* RDKit::RunReactant(ChemicalReaction*, T, unsigned int) [with T = boost::python::api::object]’:
/tmp/rdkit_builder/rdkit/Code/GraphMol/ChemReactions/Wrap/rdChemReactions.cpp:129:30: note: ‘<anonymous>’ declared here
  129 |   ROMOL_SPTR react = python::extract<ROMOL_SPTR>(reactant);
      |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Fixes a (weird) allocation warning in GCC 12

In file included from /usr/include/c++/12/bits/alloc_traits.h:33,
                 from /usr/include/c++/12/ext/alloc_traits.h:34,
                 from /usr/include/c++/12/bits/basic_string.h:39,
                 from /usr/include/c++/12/string:53,
                 from /usr/include/c++/12/bits/locale_classes.h:40,
                 from /usr/include/c++/12/bits/ios_base.h:41,
                 from /usr/include/c++/12/streambuf:41,
                 from /usr/include/c++/12/bits/streambuf_iterator.h:35,
                 from /usr/include/c++/12/iterator:66,
                 from /usr/include/boost/iterator/iterator_traits.hpp:10,
                 from /usr/include/boost/range/iterator_range_core.hpp:26,
                 from /usr/include/boost/range/iterator_range.hpp:13,
                 from /usr/include/boost/iostreams/traits.hpp:38,
                 from /usr/include/boost/iostreams/detail/call_traits.hpp:15,
                 from /usr/include/boost/iostreams/detail/adapter/device_adapter.hpp:22,
                 from /usr/include/boost/iostreams/tee.hpp:18,
                 from /tmp/rdkit_builder/rdkit/Code/RDGeneral/RDLog.h:17,
                 from /tmp/rdkit_builder/rdkit/Code/GraphMol/ChemTransforms/testChemTransforms.cpp:11:
In function ‘void std::_Construct(_Tp*, _Args&& ...) [with _Tp = pair<int, int>; _Args = {const pair<int, int>&}]’,
    inlined from ‘_ForwardIterator std::__do_uninit_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = const pair<int, int>*; _ForwardIterator = pair<int, int>*]’ at /usr/include/c++/12/bits/stl_uninitialized.h:120:21,
    inlined from ‘static _ForwardIterator std::__uninitialized_copy<_TrivialValueTypes>::__uninit_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = const std::pair<int, int>*; _ForwardIterator = std::pair<int, int>*; bool _TrivialValueTypes = false]’ at /usr/include/c++/12/bits/stl_uninitialized.h:137:32,
    inlined from ‘_ForwardIterator std::uninitialized_copy(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = const pair<int, int>*; _ForwardIterator = pair<int, int>*]’ at /usr/include/c++/12/bits/stl_uninitialized.h:185:15,
    inlined from ‘_ForwardIterator std::__uninitialized_copy_a(_InputIterator, _InputIterator, _ForwardIterator, allocator<_Tp>&) [with _InputIterator = const pair<int, int>*; _ForwardIterator = pair<int, int>*; _Tp = pair<int, int>]’ at /usr/include/c++/12/bits/stl_uninitialized.h:372:37,
    inlined from ‘void std::vector<_Tp, _Alloc>::_M_range_insert(iterator, _ForwardIterator, _ForwardIterator, std::forward_iterator_tag) [with _ForwardIterator = const std::pair<int, int>*; _Tp = std::pair<int, int>; _Alloc = std::allocator<std::pair<int, int> >]’ at /usr/include/c++/12/bits/vector.tcc:808:38,
    inlined from ‘std::vector<_Tp, _Alloc>::iterator std::vector<_Tp, _Alloc>::insert(const_iterator, std::initializer_list<_Tp>) [with _Tp = std::pair<int, int>; _Alloc = std::allocator<std::pair<int, int> >]’ at /usr/include/c++/12/bits/stl_vector.h:1409:17,
    inlined from ‘void testReplaceCoreMatchVectMultipleMappingToCore()’ at /tmp/rdkit_builder/rdkit/Code/GraphMol/ChemTransforms/testChemTransforms.cpp:974:17:
/usr/include/c++/12/bits/stl_construct.h:119:7: warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ writing 56 bytes into a region of size 48 overflows the destination [-Wstringop-overflow=]
  119 |       ::new((void*)__p) _Tp(std::forward<_Args>(__args)...);
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/x86_64-linux-gnu/c++/12/bits/c++allocator.h:33,
                 from /usr/include/c++/12/bits/allocator.h:46,
                 from /usr/include/c++/12/string:41:
In member function ‘_Tp* std::__new_allocator<_Tp>::allocate(size_type, const void*) [with _Tp = std::pair<int, int>]’,
    inlined from ‘static _Tp* std::allocator_traits<std::allocator<_CharT> >::allocate(allocator_type&, size_type) [with _Tp = std::pair<int, int>]’ at /usr/include/c++/12/bits/alloc_traits.h:464:28,
    inlined from ‘std::_Vector_base<_Tp, _Alloc>::pointer std::_Vector_base<_Tp, _Alloc>::_M_allocate(std::size_t) [with _Tp = std::pair<int, int>; _Alloc = std::allocator<std::pair<int, int> >]’ at /usr/include/c++/12/bits/stl_vector.h:378:33,
    inlined from ‘void std::vector<_Tp, _Alloc>::_M_range_insert(iterator, _ForwardIterator, _ForwardIterator, std::forward_iterator_tag) [with _ForwardIterator = const std::pair<int, int>*; _Tp = std::pair<int, int>; _Alloc = std::allocator<std::pair<int, int> >]’ at /usr/include/c++/12/bits/vector.tcc:799:40,
    inlined from ‘std::vector<_Tp, _Alloc>::iterator std::vector<_Tp, _Alloc>::insert(const_iterator, std::initializer_list<_Tp>) [with _Tp = std::pair<int, int>; _Alloc = std::allocator<std::pair<int, int> >]’ at /usr/include/c++/12/bits/stl_vector.h:1409:17,
    inlined from ‘void testReplaceCoreMatchVectMultipleMappingToCore()’ at /tmp/rdkit_builder/rdkit/Code/GraphMol/ChemTransforms/testChemTransforms.cpp:974:17:
/usr/include/c++/12/bits/new_allocator.h:137:55: note: at offset [8, 56] into destination object of size 56 allocated by ‘operator new’
  137 |         return static_cast<_Tp*>(_GLIBCXX_OPERATOR_NEW(__n * sizeof(_Tp)));
      |                                                       ^

* finalForce maybe uninitialized in inlined copy constructor

* failure is being used uninitialized

* MaeWriter::write overrides a method of the base class without being marked 'override'

This one is annoying because it happens in an exported header, so it propagates
to any code using the header!

* otq is never used

* fix implicitly declared assignment operator warning

* variables in catch statements that are never used

* fix type (sign) mismatch warning

* drop duplicate export macro
2023-08-11 06:04:55 +02:00
Greg Landrum
ac54eb3209 Add an in place version of most of the MolStandardize functionality (#6491)
* reionizer and uncharger and normalizer can now operate in place

* add removeUnmatchedAtoms argument to in-place version of runReactant

When set to false atoms which are not explicitly removed by the reaction are preserved

* Fix a case where transforms were incorrectly updating atomic numbers

* add more inplace operations to MolStandardize

* support those in the Python layer

* support inplace for the rest of the python wrappers

* move a few more functions over to the inplace code
2023-07-21 08:44:41 +02:00
Ric
880a8e5725 Reformat Python code for 2023.03 release (#6294)
* run yapf

* run isort

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2023-04-28 06:53:56 +02:00
Ric
58d135a874 Reformat C/C++ code ahead of 2023.03 release (#6295)
* format files

* format template files too
2023-04-28 04:42:35 +02:00
David Cosgrove
aef426b247 Fix oxidation numbers calculation (#6266)
* Skip dative bonds when calculating oxidation numbers.

* Add extra test file.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2023-04-03 09:55:59 +02:00
David Cosgrove
5fe20d270f Improved handling of organometallics (#6139)
* First stab at disconnecting organometallics the Syngenta way.

* Add support for Li and Na compounds, with tests.

* Correct docstrings.

* Add -1 valence to sodium.

* Add changes from PR 5997.

* Move declaration of disconnectOrganometallics.

* Re-factor tests for disconnectOrganometallics to read V3000 files rather than hard-code in file.

* Add optional options to constructor.

* Python wrapping for disconnectOrganometallics.

* Python tests.

* Correct docString.

* Test using in place overload of disconnect free function.

* options_ should not be a reference.

* Re-work the wrappers.

* Added test MOL files.

* Correct charge on iron atoms.

* Fix tests for charged iron.

* Added some more test complexes.

* Port Marco's oxidation number Python to C++ Descriptor.

* Rename functions.

* Another test.

* Python wrappers.

* Comment in test.

* Throws exception if atom-based function called with non-kekulized parent.

* Fix typo.

* Allow potassium as well.

* Whole molecule calculations for oxidation number only.
Rename prop to OxidationNumber.

* Add OxidationNumber to common_properties.

* Extra comment.

* When deleting atom, update any ENDPTS props on bonds.
Copied in from stale PR ExtraDoc.

* Added hapticBondsToDative with python wrapper.

* Extra test.

* Re-factor endpts parsing.

* Typo.

* Add function for haptic end points inc. Python wrapper.

* Add dative bonds in separate step.

* C++ version of dativeBondstoHaptic.

* Update test mols.

* Python wrapper.

* Fix indents in Python test script.

* Corrected expected test result.

* Only do non-metal to metal conversion of single bond to dative if the explicit valence of the non-metal allows it.

* Fix test broken in a non-material way by previous change.

* Export for DLL.

* Remove redundant function declaration.

* Dave hates Windows.

* Move hapticBondEndpoints to Molops::details.

* And take it out of the Python wrappers.

* Run yapf on reformatted test script.

* Get the DLL builds going.

* addDativeBond -> addHapticBond.

* Batch edits.

* Position arithmetic.

* setQuery.

* Dummy positions for all confs.

* Fix tests for dummy positions for all confs.

* Move tests to catch_organometallics.cpp.

* Modern docString.

* Change member variable names.

* sProp length, bonus batchEdit.

* Add options object to disconnectOrganometallics.

* Tidied license.
Atom precondition.

* GetIntProp.

* Test opts aren't defaults.

* Python wrapper for disconnectOrganometallics with options.

* Minor edit.

* Slightly random attempt to fix Java build.

* Response to review.

* Another stab at the doc strings.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2023-03-18 04:36:38 +01:00
David Cosgrove
42782d31cf Change IUPAC metal->non-metal single bonds to dative (#6038)
* Change atom to metal bonds from single to dative if appropriate.

* Pedantic change whilst I was in the area.

* Reinstate all tests, leave in debugging writes to see failing tests.

* Re-did it.  Failing tests now pass.

* Move any positive charge from the non-metal to the metal.
Fix expected test results.

* Write dative bond to JSON.

* Bump currentRDKitJSONVersion to 11, but allow parser to still read 10.

* Only move 1 unit of charge at a time from non-metal to metal.

* Greg's hack to not do it for O+ and N+ etc.
Explicitly exclude H, He, F, Ne from dative bonds.
Fix tests.

* Update expected PostGres json version to 11.

* suggestions for PR

* Correct comment.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: greg landrum <greg.landrum@gmail.com>
2023-02-06 04:37:32 +01:00
Ric
0e7e44c9f4 Fix some minor leaks (#6029)
* fix deserializing leaks ringinfo

* fix leak in FingerprintGenerator

* fix leak in MolStandardize test
2023-01-31 17:30:34 +01:00
David Cosgrove
29c84984e9 Fix disconnection of magnesium acetate in covalent form. (#5998)
* Skip -1 on valence lists for metals.

* Don't give metal a charge that exceeds the maximum valence.

* Re-format file.

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2023-01-24 11:56:31 +01:00
Greg Landrum
d883803a4b stop caching ring-finding results (#5955)
* stop caching ring-rinding results

* add backwards incompatibility note

* changes in response to review
2023-01-11 09:32:54 -05:00
Greg Landrum
ff6451447a Fixes #5784 (#5817)
catch kekulization errors during the tautomer enumeration
I have tested this on ~100K ChEMBL molecules and encountered
no further problems.
2022-12-01 17:11:50 +01:00
Greg Landrum
522811b8d4 Fixes #5402 (#5542)
* support transforms with branches

* improve output when doing verbose canonicalization

* Fixes #5402
2022-09-09 05:06:42 +02:00