Commit Graph

5381 Commits

Author SHA1 Message Date
Paolo Tosco
bd8289738d Fix #8027 (#8031)
* - Fix #8029
- avoid unnecessary rounding errors in the JSON writer
- remove a warning when compiling MinimalLib without SubstructLIbrary support

* changes in response to review

* changes in response to review

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-11-30 07:20:20 +01:00
David Cosgrove
bbac292b4c Synthon fingerprint search (#8025)
* First pass at splitting molecule.

* Interim commit.  Reading libraries from file in original format.

* Basic search seems to be working.

* Pattern fingerprint screening.

* Connector region heuristic.

* Fixed triazole (aromatic/non-aromatic connectors).

* Fix search with non-split parent query, where query is substructure of a single reagent.

* Remove duplicate hits by reaction/reagents used.

* Implement largest fragment heuristic.

* Extra test files.

* Read/write binary file.
Program for conversion from text format to binary format.

* Remove empty reagent sets on reading, probably due to synthon number counting from 1 rather than 0.

* Tidy SSSearch functions.

* Stash pending major surgery for triazole bug.

* Revert to using unique_ptr.
Correct use of reagent order.

* Function to summarise Hyperspace.

* Delay building hits till end and put cutoff on number.

* Earlier bale-out in getHitReagents.

* Streamline checkConnectorRegions.

* Remove free functions for search.

* Correct name of Python test.

* First stage of Python wrappers.

* Rename namespace.

* Parameters object.

* Mysterious windows export thing.

* Fix bug - not matching number of connectors in fragment and synthon.

* Back like it was.  The connector count wasn't the problem.

* Put the substructure results into their own class.

* gcc 14 didn't like my use of std::reduce.
Update expected test results.

* Remove write statement.

* Tidy.

* Tidy.

* Enable random sample of hits.

* Test that complex SMARTS works.
Update Python wrappers.

* Rename Hyperspace to SynthonSpace.

* More renaming.
Python test.

* Enable Python test.
Remove write.

* Plug memory leak.

* Response to Greg's initial look.

* More response to Greg's initial look.

* get the windows DLL builds working

* Do away with mutable.
Purge a few more uses of reagent in favour of synthon.
Remove the c++ exe for converting text to binary databases.

* Better Synthon c'tor.

* More feedback from Greg.

* Tidy the Python wrapper.

* Remove tags from catch tests.

* Don't allow copying of SubstructureResults.

* Revert to allow copying of SubstructureResults.  The Python wrapper needs it.

* Refinements based on CLion/clangd suggestions.

* Allow for map numbers in connectors in space file.

* Refactor to make the searcher a separate class from the space.

* Transfer Greg's review suggestions from Hyperspace merge.

* First cut of fingerprint searcher.

* Python wrapper.
Some tidying.

* Better random selection.

* Fix bug in preparing frags for fingerprints.
Re-factor.

* Minor-refactor.

* Sort hits by similarity if available.

* Option for a few different fingerprint types.  Pending a better solution.

* Write fingerprints to binary file.

* Use any fingerprint generator for similarity searching.  No Python wrapper yet.

* Python wrapper.

* Change random selection to use distribution weighted by number of hits in each reaction.

* Lots of suggestions from CLion/clang.

* Use boost discrete_distribution for cross-platform consistency.

* Tidy test up.

* Try boost rng as well.

* uniform_int_distribution to boost also.

* Small tidy.

* Method to write enumerated library.

* Windows export thing.

* Windows export thing.

* Allow for commas in tab-separated fields.

* win64 dll builds now work

* More aliphatic synthon, aromatic product joy.

* Force ring finding if it hasn't been done.

* Fingerprint hits not being sorted if maxHits reached.

* Remove debugging write.  Doh!

* Response to review of SynthonSpace2.

* Missed one.

* Add test file.

* Hand merge Greg's #8050.

* Discard nodiscard.

* Move include of export.h inside include guards.

* Response to review.

* Fix memory leaks.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-29 13:07:32 +01:00
David Cosgrove
8b7f773593 Improve doc string for ParseAbbreviations in python wrapper. (#8049)
Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2024-11-28 13:54:04 +01:00
David Cosgrove
43229cf933 Synthon space2 (#8048)
* Fix for connector regions and missing ringinfo.

* Merge in the fix for comma-separated names in tab-separated space files.

* Response to review.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2024-11-28 13:12:35 +01:00
Dan Nealschneider
003225206d Remove explicit default constructor from Compute2DCoordParameters (#8046)
The explicit default constructor actually blocks the other
defaults. It also means this is no longer a "plain struct",
and so designated initializers aren't available.

I wanna be able to do:

    Compute2DCoordParameters params{.useRingTemplates = true};

Or even:

    RDDepict::compute2DCoords(*mol, {.useRingTemplates = true};
2024-11-28 08:29:44 +01:00
Eric Boittier
64fcea7391 [fix] issue #7572, precondition rootAtAtom if more than one fragment exists (#7811)
* [fix] re: issue #7572; added precondition check to prevent setting a root atom when more than one fragment exists

* tests for #7572, precondition rootAtAtom if more than one fragment exists

* test the fix to issue #7572

* [fix] moved the precondition to a block which get atom at index to prevent unhandled exceptions, MHFP tests pass now

---------

Co-authored-by: Eric Boittier <ericdavid.boittier@unibas.ch>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-28 07:06:44 +01:00
Brian Kelley
7b31d5307b MergeQueryHs better detects hydrogens in OR queries (#8043)
* Fixes #7687

* Make sure high and low precedence ands also work

* really fix the issue

* Fix merge conflict

* Add AtomType to MergeH check with test
2024-11-26 04:44:32 +01:00
Ricardo Rodriguez
1b0a20c372 Fix shifting of a potentially negative number (#8014)
* fix shifting a potentially negative number

* add a test

* fix signed comparison
2024-11-24 06:50:16 +01:00
Greg Landrum
a0a21e6ec6 Fixes #8023 (#8024) 2024-11-21 14:28:19 +01:00
Paolo Tosco
fc6e37d11e fix #8019 (#8020)
Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-11-20 09:09:56 +01:00
Ricardo Rodriguez
db0df54347 Fix some minor issues reported by ubsan and the compiler (#8015)
* initialize chiralityPossible

* fix build warning

* Fix integer overflow

* fix downcasting MarvinMolBase to MarvinMol

* Fix buildwarning

* increase PairList container to 64 bit

* fix testDict

* Update Code/RDGeneral/testDict.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/rules/Pairlist.h

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Update Code/GraphMol/CIPLabeler/rules/Pairlist.h

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* Fix catch_tests.cpp

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-20 09:09:22 +01:00
Ivan Tubert-Brohman
d8bc5d61f8 Catch exceptions in MultithreadedMolSupplier callbacks (#7810)
* Catch exceptions in MultithreadedMolSupplier callbacks

* In next(), simply ignore any exceptions from nextCallback.
* In reader(), if readCallback throws, log a warning and proceed using
  the unmodified record.
* (The writer() was already handling exceptions from writeCallback.)

* Remove unused parameter names

Hopefully this will placate the warning/error settings used by the Linux
build.
2024-11-19 17:22:25 +01:00
Ricardo Rodriguez
39d4662ae7 Throw when attempting to normalize a Zero RDGeom::Point (#8008)
* throw if close to zero

* fix moldraw2DTestCatch

* Fix testRGroupDecomp

* fix one test in distGeomHelpersCatch

* fix tests in distGeomHelpersCatch

* retry finding a dir vector when adding Hs

* push UFF fixes to calculateCosY

* fix the setTerminalAtomCoords deg 4 patch

* add a test

* reduce zero tolerance
2024-11-19 04:33:22 +01:00
Ricardo Rodriguez
ffb100d928 avoid division by zero (#8013) 2024-11-19 04:32:44 +01:00
Hussein Faara
f35e7e6414 remove no-op macros and dead code (pt 1) (#8012) 2024-11-19 04:31:56 +01:00
David Cosgrove
eaf544ab6f SynthonSpace Search (#7978)
* First pass at splitting molecule.

* Interim commit.  Reading libraries from file in original format.

* Basic search seems to be working.

* Pattern fingerprint screening.

* Connector region heuristic.

* Fixed triazole (aromatic/non-aromatic connectors).

* Fix search with non-split parent query, where query is substructure of a single reagent.

* Remove duplicate hits by reaction/reagents used.

* Implement largest fragment heuristic.

* Extra test files.

* Read/write binary file.
Program for conversion from text format to binary format.

* Remove empty reagent sets on reading, probably due to synthon number counting from 1 rather than 0.

* Tidy SSSearch functions.

* Stash pending major surgery for triazole bug.

* Revert to using unique_ptr.
Correct use of reagent order.

* Function to summarise Hyperspace.

* Delay building hits till end and put cutoff on number.

* Earlier bale-out in getHitReagents.

* Streamline checkConnectorRegions.

* Remove free functions for search.

* Correct name of Python test.

* First stage of Python wrappers.

* Rename namespace.

* Parameters object.

* Mysterious windows export thing.

* Fix bug - not matching number of connectors in fragment and synthon.

* Back like it was.  The connector count wasn't the problem.

* Put the substructure results into their own class.

* gcc 14 didn't like my use of std::reduce.
Update expected test results.

* Remove write statement.

* Tidy.

* Tidy.

* Enable random sample of hits.

* Test that complex SMARTS works.
Update Python wrappers.

* Rename Hyperspace to SynthonSpace.

* More renaming.
Python test.

* Enable Python test.
Remove write.

* Plug memory leak.

* Response to Greg's initial look.

* More response to Greg's initial look.

* get the windows DLL builds working

* Do away with mutable.
Purge a few more uses of reagent in favour of synthon.
Remove the c++ exe for converting text to binary databases.

* Better Synthon c'tor.

* More feedback from Greg.

* Tidy the Python wrapper.

* Remove tags from catch tests.

* Don't allow copying of SubstructureResults.

* Revert to allow copying of SubstructureResults.  The Python wrapper needs it.

* Refinements based on CLion/clangd suggestions.

* Allow for map numbers in connectors in space file.

* Response to review.

* update binary file spec

* Changes after review.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-17 08:13:54 +01:00
Ricardo Rodriguez
ff93b2919f look at y if x is very small (#8011) 2024-11-17 06:22:25 +01:00
Paolo Tosco
8c1bf34ed7 implemented JSON parsers for SanitizeFlags and RemoveHsParameters for CFFI and MinimalLib (#8000)
Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-11-16 05:16:25 +01:00
Ricardo Rodriguez
999d9097c1 fix potential division by zero (#8007) 2024-11-15 16:16:29 +01:00
Brian Kelley
9495dd5413 Expose tautomer scoring functions to python (#7994)
* Expose tautomer scoring functions to python

* Add more tests/documentation

* Rename getDefaultTautomerSubstructs to getDefaultTautomerScoreSubstructs

* Remove ROMOL_SPTR

* Add full custom scoring function example

* Run clang format

* Use proper BOOST_PYTHON_FUNCTION_OVERLOADS

* Use default copy constructor
2024-11-15 05:37:35 +01:00
tadhurst-cdd
c2168cd8be Fix for kekuleAtrop wedge error (#7992)
* Fix for kekuleAtrop error

* Consolodate repeating code in new tests

* as per PR review - simplification
2024-11-15 04:39:38 +01:00
David Cosgrove
1aa30412cd Mol to smiles docs (#8005)
* Use new Morgan fingerprint generator.

* Add script to build fragments database and amend score script to use it.

* Remove redundant imports.

* Response to review.

* Clarify interaction of canonical and random in MolToSmiles.

---------

Co-authored-by: Dave Cosgrove <david@cozchemix.co.uk>
2024-11-14 15:34:30 +01:00
Greg Landrum
7872341b92 fixes #8001 (#8002) 2024-11-14 05:14:51 +01:00
Paolo Tosco
9c63cf6c54 Expose molzip functionality to MinimalLib (#7959)
* Expose molzip functionality to MinimalLib

* changes from code review

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-12 17:16:14 +01:00
Hussein Faara
9db22a9178 Update SMILES parsing syntax error to include bad token position (#7979)
* Implements #TODO

This attempts to improve the SMILES parsing procedure to provide more
information about bad inputs by pointing to general location of the offending token.
The improved error messages currently only apply to syntax errors, inputs with
extra close parentheses, and inputs with too large numbers, but this will
provide a way to extend that in the future

Some examples of the improved error messages

| old version                            | new version            |
|----------------------------------------|------------------------|
|syntax error while parsing: c%()ccccc%() | syntax error while parsing input:<br /> `c%()ccccc%()` <br /> `~~~^` |
|syntax error while parsing: c%(100000)ccccc%(100000) | syntax error while parsing input: <br /> `c%(100000)ccccc%(100000)` <br /> `~~~~~~~~^` |
|syntax error while parsing: COc(c1)cccc1C#|syntax error while parsing input: <br /> `COc(c1)cccc1C#` <br /> `~~~~~~~~~~~~~^` |
|extra close parentheses while parsing: C) | extra close parentheses while parsing input: <br /> `C)` <br /> `~^` |
| extra close parentheses while parsing: C1C)foo | extra close parentheses while parsing input: <br /> `C1C)foo` <br /> `~~~^` |
| number too large while parsing: [555555555555555555C] | number too large while parsing input: <br /> `[555555555555555555C]` <br /> `~~~~~~~~~~^` |

This was achieves by extending the lexing and parsing procedure to track
the current token position with a new variable, so the reported position
may not be 100% accurate at all times but should be helpful in reducing
the amount of work done to find the bad location.

Additionally, I updated the build instructions to strip the #line macros
from the generated C++ files, which required a modern flex/bison
versions.

* fix build failure on windows

* static vars are not captured implicitly

* copy generated files to fix windows build failure

* fix long input truncation

* copied the wrong files again :(

* Review suggestions

* update error messges for extra open brackets

These aere example error messages:

```
[09:54:03] SMILES Parse Error: extra open parentheses while parsing: C1CC1(CC
[09:54:03] SMILES Parse Error: check for mistakes around position 6:
[09:54:03] C1CC1(CC
[09:54:03] ~~~~~^

[09:54:03] SMILES Parse Error: Failed parsing SMILES 'C1CC1(CC' for input: 'C1CC1(CC'
[09:54:03] SMILES Parse Error: extra open parentheses while parsing: C1CC1(CC(CC
[09:54:03] SMILES Parse Error: check for mistakes around position 6:
[09:54:03] C1CC1(CC(CC
[09:54:03] ~~~~~^

[09:54:03] SMILES Parse Error: extra open parentheses while parsing: C1CC1(CC(CC
[09:54:03] SMILES Parse Error: check for mistakes around position 9:
[09:54:03] C1CC1(CC(CC
[09:54:03] ~~~~~~~~^
```

* fix bad merge artefact.
2024-11-12 14:47:15 +01:00
Hussein Faara
384e296ba0 Remove unused code from GraphMol/SmilesParse (#7996) 2024-11-11 05:58:12 +01:00
tadhurst-cdd
57a9d2928f Fix incorrect CIP values for some aromatic atropisomers (#7957) 2024-11-09 10:32:46 +01:00
Brian Kelley
75e8858a93 Fixes #7989 Incorrect benzyl deprotection reaction (#7990) 2024-11-07 18:34:59 +01:00
tadhurst-cdd
649a62a39d Fix for trimethylcyclohexane error (#7949) 2024-11-07 06:15:32 +01:00
Paolo Tosco
b1d322555b Expose propertyFlags to CFFI and MinimalLib (#7960)
* - added property support to CFFI library
- added propertyFlags JSON parser
- added support for setting propertyFlags to MinimalLib

* added missing files

* fix SWIG builds

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-11-07 06:08:27 +01:00
Greg Landrum
5b943e3a55 Fixed #7986 (#7987)
* Fixed #7986

* get builds working on linux

* try again

* oops, use std::uint32_t
2024-11-05 13:58:48 -05:00
Michael Cho
b16b6026e8 Use .dylib on macOS for PostgreSQL 16+ (#7869)
See b55f62abb2
2024-11-01 06:18:02 +01:00
Greg Landrum
2f119c2693 Convert reaction fingerprinter to use FingerprintGenerators (#7931)
* FingerprintGenerator improvements

1. simplify construction by adding ctors taking FingerprintArguments
2. remove inexplicable boost::noncopyable from FingerprintArguments

* Switch FingerprintType to be an enum class

* Fixes #7521

* dumb mistake

* initialize everything

* get the defaults right

* Update Code/GraphMol/ChemReactions/ReactionFingerprints.cpp

Co-authored-by: Paolo Tosco <paolo.tosco.mail@gmail.com>

---------

Co-authored-by: Paolo Tosco <paolo.tosco.mail@gmail.com>
2024-10-31 06:58:02 +01:00
Paolo Tosco
64061b6ca7 - remove deprecation warnings by switching MinimalLib to fingerprint generators (#7938)
- add missing Fingerprints dependency to MinimalLib CMakeLists.txt
- remove conditional RGroupDecomposition dependency in MinimalLib CMakeLists.txt, as it is always needed because of relabelMappedDummies()

wip

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-10-30 06:15:58 +01:00
Brian Kelley
eacc365b27 GitHub 7865 haspropwithvaluequery leaks (#7872)
* Properly cleanup Dict::Pair when serializing HasPropWithQueryValue

* Make sure pickling doesn't change original molecule

* Fix bad cut and paste

* Add PairHolder utility class for memory management of non Dict Dict::Pairs, fix mem leak in pickler

* Edit comment to force a rebuild

* Ignore PairHolder from Java/Swig builds

* Ignore PairHolder API from swig

* Reponses to review

* Add backward incompatible change

* Make release note a bullet point
2024-10-30 06:12:40 +01:00
PatrickPenner
08bf105e5d Allow inclusion of molecule names in ScaffoldNetwork (#7956)
* Add molecules names into ScaffoldNetwork

- added parameter to ScaffoldNetwork Params to include molecule names in the
  ScaffoldNetwork nodes corresponding to input molecules

* Document _Name assumption

- fixed a binary and instead of a boolean and

* Forgot has prop check before access

* Misunderstood semantics of CHECK vs. REQUIRE
2024-10-30 06:11:30 +01:00
Paolo Tosco
511d4d941a Fixes a regression introduced in #7582 which made all SWIG enums become type-unsafe (#7972)
* fixes a regression introduced in #7582 which made all SWIG enums become type-unsafe

* fix also PipelineStrage asserts

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-10-30 06:08:14 +01:00
Ricardo Rodriguez
1201f214c4 One final round of mem errors (#7943)
* fix descriptors

* fix filtercatalog wrapping

* fix Chem reaction enumeration

* register shared_ptr

* change order of declarations

* fixleaks in SimDivPickers

* Manage ptr arrays in ForceField/BFGSOpt
2024-10-29 06:46:43 +01:00
Brian Kelley
d0fcbdac0d Need to reserve space before assignment (#7964) 2024-10-28 14:11:06 +01:00
Paolo Tosco
f66ad7e7c1 Replace awful enum reflection macros with Better Enums (#7913)
* - moved SMILES and RGroupDecomp JSON parsers to their own translation units
- added missing DLL export decorators that had been previously forgotten
- changed the signatures of MolToCXSmiles and updateCXSmilesFieldsFromJSON
to replace enum parameters with the underlying types
- updated ReleaseNotes.md

* make sure cxSmilesFields is only updated if the JSON string contains keys
belonging to the CXSmilesFields enum

* added missing copyright notices

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-10-25 08:17:07 +02:00
Ricardo Rodriguez
ccfb1fa688 ... and more mem errors fixed (#7924)
* fixleak in CIP labels catch test

* fix leak in Murtagh clustering

* do not leak writers in streambuf

* fix leaks in fingerprintgeneratorwrapper

* remove 'minor leak' comments
2024-10-25 07:01:34 +02:00
esiaero
7546167033 Fix 4.3.0--4.4.0 sql upgrade script (#7774)
Updates the script to drop the operator family so the subsequent creation attempt can pass.
2024-10-25 04:57:33 +02:00
Kollin Trujillo
2e5f7ce80c Support Writing CX Extensions for Reactions (#7838)
* Support writing CX extensions in reactions

* fixed merge conflicts

* wip

* Updated for getCXExtensions

* Refactored and deleted extraneous file.

* Updated function signatures

* Updated some tests

* Removed extraneous include from debugging

* Removed comment in reactionwriter.cpp

* Updated some tests with expected strings

* Updated to add logging for linknodes and substance group hierarchies

* Addressed some issues

* updated tests

* Addressed Greg's comments

* Updated for recommendations

---------

Co-authored-by: Rachel Walker <rachel.walker@schrodinger.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-10-24 19:31:04 +02:00
Ricardo Rodriguez
9e2b3f233e rename test (#7952) 2024-10-24 10:47:54 +02:00
John Mayfield
2553cfce73 Improved handling of SP/TB/OH reording in SMILES/SMARTS. (#6777)
* Improved handling of SP/TB/OH reording in SMILES/SMARTS.

- add a getMaxNbors(tag) utility function to avoid repeated logic in multiple places
- tweak getChiralPermutation() for handling implicit/missing ligands (uses -1), allow inverse lookup
- use the tweaked chiral ordering in the reading/writing from SMILES

* clang-format run

* Reviewer requested changes.
- curly braces on all if conditions
- null check raw pointers with precondition.

* Correct additional test case introduced in the last year
2024-10-21 04:55:08 +02:00
Greg Landrum
cf29f0895f Documentation updates (#7933)
* doc updates

* add rdShapeAlign to the docs

* fix shape docs

* let's not add too much here, revert deprecation

* response to review
2024-10-19 17:19:59 +02:00
Kevin Boyd
54025259d3 Only save the most recent CIP rank for sorting (#7932)
* Only save the most recent rank for sorting

* Remove padding code
2024-10-19 07:17:54 +02:00
Ricardo Rodriguez
077493b209 Fixes #7929 (#7930)
* add a test

* fix issue

* Update Code/GraphMol/Chirality.cpp

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-10-18 05:34:50 +02:00
Yakov Pechersky
ad4ee83aec Fixes #7044 (#7137)
* oxime tests

* Support allenes in canonicalizing double bonds

* alternate solution to the problem

* expand comment

* reactivate conjugated nitro test

* Fix conjugated nitro tests, have a bondstereo test

* Empty commit to re-proc tests

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-10-18 05:32:20 +02:00
Kevin Boyd
5d840450be Sort CIP entries in place using a lightweight wrapper, only sort tied entries (#7911)
* Sort CIP entries in place using a lightweight wrapper, only sort tied regions

* Address review comments

* Clang-format
2024-10-17 05:42:07 +02:00