Commit Graph

18 Commits

Author SHA1 Message Date
David Cosgrove
e712d201a6 Fix Rascal out of memory bug. (#8648)
* Make test

* Cap number of equivalent cliques that will be produced.

* Oh FFFS!

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2025-07-24 16:43:09 +02:00
David Cosgrove
d7e1ce7cf4 Rascal Fix 8360 (#8376)
* Use distances on all valid paths rather than just shortest distance.

* Optimise BondPaths.

* Optimise BondPaths.

* Hash coded for the bond paths.

* Faster find all paths.

* Build in gcc working.

* Comment.

* Remove debugging code.

* Update GettingStartedInPython.rst.

* Now need to split the clique and keep the largest fragment.
Lots of warnings about how slow this is.
Split out long tests.

* Back out a lot of changes.  Remove the distance check with singleLargestFrag when building modular product.

* Tidy code.
Update docstrings.
Add explanation to GettingStartedInPython.rst.

* Fix single fragment test.

* Response to review.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2025-04-08 10:12:58 +02:00
Rachel Walker
7d02ac82fd Add RascalMCES option to require complete RingInfo rings (#8305)
* Attempt at completeRingsOnly in mces

* changed option name and added a test

* code review

* check aromaticity and bond type matches before checking ring equivalencies

* update if statement

* typo
2025-03-21 13:54:10 +01:00
David Cosgrove
4eb1ea9985 Initial Rascal results sort the wrong way round on number of atoms and bonds. (#8258)
* Initial results sort the wrong way round on number of atoms and bonds.

* Improve sort for singleLargestFrag.

* Response to review - don't sort the fragment sets twice when opts.singleLargestFrag is true.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2025-02-15 15:34:16 +01:00
David Cosgrove
b0083ccfda Rascal left some high atomic numbers in the output SMARTS (#8247)
* Fixes bug in output SMARTS when ringMatchesRingOnly causes an extra &R.

* Punctuation.

* Run clang-format.

* Don't use recursive SMARTS.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2025-02-14 16:15:11 +01:00
David Cosgrove
343b0698ca Rascal: Add option to specify minimum clique size directly. (#8248)
* Add option to specify minimum clique size directly.

* Make the minCliqueSize uint and default to 0.

* Brain-fade.

* Send timeout and 'too many bonds' message to BOOST_LOG rather than std::cout, std::cerr respectively.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2025-02-13 17:05:23 +01:00
David Cosgrove
1fba446e89 Rascal bond label error (#8199)
* Fix atom ordering in bond label.

* Response to review.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2025-01-29 15:55:43 +01:00
David Cosgrove
c23977aac3 Trim results after singleLargestFrag = true applied. (#8202)
* Trim results after singleLargestFrag = true applied.

* Fix sort.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2025-01-28 14:04:16 +01:00
David Cosgrove
736e309f10 Fix empty results bug. (#8099)
Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2024-12-14 18:21:29 +01:00
David Cosgrove
6db3f982cb Allow fragments of aromatic rings to match in RascalMCES (#8088)
* Allow fragments of aromatic rings to match.

* Use existing code for checking bond is in ring.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2024-12-13 07:20:24 +01:00
David Cosgrove
bbac292b4c Synthon fingerprint search (#8025)
* First pass at splitting molecule.

* Interim commit.  Reading libraries from file in original format.

* Basic search seems to be working.

* Pattern fingerprint screening.

* Connector region heuristic.

* Fixed triazole (aromatic/non-aromatic connectors).

* Fix search with non-split parent query, where query is substructure of a single reagent.

* Remove duplicate hits by reaction/reagents used.

* Implement largest fragment heuristic.

* Extra test files.

* Read/write binary file.
Program for conversion from text format to binary format.

* Remove empty reagent sets on reading, probably due to synthon number counting from 1 rather than 0.

* Tidy SSSearch functions.

* Stash pending major surgery for triazole bug.

* Revert to using unique_ptr.
Correct use of reagent order.

* Function to summarise Hyperspace.

* Delay building hits till end and put cutoff on number.

* Earlier bale-out in getHitReagents.

* Streamline checkConnectorRegions.

* Remove free functions for search.

* Correct name of Python test.

* First stage of Python wrappers.

* Rename namespace.

* Parameters object.

* Mysterious windows export thing.

* Fix bug - not matching number of connectors in fragment and synthon.

* Back like it was.  The connector count wasn't the problem.

* Put the substructure results into their own class.

* gcc 14 didn't like my use of std::reduce.
Update expected test results.

* Remove write statement.

* Tidy.

* Tidy.

* Enable random sample of hits.

* Test that complex SMARTS works.
Update Python wrappers.

* Rename Hyperspace to SynthonSpace.

* More renaming.
Python test.

* Enable Python test.
Remove write.

* Plug memory leak.

* Response to Greg's initial look.

* More response to Greg's initial look.

* get the windows DLL builds working

* Do away with mutable.
Purge a few more uses of reagent in favour of synthon.
Remove the c++ exe for converting text to binary databases.

* Better Synthon c'tor.

* More feedback from Greg.

* Tidy the Python wrapper.

* Remove tags from catch tests.

* Don't allow copying of SubstructureResults.

* Revert to allow copying of SubstructureResults.  The Python wrapper needs it.

* Refinements based on CLion/clangd suggestions.

* Allow for map numbers in connectors in space file.

* Refactor to make the searcher a separate class from the space.

* Transfer Greg's review suggestions from Hyperspace merge.

* First cut of fingerprint searcher.

* Python wrapper.
Some tidying.

* Better random selection.

* Fix bug in preparing frags for fingerprints.
Re-factor.

* Minor-refactor.

* Sort hits by similarity if available.

* Option for a few different fingerprint types.  Pending a better solution.

* Write fingerprints to binary file.

* Use any fingerprint generator for similarity searching.  No Python wrapper yet.

* Python wrapper.

* Change random selection to use distribution weighted by number of hits in each reaction.

* Lots of suggestions from CLion/clang.

* Use boost discrete_distribution for cross-platform consistency.

* Tidy test up.

* Try boost rng as well.

* uniform_int_distribution to boost also.

* Small tidy.

* Method to write enumerated library.

* Windows export thing.

* Windows export thing.

* Allow for commas in tab-separated fields.

* win64 dll builds now work

* More aliphatic synthon, aromatic product joy.

* Force ring finding if it hasn't been done.

* Fingerprint hits not being sorted if maxHits reached.

* Remove debugging write.  Doh!

* Response to review of SynthonSpace2.

* Missed one.

* Add test file.

* Hand merge Greg's #8050.

* Discard nodiscard.

* Move include of export.h inside include guards.

* Response to review.

* Fix memory leaks.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-11-29 13:07:32 +01:00
David Cosgrove
d44758ee5e Rascal match exact atom type (#7673)
* New option - exactAtomTypeMatch.

* Python wrapper.

* REQUIRE(res.size()) not CHECK.  If it fails, you get seg fault on subsequent tests otherwise.

* Change option name.
2024-09-16 04:45:18 +02:00
David Cosgrove
80381c61fd Rascal atom and bond equivalences (#7612)
* C++ implementation of equivalent atoms via SMARTS.

* Python wrapper.

* Tidy.

* Check for too many classes.

* Better handling of spaces in split.

* Pass string by reference.  Doh.

* Add ignoreBondOrders option.
2024-07-23 06:02:33 +02:00
David Cosgrove
b7e999bfd8 Rascal exactConnectionsMatch bug (#7359)
* Fix bug where different atom orders gave different MCES sizes.

* Update function description.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2024-06-23 07:05:35 +02:00
David Cosgrove
29a166dff6 Add option for RASCAL to restrict atom matching to atoms of same degree (#7344)
* Add exactConnectionsMatch option.

* Better python test.

* Better C++ test.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
2024-04-11 16:20:38 +02:00
Greg Landrum
2957ab4576 switch to catch2 v3 (#6898)
* switch to catch2 v3
Fixes #6894

* fix a couple of problems noticed in the CI builds

* more warning cleanup

* changes in response to review
2023-11-15 06:45:42 +01:00
Ric
8176f5c962 Fail CI builds on compiler warnings + some fixes (#6675)
* enable Werror on Mac and Linux

* do not fail on boost multiprecision pessimizing move

* fix eigen array_bounds warning

* Fix unused arg in Rascal MCS

* fix range-loop-construct warning in Rascal MCES

* fix sign mismatched comparison

* drop unused lambda capture

* allow FMCS timeout test more time under Debug (not a warning!)

* fix fwd declaration of struct RascalClusterOptions

* fix deallocator mismatch

* fix two minor leaks

* fix a real leak

* more minor leaks

* fix another real leak, plus some potential ones

* fix std::move preventing copy ellision

* allow longer run time for debug builds

* make maxBondMatchPairs and getLargestFragSize unsigned int

* make snake case camel case

* update to current master, fix new warnings

* update again and more fixes

* add #include <optional>

* fix char array deallocation

* update and fixes in Marvin writer

* unsigned int

* more copy ellision fixes

* more copy ellision fixes, and typos

* and some more
2023-09-02 04:38:45 +02:00
David Cosgrove
2dd9c5f3cd RASCAL MCES (#6568) 2023-08-27 13:51:49 +02:00