* Create a function to extract some specified atoms from a ROMol as a new ROMol by creating new graph (#8742)
This adds a new api, `RDKit::MolOps::ExtractMolFragment`, to allow efficient
extractions of mol fragments from large mols. Compared to the approach where
we delete "unwanted" atoms/bonds from the input mol, this api is faster for
small mols (about 2x faster) and at least 3x faster for big mols
(was 10x faster for "CCC"*1000).
* clang-format
* review comments
* cleanup
* Consolidate copying subsets of molecules
* Readd missing tests
* Update comment to restart build
* Remove missing test
* Remove debugging comment, fix warnings
* Fix warnings on gcc11
* Add docs
* Make vector<bool> dynamic_bitset<>
* Update copyright
* Add swig wrappers
* Use new designated constructor API
* Fix windows builds
* Change enum values from unsigned int to integer
* Fix unsigned int variable
* Update Code/GraphMol/Wrap/test_subset.py
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Update Code/GraphMol/Subset.cpp
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Update Code/JavaWrappers/gmwrapper/src-test/org/RDKit/ChemTransformsTests.java
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Reponse to review
* Fix documentation
* Remove comments
* Remove unnecessary comments
* Fix one liners
* Change assertion to be clearer (and not one-liners)
* Run clang-format
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Hussein Faara <hussein.faara@schrodinger.com>
Co-authored-by: Brian Kelley <bkelley@glysade.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* do not remove hydrides by default
* add a minimal test
* add release note about behavior change
* require Hydrides to have degree 1
* also allow hydrides with degree 0 (ionic bond)
* suggested changes
---------
Co-authored-by: greg landrum <greg.landrum@gmail.com>
* Trim spaces from RDProp strings to simulate reading from SDFiles
* Update documentation
* Use the correct doc strings
---------
Co-authored-by: Brian Kelley <bkelley@glysade.com>
* backup
builds but no tests
* deprecate old form
* initial basic tests for bond property lists
* Tests pass
Fixes#8777
* add deprecations to release notes
* - avoid computing size of a constant at runtime
- replace multiple instances of a numeric constant with a literal constant
- avoid unnecessary copying of strings in iterations and function calls
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* Fist pass at CDX support
* Enable CDX support for reading (also) in the CDXMLParser API
* Add cdxml test files
* Update swig wrappers for CDXMLFormat and Parameters
* Add constructor to ChemDrawParserParams
* Add Java SWIG support for ChemDraw
* Add chemdraw define to rdconfig
* Add missing chemdraw deps
* Remove direct expat link
* Fix Java linkages for ChemDraw
* Remove bad merge code
* Remove bad merge code
* Fix csharp builds
* Add sniffer for the ChemDraw DataStream
* Include filesystem
* Fix test on windows
* Add more CDX tests
* Ensure streams are open in binary mode to support CDX on windows
* Fix text to show that a Block is the text input, not a file
* Fix CSharp test
* Disable CDX tests when not building chemdraw
* Turn back on chemdraw
* Response to review
* Turn off chemdraw support for the limited external test
---------
Co-authored-by: Brian Kelley <bkelley@glysade.com>
* get SynthonSpace.cpp to build also when RDK_USE_BOOST_SERIALIZATION is
not defined
* test should not fail when RDK_USE_BOOST_SERIALIZATION is not defined
* - expose reading/writing PNG metadata to CFFI and MinimalLib
- add relevant CFFI and MinimalLib unit tests
- add RDK_USE_BOOST_PROGRAM_OPTIONS CMake option
- enable using standalone zlib in the absence of boost::iostreams for parsing PNG files
- enable linking against maeparser in the absence of boost::iostreams also on Windows
- enable building RDKit in the absence of boost::program_options
* add check for boost::program_options
* change size_t into std::uint64_t in SearchResults for consistency with doTheSearch() which uses std::uint64_t
* change size_t into std::uint64_t in SearchResults for consistency with
SynthonSpaceSearcher::doTheSearch()
* set CMake policy to allow YAeHMOP to require a version which is not
actually supported
* reverted External/YAeHMOP/CMakeLists.txt to master version
* check if Windows build will work
* fix build
* configure zlib install location
* build zlib dependency
* include zlib header directory
* explicitly set PropertyFlags.AllProps so the test does not fail on
static builds
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* Adds a df_forceStop to stop readers and writers, clears out queues on destructor
* Propery implement close function, requires protected closeStreams
* changes from greg's version
* close() needs to be called in the derived destructors
* Close the writers before the reader to avoid deadlock
* Don't process trailing new lines
* Don't accept pushes if the queue is done
* Add mutex protecting d_threadCounter, remove unneeded forceStop checks
* Update Code/GraphMol/FileParsers/MultithreadedMolSupplier.cpp
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Add comment for the d_threadCounterMutex unlock
---------
Co-authored-by: = <=>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Parsing SCSR
* add scsrol to mol
* removed bad include file
* loosen distGeom test slightly
* add wrap test for SCSRMol
* Add test for scsr in python
* tests added for scsr and strict parsing removed
* remove extra stuff
* More fully specified use of SCSRMol for PR CI build
* Added flags for SCSR expansion to not include any leaving groups
* Added MolFromScsrParams to Wrap for python
* added SCSRMol destructor
* Added two tests for RNA macromols, and fixed a bug they revealed
* Added new tests abd expected files
* changes as per PR review
* SCSR Chnages for leaving groups
* fixed testScsr.py
* hydrogen bond treatment
* in SCSR expand, allow Hbond to be autoatically detected
* changes as per code review
* Adding new test file
* chages for SCSR contructors, destructors for CI build
* fixed pyton for SCSR hydrogen bond modes, and added tests
* Added new test files
* fixed edge case for SCSR
* fix checksum for inchi
* consistent capitalization of SCSR throughout
* switch to enum class
* make things shorter
* simplify
* get rid of the ATTCHORD class
* New section for SCSR in RDKit_book
* addeed section to RDKit_Book
* SCSRMol is no longer exposed in Python
* fix leak in MolFromSCSRFile()
light refactoring
* expose MolFromSCSRFile() to python
make the MolFromSCSR functions work with default args
a bit more testing
* removed C++ access to SCSRMol
* CXMsiles now ouputs hbonds, fix to template matching, and a few other things
* Addl fix for bad aromaticity in Hbond rings
* Test files needed
* Test files needed
* try to fix a CI build errors
* CI error fix
* Added missing test file
* CMake version - for CI build
* remove full file compoarison from macromol test file
* accidental change to debug restored to release
* Code review changes
* As per PR review
---------
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* use std::span for substruct match callbacks
This removes a copy from every evaluation of potential matches
* some cleanup/modernization
* some modernization
* deprecate chiralAtomCompat
* small optimization
* remove naked pointers
* improve new_timings.py script
* changes suggested in review
* response to review
* response to review
* Fixes#8379
* check in some working tests
* test passes
* test passes
* test passes
* test passes
* test passes
* ensure that the invariants flush the streams on failure
* tests pass
* test passes
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* tests pass
* Fixes#8391
* tests pass
* fix a test with legacy
not clear why this was not causing problems before
* make a test work
* Fixes#8396
* gcc builds work
* fingerprint tests pass
* mention backwards incompatible change
* fix a problem with FindMolChiralCenters
* more testing details
* enable the test status output
* Fixes#8432
fix a bug in double-bond stereo handling for template matching
* all depictor tests pass
* use the new-stereo chiral ranks in the depiction code
* always assign new-stereo chiral ranks
* make _ChiralAtomRank a computed property
This is analogous to _CIPRank
* tweak to the way the atom ordering is computed for 2D coordinate generation
* update two expected results
* backup
* response to review
* tests pass
* tests pass
---------
Co-authored-by: = <=>
* Speed up GetProp Python keyerrors
A common pattern _in Python_ for checking for the presence or
absence of a key is:
try:
return mol.GetProp('mykey')
except KeyError:
return None
Shockingly, this is really slow with boost python objects! I was
recently profiling a workflow and 90% of the time or more was
spent in failed GetProp calls (mostly on bonds, some on atoms
or mols).
I sped up the workflow by protecting the calls using HasProp. But
I think this is a silly trap we've set for our users.
The problem comes because boost::python uses a C++ exception to
indicate that there is already a Python exception set. In C++,
exceptions are slow - they require unrolling a stack. In Python,
exceptions are about the same speed as any other control flow!
This commit speeds up GetProp failures by circumventing the
boost throw_exception_already_set() mechanism.
In my testing, this speeds up failed GetProp substantially:
* Factor of 1000x on Mac
* Factor of 40x on Linux
* Update typed GetXXXProp to bypass boost exceptions
Based on PR #8372
Updates the typed GetIntProp, GetDoubleProp, etc to bypass C++
exceptions in access. This speeds up missing key errors
significantly - for instance, calling mol. GetIntProp with a
missing prop 100,000 times:
Before: 28s
After: 0.05s
* refactor the code to determine whether or not an atom is in brackets
* move the definition of isMetal to QueryOps
* atoms bound to metals in SMILES should always be in square brackets
Implementation and some test updates
needs confirmation that all of tests run
* basic tests pass
* java tests pass
* update js tests
* doc updates
* Update Code/GraphMol/catch_graphmol.cpp
Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
* Update Code/GraphMol/SmilesParse/test.cpp
Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
* finish fixing tests
* bump yaehmop version to allow compilation to work
---------
Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
* first pass, does not pass all tests
* add an option to control the new behavior
* add that to the python wrapper too
Fixes#8304
* Update Code/GraphMol/MolOps.h
Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
* undo some extra comment reformatting
* typo
Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
---------
Co-authored-by: Ricardo Rodriguez <ricrogz@users.noreply.github.com>
* fix SetPositions when using strided numpy array
previously SetPositions assumed that the provided numpy array used contiguous-C stride patterns
* cast to a const pointer to avoid compiler warning
* For GetAtomsMatchingQuery, note that "Atom query options are given in the rdkit.Chem.rdqueries module"
* wording change
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
---------
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Use new Morgan fingerprint generator.
* Add script to build fragments database and amend score script to use it.
* Remove redundant imports.
* Response to review.
* Clarify interaction of canonical and random in MolToSmiles.
---------
Co-authored-by: Dave Cosgrove <david@cozchemix.co.uk>
* fixleak in CIP labels catch test
* fix leak in Murtagh clustering
* do not leak writers in streambuf
* fix leaks in fingerprintgeneratorwrapper
* remove 'minor leak' comments
* atropisomer handling added
* fixed non-used variables, linking directives
* BOOST LIB start/stop fixes, linking fix
* Fixes for RDKIT CI errors
* minimalLib fix
* changed vector<enum> for java builds
* check for extra chars in CIP labeling
* removed wrong deprecated message
* fix ostrstream output error?
* restored _ChiralAtomRank to lowercase first letter
* changes for merged master
* Fixed catch label for new Catch package
* update expected psql results
* get swig wrappers building
* restore MolFileStereochem to FileParsers
* fix java wrapper for reapplyMolBlockWedging
* test changes
* some suggestions
* move a couple functions out of Bond
* Merge branch 'master' into pr/atropisomers2
* merged master
* Renamed setStereoanyFromSquiggleBond
* atropisomers in cdxml, rationalize atrop wedging, stereoGroups in drawMol
* Merge branch 'master' into pr/specialQueries
* changes from previous PR
* Iclude false chiral
* rigorous enhnced stereo canoncalization
* Added more tests and clenup
* removed commented out code
* corrected init of SmilesWriteParams
* added MolFileStereoChem.h to the header files
* Renamed Rxn parser to MrvBlockToChemicalReaction
* To make catch2 work, and match the checksum
* Fixed Structchecker errors
* fix CI for DetermineBonds catch test
* error in catch_test for CI
* Allow custom smileWriteParams in GetMolLayers
* misnamed entry point
* ReactionFromMrvString change name
* remove adding writeParams to GetMolLayers
* make rigorous enhanced stereo the default, and fix tests
* only one abs group no longer needs Rigorous Enhanced treatment
* changed string_view to string in catch test
* Canonicalize Enhnaced Stereo only resturne unique smiles
* Now allows or and and groups together
* internal routines inside detail scope
* fix test error
* changed string back to string_view and fixed a CHECK
* Fixes for PR review tests
* Fix RDKit_Book.rst failure on build test
* fix xqm sql test
* updated expected files for cxsmiles_test
* Fixed removal of atom attrs
* Fixed tests after merge of master
* More efficient version of Stereo Groups Canonicalization
* Fixes for ctests
* removed debug code
* readded cipLabel test
* fix generalizedSubstruct/catch_tests.cpp error
* hueristics to improve speed
* Rationaized control of abs groups
* removed unused routine
* added rigorous stereo group treatment to test
* some suggested changes
* Changes per PR review and removed some changes to smiles
* Fixed CI errors
* changes per PR review
* more PR review vhanges and cleanup
* Fixed PSql PKL change
* changes as per PR review
* Restored error type for bad mols for canonicalizeStereoGroups and added a test
* Merge master and fix test in MolDraw2D
* Fix for randomize test error and other PR review comments
* Removed unsued variable to fix mac CI
* do not force aromatization in canonicalizeStereoGroups
* changes as per PR review
---------
Co-authored-by: greg landrum <greg.landrum@gmail.com>
* Fixes#7873
* Resolve MonomerInfo class for deletion
* Add regression test for setMonomerInfo
---------
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* fixes
* do not leak MolCatalogParams
* do not leak points on align failures
* give python ownership of pointers returned in getFingerprintsHelper
* clean up ScaffoldNetwork ptr if createNetworkHelper fails
* manage FF ptrs during construction
* wire in ownsBondInvGenerator in getMorganGenerator
* manage weights in rdMolAlign CalcRMS
* fix ownership of matches list/tuple in generateRmsdTransMatchPyTuple
* manage stream in createForwardSupplier during construction
* drop redundant Point3D allocations in GetUSRDistributionsFromPoints
* fix signed comparison mismatch