* Parsing SCSR
* add scsrol to mol
* removed bad include file
* loosen distGeom test slightly
* add wrap test for SCSRMol
* Add test for scsr in python
* tests added for scsr and strict parsing removed
* remove extra stuff
* More fully specified use of SCSRMol for PR CI build
* Added flags for SCSR expansion to not include any leaving groups
* Added MolFromScsrParams to Wrap for python
* added SCSRMol destructor
* Added two tests for RNA macromols, and fixed a bug they revealed
* Added new tests abd expected files
* changes as per PR review
* SCSR Chnages for leaving groups
* fixed testScsr.py
* hydrogen bond treatment
* in SCSR expand, allow Hbond to be autoatically detected
* changes as per code review
* Adding new test file
* chages for SCSR contructors, destructors for CI build
* fixed pyton for SCSR hydrogen bond modes, and added tests
* Added new test files
* fixed edge case for SCSR
* fix checksum for inchi
* consistent capitalization of SCSR throughout
* switch to enum class
* make things shorter
* simplify
* get rid of the ATTCHORD class
* New section for SCSR in RDKit_book
* addeed section to RDKit_Book
* SCSRMol is no longer exposed in Python
* fix leak in MolFromSCSRFile()
light refactoring
* expose MolFromSCSRFile() to python
make the MolFromSCSR functions work with default args
a bit more testing
* removed C++ access to SCSRMol
* CXMsiles now ouputs hbonds, fix to template matching, and a few other things
* Addl fix for bad aromaticity in Hbond rings
* Test files needed
* Test files needed
* try to fix a CI build errors
* CI error fix
* Added missing test file
* CMake version - for CI build
* remove full file compoarison from macromol test file
* accidental change to debug restored to release
* Code review changes
* As per PR review
---------
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
An internal user encountered the existing error message and became alarmed by the implication that no aromaticity detection was even attempted.
I think that the new error message more accurately reflects the logic - we halt the search eventually if the ring system is too large.
Possible ring combinations expand combinatorially, so for large ring systems, just checking for ring fusions may be slow. For a reported example of 900 rings, the calculation never completes. 300 is about 2 million iterations, which takes approximately 1 minute.
* fixes#4721
* - store in RingInfo the index of the ring(s) each atom and bond belongs to rather tham just their size
- expand the RingInfo API with a few useful methods
- identify rings that are certainly aliphatic upfront
- avoid unnecessary copying atomRings when RingInfo is already initialized
* - code modernization and cleanup
- better handling of dummies in aromatic rings
- exposed atomMembers() and bondMembers()
- added several tests
* - avoid order dependency on rings
- added test for the above
* changes in response to review
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* remove trailing spaces
* 3256: Envelope aromaticity not detected in complex fused system
Removes stopping point in aromaticity detection when all atoms
are "done". This also markedly improves the performance of
aromaticity detection for very large molecules - for example,
aromitization of 3EOH from the PDB was dominated by done atom
checking before this commit.
Some aromatic bonds were missed before this commit in complex fused
systems. This happened if all atoms in the fused system were also
in some smaller aromatic ring and there was at least one fused edge
that was single in the kekule form.
Some example molecules for which envelope aromaticity failed
before this commit:
c1cc2n(c1)c1cccn1c1cccn21
-> became c1cc2n(c1)-c1cccn1-c1cccn1-2 before this commit
c1cc2c3cc[nH]c3n3cccc3n2c1
-> became c1cc2n(c1)-c1cccn1-c1[nH]ccc1-2 before this commit
c1cc2c3cc[nH]c3c3cc[nH]c3n2c1
-> became c1cc2n(c1)-c1[nH]ccc1-c1[nH]ccc1-2 before this commit
Here's a similar example that didn't fail even before this
commit. The central ring only shares double bonds with the
exterior rings.
* c1cc2c([nH]1)c1cc[nH]c1c1cc[nH]c21
Requires updates to some MQN descriptors tests because some
bonds become aromatic (MQN includes counts of single and
double bonds of kekule form).
FWIW, for the molecule that had a change in counts, the counts
were incorrect both before and after this commit, because
MQN uses an approximation (dividing aromatic bonds evenly
between single and double bonds) to avoid kekulization.
This approximation is invalid when there are oodles of
nitrogens lone pairs participating in the aromatic
bonds.
(the failing line was 2558 in aromat_regress.txt: Cc1cc2n(n1)c1cc(C)nn1c1c(C=O)c(C)nn21)
* Detect envelope aromaticity in fused systems
In #3253, we proposed removing doneAtoms for performance, and it was
noted that it also fixed detection of envelope aromaticity in some
fused systems. However, when I completely removed doneAtoms, I saw
hangs in sanitization of things like nanotubes. Using doneBonds
allows envelope aromaticity, while preserving a reasonable break
on runaway work for crazy molecules.
The performance issue was addressed by caching the ring bond
count.
Here are some sanitize timings on proteins from the RCSB PDB:
Before this commit:
* 3eoh 1.21s
* 2j3n 0.77s
* 1nks 0.053s
Afterwards:
* 3eoh 0.42s
* 2j3n 0.15s
* 1nks 0.046s
* Use boost::dynamic_bitset instead of unordered_set
To cound ring bonds.
* run clang-tidy with readability-braces-around-statements
clang-format the results
clean up all the parts that clang-tidy-8 broke
* fix problem on windows
* Fix for issue #1730
setAromaticity() now works even if there are aromatic atoms present and the relevat test case is added
* Removed setaromaticity flag
* Removes ATOM/BOND_SPTR in boost::graph in favor of raw pointers
* Actually delete atoms and bonds...
* RWMol::clear now calls destroy to handle atom/bond deletion
* Changes broken Atom lookup for windows/gcc
* Adds tests for running with valgrind
* Adds test designed for valgrind and molecule deletions
* Removes RNG, actually tests bond deletions
* update swig wrappers
* deal with most recent changes on the main branch
* backup
* Add a couple of more tests and an exclusion for triple bonds.
* expose the MDL aromaticity model to python and test it.
* exocyclic bonds should not “steal” electrons in the MDL model
* backup (partial) update for aromaticity model documentation
* add examples to testGithub1622 for aromatic and nonaromatic compounds
* updates to aromaticity model and docs based on additional information from @bannanc
* some additional examples from @bannanc
* add rule to allow exocyclic multiple bonds to disqualify an atom.
* minor doc update
* address some review comments
* add a SmilesParserParams object to prepare for this
* add a SmilesParserParams object to prepare for this
* add tests for the SmilesParseParmas
* support name parsing, should it be the default?
* rename CXNSmiles to CXSmiles;
add a spirit parser for CXSmiles coordinate that is at least syntax correct
* abandon boost::spirit for now; crude atom token parser
* support params in smiles parser (not tested, may not build)
* can read coords and atom labels along with mol names; crude, but works
* read coordinate bonds
* remove some compiler warnings with VS2015
* remove a bunch of compiler warnings on windows
* remove more warnings on windows
* remove more warnings on windows
* backup commit: first pass at parsing query features
* radical spec parsing
* handle attachment points using atom mapping
* switch to a named property for atom labels
* fix handling of the "A" atom query
* add functions to construct A and Q queries (needs more work)
* fix a problem created while cleaning up warnings earlier
* add some additional convenience functions for making generic atoms.
Still need M and to recognize these while writing CXSMILES
* add M queries; update some tests
* fix a linux compile problem
* get the cxsmiles stuff working in python; basic testing
* support "M" in CXSMILES
* first pass
* Fixes#623
* fix a merge problem
* move the aromaticity perception to a helper fn
* python doc update
* replace setSimpleAromaticity() with a parameter to setAromaticity()
* add simple test for the custom aromaticity function
o rdkit gains a RDKit::common_properties namespace that contains common string value properties
o Dict.h and below gain getPropIfPresent that attempts to retrieve a property and returns
true/false on success or failure. This is used to optimize access.
o rdkit learns how to pass property keys by reference, not value.
A new namespace has been added to RDKit, common_properties
that contains the std::string values for commonly used
properties. This helps to avoid typos in string values
but also avoids a creation of std::strings from character
values. All accessors (has/get/clear and getPropIfPresent) now pass
the key by reference.
Additionally, getPropIfPresent removes the double lookup
of hasProp/getProp which can be a significant speedup
in the smiles and smarts parsers (10-20%)