* Issue #2403: Speed up SSSR symmetrization
For my horrible example molecule (a highly symmetric
nanotube with 2400 atoms and > 1000 rings), this speeds up
symmetrizeSSSR() from 5s to about 0.002s. findSSSR() takes
another .4s or so.
* Refactor after Ricardo's suggestions
* Greg's review comments. use std::vector
This removes some redundancy from some of the test code in order to bring
the runtime down. This does not affect test coverage and shouldn't do
anything bad to the overall test quality.
* Potential implementation of copying enhanced stereo groups
Copies the enhanced stereo if all atoms in the reactant
end up in the same molecule of the product with valid
ChiralTags.
Current implementation: Only copy StereoGroup if all atoms are "valid" in the product.
Possible implementation: Copy StereoGroup for all atoms that are "valid" in the product.
Details:
Uses ChiralTag invalidation to decide whether StereoGroup should be copied. If
the product atoms have valid ChiralTag, then the reaction was able to
meaningfully propogate chirality from the reactant to the product. This means
that it is also meaningful to propogate the StereoGroup from the reactant to
the product.
The only exception to this is if the product template defines a specific
absolute configuration for an atom. This means that the reaction defines the
stereochemistry for the atom, so the stereochemistry of that atom is no longer
relative.
If an atom from a reactant StereoGroup appears multiple times in the product,
all copies of that atom are put in the same product StereoGroup.
Still developing test cases.
from rdkit import Chem
from rdkit.Chem import AllChem
# Duplicate a molecule example:
mol1 = Chem.MolFromSmiles('Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4|')
mol2 = Chem.MolFromSmiles('CC(=O)C')
rxn = AllChem.ReactionFromSmarts('[O:1].[C:2]=O>>[O:1][C:2][O:1]')
for prods in rxn.RunReactants([mol1, mol2]):
for p in prods:
for a in p.GetAtoms():
for k in a.GetPropsAsDict():
a.ClearProp(k)
print(Chem.MolToCXSmiles(p))
Output:
[21:26:08] product atom-mapping number 1 found multiple times.
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18
* Issue 2366: Documentation and fix stereo group invalidation
Adds some documentation to EnhancedStereo.md
Also invalidates StereoGroup if a reaction specifies the
stereochemistry of a center. This destroys the relative
relationship of the center to other centers.
* Demo python file examples for Enhanced Stereochemistry in reactions
This is not intended to be pushed. These probably will become test
cases. For the output looks like this:
0a. Reaction preserves stereo:
[C@:1]>>[C@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@H](Cl)Br |o1:1|
0b. Reaction preserves stereo:
[C@:1]>>[C@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br |&1:1|
0c. Reaction preserves stereo:
[C@:1]>>[C@:1]
FC(Cl)Br
>>
FC(Cl)Br
1a. Reaction ignores stereo:
[C:1]>>[C:1]
F[C@H](Cl)Br |a:1|
>>
F[C@H](Cl)Br |a:1|
1b. Reaction ignores stereo:
[C:1]>>[C:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br |&1:1|
1c. Reaction ignores stereo:
[C:1]>>[C:1]
FC(Cl)Br
>>
FC(Cl)Br
2a. Reaction inverts stereo:
[C@:1]>>[C@@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@@H](Cl)Br |o1:1|
2b. Reaction inverts stereo:
[C@:1]>>[C@@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@H](Cl)Br |&1:1|
2c. Reaction inverts stereo:
[C@:1]>>[C@@:1]
FC(Cl)Br
>>
FC(Cl)Br
3a. Reaction destroys stereo:
[C@:1]>>[C:1]
F[C@H](Cl)Br |o1:1|
>>
FC(Cl)Br
3b. Reaction destroys stereo:
[C@:1]>>[C:1]
F[C@@H](Cl)Br |&1:1|
>>
FC(Cl)Br
3c. Reaction destroys stereo:
[C@:1]>>[C:1]
FC(Cl)Br
>>
FC(Cl)Br
3d. Reaction destroys stereo (but preserves unaffected group):
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
>>
FC(Cl)[C@@H](Cl)Br |&1:3|
3e. Reaction destroys stereo:
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |&1:1,3|
>>
FC(Cl)[C@@H](Cl)Br
4a. Reaction creates stereo:
[C:1]>>[C@@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@@H](Cl)Br
4b. Reaction creates stereo:
[C:1]>>[C@@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br
4c. Reaction creates stereo:
[C:1]>>[C@@:1]
FC(Cl)Br
>>
F[C@@H](Cl)Br
4d. Reaction creates stereo (preserve unaffected group):
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |&1:3|
4e. Reaction creates stereo:
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,3|
>>
F[C@@H](Cl)[C@@H](Cl)Br
5a. Reaction preserves unrelated stereo:
[C@:1]F>>[C@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
5b. Reaction ignores unrelated stereo:
[C:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
5c. Reaction inverts unrelated stereo:
[C@:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |o1:3|
5d. Reaction destroys unrelated stereo:
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
FC(Cl)[C@@H](Cl)Br |o1:3|
5e. Reaction creates unrelated stereo:
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |o1:3|
6e. Reaction splits StereoGroup atoms into two Mols:
[C:1]OO[C:2]>>[C:2]O.O[C:1]
F[C@H](Cl)OO[C@@H](Cl)Br |o1:1,5|
>>
O[C@@H](Cl)Br + O[C@H](F)Cl
>>
O[C@H](F)Cl + O[C@@H](Cl)Br
7. Add two copies:
[O:1].[C:2]=O>>[O:1][C:2][O:1]
Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
[17:15:38] product atom-mapping number 1 found multiple times.
>>
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|
8. Add two copies:
[O:1].[C:2]=O>>[O:1][C:2][O:1]
Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
[17:15:38] product atom-mapping number 1 found multiple times.
>>
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|
* Updates StereoGroup strategy in reactions to copy all possible atoms.
Copy all atoms for which the stereochemistry was not created or destroyed
in the reaction. Any StereoGroup which has at least one atom will appear
in the product.
Also updates the documentation to match this description, and adds C++
and Python tests which fail before this PR and pass after. The Python
tests are more extensive.
Test output was validated by hand (especially the stereo groups
generated. I'm less confident in the reaction processing in my head,
but I truested the existing validation there.)
For future diagnosis: Python unittest failures will look like:
AssertionError: 'F[C@H](Cl)Br' != 'F[C@H](Cl)Br |&1:1|'
- F[C@H](Cl)Br
+ F[C@H](Cl)Br |&1:1|
? +++++++
For future diagnosis: C++ Catch2 failures will look like:
CHECK( MolToCXSmiles(*p) == "F[C@H](Cl)Br |o1:1|" )
with expansion:
"FC(Cl)[C@@H](Cl)Br |&1:3|"
==
"F[C@H](Cl)Br |o1:1|"
* Add a couple of new tests.
* rename "relative" to "enhanced"
some reformatting
* Factor out test helper function.
* Actually, enhanced stereo groups are exposed ot Python
* Added discussion of enhanced stereochemistry in reactions to docs
* Fix new test
* Fixes#2311
at least I hope it does
* Stop using deprecated boost functionality
* allow the Murtagh module to import even if the code isn't built
update the associated tests
* update release notes
* typo
* fix integer division
* Fixes#2383 (tests coming in the next commit)
Minor typo fix
Fixes a "bug" in one of the default transforms
* Adds support for directly providing normalization parameter data
instead of requiring the use of a text file.
* allow fragment removers to be initialized with string data
* remove unicode
* allow the reionizer to be initialized from a stream
* modify the uncharger to be use a canonical atom ordering
* add doCanonical cleanup parameter
make canonical ordering the default
document the change
* Add neutralization of additonal negative groups (not just acids).
This may not be the right thing to do.
* expose the new parameter to python
* changes in response to review
* Move DetectAtomStereoChemistry to Molops::assignChiralTypesFromBondDirs
DetectAtomStereoChemistry in MolFileStereochem is more broadly
useful. Additionally, it was not named very clearly for what
it was actually doing.
* Wraps assignChiralTypesFromBondDirs for use in Python
Makes assignChiralTypesFromBondDirs available in Python
and adds a test demonstrating that availability.
* Correct typo in thiophene_E pattern, !H0,!H1 is "always true" should be !H0!H1
* Errors in ring closure translation from original SLN.
* Make queries agnostic to aromaticity model.
* Redundant recursive SMARTS
* More queries that benefit from optional aromaticity.
* Update the (.in) files from previous commit.
* Update thiaz_ene_A inline with CSV file.
* Move RDBoostStreams to RDStreams
* RDBoostStreams->RDStreams
* RDBoostStreams->RDStreams
* Wrap SWIG (with Java test)
* Fix missing declaration
* Use the file that already exists
* Revert to original version
* Revert to CXSMiles version
* Update boost version
* Remove redundant code
* Add zlib
* check for win32
* FileParsers now builds static on windows
* added a set of test files for SGroups.
Many thanks to Gerd Blanke for providing these
* Partial version of the wrapper
Definitely needs more work
* add some properties
* basic SGroup property change test
* not working; backup commit
* disable writing for now
* add ClearMolSGroups() function
* review response: add a couple missing methods
* remove spaces from filenames
* update filename in test
* changes in response to review
* add operator== to SGroups
* solve lifetime problems with a vector_indexing_suite
* add SKIP_IF_ALL_MATCH argument to FragmentRemover
Refactor FragmentRemover::remove() to make it more efficient
* implement and test SKIP_IF_ALL_MATCH
* expose the extra option to Python
* add info to logger
* change to make the SWig builds work on windows
* add the wrapper. Still needs tests
* first rgd java wrapper test, does not pass
* get static builds working on windows
* unused vars in bison parser cleanup
* initialization order in TopologicalTorsionGenerator
* unused params in SLN bison
* sln flex unused params
* throwing destructor in TDTWriter
* signed comparison in substructmethods
* unused input param in smiles/smarts bison
* unused ms param in sln bison
* signed comparison in FingerprintGenerator
* store return of fscanf in StructCheckerOptions
* unreferenced var in catch
* uninitialized value in FileParserUtils
* avoid override overload warning in MolDraw2DSVG
* non-final overrides in Validate.h
* unused static var in Avalon
* unused vars in catch blocks
* make AvalonTools avalonSimilarityBits & avalonSSSBits const int
* assert fscanf result in StructCheckerOptions
* Allow Atoms to be copyied in Python.
The dunder copy method is the idiomatic way to support
making copies in Python. Includes a test to make sure that
copied atoms are usable.
* Use RWMol in Code/GraphMol/Wrap/rough_test.py
Co-Authored-By: d-b-w <dan.nealschneider@schrodinger.com>
* Allow access to an atom's copy constructor in Python
* Add writing of enhanced stereo to cxsmiles
* changes in response to review
* fix an interaction between the fragment and enhanced smiles bits
* fix a logic error
* remove all of the "from __future__" imports
* remove the first batch of rdkit.six imports/uses
* next step of rdkit.six removal
* removing xrange, range, and some maps
* next round of removals
* next round of cleanups
* fix inchi test
* last bits of "from rdkit.six" are gone
* and the last of the six stuff is gone
* strange importlib problem