* Potential implementation of copying enhanced stereo groups
Copies the enhanced stereo if all atoms in the reactant
end up in the same molecule of the product with valid
ChiralTags.
Current implementation: Only copy StereoGroup if all atoms are "valid" in the product.
Possible implementation: Copy StereoGroup for all atoms that are "valid" in the product.
Details:
Uses ChiralTag invalidation to decide whether StereoGroup should be copied. If
the product atoms have valid ChiralTag, then the reaction was able to
meaningfully propogate chirality from the reactant to the product. This means
that it is also meaningful to propogate the StereoGroup from the reactant to
the product.
The only exception to this is if the product template defines a specific
absolute configuration for an atom. This means that the reaction defines the
stereochemistry for the atom, so the stereochemistry of that atom is no longer
relative.
If an atom from a reactant StereoGroup appears multiple times in the product,
all copies of that atom are put in the same product StereoGroup.
Still developing test cases.
from rdkit import Chem
from rdkit.Chem import AllChem
# Duplicate a molecule example:
mol1 = Chem.MolFromSmiles('Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4|')
mol2 = Chem.MolFromSmiles('CC(=O)C')
rxn = AllChem.ReactionFromSmarts('[O:1].[C:2]=O>>[O:1][C:2][O:1]')
for prods in rxn.RunReactants([mol1, mol2]):
for p in prods:
for a in p.GetAtoms():
for k in a.GetPropsAsDict():
a.ClearProp(k)
print(Chem.MolToCXSmiles(p))
Output:
[21:26:08] product atom-mapping number 1 found multiple times.
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18
* Issue 2366: Documentation and fix stereo group invalidation
Adds some documentation to EnhancedStereo.md
Also invalidates StereoGroup if a reaction specifies the
stereochemistry of a center. This destroys the relative
relationship of the center to other centers.
* Demo python file examples for Enhanced Stereochemistry in reactions
This is not intended to be pushed. These probably will become test
cases. For the output looks like this:
0a. Reaction preserves stereo:
[C@:1]>>[C@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@H](Cl)Br |o1:1|
0b. Reaction preserves stereo:
[C@:1]>>[C@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br |&1:1|
0c. Reaction preserves stereo:
[C@:1]>>[C@:1]
FC(Cl)Br
>>
FC(Cl)Br
1a. Reaction ignores stereo:
[C:1]>>[C:1]
F[C@H](Cl)Br |a:1|
>>
F[C@H](Cl)Br |a:1|
1b. Reaction ignores stereo:
[C:1]>>[C:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br |&1:1|
1c. Reaction ignores stereo:
[C:1]>>[C:1]
FC(Cl)Br
>>
FC(Cl)Br
2a. Reaction inverts stereo:
[C@:1]>>[C@@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@@H](Cl)Br |o1:1|
2b. Reaction inverts stereo:
[C@:1]>>[C@@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@H](Cl)Br |&1:1|
2c. Reaction inverts stereo:
[C@:1]>>[C@@:1]
FC(Cl)Br
>>
FC(Cl)Br
3a. Reaction destroys stereo:
[C@:1]>>[C:1]
F[C@H](Cl)Br |o1:1|
>>
FC(Cl)Br
3b. Reaction destroys stereo:
[C@:1]>>[C:1]
F[C@@H](Cl)Br |&1:1|
>>
FC(Cl)Br
3c. Reaction destroys stereo:
[C@:1]>>[C:1]
FC(Cl)Br
>>
FC(Cl)Br
3d. Reaction destroys stereo (but preserves unaffected group):
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
>>
FC(Cl)[C@@H](Cl)Br |&1:3|
3e. Reaction destroys stereo:
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |&1:1,3|
>>
FC(Cl)[C@@H](Cl)Br
4a. Reaction creates stereo:
[C:1]>>[C@@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@@H](Cl)Br
4b. Reaction creates stereo:
[C:1]>>[C@@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br
4c. Reaction creates stereo:
[C:1]>>[C@@:1]
FC(Cl)Br
>>
F[C@@H](Cl)Br
4d. Reaction creates stereo (preserve unaffected group):
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |&1:3|
4e. Reaction creates stereo:
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,3|
>>
F[C@@H](Cl)[C@@H](Cl)Br
5a. Reaction preserves unrelated stereo:
[C@:1]F>>[C@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
5b. Reaction ignores unrelated stereo:
[C:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
5c. Reaction inverts unrelated stereo:
[C@:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |o1:3|
5d. Reaction destroys unrelated stereo:
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
FC(Cl)[C@@H](Cl)Br |o1:3|
5e. Reaction creates unrelated stereo:
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |o1:3|
6e. Reaction splits StereoGroup atoms into two Mols:
[C:1]OO[C:2]>>[C:2]O.O[C:1]
F[C@H](Cl)OO[C@@H](Cl)Br |o1:1,5|
>>
O[C@@H](Cl)Br + O[C@H](F)Cl
>>
O[C@H](F)Cl + O[C@@H](Cl)Br
7. Add two copies:
[O:1].[C:2]=O>>[O:1][C:2][O:1]
Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
[17:15:38] product atom-mapping number 1 found multiple times.
>>
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|
8. Add two copies:
[O:1].[C:2]=O>>[O:1][C:2][O:1]
Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
[17:15:38] product atom-mapping number 1 found multiple times.
>>
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|
* Updates StereoGroup strategy in reactions to copy all possible atoms.
Copy all atoms for which the stereochemistry was not created or destroyed
in the reaction. Any StereoGroup which has at least one atom will appear
in the product.
Also updates the documentation to match this description, and adds C++
and Python tests which fail before this PR and pass after. The Python
tests are more extensive.
Test output was validated by hand (especially the stereo groups
generated. I'm less confident in the reaction processing in my head,
but I truested the existing validation there.)
For future diagnosis: Python unittest failures will look like:
AssertionError: 'F[C@H](Cl)Br' != 'F[C@H](Cl)Br |&1:1|'
- F[C@H](Cl)Br
+ F[C@H](Cl)Br |&1:1|
? +++++++
For future diagnosis: C++ Catch2 failures will look like:
CHECK( MolToCXSmiles(*p) == "F[C@H](Cl)Br |o1:1|" )
with expansion:
"FC(Cl)[C@@H](Cl)Br |&1:3|"
==
"F[C@H](Cl)Br |o1:1|"
* Add a couple of new tests.
* rename "relative" to "enhanced"
some reformatting
* Factor out test helper function.
* Actually, enhanced stereo groups are exposed ot Python
* Added discussion of enhanced stereochemistry in reactions to docs
* Fix new test
2.8 KiB
Enhanced Stereochemistry in the RDKit
Greg Landrum (greg.landrum@t5informatics.com)
September 2018
This is still a DRAFT
Overview
We are going to follow, at least for the initial implementation, the enhanced stereo representation used in V3k mol files: groups of atoms with specified stereochemistry with an ABS, AND, or OR marker indicating what is known. The general idea is that AND indicates mixtures and OR indicates unknown single substances.
Here are some illustrations of what the various combinations mean:
| What's drawn | Mixture? | What it means |
|---|---|---|
![]() |
mixture | ![]() |
![]() |
mixture | ![]() |
![]() |
mixture | ![]() |
![]() |
single | ![]() |
![]() |
single | ![]() |
![]() |
single | ![]() |
![]() |
mixture | ![]() |
![]() |
single | ![]() |
Use cases
The initial target is to not lose data on an V3k mol -> RDKit -> V3k mol round trip. Manipulation, depiction, and searching is a secondary goal.
Representation
Stored as a vector of StereoGroup objects.
A StereoGroup contains an enum with the type as well as pointers to the atoms involved. We will need to adjust this when atoms are removed or replaced.
Enumeration
The Python-only function rdkit.Chem.EnumerateStereoisomers.EnumerateStereoisomers can be used to enumerate
all stereoisomers represented by the StereoGroups. It is possible to enumerate only the Stereo Groups,
ignoring other unspecified centers.
Reactions
The reaction runner is aware of stereo groups associated with a Mol. Atoms in a Stereo Group are copied to all products in which they appear, unless the reaction created or destroyed stereochemistry at that atom. If an atom from a reactant StereoGroup appears multiple times in the product, all copies of that atom are put in the same product StereoGroup.
Searching
This will be handled by searching in MolBundle objects produced by the enumeration code. Substructure searching and canonicalization do not yet support stereo groups.
Depiction
Something needs to be added to the depiction code to allow the groups to be seen.











