* add a couple test files * backup * first pass at some theory documentatin * it's a draft * Update enhanced stereochemistry documentation Adds initial target use case and caveats about the tentative nature of the current implementation. * Support read/write of molfile enhanced stereochemistry This includes reading and writing of enhanced stereochemistry from v3000 molfiles (sdf). Enhanced stereochemistry encodes the relative configuration of stereocenters, allowing representation of racemic mixtures and compounds with unknown absolute stereochemistry. It does not include: * Python wrapping * invalidation of the enhanced stereochemistry * use of enhanced stereochemistry in search * depiction of enhanced stereochemistry. * Update to reflect changes from #1971 * change names of enum elements to allow compilation in VS2017 I think it's also clearer to do things this way * Addressed most review comments. * Run missed test "testEnhancedStereoChemistry" * In tests, added size checks to group equality checks * Updated copyright statements * Deleted mol created for a test * Use perfect forwarding in RWMol::setStereoGroups() * use references for stereo groups that are checked in write and pickle * Updated stereogroup.h in hopes of fixing compilation on Windows. * clang-format * try allowing a switch to boost regex and requiring it for g++-4.8 * do a better job of that * typo * Code review comments. Updated Copyright notice. * When an atom is deleted, delete stereo groups containing it. Also updates StereoGroup toUse accessors instead of constant member attributes. This allows move of StereoGroups. * RDKit style guide * Add header required on Windows. * get the SWIG wrappers to build
2.5 KiB
Enhanced Stereochemistry in the RDKit
Greg Landrum (greg.landrum@t5informatics.com)
September 2018
This is still a DRAFT
Overview
We are going to follow, at least for the initial implementation, the enhanced stereo representation used in V3k mol files: groups of atoms with specified stereochemistry with an ABS, AND, or OR marker indicating what is known. The general idea is that AND indicates mixtures and OR indicates unknown single substances.
Here are some illustrations of what the various combinations mean:
| What's drawn | Mixture? | What it means |
|---|---|---|
![]() |
mixture | ![]() |
![]() |
mixture | ![]() |
![]() |
mixture | ![]() |
![]() |
single | ![]() |
![]() |
single | ![]() |
![]() |
single | ![]() |
![]() |
mixture | ![]() |
![]() |
single | ![]() |
Use cases
The initial target is to not lose data on an V3k mol -> RDKit -> V3k mol round trip. Manipulation, depiction, and searching is a secondary goal.
Representation
Stored as a vector of StereoGroup objects.
A StereoGroup contains an enum with the type as well as pointers to the atoms involved. We will need to adjust this when atoms are removed or replaced. StereoGroups are not exposed to Python, as the implementation is still tentative.
Enumeration
The existing stereoisomer enumeration code needs to be updated to handle enhanced stereo groups correctly. This is the key piece for canonicalization and substructure searching.
We probably need to add an option to allow enumeration only of stereo groups (to ignore unspecified centers).
Searching
This will be handled by searching in MolBundle objects produced by the enumeration code.
Depiction
Something needs to be added to the depiction code to allow the groups to be seen.











