Files
rdkit/Docs/Code/EnhancedStereo.md
Dan N eaa44b40c2 Enhanced stereo read/write support in SDF files. (#2022)
* add a couple test files

* backup

* first pass at some theory documentatin

* it's a draft

* Update enhanced stereochemistry documentation

Adds initial target use case and caveats about the tentative
nature of the current implementation.

* Support read/write of molfile enhanced stereochemistry

This includes reading and writing of enhanced stereochemistry
from v3000 molfiles (sdf). Enhanced stereochemistry encodes
the relative configuration of stereocenters, allowing
representation of racemic mixtures and compounds with
unknown absolute stereochemistry.

It does not include:
* Python wrapping
* invalidation of the enhanced stereochemistry
* use of enhanced stereochemistry in search
* depiction of enhanced stereochemistry.

* Update to reflect changes from #1971

* change names of enum elements to allow compilation in VS2017

I think it's also clearer to do things this way

* Addressed most review comments.

* Run missed test "testEnhancedStereoChemistry"
* In tests, added size checks to group equality checks
* Updated copyright statements
* Deleted mol created for a test
* Use perfect forwarding in RWMol::setStereoGroups()
* use references for stereo groups that are checked in write and pickle
* Updated stereogroup.h in hopes of fixing compilation on Windows.
* clang-format

* try allowing a switch to boost regex and requiring it for g++-4.8

* do a better job of that

* typo

* Code review comments. Updated Copyright notice.

* When an atom is deleted, delete stereo groups containing it.

Also updates StereoGroup toUse accessors instead of
constant member attributes. This allows move of StereoGroups.

* RDKit style guide

* Add header required on Windows.

* get the SWIG wrappers to build
2018-09-26 15:44:23 +02:00

2.5 KiB

Enhanced Stereochemistry in the RDKit

Greg Landrum (greg.landrum@t5informatics.com)

September 2018

This is still a DRAFT

Overview

We are going to follow, at least for the initial implementation, the enhanced stereo representation used in V3k mol files: groups of atoms with specified stereochemistry with an ABS, AND, or OR marker indicating what is known. The general idea is that AND indicates mixtures and OR indicates unknown single substances.

Here are some illustrations of what the various combinations mean:

What's drawn Mixture? What it means
img1a mixture img1b
img2a mixture img2b
img3a mixture img3b
img4a single img4b
img5a single img5b
img5a single img5b
img6a mixture img6b
img7a single img7b

Use cases

The initial target is to not lose data on an V3k mol -> RDKit -> V3k mol round trip. Manipulation, depiction, and searching is a secondary goal.

Representation

Stored as a vector of StereoGroup objects.

A StereoGroup contains an enum with the type as well as pointers to the atoms involved. We will need to adjust this when atoms are removed or replaced. StereoGroups are not exposed to Python, as the implementation is still tentative.

Enumeration

The existing stereoisomer enumeration code needs to be updated to handle enhanced stereo groups correctly. This is the key piece for canonicalization and substructure searching.

We probably need to add an option to allow enumeration only of stereo groups (to ignore unspecified centers).

Searching

This will be handled by searching in MolBundle objects produced by the enumeration code.

Depiction

Something needs to be added to the depiction code to allow the groups to be seen.