* Updated cartridge documentation
Made examples compatible with latest chembl (25) and most recent conda versions of rdkit (2019.03.4.0, python 3.6.9) + rdkit-postgresql (2019.03.4.0)
* updated more query results
* updated more query results
So far, the installation instructions for Linux rely on a specific
python version in order to include numpy. Replace the hard-coded path
with a call to numpy.get_include().
Greg already had a look at this and mailed me this:
"The approach you suggest is certainly simpler. Unfortunately, it's also a lot slower. Here's timing information for converting a single fingerprint on my linux box:
In [6]: %timeit numpy.asarray(fp)
1.59 ms ± 5.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [8]: %timeit arr=numpy.zeros((1,));DataStructs.ConvertToNumpyArray(fp,arr)
48.5 µs ± 2.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The simple approach takes about 300 times as long.
This likely isn't a performance critical piece of code, so I would be willing to willing to make the change anyway."
* update version of japanese docs
* Remove external labels from cores
* Fix syntax errors
* Add better autodetection of labels, add dummyatom label, don't fall back to indexes when onlyMatchAtRgroups are set
* Add better autodetection of labels, add dummyatom label, don't fall back to indexes when onlyMatchAtRgroups are set
* Move autodetection before alignment, fix final core labelling
* Fix stupid bit twiddling mistake
* None of the original mol's should actually match the cores with onlyMatchAtRgroups
* Convert PRECONDITION to CHECK_INVARIANT
* Run clang-format
* use nullptr instead of 0 for pointers
* Handle cases where molecules don't have anything for an R-group properly.
Here's the python demo of the bug:
```
In [14]: scaffold2 = Chem.MolFromSmiles('c1c([*:1])cncn1')
In [15]: scaffold = Chem.MolFromSmiles('c1c([*:1])cccn1')
In [19]: mols2 = [Chem.MolFromSmiles(smi) for smi in 'c1c(F)cc(O)cn1 c1c(F)cncn1 c1c(Cl)cc(O)cn1'.split()]
In [20]: print(rdRGroupDecomposition.RGroupDecompose([scaffold,scaffold2],mols2,asSmiles=True,asRows=False))
({'Core': ['c1ncc([*:2])cc1[*:1]', 'c1ncc([*:1])cn1', 'c1ncc([*:2])cc1[*:1]'], 'R1': ['F[*:1]', 'F[*:1]', 'Cl[*:1]'], 'R2': ['[H]O[*:2]', '[H]O[*:2]', '']}, [])
```
* Fixes#2471
* Potential implementation of copying enhanced stereo groups
Copies the enhanced stereo if all atoms in the reactant
end up in the same molecule of the product with valid
ChiralTags.
Current implementation: Only copy StereoGroup if all atoms are "valid" in the product.
Possible implementation: Copy StereoGroup for all atoms that are "valid" in the product.
Details:
Uses ChiralTag invalidation to decide whether StereoGroup should be copied. If
the product atoms have valid ChiralTag, then the reaction was able to
meaningfully propogate chirality from the reactant to the product. This means
that it is also meaningful to propogate the StereoGroup from the reactant to
the product.
The only exception to this is if the product template defines a specific
absolute configuration for an atom. This means that the reaction defines the
stereochemistry for the atom, so the stereochemistry of that atom is no longer
relative.
If an atom from a reactant StereoGroup appears multiple times in the product,
all copies of that atom are put in the same product StereoGroup.
Still developing test cases.
from rdkit import Chem
from rdkit.Chem import AllChem
# Duplicate a molecule example:
mol1 = Chem.MolFromSmiles('Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4|')
mol2 = Chem.MolFromSmiles('CC(=O)C')
rxn = AllChem.ReactionFromSmarts('[O:1].[C:2]=O>>[O:1][C:2][O:1]')
for prods in rxn.RunReactants([mol1, mol2]):
for p in prods:
for a in p.GetAtoms():
for k in a.GetPropsAsDict():
a.ClearProp(k)
print(Chem.MolToCXSmiles(p))
Output:
[21:26:08] product atom-mapping number 1 found multiple times.
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18
* Issue 2366: Documentation and fix stereo group invalidation
Adds some documentation to EnhancedStereo.md
Also invalidates StereoGroup if a reaction specifies the
stereochemistry of a center. This destroys the relative
relationship of the center to other centers.
* Demo python file examples for Enhanced Stereochemistry in reactions
This is not intended to be pushed. These probably will become test
cases. For the output looks like this:
0a. Reaction preserves stereo:
[C@:1]>>[C@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@H](Cl)Br |o1:1|
0b. Reaction preserves stereo:
[C@:1]>>[C@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br |&1:1|
0c. Reaction preserves stereo:
[C@:1]>>[C@:1]
FC(Cl)Br
>>
FC(Cl)Br
1a. Reaction ignores stereo:
[C:1]>>[C:1]
F[C@H](Cl)Br |a:1|
>>
F[C@H](Cl)Br |a:1|
1b. Reaction ignores stereo:
[C:1]>>[C:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br |&1:1|
1c. Reaction ignores stereo:
[C:1]>>[C:1]
FC(Cl)Br
>>
FC(Cl)Br
2a. Reaction inverts stereo:
[C@:1]>>[C@@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@@H](Cl)Br |o1:1|
2b. Reaction inverts stereo:
[C@:1]>>[C@@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@H](Cl)Br |&1:1|
2c. Reaction inverts stereo:
[C@:1]>>[C@@:1]
FC(Cl)Br
>>
FC(Cl)Br
3a. Reaction destroys stereo:
[C@:1]>>[C:1]
F[C@H](Cl)Br |o1:1|
>>
FC(Cl)Br
3b. Reaction destroys stereo:
[C@:1]>>[C:1]
F[C@@H](Cl)Br |&1:1|
>>
FC(Cl)Br
3c. Reaction destroys stereo:
[C@:1]>>[C:1]
FC(Cl)Br
>>
FC(Cl)Br
3d. Reaction destroys stereo (but preserves unaffected group):
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
>>
FC(Cl)[C@@H](Cl)Br |&1:3|
3e. Reaction destroys stereo:
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |&1:1,3|
>>
FC(Cl)[C@@H](Cl)Br
4a. Reaction creates stereo:
[C:1]>>[C@@:1]
F[C@H](Cl)Br |o1:1|
>>
F[C@@H](Cl)Br
4b. Reaction creates stereo:
[C:1]>>[C@@:1]
F[C@@H](Cl)Br |&1:1|
>>
F[C@@H](Cl)Br
4c. Reaction creates stereo:
[C:1]>>[C@@:1]
FC(Cl)Br
>>
F[C@@H](Cl)Br
4d. Reaction creates stereo (preserve unaffected group):
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |&1:3|
4e. Reaction creates stereo:
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:1,3|
>>
F[C@@H](Cl)[C@@H](Cl)Br
5a. Reaction preserves unrelated stereo:
[C@:1]F>>[C@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
5b. Reaction ignores unrelated stereo:
[C:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
5c. Reaction inverts unrelated stereo:
[C@:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |o1:3|
5d. Reaction destroys unrelated stereo:
[C@:1]F>>[C:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
FC(Cl)[C@@H](Cl)Br |o1:3|
5e. Reaction creates unrelated stereo:
[C:1]F>>[C@@:1]F
F[C@H](Cl)[C@@H](Cl)Br |o1:3|
>>
F[C@@H](Cl)[C@@H](Cl)Br |o1:3|
6e. Reaction splits StereoGroup atoms into two Mols:
[C:1]OO[C:2]>>[C:2]O.O[C:1]
F[C@H](Cl)OO[C@@H](Cl)Br |o1:1,5|
>>
O[C@@H](Cl)Br + O[C@H](F)Cl
>>
O[C@H](F)Cl + O[C@@H](Cl)Br
7. Add two copies:
[O:1].[C:2]=O>>[O:1][C:2][O:1]
Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
[17:15:38] product atom-mapping number 1 found multiple times.
>>
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|
8. Add two copies:
[O:1].[C:2]=O>>[O:1][C:2][O:1]
Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
[17:15:38] product atom-mapping number 1 found multiple times.
>>
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|
* Updates StereoGroup strategy in reactions to copy all possible atoms.
Copy all atoms for which the stereochemistry was not created or destroyed
in the reaction. Any StereoGroup which has at least one atom will appear
in the product.
Also updates the documentation to match this description, and adds C++
and Python tests which fail before this PR and pass after. The Python
tests are more extensive.
Test output was validated by hand (especially the stereo groups
generated. I'm less confident in the reaction processing in my head,
but I truested the existing validation there.)
For future diagnosis: Python unittest failures will look like:
AssertionError: 'F[C@H](Cl)Br' != 'F[C@H](Cl)Br |&1:1|'
- F[C@H](Cl)Br
+ F[C@H](Cl)Br |&1:1|
? +++++++
For future diagnosis: C++ Catch2 failures will look like:
CHECK( MolToCXSmiles(*p) == "F[C@H](Cl)Br |o1:1|" )
with expansion:
"FC(Cl)[C@@H](Cl)Br |&1:3|"
==
"F[C@H](Cl)Br |o1:1|"
* Add a couple of new tests.
* rename "relative" to "enhanced"
some reformatting
* Factor out test helper function.
* Actually, enhanced stereo groups are exposed ot Python
* Added discussion of enhanced stereochemistry in reactions to docs
* Fix new test
* Implementation of SGroups
* remove sample files test
* update gitignore with test outputs
* fix RevisionModifier
* re-enable tests
* backup commit; things seem to work so far
* some refactoring; obvious s group tests pass now
* more refactoring
* everything now out of the public API
* not sure why this was still in there
* rename functions; all tests now pass
* remove getNextFreeSGroupId; readd comment in copy SGroups
* clang-format
* squash-merge current master
* squash merge master
* Address comments on PR
- Update to current master.
- Move SGroup parse time checks to SGroupChecks namespace.
- Store SGroups in ROMOl as vector<SGroups>.
- SGroup methods return referenes instead of pointers.
- Use atom/bond/sgroup indexes for properties instead of pointers.
- Have SGroups inherit from RDProps; move properties to RDProps.
- Remove trivial/unused methods.
- Add a link to the SD specification atop SGroup.h
* add documents translated into Japanese
* add conf.py
* change links to images
* change Translation_into Japanese to Translation_into_Japanese
* move japanese translation to Book_jp
* update image paths
* makefile and config
* remove Book/Translation_into_Japanese
* Fingerprint generator first prototype
* Added some more details to the prototype
* Update based on comments
* Added additional outputs and return type changes
* FingerprintGenerator updated and placeholder implementation added
* Added getFingerprint implementation to FingerprintGenerator
* Added comments for FingerprintGenerator
* WIP: Atom pairs fingerprint implementation for FingerprintGenerator
* Removed templates and added comments
* Fixed AtomPairEnvGenerator creating duplicate environments
* Added a atom pair old version compatibility test
* Moved the FingerprintGenerator related tests to a new file
* Added new comments and changes from the PR comments
* using int types from std instead of boost and remove cleanUpEnvironments
* Minor refactoring for atom-pair atom code generation
* Added more tests for AtomPairGenerator
* Removed additional clean up method from FingerprinGenerator
* Added additional output for atom-pair fingerprint and a test
* Removed leftover code
* Default argument changes
* Removed leftover include
* Default invariant generation logic seperated from env generation logic for AtomPairs
* Implemented fingerprint as bit vector type and added the test for it
* Folded fingerprint implementation and a test case added
* String representation for fingerprint generator is added
* Python wrapper for fingeprint generator added with a simple test
* Removed unused linked libraries
* AtomPair related wrapper code moved to its own file
* Python wrapper methods for different fingerprint output types added
* Wrappers for invariants generators and tests are added
* Added more comments and tests
* Changed python side names for FIngerprintGenerator and removed extra wrappers used for invariant generators on python
* Fixed object lifetime problems for invariant generators in Python
* Fixed typo
* Added a list of test molecules and made fingerprint generator related classes noncopyable
* Morgan fingerprint python wrappers
* Removed argument helper class for wrapper
* Morgan Fingerprint simple implementation
* Added more invariants generators for Morgan
* Fixed a bug in Morgan bond invariant generator
* Added invariant generator combination tests
* Added atom pair generator to the invariant generator combination test
* Fixed a problem in morgan feature invariant generator
* Overriding invariants without generators is made possible
* Added comments and documentation
* Radius changed for morgan fingerprint test
* RDKit fingerprint generator implementation with cpp tests
* 32 bit and 64 bit fingerprint support for FingerprintGenerator
* Common utilities moved to FingerprintUtil.h and code duplication reduced
* Solved undefined reference issues for FPGenerator templates
* Topological torsion fp generator added
* Fingerprint notebook added
* Python wrappers updated
* Morgan tests added
* Tests expanded and reduced excess amounts of collision in folded output
* Expanded tests
* More documentation
* Python docs for atom pair
* Updated fingerprint generator notebook
* Python wrapper documentation added
* Seperated FingerprintGenerator implementations into seperated file again
* Python wrapper names updated to reflect new naming
* getCountFingerprint now returns 32 bit output and count simulation does not affect count fingerprints
* Python 3 compatibility for fingerprint generator tests
* a bit of ABC cleanup
* some comment formatting got screwed up
* <sigh>
* fix an uninitialized memory problem
* Added copyright statement to new files
* Corrected some comments and docs according to the latest changes
* Bulk fingerprint generation and tests
* Convenience function wrappers and size limiting for getSparseFingerprint
* Copyright text fixed
* Info string added to python wrappers
* Some changes to get the swig wrappers building again
* Fixes#2057
docs still need to be updated
* docs update
* Update getting started doc.
This still needs to have the doctests run and should probably be
proofread and tweaked
* some doc updates
* change in response to review
* add a couple test files
* backup
* first pass at some theory documentatin
* it's a draft
* Update enhanced stereochemistry documentation
Adds initial target use case and caveats about the tentative
nature of the current implementation.
* Support read/write of molfile enhanced stereochemistry
This includes reading and writing of enhanced stereochemistry
from v3000 molfiles (sdf). Enhanced stereochemistry encodes
the relative configuration of stereocenters, allowing
representation of racemic mixtures and compounds with
unknown absolute stereochemistry.
It does not include:
* Python wrapping
* invalidation of the enhanced stereochemistry
* use of enhanced stereochemistry in search
* depiction of enhanced stereochemistry.
* Update to reflect changes from #1971
* change names of enum elements to allow compilation in VS2017
I think it's also clearer to do things this way
* Addressed most review comments.
* Run missed test "testEnhancedStereoChemistry"
* In tests, added size checks to group equality checks
* Updated copyright statements
* Deleted mol created for a test
* Use perfect forwarding in RWMol::setStereoGroups()
* use references for stereo groups that are checked in write and pickle
* Updated stereogroup.h in hopes of fixing compilation on Windows.
* clang-format
* try allowing a switch to boost regex and requiring it for g++-4.8
* do a better job of that
* typo
* Code review comments. Updated Copyright notice.
* When an atom is deleted, delete stereo groups containing it.
Also updates StereoGroup toUse accessors instead of
constant member attributes. This allows move of StereoGroups.
* RDKit style guide
* Add header required on Windows.
* get the SWIG wrappers to build
* add basic fingerprint bit rendering code
This was inspired by/adapted from the CheTo code from Nadine
* add some tests
* add documentation
add option to control which example of the bit is used
Also fixes a couple other lines that didn't get parsed as code blocks.
Not sure whether or not you're OK pointing users to the unofficial conda-forge feedstock -- the installation is probably easier via the conda-forge package if you already are using other package supplied via conda-forge. But those users might already know about the rdkit package?
* Update Install.md
* Allow optimization in linux/anaconda build.
Remove '-DRDK_OPTIMIZE_NATIVE=OFF' from linux build in anaconda env, to allow optimization.
* Set path to numpy headers for cmake build
Set PYTHON_NUMPY_INCLUDE_PATH variable to find numpy headers in linux/anaconda build.
* Updated note on numpy headers
Replaced instructions to link numpy headers into anaconda/include dir with note on the headers being hidden inside the package.
* this is a rough first pass, needs to be finished and is a strong argument for changing the names of some of the #defines that are currently used
* rationalize the rest of the #defines
add something to the release notes about this