Commit Graph

2764 Commits

Author SHA1 Message Date
Greg Landrum
d0c8c3cf8f Fixes #2411 and #2414 (#2412)
* clang-tidy-7 pass

* Fixes #2411

* Fixes #2414
2019-04-19 21:51:41 -04:00
Dan N
47acdc8b73 Issue #2403: Speed up SSSR symmetrization (#2410)
* Issue #2403: Speed up SSSR symmetrization

For my horrible example molecule (a highly symmetric
nanotube with 2400 atoms and > 1000 rings), this speeds up
symmetrizeSSSR() from 5s to about 0.002s. findSSSR() takes
another .4s or so.

* Refactor after Ricardo's suggestions

* Greg's review comments. use std::vector
2019-04-18 07:11:15 +02:00
Greg Landrum
ec31bea97b clang-tidy-7 pass (#2408) 2019-04-16 12:05:47 -04:00
Ric
d0520ef2a6 fix (#2406) 2019-04-15 14:15:51 +01:00
Paolo Tosco
6046bb2d03 - avoid a build failure on Windows (#2399)
* - avoid a build failure on Windows

* - fix compilation error when building compressed suppliers

* - fix accidental commit

* - proper fix
2019-04-07 11:38:26 +02:00
Greg Landrum
09325d32ff Speed up some of the tests (#2398)
This removes some redundancy from some of the test code in order to bring
the runtime down. This does not affect test coverage and shouldn't do
anything bad to the overall test quality.
2019-04-07 06:06:51 +02:00
Dan N
2bcb7ea692 I2366 Preserve enhanced stereo in reactions (#2377)
* Potential implementation of copying enhanced stereo groups

Copies the enhanced stereo if all atoms in the reactant
end up in the same molecule of the product with valid
ChiralTags.

Current implementation: Only copy StereoGroup if all atoms are "valid" in the product.
Possible implementation: Copy StereoGroup for all atoms that are "valid" in the product.

Details:
Uses ChiralTag invalidation to decide whether StereoGroup should be copied. If
the product atoms have valid ChiralTag, then the reaction was able to
meaningfully propogate chirality from the reactant to the product. This means
that it is also meaningful to propogate the StereoGroup from the reactant to
the product.

The only exception to this is if the product template defines a specific
absolute configuration for an atom. This means that the reaction defines the
stereochemistry for the atom, so the stereochemistry of that atom is no longer
relative.

If an atom from a reactant StereoGroup appears multiple times in the product,
all copies of that atom are put in the same product StereoGroup.

Still developing test cases.

    from rdkit import Chem
    from rdkit.Chem import AllChem

    # Duplicate a molecule example:
    mol1 = Chem.MolFromSmiles('Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4|')
    mol2 = Chem.MolFromSmiles('CC(=O)C')
    rxn = AllChem.ReactionFromSmarts('[O:1].[C:2]=O>>[O:1][C:2][O:1]')
    for prods in rxn.RunReactants([mol1, mol2]):
        for p in prods:
            for a in p.GetAtoms():
                for k in a.GetPropsAsDict():
                    a.ClearProp(k)
            print(Chem.MolToCXSmiles(p))

Output:

[21:26:08] product atom-mapping number 1 found multiple times.
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18

* Issue 2366: Documentation and fix stereo group invalidation

Adds some documentation to EnhancedStereo.md

Also invalidates StereoGroup if a reaction specifies the
stereochemistry of a center. This destroys the relative
relationship of the center to other centers.

* Demo python file examples for Enhanced Stereochemistry in reactions

This is not intended to be pushed. These probably will become test
cases. For the output looks like this:

    0a. Reaction preserves stereo:
      [C@:1]>>[C@:1]
        F[C@H](Cl)Br |o1:1|
          >>
          F[C@H](Cl)Br |o1:1|

    0b. Reaction preserves stereo:
      [C@:1]>>[C@:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@@H](Cl)Br |&1:1|

    0c. Reaction preserves stereo:
      [C@:1]>>[C@:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    1a. Reaction ignores stereo:
      [C:1]>>[C:1]
        F[C@H](Cl)Br |a:1|
          >>
          F[C@H](Cl)Br |a:1|

    1b. Reaction ignores stereo:
      [C:1]>>[C:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@@H](Cl)Br |&1:1|

    1c. Reaction ignores stereo:
      [C:1]>>[C:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    2a. Reaction inverts stereo:
      [C@:1]>>[C@@:1]
        F[C@H](Cl)Br |o1:1|
          >>
          F[C@@H](Cl)Br |o1:1|

    2b. Reaction inverts stereo:
      [C@:1]>>[C@@:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@H](Cl)Br |&1:1|

    2c. Reaction inverts stereo:
      [C@:1]>>[C@@:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    3a. Reaction destroys stereo:
      [C@:1]>>[C:1]
        F[C@H](Cl)Br |o1:1|
          >>
          FC(Cl)Br

    3b. Reaction destroys stereo:
      [C@:1]>>[C:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          FC(Cl)Br

    3c. Reaction destroys stereo:
      [C@:1]>>[C:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    3d. Reaction destroys stereo (but preserves unaffected group):
      [C@:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
          >>
          FC(Cl)[C@@H](Cl)Br |&1:3|

    3e. Reaction destroys stereo:
      [C@:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |&1:1,3|
          >>
          FC(Cl)[C@@H](Cl)Br

    4a. Reaction creates stereo:
      [C:1]>>[C@@:1]
        F[C@H](Cl)Br |o1:1|
          >>
          F[C@@H](Cl)Br

    4b. Reaction creates stereo:
      [C:1]>>[C@@:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@@H](Cl)Br

    4c. Reaction creates stereo:
      [C:1]>>[C@@:1]
        FC(Cl)Br
          >>
          F[C@@H](Cl)Br

    4d. Reaction creates stereo (preserve unaffected group):
      [C:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br |&1:3|

    4e. Reaction creates stereo:
      [C:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:1,3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br

    5a. Reaction preserves unrelated stereo:
      [C@:1]F>>[C@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@H](Cl)[C@@H](Cl)Br |o1:3|

    5b. Reaction ignores unrelated stereo:
      [C:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@H](Cl)[C@@H](Cl)Br |o1:3|

    5c. Reaction inverts unrelated stereo:
      [C@:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br |o1:3|

    5d. Reaction destroys unrelated stereo:
      [C@:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          FC(Cl)[C@@H](Cl)Br |o1:3|

    5e. Reaction creates unrelated stereo:
      [C:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br |o1:3|

    6e. Reaction splits StereoGroup atoms into two Mols:
      [C:1]OO[C:2]>>[C:2]O.O[C:1]
        F[C@H](Cl)OO[C@@H](Cl)Br |o1:1,5|
          >>
          O[C@@H](Cl)Br + O[C@H](F)Cl
          >>
          O[C@H](F)Cl + O[C@@H](Cl)Br

    7. Add two copies:
      [O:1].[C:2]=O>>[O:1][C:2][O:1]
        Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
    [17:15:38] product atom-mapping number 1 found multiple times.
          >>
          CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|

    8. Add two copies:
      [O:1].[C:2]=O>>[O:1][C:2][O:1]
        Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
    [17:15:38] product atom-mapping number 1 found multiple times.
          >>
          CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|

* Updates StereoGroup strategy in reactions to copy all possible atoms.

Copy all atoms for which the stereochemistry was not created or destroyed
in the reaction. Any StereoGroup which has at least one atom will appear
in the product.

Also updates the documentation to match this description, and adds C++
and Python tests which fail before this PR and pass after. The Python
tests are more extensive.

Test output was validated by hand (especially the stereo groups
generated. I'm less confident in the reaction processing in my head,
but I truested the existing validation there.)

For future diagnosis: Python unittest failures will look like:

    AssertionError: 'F[C@H](Cl)Br' != 'F[C@H](Cl)Br |&1:1|'
    - F[C@H](Cl)Br
    + F[C@H](Cl)Br |&1:1|
    ?             +++++++

For future diagnosis: C++ Catch2 failures will look like:

      CHECK( MolToCXSmiles(*p) == "F[C@H](Cl)Br |o1:1|" )
    with expansion:
      "FC(Cl)[C@@H](Cl)Br |&1:3|"
      ==
      "F[C@H](Cl)Br |o1:1|"

* Add a couple of new tests.

* rename "relative" to "enhanced"
some reformatting

* Factor out test helper function.

* Actually, enhanced stereo groups are exposed ot Python

* Added discussion of enhanced stereochemistry in reactions to docs

* Fix new test
2019-04-07 06:06:28 +02:00
Greg Landrum
941d7abb5f Fixes #2392 (#2393)
* Fixes #2392

* update release notes
2019-04-06 07:16:55 -04:00
Ric
27f50e332d ignore HTML char codes (#2395) 2019-04-06 05:36:56 +02:00
Greg Landrum
b337415094 Fix github #2311 (#2394)
* Fixes #2311
at least I hope it does

* Stop using deprecated boost functionality

* allow the Murtagh module to import even if the code isn't built
update the associated tests

* update release notes

* typo

* fix integer division
2019-04-04 10:20:56 +02:00
Greg Landrum
9f103a9913 Allow components of the MolStandardize code to be initialized from streams (#2385)
* Fixes #2383 (tests coming in the next commit)
Minor typo fix
Fixes a "bug" in one of the default transforms

* Adds support for directly providing normalization parameter data
instead of requiring the use of a text file.

* allow fragment removers to be initialized with string data

* remove unicode

* allow the reionizer to be initialized from a stream
2019-04-03 04:48:05 +02:00
Brian Kelley
46d68bbe67 Add ExplicitBitVect prop and query (#2384)
* Add ExplicitBitVect prop and query

* Fix for review comments
2019-04-03 04:46:14 +02:00
Greg Landrum
531d3a2b7e Fixes for zlib and windows builds (#2390)
This now builds on windows both with a local normal boost install and the boost install that's currently provided by conda.
2019-04-02 08:51:12 -04:00
Greg Landrum
5a79190261 rename SGroup -> SubstanceGroup (#2375)
We leave the names of the bit connected with Mol files as SGroups, since that is
appropriate there, but the more generic pieces are renamed
2019-03-30 14:53:24 -04:00
Greg Landrum
255b254690 Fixes #2332 (#2378) 2019-03-30 08:43:07 -04:00
Greg Landrum
b0617ebc17 Fixes: #2368 (#2373)
* Fixes #2368 (#2369)

* Another use of potentially null environment variables
2019-03-29 21:03:21 -04:00
Greg Landrum
1d01874678 improvements to the Uncharge functionality (#2374)
* modify the uncharger to be use a canonical atom ordering

* add doCanonical cleanup parameter
make canonical ordering the default
document the change

* Add neutralization of additonal negative groups (not just acids).
This may not be the right thing to do.

* expose the new parameter to python

* changes in response to review
2019-03-29 21:02:55 -04:00
Brian Kelley
75096ac33c [WIP] property custom handlers (#2293)
Allow custom type-handlers in the RDProps interface
2019-03-28 17:21:00 +01:00
Dan N
f730d38ceb Removes an extra debugging cerr statment (#2360)
This will show up in every 2D SD file read... Whoops!
2019-03-22 15:50:04 +01:00
Dan N
10c3488441 #2329 wrap detect atom stereochemistry (#2351)
* Move DetectAtomStereoChemistry to Molops::assignChiralTypesFromBondDirs

DetectAtomStereoChemistry in MolFileStereochem is more broadly
useful. Additionally, it was not named very clearly for what
it was actually doing.

* Wraps assignChiralTypesFromBondDirs for use in Python

Makes assignChiralTypesFromBondDirs available in Python
and adds a test demonstrating that availability.
2019-03-19 10:54:28 +01:00
John Mayfield
da60d20aca Patch/pains updates (#2272)
* Correct typo in thiophene_E pattern, !H0,!H1 is "always true" should be !H0!H1

* Errors in ring closure translation from original SLN.

* Make queries agnostic to aromaticity model.

* Redundant recursive SMARTS

* More queries that benefit from optional aromaticity.

* Update the (.in) files from previous commit.

* Update thiaz_ene_A inline with CSV file.
2019-03-18 13:42:31 -04:00
Paolo Tosco
60ce95f07f - small fixes to get DLLs to build on Windows (#2356) 2019-03-18 05:33:39 +01:00
Brian Kelley
d2f716a2e4 Adds gzstream stream, exposes to swig (#2314)
* Move RDBoostStreams to RDStreams

* RDBoostStreams->RDStreams

* RDBoostStreams->RDStreams

* Wrap SWIG (with Java test)

* Fix missing declaration

* Use the file that already exists

* Revert to original version

* Revert to CXSMiles version

* Update boost version

* Remove redundant code

* Add zlib

* check for win32

* FileParsers now builds static on windows
2019-03-18 05:32:42 +01:00
Greg Landrum
b739a2c208 Add a read-only Python wrapper for SGroups (#2343)
* added a set of test files for SGroups.
Many thanks to Gerd Blanke for providing these

* Partial version of the wrapper
Definitely needs more work

* add some properties

* basic SGroup property change test

* not working; backup commit

* disable writing for now

* add ClearMolSGroups() function

* review response: add a couple missing methods

* remove spaces from filenames

* update filename in test

* changes in response to review

* add operator== to SGroups

* solve lifetime problems with a vector_indexing_suite
2019-03-15 08:50:32 -04:00
Greg Landrum
435521453c Fixes #2346 (#2347) 2019-03-14 09:48:40 -04:00
Greg Landrum
55fb9034a6 Add a skip_all_if_match option to the FragmentRemover (#2338)
* add SKIP_IF_ALL_MATCH argument to FragmentRemover
    Refactor FragmentRemover::remove() to make it more efficient

* implement and test SKIP_IF_ALL_MATCH

* expose the extra option to Python

* add info to logger
2019-03-14 09:32:08 -04:00
Brian Kelley
af6d413ccc Exposes substructlibrary to swig (#2337)
* SWIG wrap SubstructLibrary

* Fix tests

* Fix virtual overload for tagAtoms

* Add SubstructLibrary to swig
2019-03-12 16:35:22 +01:00
Greg Landrum
26c07e67d6 fixes #908 (#2328)
Code is a mess and really should be refactored
2019-03-10 21:43:20 -04:00
Ric
6224a42516 Build warnings revisited (#2318)
* unused vars in bison parser cleanup

* initialization order in TopologicalTorsionGenerator

* unused params in SLN bison

* sln flex unused params

* throwing destructor in TDTWriter

* signed comparison in substructmethods

* unused input param in smiles/smarts bison

* unused ms param in sln bison

* signed comparison in FingerprintGenerator

* store return of fscanf in StructCheckerOptions

* unreferenced var in catch

* uninitialized value in FileParserUtils

* avoid override overload warning in MolDraw2DSVG

* non-final overrides in Validate.h

* unused static var in Avalon

* unused vars in catch blocks

* make AvalonTools avalonSimilarityBits & avalonSSSBits const int

* assert fscanf result in StructCheckerOptions
2019-03-08 16:42:54 +01:00
Dan N
3095d08cd1 Allow copying atoms in Python (#2322)
* Allow Atoms to be copyied in Python.

The dunder copy method is the idiomatic way to support
making copies in Python. Includes a test to make sure that
copied atoms are usable.

* Use RWMol in Code/GraphMol/Wrap/rough_test.py

Co-Authored-By: d-b-w <dan.nealschneider@schrodinger.com>

* Allow access to an atom's copy constructor in Python
2019-03-08 10:05:16 -05:00
Greg Landrum
f23bde46d3 fixes an r-group symmetrization problem (#2324)
* fixes a r-group symmetrization problem

* clang-tidy

* changes in response to review

* typo
2019-03-08 09:11:15 -05:00
Ric
be3170d0d5 Mem errors clean up (#2305)
* fix test leaks

* fix "invalid read" when casting Query to EqualityQuery

* fix error cleanup in SMILES/SMARTS parsers

* SMILES/SMARTS parser fix updated *.cmake

* fix error cleanup in SLN parser

* SLN parser updated *.cmake

* updated suppressions

* update loop in sln bison
2019-03-08 05:39:59 +01:00
greg landrum
92ca0da5f9 Merge branch 'doc-update' of https://github.com/greglandrum/rdkit into greglandrum-doc-update 2019-03-07 21:15:00 +01:00
Greg Landrum
e06b51ae7c Write enhanced stereo to cxsmiles (#2290)
* Add writing of enhanced stereo to cxsmiles

* changes in response to review

* fix an interaction between the fragment and enhanced smiles bits

* fix a logic error
2019-03-07 05:46:45 +01:00
Greg Landrum
24f1737839 Remove a bunch of Python2-related warts (#2315)
* remove all of the "from __future__" imports

* remove the first batch of rdkit.six imports/uses

* next step of rdkit.six removal

* removing xrange, range, and some maps

* next round of removals

* next round of cleanups

* fix inchi test

* last bits of "from rdkit.six" are gone

* and the last of the six stuff is gone

* strange importlib problem
2019-03-06 20:43:49 -05:00
Greg Landrum
84c1ea5e7a some much-needed optimization work on the new property lists (#2317) 2019-03-06 16:06:32 -05:00
Ric
a611e7f917 Update maeparser & coordgen libraries (#2302)
* update maeparser & coordgen

* fix iostreams linking

* add shared_ptr constructor
2019-03-06 08:26:39 +01:00
Greg Landrum
2aed95bcf4 Add definition of MolFragmentToCXSmiles (#2307)
* Add definition of MolFragmentToCXSmiles

* expose MolToCXSmiles and MolFragmentToCXSmiles to python

* improve a test
2019-03-05 08:16:06 -05:00
Greg Landrum
fb5e325705 update docstrings in the wrappers too 2019-03-05 11:40:28 +01:00
Greg Landrum
f5e1686055 Fixes #2303 (#2304) 2019-03-04 08:58:12 +01:00
Greg Landrum
180c15fe0e support reading/writing atom props from SD files (#2297)
* first crude pass

* fix a deprecation

* change naming scheme, support bools

* add standalone function

* add a default value for missings

* support long lines

* stupid typo

* make operator[] work

* revisit missing value handling

* modify missing value handling

* switch to an alternate scheme for specifying missing values

* clang-format

* First pass at property list parser
still needs more tests

* add test for processMolPropertyLists

* get this working as part of the ForwardSDMolSupplier

* first pass at python wrappers and tests

* clang-format run

* add creation of property lists at the mol level

* wrap long lines on output

* remove PoC implementation

* fix python wrappers

* remove out-of-date reference to the Python PoC

* changes in response to review
2019-03-03 13:17:13 -05:00
Brian Kelley
57a891bff2 Add serialization to SubstructLibrary (#2295)
* WIP - Substruct Library Serialization

* Add serialization to SubstructLibrary

* Add SubstructLibraryDefs

This holds the definition of whether the substruct 
library is serializable

* Wrap serialization in python

* Remove .h file, add .h.in file

* Configure header file into source dir

* Use RDConfig.h to configure serialization

* Move serialization code to seperate header file

* Fixes for review comments

* Removes some code redundancy

* Make pickling mols less memory intensive

* Check if molholders come back as the right types
2019-03-02 16:31:40 +01:00
Greg Landrum
5c1341f1dc Fixes #2299 (#2300) 2019-03-01 13:22:01 -05:00
Brian Kelley
fa61fa717d Add test for issue #2285, fix molbundle test (#2301) 2019-02-28 13:47:03 -05:00
Greg Landrum
334b1558bc Fixes #2258 (#2286) 2019-02-22 07:30:31 -07:00
Greg Landrum
733167258e Fixes #2257 (#2276)
* very basics of writing as PoC: atom labels end up in output

* further progress, does not currently work

* stop writing coordinate bonds;
we represent these in SMILES already

* add coords

* fix typo in dealing with atom properties

* get atom props working

* changes in response to PR
2019-02-22 06:50:06 +01:00
Ric
9c336565fd Parse enhanced stereo information from CXSMILES (#2282)
* parser implemented

* added a couple of tests

* use logger instead of stderr
2019-02-22 04:51:26 +01:00
Ric
ce354051dc Store extra CXSMILES data as a property (#2281)
* store cx smiles

* Cpp test

* py test

* fix substr
2019-02-21 17:43:09 +01:00
Greg Landrum
b52ad644b2 Robustify parsing of CTABs and SGROUPs (#2283)
* Fixes #2277

* changes in response to review
the big one is to move the PXA parser into the normal mol file parsing

* move the PXA changes to the writer as well

* SCN actually only needs 7 characters

* add test

* fixes in response to review

* handle blanks (instead of zeros) in the counts line.
The ctfile.pdf doc says we should do this

* Make the SGroup reader more robust w.r.t. bad data
The current behavior leads to uncaught exceptions when a line is too short.
This should clear that up so that we always throw the usual FileParseException

* make error messages a bit easier to read
2019-02-21 17:39:39 +01:00
Greg Landrum
094f65f5f4 Fix #2148 and #2244 (#2275)
* not quite done yet

* Fixes #2244

* Fixes #2148

This fixes a few of the knock-on effects of the actual fix.

* Test that we still write SMILES properly

* Fixes #2266 (#2269)

* Fixes #2268 (#2270)

* Improve interactivity of output SVG (#2253)

* Add clickable atoms when tagAtoms() is called

* add python tests

* add class tags for atoms and bonds

* add marker to allow easy insertion of extra text
2019-02-21 08:03:28 -07:00