Commit Graph

16 Commits

Author SHA1 Message Date
Rachael Pirie
090dba9cc8 Getting Started with Contributing to RDKit (#7813)
* test fork commit

test complete, remove file

squashed test

* add skeleton file

* outline of the Python dev section

* add if not developer

* Update GettingStartedWithContributing.md

* strcuture things a bit more

* add summary of blog post

* update dev instructions

* update for devs

* Add example on how to add unit tests

* add style section

* added github issues and discussion

* add style example image

* update for devs

* add hyperlinks

* Move CodingStandards to GettingStartedWithContributing

Also updated the supported C++ version to C++17. I put in g++ 8.0 as a
rough guess of a version expected to support this standard.

* add initial git pull steps

* add how to check for bugs

* added how to contribute code

* Add a section on how to run tests

* Update "running tests" section

Fix a MD syntax error and add a paragraph encouraging people to run the tests! :-)

* add pullreq

* add pull_req2

* Add links to new GettingStartedWithContributing.md

* add missing image files

* update what and how section

* update how to pull req

* Update GettingStartedWithContributing.md

* update running tests

* tweak formatting

* start docs contribs

* typo fixes

* add rst overview

* move image to correct location, update python and C++ sections

* add how to do Python bindings

* Add batch 1 of Greg's suggested changes

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>

* add Greg changes

* Revert "add Greg changes"

This reverts commit 3f7d8eed6c.

* make Greg's (actual) changes

---------

Co-authored-by: mikey <m.k.blakey@icloud.com>
Co-authored-by: martin-sicho <sicho.martin@gmail.com>
Co-authored-by: dehaenw <66372095+dehaenw@users.noreply.github.com>
Co-authored-by: Franz Waibl <waiblfranz@gmail.com>
Co-authored-by: Rasmus Mejborg Borup <borup@eduroam-hci-dock-1-081.intern.ethz.ch>
Co-authored-by: Ivan Tubert-Brohman <Ivan.Tubert-Brohman@schrodinger.com>
Co-authored-by: Ivan Tubert-Brohman <ivan.tubert@gmail.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-12-06 06:14:08 +01:00
Eisuke Kawashima
ee31ec96be Replace − (U+2212, MINUS SIGN) with -- (#3777)
Fix #3738
2021-02-04 10:03:21 +01:00
shrey183
8ea1ac6112 [GSoC-2020] Generalized and Multithreaded File Reader (#3363)
* fixed issue #2965

* added test case for issue #2965

* fixed formatting and added comment.

* update

* General Reader files

* removed dependency on boost filesystems

* removed class

* clang-format

* added-comments

* further-cleanup

* added clang-formatting

* braces-for-if-else

* changed error messages, added option for windows file path

* fixed getFileName function

* cleanup

* option for filename without path

* further-cleanup

* added tests for determineFileFormat

* cleanup, const arguments for validate function

* init

* cleanup

* cleanup

* clang-format does not work for CMake

* added RDK_TEST_MULTITHREADED option

* add-flag

* cleanup

* Delete ConcurrentQueue.h

This PR deals with the Generalized File Reader.

* Delete testConcurrentQueue.cpp

This PR deals with the Generalized File Reader.

* no change

* concurrent queue

* print values

* Single Producer Multiple Consumer works

* cleanup

* Producer Consumer Example

* update queue methods and tests

* cleanup

* test

* fixed tests

* cleanup, updated tests

* Delete ProducerConsumer.h

* Delete testProducerConsumer.cpp

* cleanup

* futher cleanup

* changes based on feedback

* make queue non copyable

* psuedocode

* possible implementation

* untested implementation

* change class to typename

* basic-setup

* need to fix segfault

* need to fix blocking

* need to fix blocking

* need to fix blocking

* fix indentation

* one possibility

* without lambda function

* possible fix with some test cases

* performance tests

* added support for record id and item text

* cleanup

* cleanup

* fixed memory leak and added methods with tests for getting last id and item text

* cleanup

* added more test cases with different smi files

* cleanup

* SD mol supplier

* modified the parsing for SDMolSupplier

* cleanup

* cleanup

* new file for testing

* added support for reading molecule properties with tests

* thread-safe logging and exception handling

* cleanup

* without thread safe logging

* cleanup

* cleanup, modified MultithreadedSmilesMolSupplier

* cleanup, made reader and writer functions private

* move O2.sdf

* basic python wrapper with tests

* cleanup, added new methods for python wrappers

* made changes suggested by Andrew

* file and compression formats are case-insensitive

* cannot open files with gzstream

* cleanup

* possible fix for opening compressed streams (SMILES)

* removed seekg() and tellg() methods from multithreadeded suppliers

* cleanup

* test cases for python wrappers

* some wrapper cleanup

* cleanup, removed unused functions

* update the MT tests so that they actually do some work
also includes some cleanup here

* cleanup

* remove iterator_next header include

* added support for multithreaded readers

* use getNumThreadsToUse for multithreaded suppliers

* fixed documentation for multithreaded python wrappers

* commented performance test

* first draft of final evaluation report

* removed inline variables

* first draft getting started in python

* fixed typos in getting started in python

* fixed typos

* fix documentation tests

* fixed documentation tests

* added links to important files and PR

* added perfomance results

* first version of wrappers with compressed streams

* getting rid of streambuf stream method

* modified General File Reader

* make this work when building in non-threads mode

* rename a test

* rename a function in the python API

* rearrange the python test a bit

* disable the stream-based constructors in Python

* mark the multithreaded classes as experimental

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2020-10-09 04:31:05 +02:00
Ric
d54e77e375 Add new CIP labelling algorithm (#3234)
* add port of centres

* Several changes:
    - Added a test based on RDKit issue 2984
        (default RDKit fails it, this gets it right)
    - Use bond directions for bond stereo (label is no longer required)
    - Fix bugs in rules 4b and 5new
    - Fix some mem errors
    - clang-formatted
    - some other minor cleanups

* Several changes and some improvements:
    - Added LGPL license, as well as a mention in the doc.
    - Fix/update/add some comments
    - Fix typo/bug in Mancude calculation
    - Fix bug in rules 4b, 5New
    - Fix Sp2 Bond dir reference
    - Re clang-format
    - other minor changes suggested by Dan

* Another bunch of changes:
  - require integer-order bonds; kekulize when required
  - fix fraction comparison
  - rename sq Cis/Trans e/z
  - replace queues with vectors
  - update copyright notices
  - revert LGPL changes
  - fix Asymmetric typo

* move to separate lib/mod, add python validation test

* Moving away from the original implementation:
    - Rename to CIPLabeler
    - Remove the abstraction layer
    - Remove some stats stuff
    - Push some CIPMol functions down to Node
    - Use RDKit's isotope info

* Another bundle of changes. The most relevant ones:
    - fix parity translation
    - use cis trans as bond reference -- breaks #2984 test
    - kill a lot of unused code
    - use lists for queues
    - store nodes and edges in digraph
    - add prefixes to class data member names
    - update changeRoot() test
    - use fastFindRings() for mancude rings
    - update docs
    - add references to the scientific paper
    - Document the Mancude functions
    - Fix Mancude atom types and their comments
    - remove mol data member from SequenceRule
    - replace Fraction with boost::rational
    - update comments, docstrings and the doc

* fix building the test

* Changes here include:
    - adding bitset overload for the labeling function
    - python wrap of the overload
    - handling trigonal pyramids with implicit H
    - setting bond labels sets stereo atoms, cis/trans
    - nix LEFT/RIGHT/TOGETHER/OPPOSITE constants
    - don't use GLOB in cmake
    - a decent amount of refactoring

* Minor edits to new_CIP_labeling (#6)

* Some changes for clarity

Added some documentation and changed some variable names to match
my understanding. Also a ran clang-tidy to ensure that all blocks
were brace-enclosed.

* Return a reference instead of a copy for performance

This is called many times and showed up after some light
profiling. This change bumped throughput by about 20%

* move out of Graphmol

* move .hpp headers to .h

* update documentation; add label set of atoms test

* Address comments:
    - Added references to centres to CIPLabeler.h and Python Wrap.
    - Update validation test to skip sanitization.
    - Document mancude fractional atomic number calculation.
    - Use unittest assertions in python test.
    - Update mancude docstrings to 'resonance' instad of 'tautomers'.
    - Rename prioritise() to prioritize().
    - Add postcondition to check carriers size in Tetrahedral.cpp.
    - Use getNeighbors() in Tetrahedral.cpp.
    - Move findStereoAtoms to Chirality namespace.
    - Move code back into GraphMol.
    - Fix typos and reformat doc.

* More comments:
    - Mention why we use boost's unordered map rather than the std one.
    - Fix include in Python wrapper.

* Addressed second batch of comments:
    - fix the bug in rule 4b
    - fix docstring for rule 2
    - move atomic mass calculation from rule 2 to node
    - addressed some build warnings
    - simplify sp2bond::label(comp)
    - add start/end atoms to Sp2Bond constructor
    - update system/local includes

Co-authored-by: Dan N <dan.nealschneider@schrodinger.com>
2020-07-07 20:34:33 +02:00
Eisuke Kawashima
185ec927ab Unset executable flag 2019-10-10 20:18:43 +09:00
Dan N
2bcb7ea692 I2366 Preserve enhanced stereo in reactions (#2377)
* Potential implementation of copying enhanced stereo groups

Copies the enhanced stereo if all atoms in the reactant
end up in the same molecule of the product with valid
ChiralTags.

Current implementation: Only copy StereoGroup if all atoms are "valid" in the product.
Possible implementation: Copy StereoGroup for all atoms that are "valid" in the product.

Details:
Uses ChiralTag invalidation to decide whether StereoGroup should be copied. If
the product atoms have valid ChiralTag, then the reaction was able to
meaningfully propogate chirality from the reactant to the product. This means
that it is also meaningful to propogate the StereoGroup from the reactant to
the product.

The only exception to this is if the product template defines a specific
absolute configuration for an atom. This means that the reaction defines the
stereochemistry for the atom, so the stereochemistry of that atom is no longer
relative.

If an atom from a reactant StereoGroup appears multiple times in the product,
all copies of that atom are put in the same product StereoGroup.

Still developing test cases.

    from rdkit import Chem
    from rdkit.Chem import AllChem

    # Duplicate a molecule example:
    mol1 = Chem.MolFromSmiles('Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4|')
    mol2 = Chem.MolFromSmiles('CC(=O)C')
    rxn = AllChem.ReactionFromSmarts('[O:1].[C:2]=O>>[O:1][C:2][O:1]')
    for prods in rxn.RunReactants([mol1, mol2]):
        for p in prods:
            for a in p.GetAtoms():
                for k in a.GetPropsAsDict():
                    a.ClearProp(k)
            print(Chem.MolToCXSmiles(p))

Output:

[21:26:08] product atom-mapping number 1 found multiple times.
CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18

* Issue 2366: Documentation and fix stereo group invalidation

Adds some documentation to EnhancedStereo.md

Also invalidates StereoGroup if a reaction specifies the
stereochemistry of a center. This destroys the relative
relationship of the center to other centers.

* Demo python file examples for Enhanced Stereochemistry in reactions

This is not intended to be pushed. These probably will become test
cases. For the output looks like this:

    0a. Reaction preserves stereo:
      [C@:1]>>[C@:1]
        F[C@H](Cl)Br |o1:1|
          >>
          F[C@H](Cl)Br |o1:1|

    0b. Reaction preserves stereo:
      [C@:1]>>[C@:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@@H](Cl)Br |&1:1|

    0c. Reaction preserves stereo:
      [C@:1]>>[C@:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    1a. Reaction ignores stereo:
      [C:1]>>[C:1]
        F[C@H](Cl)Br |a:1|
          >>
          F[C@H](Cl)Br |a:1|

    1b. Reaction ignores stereo:
      [C:1]>>[C:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@@H](Cl)Br |&1:1|

    1c. Reaction ignores stereo:
      [C:1]>>[C:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    2a. Reaction inverts stereo:
      [C@:1]>>[C@@:1]
        F[C@H](Cl)Br |o1:1|
          >>
          F[C@@H](Cl)Br |o1:1|

    2b. Reaction inverts stereo:
      [C@:1]>>[C@@:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@H](Cl)Br |&1:1|

    2c. Reaction inverts stereo:
      [C@:1]>>[C@@:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    3a. Reaction destroys stereo:
      [C@:1]>>[C:1]
        F[C@H](Cl)Br |o1:1|
          >>
          FC(Cl)Br

    3b. Reaction destroys stereo:
      [C@:1]>>[C:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          FC(Cl)Br

    3c. Reaction destroys stereo:
      [C@:1]>>[C:1]
        FC(Cl)Br
          >>
          FC(Cl)Br

    3d. Reaction destroys stereo (but preserves unaffected group):
      [C@:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
          >>
          FC(Cl)[C@@H](Cl)Br |&1:3|

    3e. Reaction destroys stereo:
      [C@:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |&1:1,3|
          >>
          FC(Cl)[C@@H](Cl)Br

    4a. Reaction creates stereo:
      [C:1]>>[C@@:1]
        F[C@H](Cl)Br |o1:1|
          >>
          F[C@@H](Cl)Br

    4b. Reaction creates stereo:
      [C:1]>>[C@@:1]
        F[C@@H](Cl)Br |&1:1|
          >>
          F[C@@H](Cl)Br

    4c. Reaction creates stereo:
      [C:1]>>[C@@:1]
        FC(Cl)Br
          >>
          F[C@@H](Cl)Br

    4d. Reaction creates stereo (preserve unaffected group):
      [C:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:1,&2:3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br |&1:3|

    4e. Reaction creates stereo:
      [C:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:1,3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br

    5a. Reaction preserves unrelated stereo:
      [C@:1]F>>[C@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@H](Cl)[C@@H](Cl)Br |o1:3|

    5b. Reaction ignores unrelated stereo:
      [C:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@H](Cl)[C@@H](Cl)Br |o1:3|

    5c. Reaction inverts unrelated stereo:
      [C@:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br |o1:3|

    5d. Reaction destroys unrelated stereo:
      [C@:1]F>>[C:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          FC(Cl)[C@@H](Cl)Br |o1:3|

    5e. Reaction creates unrelated stereo:
      [C:1]F>>[C@@:1]F
        F[C@H](Cl)[C@@H](Cl)Br |o1:3|
          >>
          F[C@@H](Cl)[C@@H](Cl)Br |o1:3|

    6e. Reaction splits StereoGroup atoms into two Mols:
      [C:1]OO[C:2]>>[C:2]O.O[C:1]
        F[C@H](Cl)OO[C@@H](Cl)Br |o1:1,5|
          >>
          O[C@@H](Cl)Br + O[C@H](F)Cl
          >>
          O[C@H](F)Cl + O[C@@H](Cl)Br

    7. Add two copies:
      [O:1].[C:2]=O>>[O:1][C:2][O:1]
        Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
    [17:15:38] product atom-mapping number 1 found multiple times.
          >>
          CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|

    8. Add two copies:
      [O:1].[C:2]=O>>[O:1][C:2][O:1]
        Cl[C@@H](Br)C[C@H](Br)CCO |&1:1,4| + CC(=O)C
    [17:15:38] product atom-mapping number 1 found multiple times.
          >>
          CC(C)(OCC[C@@H](Br)C[C@@H](Cl)Br)OCC[C@@H](Br)C[C@@H](Cl)Br |&1:6,9,15,18|

* Updates StereoGroup strategy in reactions to copy all possible atoms.

Copy all atoms for which the stereochemistry was not created or destroyed
in the reaction. Any StereoGroup which has at least one atom will appear
in the product.

Also updates the documentation to match this description, and adds C++
and Python tests which fail before this PR and pass after. The Python
tests are more extensive.

Test output was validated by hand (especially the stereo groups
generated. I'm less confident in the reaction processing in my head,
but I truested the existing validation there.)

For future diagnosis: Python unittest failures will look like:

    AssertionError: 'F[C@H](Cl)Br' != 'F[C@H](Cl)Br |&1:1|'
    - F[C@H](Cl)Br
    + F[C@H](Cl)Br |&1:1|
    ?             +++++++

For future diagnosis: C++ Catch2 failures will look like:

      CHECK( MolToCXSmiles(*p) == "F[C@H](Cl)Br |o1:1|" )
    with expansion:
      "FC(Cl)[C@@H](Cl)Br |&1:3|"
      ==
      "F[C@H](Cl)Br |o1:1|"

* Add a couple of new tests.

* rename "relative" to "enhanced"
some reformatting

* Factor out test helper function.

* Actually, enhanced stereo groups are exposed ot Python

* Added discussion of enhanced stereochemistry in reactions to docs

* Fix new test
2019-04-07 06:06:28 +02:00
Greg Landrum
5a79190261 rename SGroup -> SubstanceGroup (#2375)
We leave the names of the bit connected with Mol files as SGroups, since that is
appropriate there, but the more generic pieces are renamed
2019-03-30 14:53:24 -04:00
Ric
d26d4b076e Support for parsing/writing SGroups in SD Mol files. (#2138)
* Implementation of SGroups

* remove sample files test

* update gitignore with test outputs

* fix RevisionModifier

* re-enable tests

* backup commit; things seem to work so far

* some refactoring; obvious s group tests pass now

* more refactoring

* everything now out of the public API

* not sure why this was still in there

* rename functions; all tests now pass

* remove getNextFreeSGroupId; readd comment in copy SGroups

* clang-format

* squash-merge current master

* squash merge master

* Address comments on PR

- Update to current master.
- Move SGroup parse time checks to SGroupChecks namespace.
- Store SGroups in ROMOl as vector<SGroups>.
- SGroup methods return referenes instead of pointers.
- Use atom/bond/sgroup indexes for properties instead of pointers.
- Have SGroups inherit from RDProps; move properties to RDProps.
- Remove trivial/unused methods.
- Add a link to the SD specification atop SGroup.h
2019-01-22 15:42:27 +01:00
Dan N
eaa44b40c2 Enhanced stereo read/write support in SDF files. (#2022)
* add a couple test files

* backup

* first pass at some theory documentatin

* it's a draft

* Update enhanced stereochemistry documentation

Adds initial target use case and caveats about the tentative
nature of the current implementation.

* Support read/write of molfile enhanced stereochemistry

This includes reading and writing of enhanced stereochemistry
from v3000 molfiles (sdf). Enhanced stereochemistry encodes
the relative configuration of stereocenters, allowing
representation of racemic mixtures and compounds with
unknown absolute stereochemistry.

It does not include:
* Python wrapping
* invalidation of the enhanced stereochemistry
* use of enhanced stereochemistry in search
* depiction of enhanced stereochemistry.

* Update to reflect changes from #1971

* change names of enum elements to allow compilation in VS2017

I think it's also clearer to do things this way

* Addressed most review comments.

* Run missed test "testEnhancedStereoChemistry"
* In tests, added size checks to group equality checks
* Updated copyright statements
* Deleted mol created for a test
* Use perfect forwarding in RWMol::setStereoGroups()
* use references for stereo groups that are checked in write and pickle
* Updated stereogroup.h in hopes of fixing compilation on Windows.
* clang-format

* try allowing a switch to boost regex and requiring it for g++-4.8

* do a better job of that

* typo

* Code review comments. Updated Copyright notice.

* When an atom is deleted, delete stereo groups containing it.

Also updates StereoGroup toUse accessors instead of
constant member attributes. This allows move of StereoGroups.

* RDKit style guide

* Add header required on Windows.

* get the SWIG wrappers to build
2018-09-26 15:44:23 +02:00
Greg Landrum
88a340eacc quick modern cxx note 2017-04-28 08:17:47 +02:00
Greg Landrum
d6b22d7f50 Update CodingStandards.md
This one just bit me.
2016-02-20 02:33:43 +01:00
Greg Landrum
5fed769d9e add partial first draft of coding standards doc 2015-11-28 02:39:49 +01:00
Greg Landrum
09946929d8 remove obsolete docs 2015-11-21 15:35:13 +01:00
Greg Landrum
d3d93a830c restructure this a bit 2008-10-07 05:08:17 +00:00
Greg Landrum
f3b0791b95 script cleanup 2006-05-08 19:51:32 +00:00
Greg Landrum
75a79b6327 initial import 2006-05-06 22:20:08 +00:00