Commit Graph

59 Commits

Author SHA1 Message Date
Brian Kelley
2b99ee477c Allow fragments to be grouped in cdxml (#7529)
* Allow fragments to be groups in CDXML

* Add support for grouped reactants

* run clang-format

* Change github issue to 7528

* Add documents to the code

* response to review, check grouped reactants in cdxml against rxn file

* Remove unused code

* Add missing file

---------

Co-authored-by: Brian Kelley <bkelley@relaytx.com>
2024-06-23 07:02:19 +02:00
Brian Kelley
9764f70d41 Allow any bond (smiles ~) recognition in CDXML (#7363)
* Allow any bond (smiles ~) recognition in CDXML

* Move anybond.cdxml to the right place

* a bit of simplification

---------

Co-authored-by: Brian Kelley <bkelley@relaytx.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2024-04-29 16:20:23 +02:00
Greg Landrum
b2606f5c20 fix a problem with reading older pickles (#7180) 2024-02-22 14:03:45 +01:00
tadhurst-cdd
d5d4d194ec atropisomer handling added (#6903)
* atropisomer handling added

* fixed non-used variables,  linking directives

* BOOST LIB start/stop fixes, linking fix

* Fixes for RDKIT CI errors

* minimalLib fix

* changed vector<enum> for java builds

* check for extra chars in CIP labeling

* removed wrong deprecated message

* fix ostrstream output error?

* restored _ChiralAtomRank to lowercase first letter

* changes for merged master

* Fixed catch label for new Catch package

* update expected psql results

* get swig wrappers building

* restore MolFileStereochem to FileParsers

* fix java wrapper for reapplyMolBlockWedging

* some suggestions

* move a couple functions out of Bond

* Merge branch 'master' into pr/atropisomers2

* merged master

* Renamed setStereoanyFromSquiggleBond

* atropisomers in cdxml, rationalize atrop wedging, stereoGroups in drawMol

* fix for CI build

* attempt to fix java build in CI

* attempt to fix java build in CI #2

* New routine to remove non-explicit  3D-geneated chirality

* changed to use pair for atrop atoms and related bonds

* Changes as per PR reviews

* PR review respnses

* PR review reponse - more

* Fix merge from master

* fixing java ci after merge

* Updated the help doc for atripisomers

* update the atropisomer docs

* improve the images

* add the source CXSMILES

---------

Co-authored-by: greg landrum <greg.landrum@gmail.com>
2023-12-22 04:58:18 +01:00
Greg Landrum
fee12b0aa6 Switch to determining atom chiral types from pseudo 3D instead of bond dirs (#6456)
* backup

* backup

* passes a lot of tests

* cleanup; still failing some tests

* pay attention to bond starting points... duh

* all tests pass

* invert y coords

* Scale bonds, make the Wedge detection cleaner, add more tests

* Readd comment

* Use document bond length

* Adds roundtrip test through a molblock

* a bit of cleanup

* remove the old code since we aren't using it any more.

* changes in response to review

---------

Co-authored-by: Brian Kelley <bkelley@relaytx.com>
2023-07-04 04:48:55 +02:00
Greg Landrum
fcfbb7b083 Fix a couple CDXML issues (#6463)
* Scale bonds, make the Wedge detection cleaner, add more tests

* Readd comment

* Use document bond length

* Adds roundtrip test through a molblock

* a bit of cleanup

* change expected results for a bogus structure
add a non-ambiguous version of it

* fixes #6462

* document incompatibility

---------

Co-authored-by: Brian Kelley <bkelley@relaytx.com>
2023-06-15 05:05:33 +02:00
Greg Landrum
82dd73d079 Fix a problem with pickling molecules with more than 255 rings (#5992)
* Fix a problem with pickling molecules with more than 255 rings

* fix doctest
2023-01-30 23:24:00 -05:00
Brian Kelley
6d10f90a8b Add wavy stereo bond support to cdxml (#5755)
* Add wavy stereo bonds

* Add wavy single bonds

* Single bonds have bond stereo stripped

Co-authored-by: Brian Kelley <bkelley@relaytx.com>
2022-11-17 04:58:46 +01:00
Brian Kelley
d1985caaa7 cdxml parser (#5273) 2022-09-28 05:49:27 +02:00
Greg Landrum
cd74dc2207 Initial support for non-tetrahedral stereochemistry (#5084)
* very basics: actually parsing the new atom stereochem features

* add some input verification for the chiral permutations

* fix a typo
add quadruple bond SMILES/SMARTS extension

* add forgotten files

* patch from Roger

* add Roger's parsing examples

* typo

* new tests

* adjusted version of next PR from Roger:
- add SP2D hybridization for square planar (this may change)
- some modernizationof Chirality.cpp
- stop using < HybridizationType in Chirality.cpp (should probably do this elsewhere too)
- improved handling of hybridization assignment for new stereochem
- handle new stereo/hybridization in UFF
- tests for the above

* perception of non-tetrahedral stereo from 3D (from Roger S)
Basic testing of SP and TB based on opensmiles docs

* potential fixes for octahedral assignment
more tests

* docs update
need way more!

* map the TH tags directly to @ tags

* very basics of SMILES writing
this does not work with anything that changes the permutation order
like canonicalization or writing things in rings.

* start to support the getChiralAcross API

* more testing

* consistency

* add hasNonTetrahedralStereo() and getIdealAngleBetweenLigands()

* assignStereochemistry should only remove non-tetrahedral stereo

* re-simplify those tests

* cleanup matrix stream output

* initial pass at supporting nontet stereo in distgeom

* backup

* start on the reference docs

* TBP reference

* first pass at Oh finished

* update SP section

* more doc updates

* fix a typo

* add param to not remove Hs connected to non-tetrahedral atoms

* VERY basic coord generation for square planar

* TBP basics

* basic OH depiction

* start testing missing ligands
allow non-tet stereo in rings (ugly, but correct)

* add new TBP functions from Roger

* update depiction code for new API

* backup, the new tests work so far

* Finish the TB tests

* OH tests pass too

* cleanup

* first pass at getting correct SMILES with reordering
need way more testing than this

* ensure permutation 0 is correctly preserved

* some progress towards adding non-tetrahedral stereo to StereoInfo

* doc update

* add non-tet chiral classes to python wrappers

* make sure removeAllHs also gets neighbors of non-tetrahedral centers
more testing

* a bit of depictor cleanup

* make the assignment from 3D more tolerant
more testing

* improve the bulk testing

* cleanup

* remove a bit of redundant code

* ensure we don't write bogus permutation values to SMILES

* fix some rebase problems

* allow assignStereochemistryFrom3D() to be called without sanitization

* allow disabling the non-tetrahedral stereo when it's not explicit

* get that working on windows too
2022-05-20 09:07:16 +02:00
Eisuke Kawashima
b9a5be5a2d miscellaneous updates (#4284)
* Remove accidentally tracked files and unset x flag

* Ignore ComicNeue

* Unify test tag to `reader`

* Trivial destructors

* Bump CMAKE_CXX_STANDARD to 14 (#4165)
2021-07-13 06:57:29 +02:00
Greg Landrum
af3bb3e78b Allow partial deserialization of molecules (#4040)
* make pickling/depickling conformers optional

* make de-pickling properties optional

* support the new options in molecule ctors

* update doctest
2021-04-24 07:22:55 +02:00
Paolo Tosco
8030c36e5b Minor tweaks to SubstructLibrary (#3564)
* - added some missing const keywords
- added an addFingerprint overload to allow passing pointers
- added a test

* changes in response to review

* removed print

* added missing shared_ptr declaration

* added PatternNumBitsHolder serialization

* - merged with upstream changes and resolved conflicts
- got rid of PatternNumBitsHolder and leveraged the serialization version to get the PatternHolder to be backwards-compatible

* built substructLibV1.pkl with an older version of boost

* reverted serialization version to 1
only write numBits if != 2048 and only read numBits if it exists in the archive

* bogus commit just to trigger a rebuild
2020-12-09 19:42:38 -05:00
jones-gareth
9a864f4238 Sgroup (#3390)
* Changes to use SubstanceGroups in Java

* Forgot to add SWIG file

* Java test for SubstanceGroup wrappers

* Added RDKit boilerplate
2020-09-09 04:59:08 +02:00
Ric
d54e77e375 Add new CIP labelling algorithm (#3234)
* add port of centres

* Several changes:
    - Added a test based on RDKit issue 2984
        (default RDKit fails it, this gets it right)
    - Use bond directions for bond stereo (label is no longer required)
    - Fix bugs in rules 4b and 5new
    - Fix some mem errors
    - clang-formatted
    - some other minor cleanups

* Several changes and some improvements:
    - Added LGPL license, as well as a mention in the doc.
    - Fix/update/add some comments
    - Fix typo/bug in Mancude calculation
    - Fix bug in rules 4b, 5New
    - Fix Sp2 Bond dir reference
    - Re clang-format
    - other minor changes suggested by Dan

* Another bunch of changes:
  - require integer-order bonds; kekulize when required
  - fix fraction comparison
  - rename sq Cis/Trans e/z
  - replace queues with vectors
  - update copyright notices
  - revert LGPL changes
  - fix Asymmetric typo

* move to separate lib/mod, add python validation test

* Moving away from the original implementation:
    - Rename to CIPLabeler
    - Remove the abstraction layer
    - Remove some stats stuff
    - Push some CIPMol functions down to Node
    - Use RDKit's isotope info

* Another bundle of changes. The most relevant ones:
    - fix parity translation
    - use cis trans as bond reference -- breaks #2984 test
    - kill a lot of unused code
    - use lists for queues
    - store nodes and edges in digraph
    - add prefixes to class data member names
    - update changeRoot() test
    - use fastFindRings() for mancude rings
    - update docs
    - add references to the scientific paper
    - Document the Mancude functions
    - Fix Mancude atom types and their comments
    - remove mol data member from SequenceRule
    - replace Fraction with boost::rational
    - update comments, docstrings and the doc

* fix building the test

* Changes here include:
    - adding bitset overload for the labeling function
    - python wrap of the overload
    - handling trigonal pyramids with implicit H
    - setting bond labels sets stereo atoms, cis/trans
    - nix LEFT/RIGHT/TOGETHER/OPPOSITE constants
    - don't use GLOB in cmake
    - a decent amount of refactoring

* Minor edits to new_CIP_labeling (#6)

* Some changes for clarity

Added some documentation and changed some variable names to match
my understanding. Also a ran clang-tidy to ensure that all blocks
were brace-enclosed.

* Return a reference instead of a copy for performance

This is called many times and showed up after some light
profiling. This change bumped throughput by about 20%

* move out of Graphmol

* move .hpp headers to .h

* update documentation; add label set of atoms test

* Address comments:
    - Added references to centres to CIPLabeler.h and Python Wrap.
    - Update validation test to skip sanitization.
    - Document mancude fractional atomic number calculation.
    - Use unittest assertions in python test.
    - Update mancude docstrings to 'resonance' instad of 'tautomers'.
    - Rename prioritise() to prioritize().
    - Add postcondition to check carriers size in Tetrahedral.cpp.
    - Use getNeighbors() in Tetrahedral.cpp.
    - Move findStereoAtoms to Chirality namespace.
    - Move code back into GraphMol.
    - Fix typos and reformat doc.

* More comments:
    - Mention why we use boost's unordered map rather than the std one.
    - Fix include in Python wrapper.

* Addressed second batch of comments:
    - fix the bug in rule 4b
    - fix docstring for rule 2
    - move atomic mass calculation from rule 2 to node
    - addressed some build warnings
    - simplify sp2bond::label(comp)
    - add start/end atoms to Sp2Bond constructor
    - update system/local includes

Co-authored-by: Dan N <dan.nealschneider@schrodinger.com>
2020-07-07 20:34:33 +02:00
Greg Landrum
b55376f284 Adds more options to adjustQueryProperties (#3235)
* add documentation

* backup

* first pass at 5-rings working

* add a static method to initialize an empty parameter object

* expose static method to python

* additional testing

* support the single bond adjustments

* cleanup

* preserve the symbol used in the query from a CTAB

* support the way the MDL code adjusts five-ring aromaticity in query rings

* in-code documentation

* while we're at it, cleanup the way Q and A atoms are handled in the v3k parser

* changes in response to review

* make this C++14 again.

* change in response to review
2020-06-22 09:17:50 -04:00
Greg Landrum
95613b6279 Allow SubstanceGroups to survive molecule edits (#3170)
* Progress on #3168

* Fixes #3167

* Fixes #3169

* deal with CBONDS too

* test PATOMS

* Fixes #3175

* a bit of code simplification and test updates

still needs more testing

* more testing

* handle s-group hierarchy
also a couple of other changes in response to the review

* add forgotten test file

* changes in response to review
2020-05-19 17:35:08 +02:00
Greg Landrum
ab061d532f fix start/end atoms when wedging bonds (#2861) 2019-12-27 07:21:49 -05:00
Eisuke Kawashima
185ec927ab Unset executable flag 2019-10-10 20:18:43 +09:00
Greg Landrum
069e920645 Fixes #2224 (#2234)
* Fixes #2224

* test the basics
2019-01-21 11:31:02 -05:00
Greg Landrum
d9b06a733b Fixes #1936 (#1945)
* Fixes #1936
doctests of the book still need to be verified

* a fix that is related to #1940

* add test for what was actually reported
2018-07-05 11:53:54 -04:00
Paolo Tosco
503b84995c - make bond stereo detection in rings consistent (#1727) 2018-02-01 04:28:10 +01:00
Greg Landrum
d253aabc86 remove an output file that never should have been checked in 2017-12-05 08:21:35 +01:00
Maciej Wójcikowski
10fbd483bb [MRG] Fix PDB reader + add argument to toggle proximity bonding (#1629)
* Add parameter to skip proximity bonding during PDB reading

* Test proximityBonding flag

* Remove multivalent Hs and bonds to metals in PDB

* Add tests for multivalent Hs and metal unbinding

* Remove covalent bonds to waters

* Test unbinding of HOHs

* Refactor funxtions

* Rename flag for cosistency

* Include flavor in double bond perception

* Add metalorganic test (APW ligand)

* Validate input foe IsBlacklistedPair and minor changes.
2017-11-15 06:53:31 +01:00
Greg Landrum
64399a46f0 Fixes github1497 (#1555)
* move detectBondStereoChemistry() into MolOps

* switch more code over to using the new function

* add an addStereoChemistryFrom3D() function. Needs testing still.

* add some tests

* cleanups and rename
2017-09-11 08:37:32 -04:00
Greg Landrum
9dcef9ac57 Fixes #607 (#1075) 2016-09-23 04:57:07 +02:00
Paolo Tosco
8b5176f8c9 - initial work to put the Trajectory code into a separate object 2016-05-09 19:05:15 +01:00
Greg Landrum
027d231e38 add a test (still fails) 2016-03-09 07:27:11 +01:00
Brian Kelley
c5f210e7e7 RDKit learns how to filter PAINS/BRENK/ZINC/NIH via FilterCatalog
FilterCatalogs give RDKit the ability to screen out or reject 
undesirable molecules based on various criteria.  Supplied 
with RDKIt are the following filter sets:

  * PAINS - Pan assay interference patterns.  
    These are separated into three sets PAINS_A, PAINS_B and PAINS_C.
    Reference: Baell JB, Holloway GA. New Substructure Filters for 
               Removal of Pan Assay Interference Compounds (PAINS) 
               from Screening Libraries and for Their Exclusion in 
               Bioassays.
               J Med Chem 53 (2010) 2719Ð40. doi:10.1021/jm901137j.

  * BRENK - filters unwanted functionality due to potential tox reasons 
            or unfavorable pharmacokinetics.
    Reference: Brenk R et al. Lessons Learnt from Assembling Screening 
               Libraries for Drug Discovery for Neglected Diseases.
               ChemMedChem 3 (2008) 435-444. doi:10.1002/cmdc.200700139.

  * NIH - annotated compounds with problematic functional groups
     Reference: Doveston R, et al. A Unified Lead-oriented Synthesis of 
                over Fifty Molecular Scaffolds. Org Biomol Chem 13 
                (2014) 859Ð65.
                doi:10.1039/C4OB02287D.
     Reference: Jadhav A, et al. Quantitative Analyses of Aggregation, 
                Autofluorescence, and Reactivity Artifacts in a Screen 
                for Inhibitors of a Thiol Protease.
                J Med Chem 53 (2009) 37Ð51. doi:10.1021/jm901070c.

  * ZINC - Filtering based on drug-likeness and unwanted functional 
           groups
    Reference: http://blaster.docking.org/filtering/

The following is C++ and Python examples of how to filter molecules.

[C++]

#include <GraphMol/FilterCatalog.h>
using namespace RDKit;

    SmilesMolSupplier suppl(…);

    // setup the desired catalogs
    FilterCatalogParams params;
    params.addCatalog(FilterCatalogParams::PAINS_A);
    params.addCatalog(FilterCatalogParams::PAINS_B);
    params.addCatalog(FilterCatalogParams::PAINS_C);
    
    // create the catalog
    FilterCatalog catalog(params);

    unique_ptr<ROMol> mol; // automatically cleans up after us    
    int count = 0;
    while(!suppl.atEnd()){
      mol.reset(suppl.next());
      TEST_ASSERT(mol.get());

      // Does a PAINS filter hit?
      if (catalog.hasMatch(*mol)) {
        std::cerr << "Warning: molecule failed filter " << std::endl;
      }
      
      // More detailed data by retrieving the catalog entry
      const FilterCatalogEntry *entry = catalog.getFirstMatch(*mol);
      if (entry) {
        std::cerr << "Warning: molecule failed filter: reason " <<
          entry->getDescription() << std::endl;
        
        // get the matched substructure atoms for visualization
        std::vector<FilterMatch> matches;
        if (entry->getFilterMatches(*mol, matches)) {
          for(std::vector<FilterMatch>::const_iterator it = matches.begin();
              it != matches.end(); ++it) {
            // Get the SmartsMatcherBase that matched
            const FilterMatch & fm = (*it);
            boost::shared_ptr<SmartsMatcherBase> matchingFilter = \
              fm.filterMatch;
            
            // Get the matching atom indices
            const MatchVectType &vect = fm.atomPairs;
            for (MatchVectType::const_iterator it=vect.begin();
                 it != vect.end(); ++it) {
                 int atomIdx = it->second;
            }

          }
        }
      }
      count ++;
    } // end while

Python API

  import sys
  from rdkit.Chem import FilterCatalog

  params = FilterCatalog.FilterCatalogParams()
  params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_A)
  params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_B)
  params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_C)
  catalog = FilterCatalog.FilterCatalog(params)
  
  ...
  for mol in mols:
      if catalog.HasMatch(mol):
         print("Warning: molecule failed filter", file=sys.stderr)
      # more detailed
      entry = catalog.GetFirstMatch(mol)
      if entry:
         print("Warning: molecule failed filter: reason %s"%(
           entry.GetDescription()), file=sys.stderr)
           
         # get to the atoms involved in the substructure
         #  there ma be many matching filters here...
         for filterMatch in entry.getFilterMatches(mol):
             filter = filterMatch.filterMatch
             # get a description of the matching filter
             print(filter)
             for queryAtomIdx, atomIdx in filterMatch.atomPairs:
                 # do something with the substructure matches

Advanced

 FilterCatalogs are fully serializable and can be stored for later use.

  To serialize a catalog, use the catalog.Serialize() method.
     std::string pickle = catalog.Serialize();
     
  To unserialize, send the resulting string into the constructor
     FilterCatalog catalog(pickle);


 The underlying matchers can be arbitrarily complicated and new
  ones with more complicated semantics can be created.  The default
  matching objects are:

  SmartsMatcher - match a smarts pattern or query molecule with a minimum 
                  and maximum count
  ExclusionList - returns false if any of the supplied matches exist

  And - combine two matchers
  Or  - true if any of two matchers are true
  Not - invert the match (note that this can have confusing semantics
          when dealing with substructure matches)

  Entries can be added at any time to a catalog:
  
   ExclusionList excludedList;   

    excludedList.addPattern(SmartsMatcher("Pattern 1", smarts)); 
    excludedList.addPattern(SmartsMatcher("Pattern 2", smarts2)); 
   

  A FilterCatalog supports a few different types of matching.  One is
  a traditional rejection filter where if a substructure exists in
  the target molecule, the molecule is rejected.

  These types of queries can indicate the substructure that triggered
  the rejection through the FilterCatalogEntry::GetMatch(mol)
  function.

  The FilterCatalog also supports acceptance filters, that are
  designed to indicate which molecules are ok.  These have
  to be transformed into rejection filters or simply wrapped in a 
  Not( acceptanceFilter ) when entered into the catalog.  For example, 
   from Zinc:

    carbons [#6] 40

  means that we have a maximum of 40 carbon atoms.  We can write this by
  converting the max count to a min count (i.e. the pattern is triggered
  when the molecule has mincount atoms);

    const unsigned int minCount = 40+1;
    SmartsMatcher( "Too many carbons", "[#6"], minCount );

  This can be properly substructure searched.

  Or we can wrap this in a not:
  
    const unsigned int minCount = 0;
    const unsigned int maxCount = 40;
    Not( SmartsMatcher( "ok number of carbons", "[#6]", minCount, maxCount) );

  Note: Wrapping in a Not loses the ability to highlight the rejecting
    pattern when visualizing the molecule.
2015-07-14 10:31:31 -04:00
Nadine Schneider
0cf0dd37ce Bugfix in SmilesWrite and some additional tests for getMolFrags function 2015-04-16 10:53:20 +02:00
Nadine Schneider
5d963846b8 merge 2015-04-10 09:44:18 +02:00
Greg Landrum
74125f685c Fixes #443 2015-03-05 06:38:38 +01:00
Greg Landrum
ad62f6241a update coords 2015-01-09 09:58:20 +01:00
Greg Landrum
baf26c053c not fixed; still lots of debugging printing; backup commit 2015-01-09 06:33:49 +01:00
Greg Landrum
1f4c2e915c fix a nasty canonicalization problem:
need to be sure to sort neighbors by their ranks in a *decreasing* order
2015-01-07 20:46:08 +01:00
Greg Landrum
23076b1cdb Fixes #298 2014-07-23 05:31:16 +02:00
Greg Landrum
86b9e6b089 Fixes #72 2013-08-25 06:36:10 +02:00
Greg Landrum
40ab2c06e3 Fixes #87 2013-08-21 08:20:19 +02:00
Greg Landrum
294cb24de4 fix and test issue 266 2012-11-17 07:39:39 +00:00
Greg Landrum
f5eb640766 fix and test Issue3549146 2012-07-26 14:34:35 +00:00
Greg Landrum
6157345dde this version passes the ZINC natural-products torture test 2012-06-28 06:45:42 +00:00
Greg Landrum
0f3b84cd28 backup commit; we are not quite there yet 2012-06-26 06:15:08 +00:00
Greg Landrum
0085012701 fix and test issue 3525076 2012-05-10 05:35:20 +00:00
Greg Landrum
813f4863ed fix and test Issue 3480481 2012-02-18 05:34:34 +00:00
Greg Landrum
5ad945b9ce some supplier updates 2012-02-10 16:24:42 +00:00
Greg Landrum
044fda398b further ring-finding algorithmic changes to fix issue 3185548 2011-02-19 04:35:53 +00:00
Greg Landrum
aa1610797e initial fix for Issue3184458, more work should still be done here. 2011-02-18 06:31:31 +00:00
Greg Landrum
db1f25b16c chirality support in the fix for issue 2951221 2010-02-14 06:39:51 +00:00
Greg Landrum
8daba8ff30 fix sf.net issue 2951221: note this doesn't add Hs to chiral centers correctly 2010-02-13 14:53:47 +00:00
Greg Landrum
c425bc8c03 fix and test sf.net Issue2788233 2009-06-01 13:04:33 +00:00