rdkit

mirror of https://github.com/rdkit/rdkit.git synced 2026-06-04 21:54:27 +08:00

Author	SHA1	Message	Date
Greg Landrum	af3bb3e78b	Allow partial deserialization of molecules (#4040 ) * make pickling/depickling conformers optional * make de-pickling properties optional * support the new options in molecule ctors * update doctest	2021-04-24 07:22:55 +02:00
Paolo Tosco	8030c36e5b	Minor tweaks to SubstructLibrary (#3564 ) * - added some missing const keywords - added an addFingerprint overload to allow passing pointers - added a test * changes in response to review * removed print * added missing shared_ptr declaration * added PatternNumBitsHolder serialization * - merged with upstream changes and resolved conflicts - got rid of PatternNumBitsHolder and leveraged the serialization version to get the PatternHolder to be backwards-compatible * built substructLibV1.pkl with an older version of boost * reverted serialization version to 1 only write numBits if != 2048 and only read numBits if it exists in the archive * bogus commit just to trigger a rebuild	2020-12-09 19:42:38 -05:00
jones-gareth	9a864f4238	Sgroup (#3390 ) * Changes to use SubstanceGroups in Java * Forgot to add SWIG file * Java test for SubstanceGroup wrappers * Added RDKit boilerplate	2020-09-09 04:59:08 +02:00
Ric	d54e77e375	Add new CIP labelling algorithm (#3234 ) * add port of centres * Several changes: - Added a test based on RDKit issue 2984 (default RDKit fails it, this gets it right) - Use bond directions for bond stereo (label is no longer required) - Fix bugs in rules 4b and 5new - Fix some mem errors - clang-formatted - some other minor cleanups * Several changes and some improvements: - Added LGPL license, as well as a mention in the doc. - Fix/update/add some comments - Fix typo/bug in Mancude calculation - Fix bug in rules 4b, 5New - Fix Sp2 Bond dir reference - Re clang-format - other minor changes suggested by Dan * Another bunch of changes: - require integer-order bonds; kekulize when required - fix fraction comparison - rename sq Cis/Trans e/z - replace queues with vectors - update copyright notices - revert LGPL changes - fix Asymmetric typo * move to separate lib/mod, add python validation test * Moving away from the original implementation: - Rename to CIPLabeler - Remove the abstraction layer - Remove some stats stuff - Push some CIPMol functions down to Node - Use RDKit's isotope info * Another bundle of changes. The most relevant ones: - fix parity translation - use cis trans as bond reference -- breaks #2984 test - kill a lot of unused code - use lists for queues - store nodes and edges in digraph - add prefixes to class data member names - update changeRoot() test - use fastFindRings() for mancude rings - update docs - add references to the scientific paper - Document the Mancude functions - Fix Mancude atom types and their comments - remove mol data member from SequenceRule - replace Fraction with boost::rational - update comments, docstrings and the doc * fix building the test * Changes here include: - adding bitset overload for the labeling function - python wrap of the overload - handling trigonal pyramids with implicit H - setting bond labels sets stereo atoms, cis/trans - nix LEFT/RIGHT/TOGETHER/OPPOSITE constants - don't use GLOB in cmake - a decent amount of refactoring * Minor edits to new_CIP_labeling (#6) * Some changes for clarity Added some documentation and changed some variable names to match my understanding. Also a ran clang-tidy to ensure that all blocks were brace-enclosed. * Return a reference instead of a copy for performance This is called many times and showed up after some light profiling. This change bumped throughput by about 20% * move out of Graphmol * move .hpp headers to .h * update documentation; add label set of atoms test * Address comments: - Added references to centres to CIPLabeler.h and Python Wrap. - Update validation test to skip sanitization. - Document mancude fractional atomic number calculation. - Use unittest assertions in python test. - Update mancude docstrings to 'resonance' instad of 'tautomers'. - Rename prioritise() to prioritize(). - Add postcondition to check carriers size in Tetrahedral.cpp. - Use getNeighbors() in Tetrahedral.cpp. - Move findStereoAtoms to Chirality namespace. - Move code back into GraphMol. - Fix typos and reformat doc. * More comments: - Mention why we use boost's unordered map rather than the std one. - Fix include in Python wrapper. * Addressed second batch of comments: - fix the bug in rule 4b - fix docstring for rule 2 - move atomic mass calculation from rule 2 to node - addressed some build warnings - simplify sp2bond::label(comp) - add start/end atoms to Sp2Bond constructor - update system/local includes Co-authored-by: Dan N <dan.nealschneider@schrodinger.com>	2020-07-07 20:34:33 +02:00
Greg Landrum	b55376f284	Adds more options to adjustQueryProperties (#3235 ) * add documentation * backup * first pass at 5-rings working * add a static method to initialize an empty parameter object * expose static method to python * additional testing * support the single bond adjustments * cleanup * preserve the symbol used in the query from a CTAB * support the way the MDL code adjusts five-ring aromaticity in query rings * in-code documentation * while we're at it, cleanup the way Q and A atoms are handled in the v3k parser * changes in response to review * make this C++14 again. * change in response to review	2020-06-22 09:17:50 -04:00
Greg Landrum	95613b6279	Allow SubstanceGroups to survive molecule edits (#3170 ) * Progress on #3168 * Fixes #3167 * Fixes #3169 * deal with CBONDS too * test PATOMS * Fixes #3175 * a bit of code simplification and test updates still needs more testing * more testing * handle s-group hierarchy also a couple of other changes in response to the review * add forgotten test file * changes in response to review	2020-05-19 17:35:08 +02:00
Greg Landrum	ab061d532f	fix start/end atoms when wedging bonds (#2861 )	2019-12-27 07:21:49 -05:00
Eisuke Kawashima	185ec927ab	Unset executable flag	2019-10-10 20:18:43 +09:00
Greg Landrum	069e920645	Fixes #2224 (#2234 ) * Fixes #2224 * test the basics	2019-01-21 11:31:02 -05:00
Greg Landrum	d9b06a733b	Fixes #1936 (#1945 ) * Fixes #1936 doctests of the book still need to be verified * a fix that is related to #1940 * add test for what was actually reported	2018-07-05 11:53:54 -04:00
Paolo Tosco	503b84995c	- make bond stereo detection in rings consistent (#1727 )	2018-02-01 04:28:10 +01:00
Greg Landrum	d253aabc86	remove an output file that never should have been checked in	2017-12-05 08:21:35 +01:00
Maciej Wójcikowski	10fbd483bb	[MRG] Fix PDB reader + add argument to toggle proximity bonding (#1629 ) * Add parameter to skip proximity bonding during PDB reading * Test proximityBonding flag * Remove multivalent Hs and bonds to metals in PDB * Add tests for multivalent Hs and metal unbinding * Remove covalent bonds to waters * Test unbinding of HOHs * Refactor funxtions * Rename flag for cosistency * Include flavor in double bond perception * Add metalorganic test (APW ligand) * Validate input foe IsBlacklistedPair and minor changes.	2017-11-15 06:53:31 +01:00
Greg Landrum	64399a46f0	Fixes github1497 (#1555 ) * move detectBondStereoChemistry() into MolOps * switch more code over to using the new function * add an addStereoChemistryFrom3D() function. Needs testing still. * add some tests * cleanups and rename	2017-09-11 08:37:32 -04:00
Greg Landrum	9dcef9ac57	Fixes #607 (#1075 )	2016-09-23 04:57:07 +02:00
Paolo Tosco	8b5176f8c9	- initial work to put the Trajectory code into a separate object	2016-05-09 19:05:15 +01:00
Greg Landrum	027d231e38	add a test (still fails)	2016-03-09 07:27:11 +01:00
Brian Kelley	c5f210e7e7	RDKit learns how to filter PAINS/BRENK/ZINC/NIH via FilterCatalog FilterCatalogs give RDKit the ability to screen out or reject undesirable molecules based on various criteria. Supplied with RDKIt are the following filter sets: * PAINS - Pan assay interference patterns. These are separated into three sets PAINS_A, PAINS_B and PAINS_C. Reference: Baell JB, Holloway GA. New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays. J Med Chem 53 (2010) 2719Ð40. doi:10.1021/jm901137j. * BRENK - filters unwanted functionality due to potential tox reasons or unfavorable pharmacokinetics. Reference: Brenk R et al. Lessons Learnt from Assembling Screening Libraries for Drug Discovery for Neglected Diseases. ChemMedChem 3 (2008) 435-444. doi:10.1002/cmdc.200700139. * NIH - annotated compounds with problematic functional groups Reference: Doveston R, et al. A Unified Lead-oriented Synthesis of over Fifty Molecular Scaffolds. Org Biomol Chem 13 (2014) 859Ð65. doi:10.1039/C4OB02287D. Reference: Jadhav A, et al. Quantitative Analyses of Aggregation, Autofluorescence, and Reactivity Artifacts in a Screen for Inhibitors of a Thiol Protease. J Med Chem 53 (2009) 37Ð51. doi:10.1021/jm901070c. * ZINC - Filtering based on drug-likeness and unwanted functional groups Reference: http://blaster.docking.org/filtering/ The following is C++ and Python examples of how to filter molecules. [C++] #include <GraphMol/FilterCatalog.h> using namespace RDKit; SmilesMolSupplier suppl(…); // setup the desired catalogs FilterCatalogParams params; params.addCatalog(FilterCatalogParams::PAINS_A); params.addCatalog(FilterCatalogParams::PAINS_B); params.addCatalog(FilterCatalogParams::PAINS_C); // create the catalog FilterCatalog catalog(params); unique_ptr<ROMol> mol; // automatically cleans up after us int count = 0; while(!suppl.atEnd()){ mol.reset(suppl.next()); TEST_ASSERT(mol.get()); // Does a PAINS filter hit? if (catalog.hasMatch(mol)) { std::cerr << "Warning: molecule failed filter " << std::endl; } // More detailed data by retrieving the catalog entry const FilterCatalogEntry entry = catalog.getFirstMatch(mol); if (entry) { std::cerr << "Warning: molecule failed filter: reason " << entry->getDescription() << std::endl; // get the matched substructure atoms for visualization std::vector<FilterMatch> matches; if (entry->getFilterMatches(mol, matches)) { for(std::vector<FilterMatch>::const_iterator it = matches.begin(); it != matches.end(); ++it) { // Get the SmartsMatcherBase that matched const FilterMatch & fm = (*it); boost::shared_ptr<SmartsMatcherBase> matchingFilter = \ fm.filterMatch; // Get the matching atom indices const MatchVectType &vect = fm.atomPairs; for (MatchVectType::const_iterator it=vect.begin(); it != vect.end(); ++it) { int atomIdx = it->second; } } } } count ++; } // end while Python API import sys from rdkit.Chem import FilterCatalog params = FilterCatalog.FilterCatalogParams() params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_A) params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_B) params.AddCatalog(FilterCatalogParams.FilterCatalogs.PAINS_C) catalog = FilterCatalog.FilterCatalog(params) ... for mol in mols: if catalog.HasMatch(mol): print("Warning: molecule failed filter", file=sys.stderr) # more detailed entry = catalog.GetFirstMatch(mol) if entry: print("Warning: molecule failed filter: reason %s"%( entry.GetDescription()), file=sys.stderr) # get to the atoms involved in the substructure # there ma be many matching filters here... for filterMatch in entry.getFilterMatches(mol): filter = filterMatch.filterMatch # get a description of the matching filter print(filter) for queryAtomIdx, atomIdx in filterMatch.atomPairs: # do something with the substructure matches Advanced FilterCatalogs are fully serializable and can be stored for later use. To serialize a catalog, use the catalog.Serialize() method. std::string pickle = catalog.Serialize(); To unserialize, send the resulting string into the constructor FilterCatalog catalog(pickle); The underlying matchers can be arbitrarily complicated and new ones with more complicated semantics can be created. The default matching objects are: SmartsMatcher - match a smarts pattern or query molecule with a minimum and maximum count ExclusionList - returns false if any of the supplied matches exist And - combine two matchers Or - true if any of two matchers are true Not - invert the match (note that this can have confusing semantics when dealing with substructure matches) Entries can be added at any time to a catalog: ExclusionList excludedList; excludedList.addPattern(SmartsMatcher("Pattern 1", smarts)); excludedList.addPattern(SmartsMatcher("Pattern 2", smarts2)); A FilterCatalog supports a few different types of matching. One is a traditional rejection filter where if a substructure exists in the target molecule, the molecule is rejected. These types of queries can indicate the substructure that triggered the rejection through the FilterCatalogEntry::GetMatch(mol) function. The FilterCatalog also supports acceptance filters, that are designed to indicate which molecules are ok. These have to be transformed into rejection filters or simply wrapped in a Not( acceptanceFilter ) when entered into the catalog. For example, from Zinc: carbons [#6] 40 means that we have a maximum of 40 carbon atoms. We can write this by converting the max count to a min count (i.e. the pattern is triggered when the molecule has mincount atoms); const unsigned int minCount = 40+1; SmartsMatcher( "Too many carbons", "[#6"], minCount ); This can be properly substructure searched. Or we can wrap this in a not: const unsigned int minCount = 0; const unsigned int maxCount = 40; Not( SmartsMatcher( "ok number of carbons", "[#6]", minCount, maxCount) ); Note: Wrapping in a Not loses the ability to highlight the rejecting pattern when visualizing the molecule.	2015-07-14 10:31:31 -04:00
Nadine Schneider	0cf0dd37ce	Bugfix in SmilesWrite and some additional tests for getMolFrags function	2015-04-16 10:53:20 +02:00
Nadine Schneider	5d963846b8	merge	2015-04-10 09:44:18 +02:00
Greg Landrum	74125f685c	Fixes #443	2015-03-05 06:38:38 +01:00
Greg Landrum	ad62f6241a	update coords	2015-01-09 09:58:20 +01:00
Greg Landrum	baf26c053c	not fixed; still lots of debugging printing; backup commit	2015-01-09 06:33:49 +01:00
Greg Landrum	1f4c2e915c	fix a nasty canonicalization problem: need to be sure to sort neighbors by their ranks in a decreasing order	2015-01-07 20:46:08 +01:00
Greg Landrum	23076b1cdb	Fixes #298	2014-07-23 05:31:16 +02:00
Greg Landrum	86b9e6b089	Fixes #72	2013-08-25 06:36:10 +02:00
Greg Landrum	40ab2c06e3	Fixes #87	2013-08-21 08:20:19 +02:00
Greg Landrum	294cb24de4	fix and test issue 266	2012-11-17 07:39:39 +00:00
Greg Landrum	f5eb640766	fix and test Issue3549146	2012-07-26 14:34:35 +00:00
Greg Landrum	6157345dde	this version passes the ZINC natural-products torture test	2012-06-28 06:45:42 +00:00
Greg Landrum	0f3b84cd28	backup commit; we are not quite there yet	2012-06-26 06:15:08 +00:00
Greg Landrum	0085012701	fix and test issue 3525076	2012-05-10 05:35:20 +00:00
Greg Landrum	813f4863ed	fix and test Issue 3480481	2012-02-18 05:34:34 +00:00
Greg Landrum	5ad945b9ce	some supplier updates	2012-02-10 16:24:42 +00:00
Greg Landrum	044fda398b	further ring-finding algorithmic changes to fix issue 3185548	2011-02-19 04:35:53 +00:00
Greg Landrum	aa1610797e	initial fix for Issue3184458, more work should still be done here.	2011-02-18 06:31:31 +00:00
Greg Landrum	db1f25b16c	chirality support in the fix for issue 2951221	2010-02-14 06:39:51 +00:00
Greg Landrum	8daba8ff30	fix sf.net issue 2951221: note this doesn't add Hs to chiral centers correctly	2010-02-13 14:53:47 +00:00
Greg Landrum	c425bc8c03	fix and test sf.net Issue2788233	2009-06-01 13:04:33 +00:00
Greg Landrum	b78abe26d6	fix and test sf.net issue 2787221	2009-05-05 12:54:41 +00:00
Greg Landrum	09455ef42e	resolves issue 2705543	2009-03-28 06:43:04 +00:00
Greg Landrum	a412664c65	just for backup	2009-03-27 16:12:54 +00:00
Greg Landrum	c0e1ffdc65	fix and test Issue2313979	2008-12-02 08:50:51 +00:00
Greg Landrum	eef831c2bf	fix and test issue 2316677; there probably should be a hand-crafted test for this as well, but that is going to be tricky	2008-11-20 05:55:46 +00:00
Greg Landrum	fc6c2b2f9a	fixes to issues 2196817 and 2208994	2008-10-30 06:56:39 +00:00
Greg Landrum	023f7b4f0f	Merge changes from the iterative chirality branch: https://rdkit.svn.sourceforge.net/svnroot/rdkit/branches/IterativeChirality_20Aug2008 into the trunk. This covers revs 798-828. Dependent chirality should now be correctly handled, but the handling of ring stereochemistry, i.e. things like: C[C@H]1CC[C@H](C)CC1 is still not 100% correct.	2008-09-19 09:40:15 +00:00
Greg Landrum	148a3a87c4	add getTotalDegree() method to Atom; support generating stereochem information from 3D coords	2008-05-26 14:54:35 +00:00
Greg Landrum	75a79b6327	initial import	2006-05-06 22:20:08 +00:00

48 Commits