rdkit

mirror of https://github.com/rdkit/rdkit.git synced 2026-06-05 22:04:27 +08:00

Author	SHA1	Message	Date
Rachel Walker	e1322f73c6	Sped up SSSR by not storing every path back to root (#3333 ) * Sped up SSSR by not storing every path back to root This change speeds up ring performance by not storing every path back to the root. Instead, it keeps track of parents and rebuilds paths from the parents once a cycle is found. It also stops the BFS once the depth of the BFS is larger than the smallest ring (i.e., we found a path that is longer than the smallest ring). Before this commit: 3EOH: 0.72s 2J3N: 0.26s 1NKS: 0.018s After this commit: 3EOH: 0.35s 2J3N: 0.07s 1NKS: 0.007s * Fixed ordering of atoms within SSSR rings Co-authored-by: Rachel Walker <rachel.walker@schrodinger.com>	2020-08-15 06:00:40 +02:00
Greg Landrum	a9010da8a4	Small bug fixes and cleanups from fuzz testing (#3299 ) * fix ossfuzz issue 24074 * fix ossfuzz issue 23896 * switch to throw exceptions when reading ints/floats * remove extraneous benchmarking code * change type of AH query * confirm an invariant while finding rings * no sense in adding these tests to github * switch to use fail() instead of failbit switch to acceptSpaces by default	2020-07-22 16:57:31 +02:00
Dan N	17c49b8b8b	Speed up ring finding by skipping nodes not in rings (#3254 ) When finding rings - If we've exhaustively searched an atom and found no rings, we should mark the bonds to that atom as "not in a ring". We can also mark any neighboring low degree atoms in the same way. This speeds up searches in large molecules, because a ring search that _doesn't_ find any rings is very expensive, and it's a bummer to pay for that search on two neighboring atoms, for instance.	2020-06-26 05:42:34 +02:00
Dan N	de869ef017	Improve SSSR performance for large molecules (#3236 ) RDKit sanitization is a bottleneck for some RDKit workflows that I have. Particularly for large molecules, SSSR is pretty slow. This improves speeds up SSSR from 6s to 2.5s for a protein that I was looking at (3EOH). I think there is room for improvement, but this is easy. Schrodinger's SSSR takes 0.02s for that same molecule. (not a completely fair comparison, Schrodinger doesn't use symmetrized SSSR). Co-authored-by: Greg Landrum <greg.landrum@gmail.com>	2020-06-21 04:48:17 +02:00
Greg Landrum	edd922c99c	Cleanup warnings from clang-10 (#3238 ) * stop returning local memory in exceptions * remove a couple unnecessary copies in loops * fix a bug in the way the default MMFF aromatic parameters are constructed * remove a bunch of loop-variable warnings * remove a bunch of clang warnings * disable clang warnings in python wrappers * remove some warnings when building the python wrappers	2020-06-19 17:16:22 -04:00
Ric	66a38d3751	Address build warnings (#3082 ) * do not throw in desctructor * remove unused var; reserve * provide operator= for DiscreteValueVect * provide operator= for SparseIntVect * remove unknown 'omp' #pragmas; refactor loop * remove unused var * remove unused variables * give EmbeddedAtom a default constructor & early exit on self assignç * handle unused vars/args * catch exception by ref * address unused args * fix signed type comparison; refactor extra checks * remove unused variable * suppress switch fallthtough warning * handle signed type comparison * handle signed type comparison * potentially uninitialized vars * fix abs() of bool * unused vars in catch statements * remove unused variables * python::list returns will be copied * give ValidationMethod constructor & virtual destructor * remove extra semicolon	2020-04-17 14:40:15 +02:00
Greg Landrum	3851380800	Remove bogus URFLib library (#2900 ) * a couple of URF building cleanups * java wrapper build cleanups * no longer need URF.cpp	2020-01-28 08:52:58 -05:00
Greg Landrum	d41752d558	run clang-tidy with readability-braces-around-statements (#2899 ) * run clang-tidy with readability-braces-around-statements clang-format the results clean up all the parts that clang-tidy-8 broke * fix problem on windows	2020-01-25 14:19:32 +01:00
Eisuke Kawashima	5cd27a242f	Fix typo (#2862 ) * Fix typo * Reflect the comments * Fix more typos	2019-12-31 06:43:27 +01:00
Eisuke Kawashima	dc7cc84a0c	Fix typo [ci skip]	2019-10-17 17:45:50 +09:00
Greg Landrum	5dfd67a22a	Add new mol hashing code (#2636 ) * copy in, get building, add some basic tests * complete the testing Except for regiosiomers, which do not work * regioisomers work now * backup commit; things work * remove last of NM macros from hashfunctions.cpp * remove last of NM macros from hashfunctions.cpp * remove dependency on the abstraction layer * typo * start using namespaces clang-format * switch to using enums for the HashFunctions and StripTypes * Add initial python wrapper (and tests) * move the new hashing code to the MolHash library still may want to revise the naming of this * Setup deprecation of the older hashing code * better release notes text * change in response to review	2019-09-24 07:55:21 -04:00
Greg Landrum	dd21db1b06	Integrate Unique Ring Families from RingDecomposerLib (#2558 ) * add the ring decomposer lib (temporarily?) * simplify makefile * very basics work * backup * basics working * builds and basic tests pass * get this building again * expose the ring families * add tests on the python side * make the pywrapper for this optional * remove some extra bits * cleanup * switch to using RDL as an external project * make sure this still works if we do not use the URF code * remove BUILD_ALWAYS * fix linkage of Java wrapper and cartridge (hopefully) * fix cmake for wrappers (hopefully) * forgot a semicolon * try to force URF lib to build first * improve memory management and interface * fix dependency specifier * make pointer initialization explicit This may not be necessary, but it feels safer. * not pleasing and needs to be cleaned up but it builds * not pleasing and needs to be cleaned up but it builds * cleanup in preparation for merging * cleanup in preparation for merging * switch to rareylab repo * fix updated copyright date * Fix updated copyright date * switch to a specific library tag Co-Authored-By: Florian Flachsenberg <flachsenberg@zbh.uni-hamburg.de> * change in response to review	2019-07-30 06:41:55 -04:00
Greg Landrum	d8c49e6dab	Code cleanups from PVS/Studio (#2531 ) * first round of cleanups based on PVS-studio suggestions * a couple more * a few more cleanups * another round of cleanups * undo one of those cleanups we want the integer rounding behavior here * add a comment to make that clear * Fix for filter catalog PRECONDITION redundancy	2019-07-13 07:25:37 +02:00
Dan N	47acdc8b73	Issue #2403 : Speed up SSSR symmetrization (#2410 ) * Issue #2403: Speed up SSSR symmetrization For my horrible example molecule (a highly symmetric nanotube with 2400 atoms and > 1000 rings), this speeds up symmetrizeSSSR() from 5s to about 0.002s. findSSSR() takes another .4s or so. * Refactor after Ricardo's suggestions * Greg's review comments. use std::vector	2019-04-18 07:11:15 +02:00
Brian Kelley	373a89021e	Change boost int types to std types (#2233 )	2019-01-22 17:45:03 +01:00
Greg Landrum	ba40ecaca1	Fixes #299 (#2100 ) add a fallback for when the original algorithm fails	2018-10-11 17:24:15 -04:00
Greg Landrum	ba12d98ad0	Removes ATOM/BOND_SPTR in boost::graph in favor of raw pointers (#1713 ) * Removes ATOM/BOND_SPTR in boost::graph in favor of raw pointers * Actually delete atoms and bonds... * RWMol::clear now calls destroy to handle atom/bond deletion * Changes broken Atom lookup for windows/gcc * Adds tests for running with valgrind * Adds test designed for valgrind and molecule deletions * Removes RNG, actually tests bond deletions * update swig wrappers * deal with most recent changes on the main branch	2018-01-07 14:19:47 -05:00
Greg Landrum	87786c08b5	Merge branch 'master' into modern_cxx # Conflicts: # .travis.yml # Code/GraphMol/FileParsers/MolFileParser.cpp # Code/GraphMol/FileParsers/MolFileStereochem.cpp # Code/GraphMol/ForceFieldHelpers/UFF/testUFFHelpers.cpp # Code/GraphMol/MolAlign/testMolAlign.cpp # Code/GraphMol/MolDraw2D/MolDraw2D.cpp # Code/GraphMol/MolDraw2D/Wrap/rdMolDraw2D.cpp # Code/GraphMol/QueryOps.cpp # Code/GraphMol/ROMol.cpp # Code/GraphMol/SmilesParse/test.cpp # Code/GraphMol/Trajectory/Trajectory.cpp # Code/GraphMol/Wrap/Atom.cpp # Code/GraphMol/Wrap/Bond.cpp # Code/GraphMol/new_canon.cpp # Code/RDGeneral/testDict.cpp # Code/SimDivPickers/Wrap/MaxMinPicker.cpp	2017-10-05 05:58:38 +02:00
Greg Landrum	83691e4f16	Fixes #1281 in a way (#1553 ) This actually just causes the molecule processing to fail in a reasonable amount of time; it is not an actual fix to the underlying ring-finding problem	2017-09-08 12:10:07 -04:00
Greg Landrum	f6ced134f0	a number of other small changes from manually reviewing the PR	2017-04-22 17:24:57 +02:00
Greg Landrum	915cf08faa	run clang-format with c++-11 style over that	2017-04-22 17:19:10 +02:00
Greg Landrum	7c0bb0b743	clang-tidy output	2017-04-22 17:09:24 +02:00
Brian Cole	893fa41e98	SSSR performance improvements to support larger systems (#1131 ) * findSSSR performance improvements for fragments without rings This makes Chem.SanitizeMol significantly faster when dealing with molecules with lots of disconnected fragments (like a box of water). The following is the runtime of Chem.SanitizeMol while adding 10,000 waters with explicit hydrogens when running Chem.SanitizeMol on every 1,000th water added. Before: 0 add_water = 0.00007s 0 Chem.SanitizeMol = 0.01991s 1000 add_water = 0.00009s 1000 Chem.SanitizeMol = 0.99659s 2000 add_water = 0.00013s 2000 Chem.SanitizeMol = 3.94565s 3000 add_water = 0.00018s 3000 Chem.SanitizeMol = 8.94760s 4000 add_water = 0.00023s 4000 Chem.SanitizeMol = 15.75187s 5000 add_water = 0.00035s 5000 Chem.SanitizeMol = 24.59318s 6000 add_water = 0.00048s 6000 Chem.SanitizeMol = 37.23530s 7000 add_water = 0.00042s 7000 Chem.SanitizeMol = 47.70860s 8000 add_water = 0.00105s 8000 Chem.SanitizeMol = 62.21912s 9000 add_water = 0.00056s 9000 Chem.SanitizeMol = 80.08511s After: 0 add_water = 0.00003s 0 Chem.SanitizeMol = 0.01219s 1000 add_water = 0.00004s 1000 Chem.SanitizeMol = 0.01004s 2000 add_water = 0.00012s 2000 Chem.SanitizeMol = 0.01058s 3000 add_water = 0.00018s 3000 Chem.SanitizeMol = 0.01158s 4000 add_water = 0.00018s 4000 Chem.SanitizeMol = 0.01530s 5000 add_water = 0.00022s 5000 Chem.SanitizeMol = 0.02010s 6000 add_water = 0.00036s 6000 Chem.SanitizeMol = 0.02397s 7000 add_water = 0.00033s 7000 Chem.SanitizeMol = 0.02978s 8000 add_water = 0.00037s 8000 Chem.SanitizeMol = 0.04446s 9000 add_water = 0.00040s 9000 Chem.SanitizeMol = 0.04419s * Refactor new_timings.py script a bit to be able to run only the first (reading molecules) test. * Removing O(N^2) behavior of finding the number of bonds in the fragment during SSSR. This only improves the case when there are long chains and a small number of rings in the fragment. Many ring systems are still dominated by the rest of the SSSR algorithm, and fragments with no ring systems don't reach this part of the code. For a test case with a single cyclicpropane and adding carbons while calling Chem.SanitizeMol every 10,000 carbons added yield the following improvement in performance: before: 0 add_carbon = 0.00001s 0 Chem.SanitizeMol = 0.01237s 10000 add_carbon = 0.00017s 10000 Chem.SanitizeMol = 0.04453s 20000 add_carbon = 0.00017s 20000 Chem.SanitizeMol = 0.13038s 30000 add_carbon = 0.00029s 30000 Chem.SanitizeMol = 0.27671s 40000 add_carbon = 0.00063s 40000 Chem.SanitizeMol = 0.44774s 50000 add_carbon = 0.00106s 50000 Chem.SanitizeMol = 0.69433s 60000 add_carbon = 0.00181s 60000 Chem.SanitizeMol = 1.00577s after: 0 add_carbon = 0.00001s 0 Chem.SanitizeMol = 0.01264s 10000 add_carbon = 0.00013s 10000 Chem.SanitizeMol = 0.01349s 20000 add_carbon = 0.00022s 20000 Chem.SanitizeMol = 0.02724s 30000 add_carbon = 0.00040s 30000 Chem.SanitizeMol = 0.04292s 40000 add_carbon = 0.00076s 40000 Chem.SanitizeMol = 0.06172s 50000 add_carbon = 0.00193s 50000 Chem.SanitizeMol = 0.07658s 60000 add_carbon = 0.00147s 60000 Chem.SanitizeMol = 0.08625s Note, couldn't actually test a higher number of carbons as it led to a stack overflow due to recursion in findSSSR.	2016-10-29 04:38:14 +02:00
Greg Landrum	7a49dd3bb9	fixes #1023 (#1027 )	2016-08-18 16:29:29 -04:00
kelley	5dbec2fe85	Adds rdcasts where appropriate	2015-11-29 17:52:27 -05:00
Greg Landrum	e08e0d16d8	first pass, using google style	2015-11-14 14:58:11 +01:00
Greg Landrum	e37296d7c7	post review	2015-11-14 08:08:14 +01:00
Brian Kelley	5f59333a56	Silences unused parameters	2015-10-18 14:02:29 -04:00
Greg Landrum	a7a2ee9a62	Fixes #526	2015-06-20 04:54:54 +02:00
Riccardo Vianello	7c346d7c2e	Code/RDBoost/Exceptions.h moved to Code/RDGeneral	2015-03-16 22:31:48 +01:00
David Hall	37f6cb0f88	Increase limit for smallest ring size Story: I have a PDB I want to read into RDKit. It has a disulfide bond between two cysteines ~400 residues apart. This creates a very large ring. RDKit throws an error because the number of found rings is less than the expected number of rings. The ring wasn't found because RDKit thought all "smallest" rings should be 256 or smaller. Now, as long as your ring is UINT_MAX aka 4,294,967,295 or smaller, life is beautiful. I hope no one has a ring bigger than 4 billion atoms.	2015-02-04 16:40:26 -05:00
Brian Kelley	95a92282d1	Dictionary access is saniztized and optimized. o rdkit gains a RDKit::common_properties namespace that contains common string value properties o Dict.h and below gain getPropIfPresent that attempts to retrieve a property and returns true/false on success or failure. This is used to optimize access. o rdkit learns how to pass property keys by reference, not value. A new namespace has been added to RDKit, common_properties that contains the std::string values for commonly used properties. This helps to avoid typos in string values but also avoids a creation of std::strings from character values. All accessors (has/get/clear and getPropIfPresent) now pass the key by reference. Additionally, getPropIfPresent removes the double lookup of hasProp/getProp which can be a significant speedup in the smiles and smarts parsers (10-20%)	2015-01-15 12:23:29 -05:00
Greg Landrum	f5cf3322fe	code cleanup: removing compiler warnings	2014-05-08 06:06:07 +02:00
Greg Landrum	75be63fd6b	merge with trunk	2014-02-09 05:00:18 +01:00
Greg Landrum	9f4471f872	more on #204 A few other cleanups	2014-02-06 06:43:28 +01:00
Greg Landrum	64366007e1	more C++ style cleanups	2014-01-01 17:16:25 +01:00
Greg Landrum	0f92877061	stop messing up aromaticity and hybridization; all tests pass, but this is definitely not the most efficient thing.	2013-11-30 08:07:09 +01:00
Greg Landrum	a4734bbd43	start using the alternate getProp form	2013-07-20 07:26:06 -04:00
Greg Landrum	294cb24de4	fix and test issue 266	2012-11-17 07:39:39 +00:00
Greg Landrum	b67dd0f437	another bit of fixing sf.net issue 249	2012-09-01 04:06:02 +00:00
Greg Landrum	2d7315e8d4	fix and test sf.net issue 249	2012-08-30 05:45:59 +00:00
Greg Landrum	b5d1ad394e	accelerate FastFindRings	2012-08-29 02:28:03 +00:00
Greg Landrum	c6aefe9912	fix and test issue 3514824	2012-04-10 04:14:52 +00:00
Greg Landrum	acb61df900	update/reformat	2012-01-17 03:25:58 +00:00
Greg Landrum	3eeb1bf30c	initial version of a DFS-based ring finder	2012-01-02 15:34:43 +00:00
Greg Landrum	a99ad44859	clean up some compiler warnings	2011-12-31 15:58:57 +00:00
Greg Landrum	044fda398b	further ring-finding algorithmic changes to fix issue 3185548	2011-02-19 04:35:53 +00:00
Greg Landrum	aa1610797e	initial fix for Issue3184458, more work should still be done here.	2011-02-18 06:31:31 +00:00
Greg Landrum	3b3d44db16	remove exe property from source files	2011-01-13 04:22:56 +00:00
Greg Landrum	f3fbef45c5	update copyright statements	2010-09-26 17:04:37 +00:00

1 2

65 Commits