Commit Graph

27 Commits

Author SHA1 Message Date
Greg Landrum
a9477d2694 Modernization of some substructure code (#8450)
* use std::span for substruct match callbacks

This removes a copy from every evaluation of potential matches

* some cleanup/modernization

* some modernization

* deprecate chiralAtomCompat

* small optimization

* remove naked pointers

* improve new_timings.py script

* changes suggested in review

* response to review

* response to review
2025-05-12 06:33:25 +02:00
Greg Landrum
bb066c43f4 add mol processing API (#7773)
* fix aliasing bug in MultithreadedSDMolSupplier

* update GeneralFileReader to v2 API

* add backwards incompatibility note

* v1 of this

* The helper function needs to be inline

* forgot the tests

* allow non-threadsafe builds

* MultithreadedMolSuppliers can now be destroyed without being used.

This was previously not possible

* add callbacks to the multithreaded readers

* document the new functions

* switch to storing the queues in unique_ptrs

* does not work

* only do those tests when in MT mode

* more generalfilereader cleanup

* tests pass

* passes tests

* extremely basic python wrapper

* better wrapper

* does not work

* tests pass

* test data

* fix failing test on ARM macs

we need to followup on why the wedging is different here

* move some stuff to the cpp file

the idea is to have the windows DLL builds not break

* fix(?) win64 linkage problems

* remove a warning in non-multi-threaded builds

* fix non-multi-threaded work

* well, at least local windows builds work

* remove duplicated code

* refactoring
simplification?

* simplify mutext handling

* review suggestions
2024-09-19 18:42:25 +02:00
Ric
880a8e5725 Reformat Python code for 2023.03 release (#6294)
* run yapf

* run isort

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2023-04-28 06:53:56 +02:00
Guy Rosin
135dfc59b5 Fix typo: quarternary --> quaternary (#5243) 2022-04-29 08:09:34 +02:00
Greg Landrum
36599e9ac3 add a script for benchmarking fingerprint screenout and substructure performance (#2523)
* add fingerprint_screenout script to benchmarking suite

* cleanup

* simplification
2019-07-03 04:54:10 +02:00
Greg Landrum
6dd6a9a4ec update benchmarking scripts 2019-04-03 08:02:39 +02:00
Greg Landrum
24f1737839 Remove a bunch of Python2-related warts (#2315)
* remove all of the "from __future__" imports

* remove the first batch of rdkit.six imports/uses

* next step of rdkit.six removal

* removing xrange, range, and some maps

* next round of removals

* next round of cleanups

* fix inchi test

* last bits of "from rdkit.six" are gone

* and the last of the six stuff is gone

* strange importlib problem
2019-03-06 20:43:49 -05:00
Brian Cole
893fa41e98 SSSR performance improvements to support larger systems (#1131)
* findSSSR performance improvements for fragments without rings

This makes Chem.SanitizeMol significantly faster when dealing with
molecules with lots of disconnected fragments (like a box of water).

The following is the runtime of Chem.SanitizeMol while adding 10,000
waters with explicit hydrogens when running Chem.SanitizeMol on every
1,000th water added.

Before:
0 add_water = 0.00007s
0 Chem.SanitizeMol = 0.01991s
1000 add_water = 0.00009s
1000 Chem.SanitizeMol = 0.99659s
2000 add_water = 0.00013s
2000 Chem.SanitizeMol = 3.94565s
3000 add_water = 0.00018s
3000 Chem.SanitizeMol = 8.94760s
4000 add_water = 0.00023s
4000 Chem.SanitizeMol = 15.75187s
5000 add_water = 0.00035s
5000 Chem.SanitizeMol = 24.59318s
6000 add_water = 0.00048s
6000 Chem.SanitizeMol = 37.23530s
7000 add_water = 0.00042s
7000 Chem.SanitizeMol = 47.70860s
8000 add_water = 0.00105s
8000 Chem.SanitizeMol = 62.21912s
9000 add_water = 0.00056s
9000 Chem.SanitizeMol = 80.08511s

After:

0 add_water = 0.00003s
0 Chem.SanitizeMol = 0.01219s
1000 add_water = 0.00004s
1000 Chem.SanitizeMol = 0.01004s
2000 add_water = 0.00012s
2000 Chem.SanitizeMol = 0.01058s
3000 add_water = 0.00018s
3000 Chem.SanitizeMol = 0.01158s
4000 add_water = 0.00018s
4000 Chem.SanitizeMol = 0.01530s
5000 add_water = 0.00022s
5000 Chem.SanitizeMol = 0.02010s
6000 add_water = 0.00036s
6000 Chem.SanitizeMol = 0.02397s
7000 add_water = 0.00033s
7000 Chem.SanitizeMol = 0.02978s
8000 add_water = 0.00037s
8000 Chem.SanitizeMol = 0.04446s
9000 add_water = 0.00040s
9000 Chem.SanitizeMol = 0.04419s

* Refactor new_timings.py script a bit to be able to run only the first (reading molecules) test.

* Removing O(N^2) behavior of finding the number of bonds in the fragment during SSSR.

This only improves the case when there are long chains and a small
number of rings in the fragment. Many ring systems are still dominated
by the rest of the SSSR algorithm, and fragments with no ring systems
don't reach this part of the code.

For a test case with a single cyclicpropane and adding carbons while
calling Chem.SanitizeMol every 10,000 carbons added yield the
following improvement in performance:

before:
0 add_carbon = 0.00001s
0 Chem.SanitizeMol = 0.01237s
10000 add_carbon = 0.00017s
10000 Chem.SanitizeMol = 0.04453s
20000 add_carbon = 0.00017s
20000 Chem.SanitizeMol = 0.13038s
30000 add_carbon = 0.00029s
30000 Chem.SanitizeMol = 0.27671s
40000 add_carbon = 0.00063s
40000 Chem.SanitizeMol = 0.44774s
50000 add_carbon = 0.00106s
50000 Chem.SanitizeMol = 0.69433s
60000 add_carbon = 0.00181s
60000 Chem.SanitizeMol = 1.00577s

after:

0 add_carbon = 0.00001s
0 Chem.SanitizeMol = 0.01264s
10000 add_carbon = 0.00013s
10000 Chem.SanitizeMol = 0.01349s
20000 add_carbon = 0.00022s
20000 Chem.SanitizeMol = 0.02724s
30000 add_carbon = 0.00040s
30000 Chem.SanitizeMol = 0.04292s
40000 add_carbon = 0.00076s
40000 Chem.SanitizeMol = 0.06172s
50000 add_carbon = 0.00193s
50000 Chem.SanitizeMol = 0.07658s
60000 add_carbon = 0.00147s
60000 Chem.SanitizeMol = 0.08625s

Note, couldn't actually test a higher number of carbons as it led to a
stack overflow due to recursion in findSSSR.
2016-10-29 04:38:14 +02:00
gedeck
e9af48ffd7 Issue1071/yapf (#1078)
* Issue #1071: add yapf configuration file

* yapf formatting of Code directory

* yapf formatting of Contrib directory

* yapf formatting of Data directory

* yapf formatting of Docs directory

* yapf formatting of External directory

* yapf formatting of Projects directory

* yapf formatting of Regress directory

* yapf formatting of Scripts directory

* yapf formatting of Web directory

* yapf formatting of rdkit directory
2016-09-23 04:58:46 +02:00
Paolo Tosco
0b1831b3e4 Timings on Windows with Python 3 (#1067)
* - Small change to enable timings to be collected on Windows with Python 3

* - better Python 3 fix for timings.py
2016-09-20 04:59:57 +02:00
Greg Landrum
b8d25f431f get the timing benchmarks working on py3 2015-11-05 03:25:34 +01:00
Andrew Dalke
866e6b831d removed bare exception and ported it to Python 2.x 2015-09-07 13:19:02 +02:00
Greg Landrum
31a1fd0101 check for failing embedding as well 2015-08-17 04:27:36 +02:00
Greg Landrum
ed9c96f544 still not really working 2015-08-15 14:26:52 +02:00
Greg Landrum
ee850b232f progress on #563 2015-08-14 07:23:18 +02:00
Greg Landrum
55f6fed10d progress on #563 2015-08-14 07:19:52 +02:00
Greg Landrum
73dd7e7265 add more timing tests 2015-02-17 03:48:12 +01:00
Greg Landrum
241bfc67a1 add new timing script 2015-02-13 06:18:36 +01:00
Riccardo Vianello
95f60d21bc python3 portability fixes for pandas and the ipython notebook 2014-09-11 23:49:45 +02:00
Greg Landrum
aa7095984e merge 2013-12-03 05:11:30 +01:00
Greg Landrum
e2b6435614 update cartridge benchmarking script 2012-10-07 04:38:27 +00:00
Greg Landrum
cc83064764 remove user names 2012-09-27 02:32:10 +00:00
Greg Landrum
1ec40268c4 add standard benchmarks for the cartridge 2012-09-26 04:10:39 +00:00
Greg Landrum
4b7e409248 testing data for postgres 2012-09-26 01:47:29 +00:00
Greg Landrum
1d226f7143 add a test or two 2010-02-18 05:34:18 +00:00
Greg Landrum
7268c21448 add two more tests 2010-02-08 08:32:18 +00:00
Greg Landrum
104efc5b60 add some data and scripts for regression testing and benchmarking 2009-06-12 05:24:49 +00:00