* Fix for RGD dummy atom bug
* Also fix labelling issues in the R group containing input dummy atom
* minor tweaks to the proposed fix
Co-authored-by: greg landrum <greg.landrum@gmail.com>
* Later cores with more R-groups should only be chosen when they are structurally related, i.e. when they are superstructures of earlier cores
* Update Code/GraphMol/RGroupDecomposition/testRGroupDecomp.cpp
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Update Code/GraphMol/RGroupDecomposition/testRGroupDecomp.cpp
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Support wildcard in input structures
* Fix typos
* Handle R groups containing a single wildcard and wildcards with group numbers
* Reorder tests
* Use propety instead of isotope to mark input dummy
* fix the windows DLL builds
* Added constexpr for dummy input atom property
* Omitted to replace one instance of string 'INPUT_DUMMY'
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* speed up scoring of permutations by clever caching of already settled
matches rather than recomputing scores for all matches every time
process() is called
* changes in response to review
* changes in response to review
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* support substructure search parameters in RGD.
Still needs testing/verification of the enhanced stereo stuff
* test enhanced stereo
* add support to python wrapper
unfortunately some python reformatting got mixed in there.
* Most tests working
* All tests working
* Fixed tests after merge with master
* Create header and implementations for RCore
* Updated comments
* Removed old code
* DLL export for MolMatchFinalCheckFunctor
* Information line for failing Mac test
* Log replace core behaviour
* Ordering fix for OSX
* Possible fuzzer fix
* Removed debug output
* Fix unmatched user R group bug
* Code review changes
* Bug fix and ChemTransforms test
* - do not add unncessary R-labels
- use a boost::dynamic_bitset rather than a std::set for lookups
* - R group labels can be >0 or <0, not 0, so no need to check for >=0 when looking for user labels
- as soon as a core is found that requires no additional labels to accommodate a molecule, bail out from the loop as no better core can be found
- add a test to better describe the use case for this change
- remove a signed/unsigned warning
* - added an entry to Release Notes to describe the impact of #3969
* avoid French expressions in Release Notes
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* 1) fixed residual kekulization issues when R-labels are removed from aromatic atoms on the core
2) added removeAllHydrogenRGroupsAndLabels flag, which defaults to True. If set to False,
unused user-defined labels on the core will be retained. If also removeAllHydrogenRGroups
is set to False, all-hydrogen user-defined R-groups will be included in the output.
Under no circumstance dynamically-added R-groups (when onlyMatchAtRGroups=false)
that are all-hydrogen will be included in the output, as that's not useful (change
compared to previous behavior)
3) added tests for (1), (2) and for removeAllHydrogenRGroups, that had no tests so far
4) fixed a few documentation typos in the Python code and added some documentation
to the C++ sources
* added missing test files
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* make sure that unlabelled R-groups on aromatic nitrogens have correct valence and formal charge to avoid kekulization failures
* change in response to review
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* backup
* simple first pass, passes all tests
* cleanup a bunch of existing uses
* ensure that we can safely add atoms/bonds while in edit mode
* add context manager on python side
* handle exceptions properly in those
* changes in response to review
* RGD modifications for any atom and index labels
* Continued development
* All tests working
* Added comment
* CR changes suggested by PTosco
* Fix catch_rgd for autocrlf
* Core dummy matches on output. RGroups on heavy atom. Dummy atoms User rgroups only when they are degree 1.
* Start work on test fixes
* testRGroupDecomp test working
* CPP and Python tests working
* Removed options for matching core query atoms on sidechains
* Windows build fix
* R groups off ring. User group matches single heavy substituent. Remove extraneous hydrogens
* Updated fingerprint variance score and tie selection
* Refactor fingerprint variance score functions to class
* Removed fingerprint distance score
* Boost::trim fix
* Updated RGD test notebook
* Fixed AddHs.cpp
* - fixes the kekulization issue
- avoids that empty R-group labels are included in cores
- makes sure that SMILES cores are always canonical
- adds a few missing const declarations and avoids unintentional copying
* Support for allowNonTerminalRGroups parameter. Remove R groups that contain H or Nothing. Ignore R group labels on non-dummy atoms
* Fixed tests for Paolo's changes. Rebuilt test notebook. Increased weighting of rgroup penalty in fingerprint variance score
* remove some debug output
Co-authored-by: Brian Kelley <fustigator@gmail.com>
Co-authored-by: greg landrum <greg.landrum@gmail.com>
* Exploration
* Initial work on GA fro Rgroup Symmetry
* GA for rgroup decomp and fingerprint rgroup symmetry scoring
* Continuing development
* Exploration
* Initial work on GA fro Rgroup Symmetry
* GA for rgroup decomp and fingerprint rgroup symmetry scoring
* Continuing development
* Further development
* Continued tweaks
* Function rename
* Continued tweaks
* Bug fix for variance calculation
* Copyright notices. Remove Eigen dependency. RdKit logging. Clock fix.
* Changes to fix build failures
* Fixes for Windows dynamic DLL build
* Included GA export.h file
* Fixed RGroupDecomp CMakeLists.txt
* Notebooks working, GGroup labelling bug fixed
* Fix windows build. More options for example GA program
* More bugs found and tests adjusted
* Fixed Python rgroup test
* Trivial change to trigger CI
* OSX java and windows build fixes
* Windows DLL fix
* Fix segmentation error
* proposed change
* Possible fix for segmentation fault
* CR fixes
* CR fixes
* CR fixes
* Recreates molecules from rgroups where possible
Co-authored-by: greg landrum <greg.landrum@gmail.com>
Co-authored-by: Brian Kelley <fustigator@gmail.com>
* - replaced set with vector for SMILES-based R-group equivalence
- the first GreedyChunk is constituted by chunkSize+1 mols
- labeled R-groups may not be extracted when onlyMatchAtRGroups==false
- labeled geminal R-groups are incorrectly scored
- my attempt to introduce consistency in R-group labeling was buggy
- added a DEBUG pre-processor directive to the tests to make debugging easier
- added a unit test
- fixed unit test results which were inconsistent with the expected behavior
* changes in response to review
* - Fixes three bugs in the R-group decomposition code
* - delete iterator properly during loop so the Mac does not complain
* added more tests
Co-authored-by: user173873 <user173873@FF026.local>
* refactor RGroupDecompositionParameters to directly initialize data members
add params getter for the RGroupDecomposition object
* initial implementation of timeout
* refactor the timeout check
We could move the function that checks and throws the exception somewhere else
* re-enable the tests I stupidly left disabled
* Fixes#3224
add option to skip symmetrization entirely
* test #3224
* stupid mistake
* run clang-tidy with modernize-use-default-member-init
* results from modernize-use-emplace
* one uniform initialization per line
otherwise SWIG is unhappy
Co-authored-by: Brian Kelley <fustigator@gmail.com>
* run clang-tidy with readability-braces-around-statements
clang-format the results
clean up all the parts that clang-tidy-8 broke
* fix problem on windows
* Better resolve ties in rgroup matches
* Break ties by choosing the permutation that adds the fewest rgroups
* Update aligned cores after tie breaking\nI think this results is more optimal
* Remove unused code
* Replace count with UsedLabels
* Remove whitespace
* Remove whitespace
* Remove redundant code and error message
* Revert test
* remove debug prints
* Fix core labelling and multi-core hydrogen removals
* Add comment that explains confusing bit about indexlabels for multicores
* Multi core fixes: hydrogens properly removed, fixed labelling
* clang-format
* update version of japanese docs
* Remove external labels from cores
* Fix syntax errors
* Add better autodetection of labels, add dummyatom label, don't fall back to indexes when onlyMatchAtRgroups are set
* Add better autodetection of labels, add dummyatom label, don't fall back to indexes when onlyMatchAtRgroups are set
* Move autodetection before alignment, fix final core labelling
* Fix stupid bit twiddling mistake
* None of the original mol's should actually match the cores with onlyMatchAtRgroups
* Convert PRECONDITION to CHECK_INVARIANT
* Run clang-format
* use nullptr instead of 0 for pointers
* Handle cases where molecules don't have anything for an R-group properly.
Here's the python demo of the bug:
```
In [14]: scaffold2 = Chem.MolFromSmiles('c1c([*:1])cncn1')
In [15]: scaffold = Chem.MolFromSmiles('c1c([*:1])cccn1')
In [19]: mols2 = [Chem.MolFromSmiles(smi) for smi in 'c1c(F)cc(O)cn1 c1c(F)cncn1 c1c(Cl)cc(O)cn1'.split()]
In [20]: print(rdRGroupDecomposition.RGroupDecompose([scaffold,scaffold2],mols2,asSmiles=True,asRows=False))
({'Core': ['c1ncc([*:2])cc1[*:1]', 'c1ncc([*:1])cn1', 'c1ncc([*:2])cc1[*:1]'], 'R1': ['F[*:1]', 'F[*:1]', 'Cl[*:1]'], 'R2': ['[H]O[*:2]', '[H]O[*:2]', '']}, [])
```
* Fixes#2471
* Tweak the scoring function to penalize non h matches considerably. Only full H rgroups get a one. Might need to tweak int the future
* Scale the hydrogens as 1/# mols unless they are a full group
* handle the heavy-atom degree queries differently
* Fixes#1563
* add a test for the heavy atom degree option
* Support (and test) adjustHeavyDegree in the cartridge too.
* test results
* Adds RGroupDecomp free function and python wrapper
* Fixes subtle bug, adds new RGroupDecomp API
* Updates results for the subtle bug fix. Verified results were correct.
* Removes smilesCaching.
* Changes RGroupDecompose ordering, adds docstrings and more tests
* Fixes#1550
* might as well update properties on the r groups too
* add option to remove Hs from sidechains;
expose a few more parameters to python;
expose ctor with parameter object to python