* RGD code cleanup
- made an effort to give more meaningful names to variables (e.g., renamed most instances of attachment (point) to avoid ambiguity as attachment may be interpreted as either the R-group atom or its neighbor atom on the core, which are two different things)
- replaced the old school removeAtom() method with begin/commitBatchEdit()
- added std::move and std::make_move_iterator where relevant to avoid potential unintended copying
- replaced instances of container.size() == 0 and !container.size() with container.empty() for better clarity
- replaced std::map::find() with std::map::at() where the key was not needed
- replaced expensive std::find_if with more efficient alternative
- added some missing const keywords and added references to avoid copying where appropriate
- replaced for loops with modern implicit looping alternatives where convenient
- avoid calling MolToSmiles when VERBOSE is not defined as the result is anyway not used
- removed "oops, exponential is a pain" code snippet as I believe 1. it is never executed 2. it is not tested 3. I do not think it is correct
- removed check for data->matches.size() > 1 as I do not believe it is correct
- Use std::unique_ptr::reset instead of defining a new std::unique_ptr and moving it to the original one
* changes in response to review
* change in response to review
* replaced std::set with boost::dynamic_bitset to save time on std::set::insert and std::set::find
* make sure we do not go out of bounds
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* speed up scoring of permutations by clever caching of already settled
matches rather than recomputing scores for all matches every time
process() is called
* changes in response to review
* changes in response to review
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* - do not add unncessary R-labels
- use a boost::dynamic_bitset rather than a std::set for lookups
* - R group labels can be >0 or <0, not 0, so no need to check for >=0 when looking for user labels
- as soon as a core is found that requires no additional labels to accommodate a molecule, bail out from the loop as no better core can be found
- add a test to better describe the use case for this change
- remove a signed/unsigned warning
* - added an entry to Release Notes to describe the impact of #3969
* avoid French expressions in Release Notes
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* Fixes#3924
The scoring function was not taking into account empty rgroups which biased certain
arrangements of rgroups that didn't have an rgroup at every position.
* Change in response to review
* Rigger the no_rgroup into EMPTY_RGROUP
* Remove Fix Me label from the geminal rgroup test
* add testUserMatchTypesDefaultScore to test default score on user types
* Fix scoring RGroups with split attachments [*:1]O.[*:1]N.
* Test both scoring functions for testMutipleCoreRelabellingIssues and testUnprocessedMapping
* Update testRGroupInternals.cpp
remove a couple obsolete comments as suggested in review
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Tests for default function failures
* Revert tests
* Fix linker matches in the case that they span multiple rgroups
* Change to test required by linker fix
* Update based on Brian's changes
* get windows dll builds working
Co-authored-by: Brian Kelley <fustigator@gmail.com>
Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
* Exploration
* Initial work on GA fro Rgroup Symmetry
* GA for rgroup decomp and fingerprint rgroup symmetry scoring
* Continuing development
* Exploration
* Initial work on GA fro Rgroup Symmetry
* GA for rgroup decomp and fingerprint rgroup symmetry scoring
* Continuing development
* Further development
* Continued tweaks
* Function rename
* Continued tweaks
* Bug fix for variance calculation
* Copyright notices. Remove Eigen dependency. RdKit logging. Clock fix.
* Changes to fix build failures
* Fixes for Windows dynamic DLL build
* Included GA export.h file
* Fixed RGroupDecomp CMakeLists.txt
* Notebooks working, GGroup labelling bug fixed
* Fix windows build. More options for example GA program
* More bugs found and tests adjusted
* Fixed Python rgroup test
* Trivial change to trigger CI
* OSX java and windows build fixes
* Windows DLL fix
* Fix segmentation error
* proposed change
* Possible fix for segmentation fault
* CR fixes
* CR fixes
* CR fixes
* Recreates molecules from rgroups where possible
Co-authored-by: greg landrum <greg.landrum@gmail.com>
Co-authored-by: Brian Kelley <fustigator@gmail.com>
* - replaced set with vector for SMILES-based R-group equivalence
- the first GreedyChunk is constituted by chunkSize+1 mols
- labeled R-groups may not be extracted when onlyMatchAtRGroups==false
- labeled geminal R-groups are incorrectly scored
- my attempt to introduce consistency in R-group labeling was buggy
- added a DEBUG pre-processor directive to the tests to make debugging easier
- added a unit test
- fixed unit test results which were inconsistent with the expected behavior
* changes in response to review
* - removes an unnecessary O(n) complexity in setting the constant core_atoms_with_user_labels set
- provides a better and more general fix to github #1705
- makes sure that the best R group permutation is chosen based on a stable sorting criterion
* - changes in response to review
* reverted from compute_heavy_rgroup_counts() to compute_num_added_rgroups()
* Suppress warning in removeHs due to having a lot of dummy atoms
Co-authored-by: Brian Kelley <fustigator@gmail.com>