Files
rdkit/Code/GraphMol/RGroupDecomposition/RGroupDecomp.h
Paolo Tosco 786393beb1 Enable RGD highlights as in blog post (#7322)
* Code/GraphMol/Depictor/RDDepictor.h
- fixed typo in docstring
Code/GraphMol/RGroupDecomposition/RGroupCore.cpp
- added a missing const; formatting changes
Code/GraphMol/RGroupDecomposition/RGroupData.cpp, Code/GraphMol/RGroupDecomposition/RGroupData.h
- moved the code which merges disconnected R groups sharing the same attachment point into a single combined molecule to a private method, RGroupData::mergeIntoCombinedMol(). The method also includes logic to merge atom and bond highlights, if present.
- modernized a for loop
- isMolHydrogen is now a static function since it does not actually require any instance data
- implemented three static function to return the R group, Core and Mol labels, respectively
Code/GraphMol/RGroupDecomposition/RGroupDecomp.cpp, Code/GraphMol/RGroupDecomposition/RGroupDecomp.h
- implemented two private methods, RGroupDecomposition::labelAtomBondIndices() and RGroupDecomposition::setTargetAtomBondIndices(). The first method tags all atoms and bonds in the target molecule such that they can be tracked following core removal by RDKit::replaceCore(). The second method sets common_properties::_rgroupTargetAtoms and common_properties::_rgroupTargetBonds properties on core and R groups. These are vectors of atom and bond indices in the target molecule corresponding to core and R group atom/bonds, respectively, and can be used for color-coding the target molecule according to the R group decomposition it was subjected to, similarly to https://greglandrum.github.io/rdkit-blog/posts/2021-08-07-rgd-and-highlighting.html
Code/GraphMol/RGroupDecomposition/RGroupDecompData.cpp
- formatting changes and for loop modernization
Code/GraphMol/RGroupDecomposition/RGroupDecompParams.cpp, Code/GraphMol/RGroupDecomposition/RGroupDecompParams.h
- implemented updateRGroupDecompositionParametersFromJSON()
- added includeTargetMolInResults boolean parameter
Code/GraphMol/RGroupDecomposition/RGroupMatch.h
- implemented RGroupMatch::setTargetMoleculeForHighlights() and RGroupMatch::getTargetMoleculeForHighlights() methods to, respectively set and get the target molecule where R group decomposition can be color-coded with highlights. This molecule includes the explicit H atoms corresponding to extracted R groups, if any.
Code/GraphMol/RGroupDecomposition/Wrap/rdRGroupComposition.cpp
- use a std::unique_ptr to store the pointer to the C++ RGroupDecomposition instance
- fixed typos in docstrings
Code/GraphMol/RGroupDecomposition/Wrap/test_rgroups.py
- added test for the new includeTargetMolInResults parameter
Code/GraphMol/RGroupDecomposition/catch_rgd.cpp
- added test for the new includeTargetMolInResults parameter
Code/GraphMol/RGroupDecomposition/testRGroupDecomp.cpp
- formatting changes
Code/GraphMol/RGroupDecomposition/testRGroupInternals.cpp
- do not use deprecated constant
Code/MinimalLib/CMakeLists.txt
- added RDK_BUILD_MINIMAL_LIB_RGROUPDECOMP CMake flag to optionally expose R group decomposition functionality into MinimalLib
Code/MinimalLib/common.h
- added makeDummiesQueries flag to mol_from_input() (defaults to false)
- implemented parse_highlight_multi_colors() function to parse multi-color atom and bond highlights
- enable multi-color atom and bond highlighting
Code/MinimalLib/demo/rgd_demo.html
- added HTML page showcasing the multi-color highlights similarly to https://greglandrum.github.io/rdkit-blog/posts/2021-08-07-rgd-and-highlighting.html
Code/MinimalLib/jswrapper.cpp
- removed checks for non-nullness of d_mol as d_mol cannot be directly accessed anymore
- replaced all instances of d_mol with get()
- implemented support for multi-color atom and bond highlights
- implemented optional support for R group decomposition
- added JSMol::copy() convenience method with same functionality as get_mol_copy() to duplicate a molecule
Code/MinimalLib/minilib.cpp, Code/MinimalLib/minilib.h
- replaced all occurrences of d_mol with get(), as d_mol is now private
- removed all occurrences of assert(d_mol) as non-nullness is checked at construction time and whenever get() is called
- JSMol is now split into two subbclasses, JSMolUnique and JSMolShared, which both inherit from the JSMol base class. JSMolUnique can be constructed from a RWMol* (as the old JSMol), while JSMolShared can be constructed from a ROMOL_SPTR. This avoids unnecessary copies when wrapping a ROMOL_SPTR (e.g., from subtructure library, JSMolList or R group decomposition) into a JSMol to pass it to JS. This also avoids that modifications done in the JS layer on a molecule stored in a MolList (e.g., adding a property) are not persisted because they are carried out on a volatile copy of the molecule rather than on the actual molecule.
Code/MinimalLib/tests/tests.js
- added a test for pesistence of modifications made to JSSharedMol
- added tests for RGD
- added test for JSMol::copy()
Code/RDGeneral/RDValue.h
- removed trailing comma from vector properties such that they can be deserialized as syntactically correct JSON
Code/RDGeneral/types.cpp, Code/RDGeneral/types.h
- added _rgroupTargetAtoms and _rgroupTargetBonds common_properties

* Code/GraphMol/Depictor/RDDepictor.h
- fixed typo in docstring
Code/GraphMol/RGroupDecomposition/RGroupCore.cpp
- added a missing const; formatting changes
Code/GraphMol/RGroupDecomposition/RGroupData.cpp, Code/GraphMol/RGroupDecomposition/RGroupData.h
- moved the code which merges disconnected R groups sharing the same attachment point into a single combined molecule to a private method, RGroupData::mergeIntoCombinedMol(). The method also includes logic to merge atom and bond highlights, if present.
- modernized a for loop
- isMolHydrogen is now a static function since it does not actually require any instance data
- implemented three static function to return the R group, Core and Mol labels, respectively
Code/GraphMol/RGroupDecomposition/RGroupDecomp.cpp, Code/GraphMol/RGroupDecomposition/RGroupDecomp.h
- implemented two private methods, RGroupDecomposition::labelAtomBondIndices() and RGroupDecomposition::setTargetAtomBondIndices(). The first method tags all atoms and bonds in the target molecule such that they can be tracked following core removal by RDKit::replaceCore(). The second method sets common_properties::_rgroupTargetAtoms and common_properties::_rgroupTargetBonds properties on core and R groups. These are vectors of atom and bond indices in the target molecule corresponding to core and R group atom/bonds, respectively, and can be used for color-coding the target molecule according to the R group decomposition it was subjected to, similarly to https://greglandrum.github.io/rdkit-blog/posts/2021-08-07-rgd-and-highlighting.html
Code/GraphMol/RGroupDecomposition/RGroupDecompData.cpp
- formatting changes and for loop modernization
Code/GraphMol/RGroupDecomposition/RGroupDecompParams.cpp, Code/GraphMol/RGroupDecomposition/RGroupDecompParams.h
- implemented updateRGroupDecompositionParametersFromJSON()
- added includeTargetMolInResults boolean parameter
Code/GraphMol/RGroupDecomposition/RGroupMatch.h
- implemented RGroupMatch::setTargetMoleculeForHighlights() and RGroupMatch::getTargetMoleculeForHighlights() methods to, respectively set and get the target molecule where R group decomposition can be color-coded with highlights. This molecule includes the explicit H atoms corresponding to extracted R groups, if any.
Code/GraphMol/RGroupDecomposition/Wrap/rdRGroupComposition.cpp
- use a std::unique_ptr to store the pointer to the C++ RGroupDecomposition instance
- fixed typos in docstrings
Code/GraphMol/RGroupDecomposition/Wrap/test_rgroups.py
- added test for the new includeTargetMolInResults parameter
Code/GraphMol/RGroupDecomposition/catch_rgd.cpp
- added test for the new includeTargetMolInResults parameter
Code/GraphMol/RGroupDecomposition/testRGroupDecomp.cpp
- formatting changes
Code/GraphMol/RGroupDecomposition/testRGroupInternals.cpp
- do not use deprecated constant
Code/MinimalLib/CMakeLists.txt
- added RDK_BUILD_MINIMAL_LIB_RGROUPDECOMP CMake flag to optionally expose R group decomposition functionality into MinimalLib
Code/MinimalLib/common.h
- added makeDummiesQueries flag to mol_from_input() (defaults to false)
- implemented parse_highlight_multi_colors() function to parse multi-color atom and bond highlights
- enable multi-color atom and bond highlighting
Code/MinimalLib/demo/rgd_demo.html
- added HTML page showcasing the multi-color highlights similarly to https://greglandrum.github.io/rdkit-blog/posts/2021-08-07-rgd-and-highlighting.html
Code/MinimalLib/jswrapper.cpp
- removed checks for non-nullness of d_mol as d_mol cannot be directly accessed anymore
- replaced all instances of d_mol with get()
- implemented support for multi-color atom and bond highlights
- implemented optional support for R group decomposition
- JSMol is now split into two subbclasses, JSMol and JSMolShared, which both inherit from the JSMolBase class. While JSMol can be constructed from a RWMol* as usual, JSMolShared can be constructed from a ROMOL_SPTR. This avoids unnecessary copies when wrapping a ROMOL_SPTR (e.g., from subtructure library, JSMolList or R group decomposition) into a JSMol to pass it to JS. This also avoids that modifications done in the JS layer on a molecule stored in a MolList (e.g., adding a property) are not persisted because they are carried out on a volatile copy of the molecule rather than on the actual molecule.
- added JSMolBase::copy() convenience method with same functionality as get_mol_copy() to duplicate a molecule
Code/MinimalLib/minilib.cpp, Code/MinimalLib/minilib.h
- replaced all occurrences of d_mol with get(), as d_mol is now private
- removed all occurrences of assert(d_mol) as non-nullness is checked at construction time and whenever get() is called
Code/MinimalLib/tests/tests.js
- added a test for pesistence of modifications made to JSMolShared
- added tests for RGD
- added test for JSMolBase::copy()
Code/RDGeneral/RDValue.h
- removed trailing comma from vector properties such that they can be deserialized as syntactically correct JSON
Code/RDGeneral/types.cpp, Code/RDGeneral/types.h
- added _rgroupTargetAtoms and _rgroupTargetBonds common_properties

* added assignChiralTypesFromMolParity flag

* added test for makeDummiesQueries

* added CFFI tests

* reordered tests

* re-added piece of code that had gone accidentally lost while merging conflicts

* Removed CHECK_INVARIANT following code review

---------

Co-authored-by: ptosco <paolo.tosco@novartis.com>
2024-09-16 16:14:13 +02:00

144 lines
5.0 KiB
C++

//
// Copyright (c) 2017-2021, Novartis Institutes for BioMedical Research Inc.
// and other RDKit contributors
//
// @@ All Rights Reserved @@
// This file is part of the RDKit.
// The contents are covered by the terms of the BSD license
// which is included in the file license.txt, found at the root
// of the RDKit source tree.
//
#include <RDGeneral/export.h>
#ifndef RDKIT_RGROUPDECOMP_H
#define RDKIT_RGROUPDECOMP_H
#include "../RDKitBase.h"
#include "RGroupDecompParams.h"
#include <GraphMol/Substruct/SubstructMatch.h>
#include <chrono>
namespace RDKit {
struct RDKIT_RGROUPDECOMPOSITION_EXPORT RGroupDecompositionProcessResult {
const bool success;
const double score;
RGroupDecompositionProcessResult(const bool success, const double score)
: success(success), score(score) {}
};
struct RGroupMatch;
typedef std::map<std::string, ROMOL_SPTR> RGroupRow;
typedef std::vector<ROMOL_SPTR> RGroupColumn;
typedef std::vector<RGroupRow> RGroupRows;
typedef std::map<std::string, RGroupColumn> RGroupColumns;
class UsedLabelMap {
public:
UsedLabelMap(const std::map<int, int> &mapping) {
for (const auto &rl : mapping) {
d_map[rl.second] = std::make_pair(false, (rl.first > 0));
}
}
bool has(int label) const { return d_map.find(label) != d_map.end(); }
bool getIsUsed(int label) const { return d_map.at(label).first; }
void setIsUsed(int label) { d_map[label].first = true; }
bool isUserDefined(int label) const { return d_map.at(label).second; }
private:
std::map<int, std::pair<bool, bool>> d_map;
};
struct RGroupDecompData;
struct RCore;
class RDKIT_RGROUPDECOMPOSITION_EXPORT RGroupDecomposition {
private:
RGroupDecompData *data; // implementation details
RGroupDecomposition(const RGroupDecomposition &); // no copy construct
RGroupDecomposition &operator=(
const RGroupDecomposition &); // Prevent assignment
RWMOL_SPTR outputCoreMolecule(const RGroupMatch &match,
const UsedLabelMap &usedRGroupMap) const;
int getMatchingCoreInternal(RWMol &mol, const RCore *&rcore,
std::vector<MatchVectType> &matches);
static void labelAtomBondIndices(RWMol &mol);
void setTargetAtomBondIndices(ROMol &mol, bool includeBondsToRLabels) const;
public:
RGroupDecomposition(const ROMol &core,
const RGroupDecompositionParameters &params =
RGroupDecompositionParameters());
RGroupDecomposition(const std::vector<ROMOL_SPTR> &cores,
const RGroupDecompositionParameters &params =
RGroupDecompositionParameters());
~RGroupDecomposition();
//! Returns the index of the core matching the passed molecule
//! and optionally the matching atom indices
/*!
\param mol Molecule to check for matches
\param matches Optional pointer to std::vector<MatchVectType>
where core matches will be stored
\return the index of the matching core, or -1 if none matches
*/
int getMatchingCoreIdx(const ROMol &mol,
std::vector<MatchVectType> *matches = nullptr);
//! Returns the index of the added molecule in the RGroupDecomposition
//! or a negative error code
/*!
\param mol Molecule to add to the decomposition
\return index of the molecule or
-1 if none of the core matches
-2 if the matched molecule has no sidechains, i.e. is the
same as the scaffold
*/
int add(const ROMol &mol);
RGroupDecompositionProcessResult processAndScore();
bool process();
const RGroupDecompositionParameters &params() const;
//! return the current group labels
std::vector<std::string> getRGroupLabels() const;
//! return rgroups in row order group[row][attachment_point] = ROMol
RGroupRows getRGroupsAsRows() const;
//! return rgroups in column order group[attachment_point][row] = ROMol
RGroupColumns getRGroupsAsColumns() const;
};
RDKIT_RGROUPDECOMPOSITION_EXPORT unsigned int RGroupDecompose(
const std::vector<ROMOL_SPTR> &cores, const std::vector<ROMOL_SPTR> &mols,
RGroupRows &rows, std::vector<unsigned int> *unmatched = nullptr,
const RGroupDecompositionParameters &options =
RGroupDecompositionParameters());
RDKIT_RGROUPDECOMPOSITION_EXPORT unsigned int RGroupDecompose(
const std::vector<ROMOL_SPTR> &cores, const std::vector<ROMOL_SPTR> &mols,
RGroupColumns &columns, std::vector<unsigned int> *unmatched = nullptr,
const RGroupDecompositionParameters &options =
RGroupDecompositionParameters());
inline bool checkForTimeout(const std::chrono::steady_clock::time_point &t0,
double timeout, bool throwOnTimeout = true) {
if (timeout <= 0) {
return false;
}
auto t1 = std::chrono::steady_clock::now();
std::chrono::duration<double> elapsed = t1 - t0;
if (elapsed.count() >= timeout) {
if (throwOnTimeout) {
throw std::runtime_error("operation timed out");
}
return true;
}
return false;
}
} // namespace RDKit
#endif