Multi conf gaussian shape (#9265)

* First import of GaussianShape.

* Tidying.

* Custom features.

* Optimise.

* Optimise.

* Return 3 scores rather than 2 including combo score.

* Rename useFeatures to useColors.

* Python wrappers.

* Python tests.

* Take out big test.

* Add new start mode, as PubChem does it.

* Doh!

* Fix MolTransforms eigenvalue return.

* Two cycle optimisation, mostly working.

* Take out bestSoFar score from SCA.

* Take out DTYPE.

* Tidy out redundant variables.

* Optimisation in 2 parts.

* More fiddling in pursuit of speed.

* Update Python wrapper.

* Tweak.

* Atom subsets and different radii.

* Fix test.

* Revert pubchem_shape's test.cpp.

* Serialize ShapeInput.

* Trigger build

* Remove pointers to std::arrays in ShapeInput.

* ShapeInput virtual d'tor.

* Precondition - ShapeInput needs a molecule with at least 1 conformer.

* Rename ShapeInput::d_centroid to ShapeInput::d_canonTrans.

* Fix normalization bugs.

* Select start mode using moments of inertia rather than eigenvalues of canonical transformation.

* Include color features in moments of inertia.

* Smidge faster.

* Tversky similarity.

* Tidy tests.

* Tests working on Linux.

* Revert force of right handed axes in MolTransforms::computePrincipalAxesAndMomentsFromGyrationMatrix replacing with a comment in the code.

* Response to review.

* Sneaky allCarbon bug.

* add multithreaded test

* Response to review.

* Doh! Don't recalculate normalization after every transformation.

* Re-instate d_normalizationOK.

* Re-name functions for fetching canonical transformations.

* Separate alpha from coords.

* MultiConf works with single conf extraction.

* Extract all conformations.
Max and best similarities.

* Renames d_currConformer to d_activeShape.

* Update shapeToMol.

* Update shapeToMol.

* Changes from synthon shape searching.

* Fix normalization of multiple confs.

* Update Python wrappers.

* Fix shape merge.

* Improve bestSimilarity.

* Fix python wrapper.

* Pull in changes from SynthonShapeSearch:
make pruneShapes public.
function to negate Alpha values.

* clang-tidy suggestions.

* clang-tidy suggestions.

* Bug in quaternion gradients - we now have only 3 coordinates.

* Tidy tests.

* Mac result slightly different.

* Multi conformer molecule alignment.

* Optionally return raw overlap volumes in score functions.

* Python wrappers for raw overlap volumes.

* Update Python wrapper ShapeInputOptions.

* Tidy for PR.

* Extra include file.

* Extra library

* Tidy forward declarations.

* Don't prune if threshold < 0.0.

* Windows exporty thing.

* Check SMILES on merge of ShapeInputs.

* PRECONDITION of SMILES on merge of ShapeInputs.

* Response to review - rename some functions.

* change how overlapVols is passed
add a test for it

* API suggestions

* Response to review.

* Remove debugging writes.

* Fix Python wrappers.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: greg landrum <greg.landrum@gmail.com>
This commit is contained in:
David Cosgrove
2026-06-03 05:09:09 +01:00
committed by GitHub
parent b854399558
commit 9f551aedbe
14 changed files with 1854 additions and 578 deletions

View File

@@ -1,12 +1,12 @@
rdkit_library(GaussianShape rdkit_library(GaussianShape
GaussianShape.cpp ShapeInput.cpp SingleConformerAlignment.cpp GaussianShape.cpp ShapeInput.cpp SingleConformerAlignment.cpp
SHARED LINK_LIBRARIES SmilesParse SubstructMatch MolTransforms) SHARED LINK_LIBRARIES SmilesParse SubstructMatch MolTransforms SimDivPickers)
target_compile_definitions(GaussianShape PRIVATE RDKIT_GAUSSIANSHAPE_BUILD) target_compile_definitions(GaussianShape PRIVATE RDKIT_GAUSSIANSHAPE_BUILD)
rdkit_headers(GaussianShape.h ShapeInput.h ShapeOverlayOptions.h) rdkit_headers(GaussianShape.h ShapeInput.h ShapeOverlayOptions.h)
rdkit_catch_test(testGaussianShape catch_tests.cpp LINK_LIBRARIES GaussianShape rdkit_catch_test(testGaussianShape catch_tests.cpp LINK_LIBRARIES GaussianShape
FileParsers MolAlign MolTransforms) FileParsers MolAlign MolTransforms DistGeomHelpers DistGeometry)
if(RDK_BUILD_PYTHON_WRAPPERS) if(RDK_BUILD_PYTHON_WRAPPERS)
add_subdirectory(Wrap) add_subdirectory(Wrap)

View File

@@ -17,6 +17,7 @@
// https://github.com/ncbi/pubchem-align3d/blob/main/shape_neighbor.cpp. // https://github.com/ncbi/pubchem-align3d/blob/main/shape_neighbor.cpp.
#include <cmath> #include <cmath>
#include <numbers>
#include <Geometry/Transform3D.h> #include <Geometry/Transform3D.h>
#include <GraphMol/ROMol.h> #include <GraphMol/ROMol.h>
@@ -26,8 +27,6 @@
#include <GraphMol/GaussianShape/ShapeInput.h> #include <GraphMol/GaussianShape/ShapeInput.h>
#include <GraphMol/GaussianShape/SingleConformerAlignment.h> #include <GraphMol/GaussianShape/SingleConformerAlignment.h>
#include <GraphMol/MolTransforms/MolTransforms.h>
#include <GraphMol/SmilesParse/SmilesParse.h>
namespace RDKit { namespace RDKit {
namespace GaussianShape { namespace GaussianShape {
@@ -40,7 +39,7 @@ RDGeom::Transform3D computeFinalTransform(
const std::array<double, 3> &inRefTrans, const std::array<double, 3> &inRefTrans,
const std::array<double, 9> &inRefRot, const std::array<double, 9> &inRefRot,
const std::array<double, 3> &inFitTrans, const std::array<double, 3> &inFitTrans,
const std::array<double, 9> &inFitRot, RDGeom::Transform3D &ovXform) { const std::array<double, 9> &inFitRot, const RDGeom::Transform3D &ovXform) {
// Move to fitShape's initial centroid and principal axes // Move to fitShape's initial centroid and principal axes
RDGeom::Transform3D transform0; RDGeom::Transform3D transform0;
transform0.SetTranslation( transform0.SetTranslation(
@@ -85,7 +84,7 @@ std::array<double, 4> getInitialRotationPlain(
int index, const ShapeInput &refShape, const ShapeInput &fitShape, int index, const ShapeInput &refShape, const ShapeInput &fitShape,
const RDGeom::Point3D &refDisp, const ShapeOverlayOptions &overlayOpts, const RDGeom::Point3D &refDisp, const ShapeOverlayOptions &overlayOpts,
double &score) { double &score) {
static const double sinpi_4 = std::sin(std::atan(1.0)); static const double sinpi_4 = std::sin(std::numbers::pi / 4.0);
const static std::vector<std::array<double, 4>> quats{ const static std::vector<std::array<double, 4>> quats{
{1.0, 0.0, 0.0, 0.0}, {0.0, 1.0, 0.0, 0.0}, {1.0, 0.0, 0.0, 0.0}, {0.0, 1.0, 0.0, 0.0},
{0.0, 0.0, 1.0, 0.0}, {0.0, 0.0, 0.0, 1.0}, {0.0, 0.0, 1.0, 0.0}, {0.0, 0.0, 0.0, 1.0},
@@ -95,22 +94,23 @@ std::array<double, 4> getInitialRotationPlain(
{sinpi_4, 0.0, 0.0, sinpi_4}, {0.0, -sinpi_4, sinpi_4, 0.0}, {sinpi_4, 0.0, 0.0, sinpi_4}, {0.0, -sinpi_4, sinpi_4, 0.0},
{sinpi_4, 0.0, sinpi_4, 0.0}, {0.0, sinpi_4, 0.0, sinpi_4}, {sinpi_4, 0.0, sinpi_4, 0.0}, {0.0, sinpi_4, 0.0, sinpi_4},
{0.0, -sinpi_4, 0.0, sinpi_4}, {sinpi_4, 0.0, -sinpi_4, 0.0}}; {0.0, -sinpi_4, 0.0, sinpi_4}, {sinpi_4, 0.0, -sinpi_4, 0.0}};
bool useColor = overlayOpts.optimMode != OptimMode::SHAPE_ONLY; const bool useColor = overlayOpts.optimMode != OptimMode::SHAPE_ONLY;
std::array<double, 7> quatTrans{ const std::array<double, 7> quatTrans{
quats[index][0], quats[index][1], quats[index][2], quats[index][3], quats[index][0], quats[index][1], quats[index][2], quats[index][3],
refDisp[0], refDisp[1], refDisp[2]}; refDisp[0], refDisp[1], refDisp[2]};
SingleConformerAlignment sca( SingleConformerAlignment sca(
refShape.getCoords(), refShape.getTypes().data(), refShape.getCoords(), refShape.getAlphas(),
refShape.getCarbonRadii(), refShape.getNumAtoms(), refShape.getFeatureTypes().data(), refShape.getCarbonRadii(),
refShape.getNumFeatures(), refShape.getShapeVolume(), refShape.getNumAtoms(), refShape.getNumFeatures(),
refShape.getColorVolume(), fitShape.getCoords(), refShape.getShapeVolume(), refShape.getColorVolume(),
fitShape.getTypes().data(), fitShape.getCarbonRadii(), fitShape.getCoords(), fitShape.getAlphas(),
fitShape.getFeatureTypes().data(), fitShape.getCarbonRadii(),
fitShape.getNumAtoms(), fitShape.getNumFeatures(), fitShape.getNumAtoms(), fitShape.getNumFeatures(),
fitShape.getShapeVolume(), fitShape.getColorVolume(), quatTrans, fitShape.getShapeVolume(), fitShape.getColorVolume(), quatTrans,
overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta, overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta,
overlayOpts.optParam, overlayOpts.useDistCutoff, overlayOpts.distCutoff, overlayOpts.optParam, overlayOpts.useDistCutoff, overlayOpts.distCutoff,
overlayOpts.shapeConvergenceCriterion, overlayOpts.nSteps); overlayOpts.shapeConvergenceCriterion, overlayOpts.nSteps);
auto scores = sca.calcScores(useColor); const auto scores = sca.calcScores(useColor);
score = scores[0]; score = scores[0];
return quats[index]; return quats[index];
} }
@@ -148,11 +148,12 @@ std::array<double, 4> getInitialRotationWiggle(
bool useColor = overlayOpts.optimMode != OptimMode::SHAPE_ONLY; bool useColor = overlayOpts.optimMode != OptimMode::SHAPE_ONLY;
std::array<double, 7> tmpQuatTrans{1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0}; std::array<double, 7> tmpQuatTrans{1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
SingleConformerAlignment sca( SingleConformerAlignment sca(
refShape.getCoords(), refShape.getTypes().data(), refShape.getCoords(), refShape.getAlphas(),
refShape.getCarbonRadii(), refShape.getNumAtoms(), refShape.getFeatureTypes().data(), refShape.getCarbonRadii(),
refShape.getNumFeatures(), refShape.getShapeVolume(), refShape.getNumAtoms(), refShape.getNumFeatures(),
refShape.getColorVolume(), fitShape.getCoords(), refShape.getShapeVolume(), refShape.getColorVolume(),
fitShape.getTypes().data(), fitShape.getCarbonRadii(), fitShape.getCoords(), fitShape.getAlphas(),
fitShape.getFeatureTypes().data(), fitShape.getCarbonRadii(),
fitShape.getNumAtoms(), fitShape.getNumFeatures(), fitShape.getNumAtoms(), fitShape.getNumFeatures(),
fitShape.getShapeVolume(), fitShape.getColorVolume(), tmpQuatTrans, fitShape.getShapeVolume(), fitShape.getColorVolume(), tmpQuatTrans,
overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta, overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta,
@@ -180,7 +181,7 @@ RDGeom::Point3D getInitialTranslation(int index, ShapeInput &refShape,
ShapeInput fitShape) { ShapeInput fitShape) {
auto getDisp = [](ShapeInput &shape, size_t i) -> RDGeom::Point3D { auto getDisp = [](ShapeInput &shape, size_t i) -> RDGeom::Point3D {
const double *coord = const double *coord =
shape.getCoords().data() + shape.calcExtremes()[i] * 4; shape.getCoords().data() + shape.calcExtremes()[i] * 3;
return RDGeom::Point3D(coord[0], coord[1], coord[2]); return RDGeom::Point3D(coord[0], coord[1], coord[2]);
}; };
RDGeom::Point3D disp; RDGeom::Point3D disp;
@@ -230,17 +231,17 @@ unsigned int calculateQrat(const std::array<double, 3> &eigenValues) {
eigenValues[0] + eigenValues[1] - eigenValues[2]}; eigenValues[0] + eigenValues[1] - eigenValues[2]};
std::sort(double_ev_oe, double_ev_oe + 3, std::greater<double>()); std::sort(double_ev_oe, double_ev_oe + 3, std::greater<double>());
const static double qrat_threshold = 0.7225; // 0.85*0.85; constexpr static double qrat_threshold = 0.7225; // 0.85*0.85;
unsigned int qrat = 1000; unsigned int qrat = 1000;
unsigned int u_rqyx, u_rqzy;
if (double_ev_oe[1] > 0) { if (double_ev_oe[1] > 0) {
if (qrat_threshold < (double_ev_oe[1] / double_ev_oe[0])) { unsigned int u_rqyx, u_rqzy;
if (qrat_threshold < double_ev_oe[1] / double_ev_oe[0]) {
u_rqyx = 1; u_rqyx = 1;
} else { } else {
u_rqyx = 0; u_rqyx = 0;
} }
if (qrat_threshold < (double_ev_oe[2] / double_ev_oe[1])) { if (qrat_threshold < double_ev_oe[2] / double_ev_oe[1]) {
u_rqzy = 1; u_rqzy = 1;
} else { } else {
u_rqzy = 0; u_rqzy = 0;
@@ -251,12 +252,12 @@ unsigned int calculateQrat(const std::array<double, 3> &eigenValues) {
return qrat; return qrat;
} }
StartMode decideStartModeFromEigenValues(ShapeInput &refShape, StartMode decideStartModeFromEigenValues(const ShapeInput &refShape,
ShapeInput &fitShape) { const ShapeInput &fitShape) {
// The PubChem code uses the moments of inertia for this, rather than the // The PubChem code uses the moments of inertia for this, rather than the
// canonical transformation. // canonical transformation.
auto rqratwf = calculateQrat(refShape.calcMomentsOfInertia(true)); const auto rqratwf = calculateQrat(refShape.calcMomentsOfInertia(true));
auto fqratwf = calculateQrat(fitShape.calcMomentsOfInertia(true)); const auto fqratwf = calculateQrat(fitShape.calcMomentsOfInertia(true));
StartMode startModeWF{StartMode::ROTATE_180_WIGGLE}; StartMode startModeWF{StartMode::ROTATE_180_WIGGLE};
if (rqratwf > 0 || fqratwf > 0) { if (rqratwf > 0 || fqratwf > 0) {
startModeWF = StartMode::ROTATE_45; startModeWF = StartMode::ROTATE_45;
@@ -319,11 +320,12 @@ std::array<double, 3> alignShape(ShapeInput &refShape, ShapeInput &fitShape,
std::array<double, 7> initQuat{quat[0], quat[1], quat[2], quat[3], std::array<double, 7> initQuat{quat[0], quat[1], quat[2], quat[3],
refDisp.x, refDisp.y, refDisp.z}; refDisp.x, refDisp.y, refDisp.z};
aligners.emplace_back(std::make_unique<SingleConformerAlignment>( aligners.emplace_back(std::make_unique<SingleConformerAlignment>(
refShape.getCoords(), refShape.getTypes().data(), refShape.getCoords(), refShape.getAlphas(),
refShape.getCarbonRadii(), refShape.getNumAtoms(), refShape.getFeatureTypes().data(), refShape.getCarbonRadii(),
refShape.getNumFeatures(), refShape.getShapeVolume(), refShape.getNumAtoms(), refShape.getNumFeatures(),
refShape.getColorVolume(), fitShape.getCoords(), refShape.getShapeVolume(), refShape.getColorVolume(),
fitShape.getTypes().data(), fitShape.getCarbonRadii(), fitShape.getCoords(), fitShape.getAlphas(),
fitShape.getFeatureTypes().data(), fitShape.getCarbonRadii(),
fitShape.getNumAtoms(), fitShape.getNumFeatures(), fitShape.getNumAtoms(), fitShape.getNumFeatures(),
fitShape.getShapeVolume(), fitShape.getColorVolume(), initQuat, fitShape.getShapeVolume(), fitShape.getColorVolume(), initQuat,
overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta, overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta,
@@ -343,20 +345,20 @@ std::array<double, 3> alignShape(ShapeInput &refShape, ShapeInput &fitShape,
}); });
std::vector<std::pair<double, unsigned int>> nextBestScoreForStart; std::vector<std::pair<double, unsigned int>> nextBestScoreForStart;
nextBestScoreForStart.reserve(finalTransIndex * finalRotIndex); nextBestScoreForStart.reserve(finalTransIndex * finalRotIndex);
for (const auto &[bssf, k] : bestScoreForStart) { for (const auto &[bssf, m] : bestScoreForStart) {
if (cycle == 1) { if (cycle == 1) {
if (bssf < 0.7 * bestScore[0]) { if (bssf < 0.7 * bestScore[0]) {
continue; continue;
} }
} }
std::array<double, 20> outScores; std::array<double, 20> outScores;
aligners[k]->doOverlay(outScores, cycle); aligners[m]->doOverlay(outScores, cycle);
nextBestScoreForStart.emplace_back(outScores[0], k); nextBestScoreForStart.emplace_back(outScores[0], m);
if (outScores[0] > bestTotal) { if (outScores[0] > bestTotal) {
bestTotal = outScores[0]; bestTotal = outScores[0];
bestScore = bestScore =
std::array<double, 3>{outScores[0], outScores[1], outScores[2]}; std::array<double, 3>{outScores[0], outScores[1], outScores[2]};
aligners[k]->getFinalQuatTrans(bestXform); aligners[m]->getFinalQuatTrans(bestXform);
} }
} }
bestScoreForStart = nextBestScoreForStart; bestScoreForStart = nextBestScoreForStart;
@@ -373,45 +375,40 @@ std::array<double, 3> AlignShape(const ShapeInput &refShape,
// example) but they might need to be. // example) but they might need to be.
auto workingRefShape = std::make_unique<ShapeInput>(refShape); auto workingRefShape = std::make_unique<ShapeInput>(refShape);
auto workingFitShape = std::make_unique<ShapeInput>(fitShape); auto workingFitShape = std::make_unique<ShapeInput>(fitShape);
auto inRefTrans = workingRefShape->calcCanonicalTranslation(); const auto inRefTrans = workingRefShape->calcCanonicalTranslation();
auto inRefRot = workingRefShape->calcCanonicalRotation(); const auto inRefRot = workingRefShape->calcCanonicalRotation();
auto inFitTrans = workingFitShape->calcCanonicalTranslation(); const auto inFitTrans = workingFitShape->calcCanonicalTranslation();
auto inFitRot = workingFitShape->calcCanonicalRotation(); const auto inFitRot = workingFitShape->calcCanonicalRotation();
// If we're not normalizing, translate both shapes so that the fit // If we're not normalizing, translate both shapes so that the fit
// is at the origin, so the rotations work. // is at the origin, so the rotations work.
RDGeom::Transform3D moveToOrigin; RDGeom::Transform3D moveToOrigin;
RDGeom::Transform3D moveFromOrigin; RDGeom::Transform3D moveFromOrigin;
if (overlayOpts.normalize) { if (overlayOpts.normalize) {
if (!workingRefShape->getNormalized()) { if (!workingRefShape->getIsNormalized()) {
workingRefShape->normalizeCoords(); workingRefShape->normalizeCoords();
} }
if (!workingFitShape->getNormalized()) { if (!workingFitShape->getIsNormalized()) {
workingFitShape->normalizeCoords(); workingFitShape->normalizeCoords();
} }
} else { } else {
const auto &canonTrans = workingFitShape->calcCanonicalTranslation();
moveToOrigin.SetTranslation( moveToOrigin.SetTranslation(
RDGeom::Point3D{workingFitShape->calcCanonicalTranslation()[0], RDGeom::Point3D{canonTrans[0], canonTrans[1], canonTrans[2]});
workingFitShape->calcCanonicalTranslation()[1],
workingFitShape->calcCanonicalTranslation()[2]});
moveFromOrigin.SetTranslation( moveFromOrigin.SetTranslation(
RDGeom::Point3D{-workingFitShape->calcCanonicalTranslation()[0], RDGeom::Point3D{-canonTrans[0], -canonTrans[1], -canonTrans[2]});
-workingFitShape->calcCanonicalTranslation()[1],
-workingFitShape->calcCanonicalTranslation()[2]});
workingFitShape->transformCoords(moveToOrigin); workingFitShape->transformCoords(moveToOrigin);
workingRefShape->transformCoords(moveToOrigin); workingRefShape->transformCoords(moveToOrigin);
} }
RDGeom::Transform3D bestXform; RDGeom::Transform3D bestXform;
auto scores = const auto scores =
alignShape(*workingRefShape, *workingFitShape, bestXform, overlayOpts); alignShape(*workingRefShape, *workingFitShape, bestXform, overlayOpts);
if (!overlayOpts.normalize) { if (!overlayOpts.normalize) {
// Shove it back again. // Shove it back again.
auto finalXform = moveFromOrigin * bestXform * moveToOrigin; bestXform = moveFromOrigin * bestXform * moveToOrigin;
bestXform = finalXform;
} else { } else {
auto finalXform = computeFinalTransform(inRefTrans, inRefRot, inFitTrans, bestXform = computeFinalTransform(inRefTrans, inRefRot, inFitTrans,
inFitRot, bestXform); inFitRot, bestXform);
bestXform = finalXform;
} }
fitShape.transformCoords(bestXform); fitShape.transformCoords(bestXform);
if (xform) { if (xform) {
@@ -428,7 +425,7 @@ std::array<double, 3> AlignMolecule(const ShapeInput &refShape, ROMol &fit,
int fitConfId) { int fitConfId) {
auto fitShape = ShapeInput(fit, fitConfId, fitOpts, overlayOpts); auto fitShape = ShapeInput(fit, fitConfId, fitOpts, overlayOpts);
RDGeom::Transform3D tmpXform; RDGeom::Transform3D tmpXform;
auto scores = AlignShape(refShape, fitShape, &tmpXform, overlayOpts); const auto scores = AlignShape(refShape, fitShape, &tmpXform, overlayOpts);
MolTransforms::transformConformer(fit.getConformer(fitConfId), tmpXform); MolTransforms::transformConformer(fit.getConformer(fitConfId), tmpXform);
if (xform) { if (xform) {
*xform = tmpXform; *xform = tmpXform;
@@ -442,32 +439,76 @@ std::array<double, 3> AlignMolecule(const ROMol &ref, ROMol &fit,
RDGeom::Transform3D *xform, RDGeom::Transform3D *xform,
const ShapeOverlayOptions &overlayOpts, const ShapeOverlayOptions &overlayOpts,
int refConfId, int fitConfId) { int refConfId, int fitConfId) {
auto refShape = ShapeInput(ref, refConfId, refOpts, overlayOpts); const auto refShape = ShapeInput(ref, refConfId, refOpts, overlayOpts);
auto scores = const auto scores =
AlignMolecule(refShape, fit, fitOpts, xform, overlayOpts, fitConfId); AlignMolecule(refShape, fit, fitOpts, xform, overlayOpts, fitConfId);
return scores; return scores;
} }
void ScoreMoleculeAllConformers(const ROMol &ref, const ROMol &fit,
int &refConfId, int &fitConfId,
std::vector<std::vector<double>> &combScores,
const ShapeInputOptions &refOpts,
const ShapeInputOptions &fitOpts,
const ShapeOverlayOptions &overlayOpts,
RDGeom::Transform3D *xform) {
// Pruning the shapes wastes time and obviously removes the correspondence
// between conformers and shapes.
auto refOptsCp = refOpts;
refOptsCp.shapePruneThreshold = -1;
refOptsCp.sortShapes = false;
auto fitOptsCp = fitOpts;
fitOptsCp.shapePruneThreshold = -1;
fitOptsCp.sortShapes = false;
auto refShape = ShapeInput(ref, -1, refOptsCp, overlayOpts);
auto fitShape = ShapeInput(fit, -1, fitOptsCp, overlayOpts);
combScores = std::vector<std::vector<double>>(
refShape.getNumShapes(), std::vector<double>(fitShape.getNumShapes()));
double bestScore = -1.0;
for (unsigned int i = 0; i < refShape.getNumShapes(); i++) {
refShape.setActiveShape(i);
for (unsigned int j = 0; j < fitShape.getNumShapes(); j++) {
fitShape.setActiveShape(j);
RDGeom::Transform3D thisXform;
auto scores = AlignShape(refShape, fitShape, &thisXform, overlayOpts);
combScores[i][j] = scores[0];
if (scores[0] > bestScore) {
bestScore = scores[0];
refConfId = i;
fitConfId = j;
if (xform) {
*xform = thisXform;
}
}
}
}
}
std::array<double, 3> ScoreShape(const ShapeInput &refShape, std::array<double, 3> ScoreShape(const ShapeInput &refShape,
const ShapeInput &fitShape, const ShapeInput &fitShape,
const ShapeOverlayOptions &overlayOpts) { const ShapeOverlayOptions &overlayOpts,
std::pair<double, double> *overlapVols) {
auto refWorking = refShape.getCoords(); auto refWorking = refShape.getCoords();
auto fitWorking = fitShape.getCoords(); auto fitWorking = fitShape.getCoords();
std::array<double, 7> quatTrans{1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0}; std::array<double, 7> quatTrans{1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
SingleConformerAlignment sca( SingleConformerAlignment sca(
refShape.getCoords(), refShape.getTypes().data(), refShape.getCoords(), refShape.getAlphas(),
refShape.getCarbonRadii(), refShape.getNumAtoms(), refShape.getFeatureTypes().data(), refShape.getCarbonRadii(),
refShape.getNumFeatures(), refShape.getShapeVolume(), refShape.getNumAtoms(), refShape.getNumFeatures(),
refShape.getColorVolume(), fitShape.getCoords(), refShape.getShapeVolume(), refShape.getColorVolume(),
fitShape.getTypes().data(), fitShape.getCarbonRadii(), fitShape.getCoords(), fitShape.getAlphas(),
fitShape.getFeatureTypes().data(), fitShape.getCarbonRadii(),
fitShape.getNumAtoms(), fitShape.getNumFeatures(), fitShape.getNumAtoms(), fitShape.getNumFeatures(),
fitShape.getShapeVolume(), fitShape.getColorVolume(), quatTrans, fitShape.getShapeVolume(), fitShape.getColorVolume(), quatTrans,
overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta, overlayOpts.optimMode, overlayOpts.simAlpha, overlayOpts.simBeta,
overlayOpts.optParam, overlayOpts.useDistCutoff, overlayOpts.distCutoff, overlayOpts.optParam, overlayOpts.useDistCutoff, overlayOpts.distCutoff,
overlayOpts.shapeConvergenceCriterion, overlayOpts.nSteps); overlayOpts.shapeConvergenceCriterion, overlayOpts.nSteps);
bool includeColor = overlayOpts.optimMode != OptimMode::SHAPE_ONLY; const bool includeColor = overlayOpts.optimMode != OptimMode::SHAPE_ONLY;
auto scores = sca.calcScores(refShape.getCoords().data(), const auto scores = sca.calcScores(refShape.getCoords().data(),
fitShape.getCoords().data(), includeColor); fitShape.getCoords().data(), includeColor);
if (overlapVols) {
(*overlapVols) = std::make_pair(scores[3], scores[4]);
}
return std::array{scores[0], scores[1], scores[2]}; return std::array{scores[0], scores[1], scores[2]};
} }
@@ -475,16 +516,18 @@ std::array<double, 3> ScoreMolecule(const ShapeInput &refShape,
const ROMol &fit, const ROMol &fit,
const ShapeInputOptions &fitOpts, const ShapeInputOptions &fitOpts,
const ShapeOverlayOptions &overlayOpts, const ShapeOverlayOptions &overlayOpts,
int fitConfId) { int fitConfId,
auto fitShape = ShapeInput(fit, fitConfId, fitOpts, overlayOpts); std::pair<double, double> *overlapVols) {
return ScoreShape(refShape, fitShape, overlayOpts); const auto fitShape = ShapeInput(fit, fitConfId, fitOpts, overlayOpts);
return ScoreShape(refShape, fitShape, overlayOpts, overlapVols);
} }
std::array<double, 3> ScoreMolecule(const ROMol &ref, const ROMol &fit, std::array<double, 3> ScoreMolecule(const ROMol &ref, const ROMol &fit,
const ShapeInputOptions &refOpts, const ShapeInputOptions &refOpts,
const ShapeInputOptions &fitOpts, const ShapeInputOptions &fitOpts,
const ShapeOverlayOptions &overlayOpts, const ShapeOverlayOptions &overlayOpts,
int refConfId, int fitConfId) { int refConfId, int fitConfId,
std::pair<double, double> *overlapVols) {
ShapeOverlayOptions tmpOpts = overlayOpts; ShapeOverlayOptions tmpOpts = overlayOpts;
tmpOpts.normalize = false; tmpOpts.normalize = false;
tmpOpts.startMode = StartMode::ROTATE_0; tmpOpts.startMode = StartMode::ROTATE_0;
@@ -492,9 +535,9 @@ std::array<double, 3> ScoreMolecule(const ROMol &ref, const ROMol &fit,
auto refShape = ShapeInput(ref, refConfId, refOpts, tmpOpts); auto refShape = ShapeInput(ref, refConfId, refOpts, tmpOpts);
ShapeInputOptions tmpFitOpts = fitOpts; ShapeInputOptions tmpFitOpts = fitOpts;
auto fitShape = ShapeInput(fit, fitConfId, fitOpts, tmpOpts); const auto fitShape = ShapeInput(fit, fitConfId, fitOpts, tmpOpts);
return ScoreShape(refShape, fitShape, tmpOpts); return ScoreShape(refShape, fitShape, tmpOpts, overlapVols);
} }
} // namespace GaussianShape } // namespace GaussianShape
} // namespace RDKit } // namespace RDKit

View File

@@ -92,12 +92,44 @@ RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> AlignMolecule(
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(), const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int refConfId = -1, int fitConfId = -1); int refConfId = -1, int fitConfId = -1);
//! Calculate scores for the alignment of all conformers of one molecule
//! onto another. Returns a matrix of the combination scores, the conformer
//! numbers of the two molecules that gave the best overlay and the
//! transformation matrix for that overlay if requested. The molecules
//! themselves are not altered. scores[0][1] is the score of aligning
//! fit conformer 1 onto ref conformer 0
/*!
\param ref the reference molecule
\param fit the molecule to align
\param refConfId returns the reference conformer for the best scoring
overlay
\param fitConfId returns the fit conformer for the best scoring overlay
\param combScores the scores for all the overlays. Will be returned sized
by the number of conformers of the ref and fit molecules.
combScores[i][j] will be the score for the jth fit
conformer onto the ith ref conformer.
\param refOpts the options for creating the ref shape
\param fitOpts the options for creating the fit shape
\param overlayOpts options for setting up and running the overlay
\param xform if passed in as non-null, will be populated with the
transformation matrix that gives the best-scoring
overlay.
*/
RDKIT_GAUSSIANSHAPE_EXPORT void ScoreMoleculeAllConformers(
const ROMol &ref, const ROMol &fit, int &refConfId, int &fitConfId,
std::vector<std::vector<double>> &combScores,
const ShapeInputOptions &refOpts = ShapeInputOptions(),
const ShapeInputOptions &fitOpts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
RDGeom::Transform3D *xform = nullptr);
//! Score the overlap of a shape to a reference shape without moving //! Score the overlap of a shape to a reference shape without moving
// either. // either.
/*! /*!
\param refShape the reference shape \param refShape the reference shape
\param fitShape the shape to score \param fitShape the shape to score
\param overlayOpts options for controlling the volume calculation \param overlayOpts options for controlling the volume calculation
\param overlapVols if not-null, is filled with the raw overlap volumes
\return an array of the combination score of the shape Tversky value and the \return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If color Tversky value (zero if colors not used) and the individual values. If
@@ -105,17 +137,19 @@ RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> AlignMolecule(
*/ */
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreShape( RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreShape(
const ShapeInput &refShape, const ShapeInput &fitShape, const ShapeInput &refShape, const ShapeInput &fitShape,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions()); const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
std::pair<double, double> *overlapVols = nullptr);
//! Score the overlap of a molecule to a reference shape without moving //! Score the overlap of a molecule to a reference shape without moving
// either. // either.
/*! /*!
\param ref the reference shape \param refShape the reference shape
\param fit the molecule to score \param fit the molecule to score
\param fitOpts the options for creating the fit shape \param fitOpts the options for creating the fit shape
\param overlayOpts options for controlling the volume calculation \param overlayOpts options for controlling the volume calculation
\param fitConfId (optional) the conformer to use for the fit \param fitConfId (optional) the conformer to use for the fit
molecule molecule
\param overlapVols if not-null, is filled with the raw overlap volumes
\return an array of the combination score of the shape Tversky value and the \return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If color Tversky value (zero if colors not used) and the individual values. If
@@ -125,7 +159,8 @@ RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreMolecule(
const ShapeInput &refShape, const ROMol &fit, const ShapeInput &refShape, const ROMol &fit,
const ShapeInputOptions &fitOpts = ShapeInputOptions(), const ShapeInputOptions &fitOpts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(), const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int fitConfId = -1); int fitConfId = -1,
std::pair<double, double> *overlapVols = nullptr);
//! Score the overlap of a molecule to a reference molecule without moving //! Score the overlap of a molecule to a reference molecule without moving
// either. // either.
@@ -139,6 +174,7 @@ RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreMolecule(
molecule molecule
\param fitConfId (optional) the conformer to use for the fit \param fitConfId (optional) the conformer to use for the fit
molecule molecule
\param overlapVols if not-null, is filled with the raw overlap volumes
\return an array of the combination score of the shape Tverksy value and the \return an array of the combination score of the shape Tverksy value and the
color Tversky value (zero if colors not used) and the individual values. If color Tversky value (zero if colors not used) and the individual values. If
@@ -149,7 +185,8 @@ RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreMolecule(
const ShapeInputOptions &refOpts = ShapeInputOptions(), const ShapeInputOptions &refOpts = ShapeInputOptions(),
const ShapeInputOptions &fitOpts = ShapeInputOptions(), const ShapeInputOptions &fitOpts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(), const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int refConfId = -1, int fitConfId = -1); int refConfId = -1, int fitConfId = -1,
std::pair<double, double> *overlapVols = nullptr);
} // namespace GaussianShape } // namespace GaussianShape
} // namespace RDKit } // namespace RDKit

File diff suppressed because it is too large Load Diff

View File

@@ -16,6 +16,7 @@
#include <array> #include <array>
#include <vector> #include <vector>
#include <GraphMol/RWMol.h>
#include <RDGeneral/export.h> #include <RDGeneral/export.h>
#include <Geometry/Transform3D.h> #include <Geometry/Transform3D.h>
@@ -39,7 +40,7 @@ namespace boost {
namespace serialization { namespace serialization {
template <class Archive, typename Block, typename Allocator> template <class Archive, typename Block, typename Allocator>
void serialize(Archive &ar, boost::dynamic_bitset<Block, Allocator> &bs, void serialize(Archive &ar, dynamic_bitset<Block, Allocator> &bs,
const unsigned int /*version*/) { const unsigned int /*version*/) {
size_t num_bits = bs.size(); size_t num_bits = bs.size();
ar & num_bits; ar & num_bits;
@@ -64,14 +65,26 @@ void serialize(Archive &ar, boost::dynamic_bitset<Block, Allocator> &bs,
namespace RDKit { namespace RDKit {
class ROMol; class ROMol;
class RWMol; class Conformer;
namespace GaussianShape { namespace GaussianShape {
constexpr double CARBON_RAD = 1.70;
constexpr double DUMMY_RAD = 2.16; // same as Xe
// From Grant et al. // From Grant et al.
constexpr double P = 2.7; constexpr double P = 2.7;
constexpr double KAPPA = 2.41798793102; constexpr double KAPPA = 2.41798793102;
using CustomFeatures =
std::vector<std::tuple<unsigned int, RDGeom::Point3D, double>>; struct CustomFeature {
CustomFeature(
const unsigned int t, const RDGeom::Point3D &p, const double r,
const std::vector<unsigned int> &a = std::vector<unsigned int>())
: type(t), pos(p), rad(r), atoms(a) {}
unsigned int type;
RDGeom::Point3D pos;
double rad;
std::vector<unsigned int>
atoms; // That the feature was derived from. May be left empty.
};
struct ShapeInputOptions { struct ShapeInputOptions {
ShapeInputOptions() = default; ShapeInputOptions() = default;
@@ -82,26 +95,33 @@ struct ShapeInputOptions {
~ShapeInputOptions() = default; ~ShapeInputOptions() = default;
// By default, it will create features using the RDKit pharmacophore
// definitions.
bool useColors{ bool useColors{
true}; //! Whether to build the color features. By default, it will true}; //! Whether to build the color features. By default, it will
//! create features using the RDKit pharmacophore definitions. //! create features using the RDKit pharmacophore definitions.
CustomFeatures customFeatures; //! Custom color features used verbatim. A std::vector<std::vector<CustomFeature>>
//! vector of tuples of integer type, Point3D customFeatures; //! Custom color features used verbatim. One outer
//! coords, double radius. //! vector for each conformation in the molecule.
std::vector<unsigned int> std::vector<unsigned int>
atomSubset; //! If not empty, use just these atoms in the molecule to atomSubset; //! If not empty, use just these atoms in the molecule to
//! form the ShapeInput object. //! form the ShapeInput object.
std::vector<std::pair<unsigned int, double>> std::vector<std::pair<unsigned int, double>>
atomRadii; //! Use these non-standard radii for these atoms. The int is atomRadii; //! Use these non-standard radii for these atoms. The int is
//! for the atom index in the molecule, not the atomic number. //! for the atom index in the molecule, not the atomic
//! Not all atoms need be specified, just some radii can be //! number. Not all atoms need be specified; some radii
//! over-ridden, with the rest left as standard. //! can be over-ridden, with the rest left as standard.
bool allCarbonRadii{ bool allCarbonRadii{
true}; //! Whether to use carbon radii for all atoms (which is quicker true}; //! Whether to use carbon radii for all atoms (which is quicker
//! but less accurate) or vdw radii appropriate for the elements. //! but less accurate) or vdw radii appropriate for the elements.
double shapePruneThreshold{-1.0}; //! If there is more than 1 conformer for
//! the input molecule, prune the shapes so
//! that none of them are more similar to
//! each other than the threshold. Default
//! -1.0 means no pruning.
bool sortShapes{true}; //! If true, the shapes are sorted in descending order
//! of total volume.
bool includeDummies{true}; //! Whether to include dummy atoms in the shape
//! or not.
}; };
// Data for shape alignment code // Data for shape alignment code
@@ -109,12 +129,20 @@ class RDKIT_GAUSSIANSHAPE_EXPORT ShapeInput {
public: public:
//! Create the ShapeInput object. //! Create the ShapeInput object.
//! @param mol: The molecule of interest //! @param mol: The molecule of interest
//! @param confId: The conformer to use //! @param confId: The conformer to use. If -1, uses all conformers.
//! @param opts: Options for setting up the shape //! @param opts: Options for setting up the shape
ShapeInput(const ROMol &mol, int confId = -1, //! @param overlayOpts: Options for controlling overlays. The distance cutoff
const ShapeInputOptions &opts = ShapeInputOptions(), //! elements are used in the self-overlap calculations.
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions()); explicit ShapeInput(
ShapeInput(const std::string &str) { const ROMol &mol, int confId = -1,
const ShapeInputOptions &opts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions());
//! Create a ShapeInput object with a single shape copied from
//! other.
//! @param other: the ShapeInput that supplies the shape
//! @param shapeNum: the number of the shape of interest.
ShapeInput(const ShapeInput &other, unsigned int shapeNum);
explicit ShapeInput(const std::string &str) {
#ifndef RDK_USE_BOOST_SERIALIZATION #ifndef RDK_USE_BOOST_SERIALIZATION
PRECONDITION(0, "Boost SERIALIZATION is not enabled") PRECONDITION(0, "Boost SERIALIZATION is not enabled")
#else #else
@@ -127,7 +155,12 @@ class RDKIT_GAUSSIANSHAPE_EXPORT ShapeInput {
ShapeInput(ShapeInput &&other) = default; ShapeInput(ShapeInput &&other) = default;
ShapeInput &operator=(const ShapeInput &other); ShapeInput &operator=(const ShapeInput &other);
ShapeInput &operator=(ShapeInput &&other) = default; ShapeInput &operator=(ShapeInput &&other) = default;
virtual ~ShapeInput() = default; ~ShapeInput() = default;
//! Merge the other ShapeInput, assuming it has the correct number
//! of atoms etc. Empties other, unless they can't be merged in which case
//! it returns unscathed.
void merge(ShapeInput &other);
std::string toString() const { std::string toString() const {
#ifndef RDK_USE_BOOST_SERIALIZATION #ifndef RDK_USE_BOOST_SERIALIZATION
@@ -140,108 +173,207 @@ class RDKIT_GAUSSIANSHAPE_EXPORT ShapeInput {
#endif #endif
} }
// Note that the coords returned is a vector size 4*getNumAtoms() const std::string getSmiles() const { return d_smiles; }
// with the 4th value per atom being the alpha paramter. unsigned int getActiveShape() const { return d_activeShape; }
const std::vector<double> &getCoords() const { return d_coords; } //! Set the currently active conformation to the new value.
//! @param newShape: the number of the conformation to be used
//! for future calculations. Counts from 0,
//! obviously. If invalid, throws a runtime
//! error.
void setActiveShape(unsigned int newShape);
//! Return the coordinates of the currently active shape.
//! Note that the coords are returned as a vector size 3*getNumAtoms()
const std::vector<double> &getCoords() const {
return d_coords[d_activeShape];
}
//! Get the alpha values for the atoms and color features in the shape.
const std::vector<double> &getAlphas() const { return d_alphas; }
//! Multiply the alpha value for the given atom/feature by -1.0
//! which will toggle whether the atom/feature is used in the volume
//! calculation or not. For temporarily "turning off" an atom or feature.
void negateAlpha(unsigned int alphaNum);
//! Fetch the coordinates of the atoms and optionally features. //! Fetch the coordinates of the atoms and optionally features.
std::vector<RDGeom::Point3D> getAtomPoints(bool includeColors = false) const; std::vector<RDGeom::Point3D> getAtomPoints(bool includeColors = false) const;
bool getNormalized() const { return d_normalized; } //! Return whether the coordinates for the current active shape are
const std::vector<int> &getTypes() const { return d_types; } //! normalized.
bool getIsNormalized() const { return d_normalizeds[d_activeShape]; }
//! Return the feature types of all atoms/features in the shape. Atoms
//! have type 0.
const std::vector<int> &getFeatureTypes() const { return d_types; }
//! Get the number of atoms in the shape.
unsigned int getNumAtoms() const { return d_numAtoms; } unsigned int getNumAtoms() const { return d_numAtoms; }
//! Get the number of color features in the shape.
unsigned int getNumFeatures() const { return d_numFeats; } unsigned int getNumFeatures() const { return d_numFeats; }
double getShapeVolume() const { return d_selfOverlapVol; } //! Get the number of shapes/conformations in the shape object. This may
double getColorVolume() const { return d_selfOverlapColor; } //! be smaller than the number of conformations in the input molecule if
//! shape pruning was performed.
unsigned int getNumShapes() const { return d_coords.size(); }
//! Get the volume of the atoms in the current active shape.
double getShapeVolume() const {
return d_selfOverlapShapeVols[d_activeShape];
}
//! Get the volume for the atoms for the given shape number.
double getShapeVolume(unsigned int shapeNum) const;
//! Get the volume of the color features in the current active shape.
double getColorVolume() const {
return d_selfOverlapColorVols[d_activeShape];
}
//! Get the volume of the color features for the given shape number.
double getColorVolume(unsigned int shapeNum) const;
//! Get the flags for which atoms have a carbon radius.
const boost::dynamic_bitset<> *getCarbonRadii() const { const boost::dynamic_bitset<> *getCarbonRadii() const {
return d_carbonRadii.get(); return d_carbonRadii.get();
} }
// These functions use cached values if available. // These functions use cached values if available.
//! Get the canonical rotation for the current active shape.
const std::array<double, 9> &calcCanonicalRotation(); const std::array<double, 9> &calcCanonicalRotation();
//! Get the canonical translation for the current active shape.
const std::array<double, 3> &calcCanonicalTranslation(); const std::array<double, 3> &calcCanonicalTranslation();
//! Get the eigen values for the coordinates matrix.
const std::array<double, 3> &calcEigenValues(); const std::array<double, 3> &calcEigenValues();
//! Get the numbers of the points at the extremes of x, y and z for the
//! current active shape. In the order minimum x, minimum y, minimum z,
//! then the maxima.
const std::array<size_t, 6> &calcExtremes(); const std::array<size_t, 6> &calcExtremes();
// Return the principal moments of inertia, if Eigen3 is available, and the //! Return the principal moments of inertia, if Eigen3 is available, and the
// eigenvalues of the canonical transformation if not. //! eigenvalues of the canonical transformation if not, for the current
//! active shape.
std::array<double, 3> calcMomentsOfInertia(bool includeColors = false) const; std::array<double, 3> calcMomentsOfInertia(bool includeColors = false) const;
// Align the principal axes to the cartesian axes and centre on the origin. //! Align the principal axes to the cartesian axes and centre on the origin
// Doesn't require that the shape was created from a molecule. Creates //! for the current active shape.
// the necessary transformation if not already done. //! Doesn't require that the shape was created from a molecule. Creates
//! the necessary transformation if not already done.
void normalizeCoords(); void normalizeCoords();
//! Applies the given transformation to the current active shape.
void transformCoords(RDGeom::Transform3D &xform); void transformCoords(RDGeom::Transform3D &xform);
// Mock a molecule up from the shape for visual inspection and sometimes //! Make a molecule from the current active shape. If required, features
// calculation of the normalization matrices. No bonds. //! are added as xenon atoms. If withBonds is false, just makes a molecule
// Atoms are C, features are N. //! from the atoms, otherwise builds a full molecule.
virtual std::unique_ptr<RWMol> shapeToMol(bool includeColors = true) const; std::unique_ptr<RWMol> shapeToMol(bool includeColors = false,
bool withBonds = true) const;
//! Find the best similarity score between all shapes in this shape and the
//! other one. Stops as soon as it gets something above the threshold.
//! The score runs between 0.0 and 1.0, so the default threshold of -1.0
//! means no threshold. Fills in the shape numbers of the two that were
//! responsible if there is something above the threshold, and the
//! transformation that did it. Returns -1.0 for the similarity if there was
//! nothing above the threshold. Note that the shape numbers are not
//! necessarily the same as the original molecule conformation numbers.
std::array<double, 3> bestSimilarity(
const ShapeInput &fitShape, unsigned int &bestThisShape,
unsigned int &bestFitShape, RDGeom::Transform3D &bestXform,
double threshold = -1.0,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions());
//! Return the maximum similarity achievable between the 2 shapes. The
//! maximum similarity is when one shape is entirely inside the other. This
//! returns the similarity in that case, which is the upper bound on what
//! is achievable between these 2 shapes.
double maxPossibleSimilarity(
const ShapeInput &fitShape,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions()) const;
//! Prune the shapes so none a more similar to each other than
//! the threshold.
void pruneShapes(double simThreshold);
#ifdef RDK_USE_BOOST_SERIALIZATION #ifdef RDK_USE_BOOST_SERIALIZATION
template <class Archive> template <class Archive>
void serialize(Archive &ar, const unsigned int) { void serialize(Archive &ar, unsigned int);
ar & d_coords;
ar & d_types;
ar & d_numAtoms;
ar & d_numFeats;
ar & d_selfOverlapVol;
ar & d_selfOverlapColor;
ar & d_extremePoints;
ar & d_carbonRadii;
ar & d_normalized;
ar & d_normalizationOK;
ar & d_canonRot;
ar & d_canonTrans;
ar & d_eigenValues;
}
#endif #endif
private: private:
void extractAtoms(const ROMol &mol, int confId, void extractAtoms(const Conformer &conf, const ShapeInputOptions &shapeOpts,
const ShapeInputOptions &opts); bool fillAlphas);
// Extract the features for the color scores, using RDKit pphore features // Extract the features for the color scores, using RDKit pphore features
// for now. Other options to be added later. // for now. Other options to be added later.
void extractFeatures(const ROMol &mol, int confId, void extractFeatures(const Conformer &conf, unsigned int confNum,
const ShapeInputOptions &shapeOpts); const ShapeInputOptions &shapeOpts, bool fillAlphas);
// Calculate the rotation and translation that will align the principal axes // Calculate the rotation and translation that will align the principal axes
// to the cartesian axes and centre on the origin. // to the cartesian axes and centre on the origin.
void calcNormalization(); void calcNormalization();
void calculateExtremes(); void calculateExtremes();
std::vector<double> d_coords; // The coordinates and alpha parameter for the unsigned int d_activeShape;
// atoms and features, packed as 4 floats per
// item - x, y, z and alpha. alpha is KAPPA / (r * r) where r is the radius std::vector<std::vector<double>>
d_coords; // The coordinates for the atoms and features,
// packed as 3 floats per item - x, y, z
std::vector<double> d_alphas; // The alpha values for the atoms and features.
// alpha is KAPPA / (r * r) where r is the radius
// of the atom. This is not used if using all_atoms_carbon mode. // of the atom. This is not used if using all_atoms_carbon mode.
std::vector<int> d_types; // The feature types. The size is the same std::vector<int> d_types; // The feature types. The size is the same
// as the number of coordinates, padded with 0 // as the number of coordinates, padded with 0
// for the atoms. // for the atoms.
unsigned int d_numAtoms; // The number of atoms unsigned int d_numAtoms; // The number of atoms
unsigned int d_numFeats; // The number of features unsigned int d_numFeats; // The number of features
double d_selfOverlapVol{0.0}; // Shape volume std::vector<double> d_selfOverlapShapeVols; // Shape volume
double d_selfOverlapColor{0.0}; // Color volume std::vector<double> d_selfOverlapColorVols; // Color volume
// These are the points at the extremes of the x, y and z axes. // These are the points at the extremes of the x, y and z axes.
// they are min_x, min_y, min_z and max_x, max_y, max_z. // they are min_x, min_y, min_z and max_x, max_y, max_z.
std::array<size_t, 6> d_extremePoints; std::vector<std::array<size_t, 6>> d_extremePointss;
std::unique_ptr<boost::dynamic_bitset<>> std::unique_ptr<boost::dynamic_bitset<>>
d_carbonRadii; // Flags those atoms with a carbon radius, for faster d_carbonRadii; // Flags those atoms with a carbon radius, for faster
// calculation later. // calculation later.
std::string d_smiles; // The SMILES string of the input molecule
// This is the rotation and translation to align the principal axes of the // These are the rotation and translation matrices to align the principal
// shape with cartesian axes. If d_normalized is true, it has been applied // axes of the shape with cartesian axes. If d_normalized is true, it has
// to the coordinates. // been applied to the coordinates.
bool d_normalized{false}; boost::dynamic_bitset<> d_normalizeds;
// If the shape is moved, the normalization matrices are no longer valid. // If the shape is moved, the normalization matrices are no longer valid.
// This flags that so it is re-computed as required. // This flags that so it is re-computed as required.
bool d_normalizationOK{false}; boost::dynamic_bitset<> d_normalizationOKs;
std::array<double, 9> d_canonRot; std::vector<std::array<double, 9>> d_canonRots;
std::array<double, 3> d_canonTrans; std::vector<std::array<double, 3>> d_canonTranss;
// The sorted eigenvalues of the principal axes. // The sorted eigenvalues of the principal axes.
std::array<double, 3> d_eigenValues; std::vector<std::array<double, 3>> d_eigenValuess;
void selectConformations(const std::vector<int> &picks);
void calculateSelfOverlaps(const ShapeOverlayOptions &overlayOpts);
// Sort the shapes in descending order of the sum of the shape
// and color volumes.
void sortShapesByVolumes();
}; };
#ifdef RDK_USE_BOOST_SERIALIZATION
template <class Archive>
void ShapeInput::serialize(Archive &ar, const unsigned int) {
ar & d_activeShape;
ar & d_coords;
ar & d_alphas;
ar & d_types;
ar & d_numAtoms;
ar & d_numFeats;
ar & d_selfOverlapShapeVols;
ar & d_selfOverlapColorVols;
ar & d_extremePointss;
ar & d_carbonRadii;
ar & d_smiles;
ar & d_normalizeds;
ar & d_normalizationOKs;
ar & d_canonRots;
ar & d_canonTranss;
ar & d_eigenValuess;
}
#endif
// Extract the features from the molecule, optionally just for the subset
// of atoms.
RDKIT_GAUSSIANSHAPE_EXPORT void findFeatures(
const Conformer &conf, std::vector<CustomFeature> &features,
const std::optional<std::vector<unsigned int>> &atomSubset = std::nullopt);
// Calculate the mean position of the given atoms. // Calculate the mean position of the given atoms.
RDKIT_GAUSSIANSHAPE_EXPORT RDGeom::Point3D computeFeaturePos( RDKIT_GAUSSIANSHAPE_EXPORT RDGeom::Point3D computeFeaturePos(
const ROMol &mol, int confId, const std::vector<unsigned int> &ats); const Conformer &conf, const std::vector<unsigned int> &ats);
RDKIT_GAUSSIANSHAPE_EXPORT RDGeom::Transform3D quatTransToTransform( RDKIT_GAUSSIANSHAPE_EXPORT RDGeom::Transform3D quatTransToTransform(
const double *quat, const double *trans); const double *quat, const double *trans);
@@ -249,16 +381,21 @@ RDKIT_GAUSSIANSHAPE_EXPORT RDGeom::Transform3D quatTransToTransform(
// Apply the transformation to the coordinates assumed to be in // Apply the transformation to the coordinates assumed to be in
// ShapeInput.d_coords form. // ShapeInput.d_coords form.
RDKIT_GAUSSIANSHAPE_EXPORT void applyTransformToShape( RDKIT_GAUSSIANSHAPE_EXPORT void applyTransformToShape(
std::vector<double> &shape, RDGeom::Transform3D &xform); std::vector<double> &shape, const RDGeom::Transform3D &xform);
RDKIT_GAUSSIANSHAPE_EXPORT void applyTransformToShape( RDKIT_GAUSSIANSHAPE_EXPORT void applyTransformToShape(
const double *inShape, double *outShape, size_t numPoints, const double *inShape, double *outShape, size_t numPoints,
RDGeom::Transform3D &xform); const RDGeom::Transform3D &xform);
RDKIT_GAUSSIANSHAPE_EXPORT void translateShape( RDKIT_GAUSSIANSHAPE_EXPORT void translateShape(
std::vector<double> &shape, const RDGeom::Point3D &translation); std::vector<double> &shape, const RDGeom::Point3D &translation);
RDKIT_GAUSSIANSHAPE_EXPORT void translateShape( RDKIT_GAUSSIANSHAPE_EXPORT void translateShape(
const double *inShape, double *outShape, size_t numPoints, const double *inShape, double *outShape, size_t numPoints,
const RDGeom::Point3D &translation); const RDGeom::Point3D &translation);
// Maximum possible score of the 2 shape (v[12]) and color (c[12]) volumes
RDKIT_GAUSSIANSHAPE_EXPORT double maxScore(
double v1, double v2, double c1, double c2,
const ShapeOverlayOptions &overlayOpts);
} // namespace GaussianShape } // namespace GaussianShape
} // namespace RDKit } // namespace RDKit

View File

@@ -17,7 +17,6 @@
#include <RDGeneral/export.h> #include <RDGeneral/export.h>
namespace RDKit { namespace RDKit {
class ROMol;
namespace GaussianShape { namespace GaussianShape {
enum class RDKIT_GAUSSIANSHAPE_EXPORT StartMode { enum class RDKIT_GAUSSIANSHAPE_EXPORT StartMode {

View File

@@ -22,21 +22,21 @@
#include <GraphMol/GaussianShape/ShapeOverlayOptions.h> #include <GraphMol/GaussianShape/ShapeOverlayOptions.h>
#include <GraphMol/GaussianShape/SingleConformerAlignment.h> #include <GraphMol/GaussianShape/SingleConformerAlignment.h>
constexpr int D = 4;
namespace RDKit { namespace RDKit {
namespace GaussianShape { namespace GaussianShape {
SingleConformerAlignment::SingleConformerAlignment( SingleConformerAlignment::SingleConformerAlignment(
const std::vector<double> &ref, const int *refTypes, const std::vector<double> &ref, const std::vector<double> &refAlphas,
const boost::dynamic_bitset<> *refCarbonRadii, int nRefShape, int nRefColor, const int *refTypes, const boost::dynamic_bitset<> *refCarbonRadii,
double refShapeVol, double refColorVol, const std::vector<double> &fit, int nRefShape, int nRefColor, double refShapeVol, double refColorVol,
const std::vector<double> &fit, const std::vector<double> &fitAlphas,
const int *fitTypes, const boost::dynamic_bitset<> *fitCarbonRadii, const int *fitTypes, const boost::dynamic_bitset<> *fitCarbonRadii,
int nFitShape, int nFitColor, double fitShapeVol, double fitColorVol, int nFitShape, int nFitColor, double fitShapeVol, double fitColorVol,
const std::array<double, 7> &initQuatTrans, OptimMode optimMode, const std::array<double, 7> &initQuatTrans, OptimMode optimMode,
double simAlpha, double simBeta, double mixingParam, bool useCutoff, double simAlpha, double simBeta, double mixingParam, bool useCutoff,
double distCutoff, double shapeConvergenceCriterion, unsigned int maxIts) double distCutoff, double shapeConvergenceCriterion, unsigned int maxIts)
: d_ref(ref), : d_ref(ref),
d_refAlphas(refAlphas),
d_refTypes(refTypes), d_refTypes(refTypes),
d_refCarbonRadii(refCarbonRadii), d_refCarbonRadii(refCarbonRadii),
d_nRefShape(nRefShape), d_nRefShape(nRefShape),
@@ -44,6 +44,7 @@ SingleConformerAlignment::SingleConformerAlignment(
d_refShapeVol(refShapeVol), d_refShapeVol(refShapeVol),
d_refColorVol(refColorVol), d_refColorVol(refColorVol),
d_fit(fit), d_fit(fit),
d_fitAlphas(fitAlphas),
d_fitTypes(fitTypes), d_fitTypes(fitTypes),
d_fitCarbonRadii(fitCarbonRadii), d_fitCarbonRadii(fitCarbonRadii),
d_nFitShape(nFitShape), d_nFitShape(nFitShape),
@@ -88,16 +89,18 @@ void SingleConformerAlignment::getFinalQuatTrans(
std::array<double, 5> SingleConformerAlignment::calcScores( std::array<double, 5> SingleConformerAlignment::calcScores(
const double *ref, const double *fit, const bool includeColor) { const double *ref, const double *fit, const bool includeColor) {
std::array<double, 5> scores{0.0, 0.0, 0.0, 0.0, 0.0}; std::array<double, 5> scores{0.0, 0.0, 0.0, 0.0, 0.0};
scores[3] = calcVolAndGrads(ref, d_nRefShape, d_refCarbonRadii, fit, scores[3] = calcVolAndGrads(ref, d_refAlphas.data(), d_nRefShape,
d_refCarbonRadii, fit, d_fitAlphas.data(),
d_nFitShape, d_fitCarbonRadii, d_gradConverters, d_nFitShape, d_fitCarbonRadii, d_gradConverters,
d_useCutoff, d_distCutoff2, nullptr, nullptr); d_useCutoff, d_distCutoff2, nullptr, nullptr);
if (d_nRefColor && d_nFitColor && if (d_nRefColor && d_nFitColor &&
(d_optimMode == OptimMode::SHAPE_PLUS_COLOR || includeColor)) { (d_optimMode == OptimMode::SHAPE_PLUS_COLOR || includeColor)) {
scores[4] = calcVolAndGrads(ref + d_nRefShape * D, d_nRefColor, scores[4] = calcVolAndGrads(
d_refTypes + d_nRefShape, fit + d_nFitShape * D, ref + d_nRefShape * 3, d_refAlphas.data() + d_nRefShape, d_nRefColor,
d_nFitColor, d_fitTypes + d_nFitShape, d_refTypes + d_nRefShape, fit + d_nFitShape * 3,
d_nFitShape, d_gradConverters, d_useCutoff, d_fitAlphas.data() + d_nFitShape, d_nFitColor, d_fitTypes + d_nFitShape,
d_distCutoff2, nullptr, nullptr); d_nFitShape, d_gradConverters, d_useCutoff, d_distCutoff2, nullptr,
nullptr);
} }
scores = calcScores(scores[3], scores[4], includeColor); scores = calcScores(scores[3], scores[4], includeColor);
return scores; return scores;
@@ -144,8 +147,8 @@ void cartToQuatGrads(const double *quat, const double *mol, const int numBPts,
const auto s = quat[2]; const auto s = quat[2];
const auto u = quat[3]; const auto u = quat[3];
const auto coef = 1.0 / (q * q + r * r + s * s + u * u); const auto coef = 1.0 / (q * q + r * r + s * s + u * u);
for (int i = 0, j = gradConvOffset, k = 12 * gradConvOffset; i < 4 * numBPts; for (int i = 0, j = gradConvOffset, k = 12 * gradConvOffset; i < 3 * numBPts;
i += 4, ++j, k += 12) { i += 3, ++j, k += 12) {
const auto x = mol[i]; const auto x = mol[i];
const auto y = mol[i + 1]; const auto y = mol[i + 1];
const auto z = mol[i + 2]; const auto z = mol[i + 2];
@@ -178,9 +181,11 @@ void cartToQuatGrads(const double *quat, const double *mol, const int numBPts,
} // namespace } // namespace
// atoms/shape features // atoms/shape features
double calcVolAndGrads(const double *ref, const int numRefPts, double calcVolAndGrads(const double *ref, const double *refAlphas,
const int numRefPts,
const boost::dynamic_bitset<> *refCarbonRadii, const boost::dynamic_bitset<> *refCarbonRadii,
const double *fit, const int numFitPts, const double *fit, const double *fitAlphas,
const int numFitPts,
const boost::dynamic_bitset<> *fitCarbonRadii, const boost::dynamic_bitset<> *fitCarbonRadii,
std::vector<double> &gradConverters, std::vector<double> &gradConverters,
const bool useCutoff, const double distCutoff2, const bool useCutoff, const double distCutoff2,
@@ -197,10 +202,13 @@ double calcVolAndGrads(const double *ref, const int numRefPts,
// both as being all carbon. There isn't enough information to do // both as being all carbon. There isn't enough information to do
// otherwise. // otherwise.
const bool allCarbon = !refCarbonRadii || !fitCarbonRadii; const bool allCarbon = !refCarbonRadii || !fitCarbonRadii;
for (int i = 0, i_idx = 0; i < numRefPts * 4; i += 4, i_idx++) { for (int i = 0, i_idx = 0; i < numRefPts * 3; i += 3, i_idx++) {
const auto ai = ref[i + 3]; const auto ai = refAlphas[i_idx];
for (int j = 0, j_idx = 0, k = 0; j < numFitPts * 4; if (ai < 0.0) {
j += 4, j_idx++, k += 12) { continue;
}
for (int j = 0, j_idx = 0, k = 0; j < numFitPts * 3;
j += 3, j_idx++, k += 12) {
const auto dx = ref[i] - fit[j]; const auto dx = ref[i] - fit[j];
const auto dy = ref[i + 1] - fit[j + 1]; const auto dy = ref[i + 1] - fit[j + 1];
const auto dz = ref[i + 2] - fit[j + 2]; const auto dz = ref[i + 2] - fit[j + 2];
@@ -208,7 +216,10 @@ double calcVolAndGrads(const double *ref, const int numRefPts,
if (useCutoff && d2 > distCutoff2) { if (useCutoff && d2 > distCutoff2) {
continue; continue;
} }
const auto aj = fit[j + 3]; const auto aj = fitAlphas[j_idx];
if (aj < 0.0) {
continue;
}
const auto mult = -(ai * aj) / (ai + aj); const auto mult = -(ai * aj) / (ai + aj);
const auto kij = exp(mult * d2); const auto kij = exp(mult * d2);
if (allCarbon || ((*refCarbonRadii)[i_idx] && (*fitCarbonRadii)[j_idx])) { if (allCarbon || ((*refCarbonRadii)[i_idx] && (*fitCarbonRadii)[j_idx])) {
@@ -247,8 +258,9 @@ double calcVolAndGrads(const double *ref, const int numRefPts,
} }
// color features // color features
double calcVolAndGrads(const double *ref, const int numRefPts, double calcVolAndGrads(const double *ref, const double *refAlphas,
const int *refTypes, const double *fit, const int numRefPts, const int *refTypes,
const double *fit, const double *fitAlphas,
const int numFitPts, const int *fitTypes, const int numFitPts, const int *fitTypes,
const int numFitShape, const int numFitShape,
std::vector<double> &gradConverters, std::vector<double> &gradConverters,
@@ -259,11 +271,11 @@ double calcVolAndGrads(const double *ref, const int numRefPts,
cartToQuatGrads(quat, fit, numFitPts, gradConverters, numFitShape); cartToQuatGrads(quat, fit, numFitPts, gradConverters, numFitShape);
} }
for (int i = 0, i_idx = 0; i < numRefPts * 4; i += 4, i_idx++) { for (int i = 0, i_idx = 0; i < numRefPts * 3; i += 3, i_idx++) {
const auto ai = ref[i + 3]; const auto ai = refAlphas[i_idx];
const auto aType = refTypes[i_idx]; const auto aType = refTypes[i_idx];
for (int j = 0, j_idx = 0, k = 0; j < numFitPts * 4; for (int j = 0, j_idx = 0, k = 0; j < numFitPts * 3;
j += 4, j_idx++, k += 12) { j += 3, j_idx++, k += 12) {
const auto bType = fitTypes[j_idx]; const auto bType = fitTypes[j_idx];
if (aType != bType) { if (aType != bType) {
continue; continue;
@@ -275,7 +287,7 @@ double calcVolAndGrads(const double *ref, const int numRefPts,
if (useCutoff && d2 > distCutoff2) { if (useCutoff && d2 > distCutoff2) {
continue; continue;
} }
const auto aj = fit[j + 3]; const auto aj = fitAlphas[j_idx];
const auto mult = -(ai * aj) / (ai + aj); const auto mult = -(ai * aj) / (ai + aj);
const auto kij = exp(mult * d2); const auto kij = exp(mult * d2);
@@ -333,14 +345,16 @@ void SingleConformerAlignment::calcVolumeAndGradients(
gradients[0] = gradients[1] = gradients[2] = gradients[3] = gradients[4] = gradients[0] = gradients[1] = gradients[2] = gradients[3] = gradients[4] =
gradients[5] = gradients[6] = 0.0; gradients[5] = gradients[6] = 0.0;
shapeOvlpVol = calcVolAndGrads( shapeOvlpVol = calcVolAndGrads(
d_refTemp.data(), d_nRefShape, d_refCarbonRadii, d_fitTemp.data(), d_refTemp.data(), d_refAlphas.data(), d_nRefShape, d_refCarbonRadii,
d_nFitShape, d_fitCarbonRadii, d_gradConverters, d_useCutoff, d_fitTemp.data(), d_fitAlphas.data(), d_nFitShape, d_fitCarbonRadii,
d_distCutoff2, quatTrans.data(), gradients.data()); d_gradConverters, d_useCutoff, d_distCutoff2, quatTrans.data(),
gradients.data());
if (d_optimMode == OptimMode::SHAPE_PLUS_COLOR) { if (d_optimMode == OptimMode::SHAPE_PLUS_COLOR) {
std::array<double, 7> colorGrads{0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0}; std::array<double, 7> colorGrads{0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
colorOvlpVol = calcVolAndGrads( colorOvlpVol = calcVolAndGrads(
d_refTemp.data() + 4 * d_nRefShape, d_nRefColor, d_refTemp.data() + 3 * d_nRefShape, d_refAlphas.data() + d_nRefShape,
d_refTypes + d_nRefShape, d_fitTemp.data() + 4 * d_nFitShape, d_nRefColor, d_refTypes + d_nRefShape,
d_fitTemp.data() + 3 * d_nFitShape, d_fitAlphas.data() + d_nFitShape,
d_nFitColor, d_fitTypes + d_nFitShape, d_nFitShape, d_gradConverters, d_nFitColor, d_fitTypes + d_nFitShape, d_nFitShape, d_gradConverters,
d_useCutoff, d_distCutoff2, quatTrans.data(), colorGrads.data()); d_useCutoff, d_distCutoff2, quatTrans.data(), colorGrads.data());
// The color gradients are normally dwarfed by the shape gradients, so // The color gradients are normally dwarfed by the shape gradients, so

View File

@@ -25,6 +25,7 @@
#include <RDGeneral/BoostEndInclude.h> #include <RDGeneral/BoostEndInclude.h>
#include <RDGeneral/export.h> #include <RDGeneral/export.h>
#include <GraphMol/MolTransforms/MolTransforms.h>
#include <GraphMol/GaussianShape/ShapeOverlayOptions.h> #include <GraphMol/GaussianShape/ShapeOverlayOptions.h>
namespace RDKit { namespace RDKit {
@@ -34,16 +35,18 @@ struct RDKIT_GAUSSIANSHAPE_EXPORT SingleConformerAlignment {
/// @brief Do the overlay for a single conformer of fit against a single /// @brief Do the overlay for a single conformer of fit against a single
/// conformer of ref. The output in scores is the rotation and translation /// conformer of ref. The output in scores is the rotation and translation
/// that moves fit to optimise its score with ref. /// that moves fit to optimise its score with ref.
/// @param ref - the query molecule as 1D array of 4 * N entries. Each /// @param ref - the reference molecule as 1D array of 3 * N entries. Each
/// block of 4 is the coords and atom radius /// block of 3 is the coords
/// @param refAlphas - the alpha values for the reference
/// @param refTypes - the feature types for molecule ref /// @param refTypes - the feature types for molecule ref
/// @param refCarbonRadii - whether each atom has a carbon radius /// @param refCarbonRadii - whether each atom has a carbon radius
/// @param nRefShape - the number of atoms in ref /// @param nRefShape - the number of atoms in ref
/// @param nRefColor - the number of features in ref /// @param nRefColor - the number of features in ref
/// @param refShapeVol - overlap volume of ref with itself /// @param refShapeVol - overlap volume of ref with itself
/// @param refColorVol - color overlap of ref with itself /// @param refColorVol - color overlap of ref with itself
/// @param fit - the fit molecule as 1D array of 4 * N entries. Each /// @param fit - the fit molecule as 1D array of 3 * N entries. Each
/// block of 4 is the coords and atom radius. /// block of 3 is the coords.
/// @param fitAlphas - the alpha values for the fit
/// @param fitTypes - the feature types for fit molecule /// @param fitTypes - the feature types for fit molecule
/// @param fitCarbonRadii - whether each atom has a carbon radius /// @param fitCarbonRadii - whether each atom has a carbon radius
/// @param nFitShape - the number of atoms in fit /// @param nFitShape - the number of atoms in fit
@@ -60,12 +63,12 @@ struct RDKIT_GAUSSIANSHAPE_EXPORT SingleConformerAlignment {
/// @param maxIts - maximum number of iterations for optimiser /// @param maxIts - maximum number of iterations for optimiser
/// of optimiser /// of optimiser
SingleConformerAlignment( SingleConformerAlignment(
const std::vector<double> &ref, const int *refTypes, const std::vector<double> &ref, const std::vector<double> &refAlphas,
const boost::dynamic_bitset<> *refCarbonRadii, int nRefShape, const int *refTypes, const boost::dynamic_bitset<> *refCarbonRadii,
int nRefColor, double refShapeVol, double refColorVol, int nRefShape, int nRefColor, double refShapeVol, double refColorVol,
const std::vector<double> &fit, const int *fitTypes, const std::vector<double> &fit, const std::vector<double> &fitAlphas,
const boost::dynamic_bitset<> *fitCarbonRadii, int nFitShape, const int *fitTypes, const boost::dynamic_bitset<> *fitCarbonRadii,
int nFitColor, double fitShapeVol, double fitColorVol, int nFitShape, int nFitColor, double fitShapeVol, double fitColorVol,
const std::array<double, 7> &initQuatTrans, OptimMode optimMode, const std::array<double, 7> &initQuatTrans, OptimMode optimMode,
double simAlpha, double simBeta, double mixingParam, bool useCutoff, double simAlpha, double simBeta, double mixingParam, bool useCutoff,
double distCutoff, double shapeConvergenceCriterion, unsigned int maxIts); double distCutoff, double shapeConvergenceCriterion, unsigned int maxIts);
@@ -141,6 +144,7 @@ struct RDKIT_GAUSSIANSHAPE_EXPORT SingleConformerAlignment {
bool optimise(unsigned int maxIters); bool optimise(unsigned int maxIters);
std::vector<double> d_ref; std::vector<double> d_ref;
std::vector<double> d_refAlphas;
std::vector<double> d_refTemp; std::vector<double> d_refTemp;
const int *d_refTypes; const int *d_refTypes;
const boost::dynamic_bitset<> *d_refCarbonRadii; const boost::dynamic_bitset<> *d_refCarbonRadii;
@@ -149,6 +153,7 @@ struct RDKIT_GAUSSIANSHAPE_EXPORT SingleConformerAlignment {
const double d_refShapeVol; const double d_refShapeVol;
const double d_refColorVol; const double d_refColorVol;
std::vector<double> d_fit; std::vector<double> d_fit;
std::vector<double> d_fitAlphas;
std::vector<double> d_fitTemp; std::vector<double> d_fitTemp;
const int *d_fitTypes; const int *d_fitTypes;
const boost::dynamic_bitset<> *d_fitCarbonRadii; const boost::dynamic_bitset<> *d_fitCarbonRadii;
@@ -181,22 +186,23 @@ struct RDKIT_GAUSSIANSHAPE_EXPORT SingleConformerAlignment {
// by the quaternion we're using to optimise the overlap volume. If // by the quaternion we're using to optimise the overlap volume. If
// gradients is null, they won't be calculated. They are assumed to be // gradients is null, they won't be calculated. They are assumed to be
// initialised correctly. // initialised correctly.
// This is for the atoms/shape features. // This is for the atoms/shape features. Negative alphas are skipped.
double calcVolAndGrads(const double *ref, int numRefPts, RDKIT_GAUSSIANSHAPE_EXPORT double calcVolAndGrads(
const boost::dynamic_bitset<> *refCarbonRadii, const double *ref, const double *refAlphas, int numRefPts,
const double *fit, int numFitPts, const boost::dynamic_bitset<> *refCarbonRadii, const double *fit,
const boost::dynamic_bitset<> *fitCarbonRadii, const double *fitAlphas, int numFitPts,
std::vector<double> &gradConverters, const boost::dynamic_bitset<> *fitCarbonRadii,
const bool useCutoff, const double distCutoff2, std::vector<double> &gradConverters, const bool useCutoff,
const double *quat = nullptr, const double distCutoff2, const double *quat = nullptr,
double *gradients = nullptr); double *gradients = nullptr);
// This one is for the features, and only calculates values if the types // This one is for the features, and only calculates values if the types
// of 2 features match. // of 2 features match. Negative alphas are skipped.
double calcVolAndGrads(const double *ref, int numRefPts, const int *refTypes, RDKIT_GAUSSIANSHAPE_EXPORT double calcVolAndGrads(
const double *fit, int numFitPts, const int *fitTypes, const double *ref, const double *refAlphas, int numRefPts,
int numFitShape, std::vector<double> &gradConverters, const int *refTypes, const double *fit, const double *fitAlphas,
const bool useCutoff, const double distCutoff2, int numFitPts, const int *fitTypes, int numFitShape,
const double *quat, double *gradients); std::vector<double> &gradConverters, const bool useCutoff,
const double distCutoff2, const double *quat, double *gradients);
} // namespace GaussianShape } // namespace GaussianShape
} // namespace RDKit } // namespace RDKit

View File

@@ -16,6 +16,7 @@
#include <Geometry/point.h> #include <Geometry/point.h>
#include <GraphMol/ROMol.h> #include <GraphMol/ROMol.h>
#include <GraphMol/RWMol.h>
#include <GraphMol/GaussianShape/GaussianShape.h> #include <GraphMol/GaussianShape/GaussianShape.h>
#include <GraphMol/GaussianShape/ShapeInput.h> #include <GraphMol/GaussianShape/ShapeInput.h>
#include <GraphMol/GaussianShape/ShapeOverlayOptions.h> #include <GraphMol/GaussianShape/ShapeOverlayOptions.h>
@@ -29,26 +30,45 @@ namespace helpers {
void set_customFeatures(GaussianShape::ShapeInputOptions &shp, void set_customFeatures(GaussianShape::ShapeInputOptions &shp,
const python::object &s) { const python::object &s) {
shp.customFeatures.clear(); shp.customFeatures.clear();
auto len = python::len(s); auto numVecs = python::len(s);
shp.customFeatures.reserve(len); shp.customFeatures.reserve(numVecs);
for (auto i = 0u; i < len; ++i) { for (auto i = 0u; i < numVecs; ++i) {
const auto elem = s[i]; const auto outVec = s[i];
unsigned int featType = python::extract<unsigned int>(elem[0]); auto numFeats = python::len(outVec);
RDGeom::Point3D pos = python::extract<RDGeom::Point3D>(elem[1]); std::vector<GaussianShape::CustomFeature> feats;
double radius = python::extract<double>(elem[2]); feats.reserve(numFeats);
shp.customFeatures.emplace_back(featType, pos, radius); for (auto j = 0u; j < numFeats; ++j) {
const auto feat = outVec[j];
unsigned int featType = python::extract<unsigned int>(feat[0]);
RDGeom::Point3D pos = python::extract<RDGeom::Point3D>(feat[1]);
double radius = python::extract<double>(feat[2]);
std::vector<unsigned int> atoms;
if (len(feat) == 4) {
for (unsigned int k = 0; k < len(feat[3]); ++k) {
atoms.push_back(python::extract<unsigned int>(feat[3][k]));
}
}
feats.emplace_back(featType, pos, radius, atoms);
}
shp.customFeatures.emplace_back(std::move(feats));
} }
} }
python::tuple get_customFeatures(const GaussianShape::ShapeInputOptions &shp) { python::tuple get_customFeatures(const GaussianShape::ShapeInputOptions &shp) {
python::list py_list; python::list allFeatLists;
for (const auto &val : shp.customFeatures) { for (const auto &feats : shp.customFeatures) {
python::list elem; python::list featList;
elem.append(static_cast<int>(std::get<0>(val))); for (const auto &feat : feats) {
elem.append(std::get<1>(val)); python::list elem;
elem.append(std::get<2>(val)); elem.append(static_cast<int>(feat.type));
py_list.append(python::tuple(elem)); elem.append(feat.pos);
elem.append(feat.rad);
elem.append(feat.atoms);
featList.append(elem);
}
allFeatLists.append(featList);
} }
return python::tuple(py_list); return python::tuple(allFeatLists);
} }
python::tuple alignMol1(const ROMol &ref, ROMol &fit, python::tuple alignMol1(const ROMol &ref, ROMol &fit,
@@ -127,9 +147,11 @@ python::tuple scoreMol1(const ROMol &ref, const ROMol &fit,
overlayOpts = overlayOpts =
python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts); python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts);
} }
std::pair<double, double> ovVols;
auto results = GaussianShape::ScoreMolecule( auto results = GaussianShape::ScoreMolecule(
ref, fit, refOpts, fitOpts, overlayOpts, refConfId, fitConfId); ref, fit, refOpts, fitOpts, overlayOpts, refConfId, fitConfId, &ovVols);
return python::make_tuple(results[0], results[1], results[2]); return python::make_tuple(results[0], results[1], results[2], ovVols.first,
ovVols.second);
} }
python::tuple scoreMol2(const GaussianShape::ShapeInput &refShape, python::tuple scoreMol2(const GaussianShape::ShapeInput &refShape,
@@ -144,9 +166,11 @@ python::tuple scoreMol2(const GaussianShape::ShapeInput &refShape,
overlayOpts = overlayOpts =
python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts); python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts);
} }
std::pair<double, double> ovVols;
auto results = GaussianShape::ScoreMolecule(refShape, fit, fitOpts, auto results = GaussianShape::ScoreMolecule(refShape, fit, fitOpts,
overlayOpts, fitConfId); overlayOpts, fitConfId, &ovVols);
return python::make_tuple(results[0], results[1], results[2]); return python::make_tuple(results[0], results[1], results[2], ovVols.first,
ovVols.second);
} }
python::tuple scoreShape(const GaussianShape::ShapeInput &refShape, python::tuple scoreShape(const GaussianShape::ShapeInput &refShape,
@@ -157,8 +181,11 @@ python::tuple scoreShape(const GaussianShape::ShapeInput &refShape,
overlayOpts = overlayOpts =
python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts); python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts);
} }
auto results = GaussianShape::ScoreShape(refShape, fitShape, overlayOpts); std::pair<double, double> ovVols;
return python::make_tuple(results[0], results[1], results[2]); auto results =
GaussianShape::ScoreShape(refShape, fitShape, overlayOpts, &ovVols);
return python::make_tuple(results[0], results[1], results[2], ovVols.first,
ovVols.second);
} }
void set_atomSubset(GaussianShape::ShapeInputOptions &opts, void set_atomSubset(GaussianShape::ShapeInputOptions &opts,
@@ -193,6 +220,101 @@ python::tuple get_atomRadii(const GaussianShape::ShapeInputOptions &opts) {
return python::tuple(py_list); return python::tuple(py_list);
} }
double getShapeVolume_helper(const GaussianShape::ShapeInput &shape) {
return shape.getShapeVolume();
}
double getColorVolume_helper(const GaussianShape::ShapeInput &shape) {
return shape.getColorVolume();
}
python::tuple bestSimilarity_helper(GaussianShape::ShapeInput &refShape,
const GaussianShape::ShapeInput &fitShape,
double threshold,
const python::object &py_overlayOpts) {
GaussianShape::ShapeOverlayOptions overlayOpts;
if (!py_overlayOpts.is_none()) {
overlayOpts =
python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts);
}
unsigned int bestFitShape, bestThisShape;
RDGeom::Transform3D bestXform;
auto bestSim = refShape.bestSimilarity(fitShape, bestThisShape, bestFitShape,
bestXform, threshold, overlayOpts);
python::list results;
results.append(python::make_tuple(bestSim[0], bestSim[1], bestSim[2]));
results.append(bestThisShape);
results.append(bestFitShape);
python::list pyMatrix;
for (int i = 0; i < 4; ++i) {
for (int j = 0; j < 4; ++j) {
pyMatrix.append(bestXform.getValUnchecked(i, j));
}
}
results.append(pyMatrix);
return python::tuple(results);
}
double maxPossibleSimilarity_helper(GaussianShape::ShapeInput &refShape,
GaussianShape::ShapeInput &fitShape,
const python::object &py_overlayOpts) {
GaussianShape::ShapeOverlayOptions overlayOpts;
if (!py_overlayOpts.is_none()) {
overlayOpts =
python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts);
}
return refShape.maxPossibleSimilarity(fitShape, overlayOpts);
}
ROMol *shapeToMol_helper(GaussianShape::ShapeInput &shape, bool includeColors,
bool withBonds) {
auto mol = shape.shapeToMol(includeColors, withBonds);
return static_cast<ROMol *>(mol.release());
}
python::tuple scoreMolAllConfs_helper(const ROMol &ref, const ROMol &fit,
const python::object &py_refOpts,
const python::object &py_fitOpts,
const python::object &py_overlayOpts) {
python::list results;
GaussianShape::ShapeInputOptions refOpts, fitOpts;
if (!py_refOpts.is_none()) {
refOpts = python::extract<GaussianShape::ShapeInputOptions>(py_refOpts);
}
if (!py_fitOpts.is_none()) {
fitOpts = python::extract<GaussianShape::ShapeInputOptions>(py_fitOpts);
}
GaussianShape::ShapeOverlayOptions overlayOpts;
if (!py_overlayOpts.is_none()) {
overlayOpts =
python::extract<GaussianShape::ShapeOverlayOptions>(py_overlayOpts);
}
std::vector<std::vector<double>> combScores;
int bestRefConf, bestFitConf;
RDGeom::Transform3D bestXform;
GaussianShape::ScoreMoleculeAllConformers(ref, fit, bestRefConf, bestFitConf,
combScores, refOpts, fitOpts,
overlayOpts, &bestXform);
python::list pyScores;
for (const auto &scores : combScores) {
python::list s;
for (const auto &score : scores) {
s.append(score);
}
pyScores.append(s);
}
results.append(pyScores);
results.append(bestRefConf);
results.append(bestFitConf);
python::list pyMatrix;
for (int i = 0; i < 4; ++i) {
for (int j = 0; j < 4; ++j) {
pyMatrix.append(bestXform.getValUnchecked(i, j));
}
}
results.append(pyMatrix);
return python::tuple(results);
}
} // namespace helpers } // namespace helpers
void wrap_rdGaussianShape() { void wrap_rdGaussianShape() {
@@ -240,13 +362,27 @@ void wrap_rdGaussianShape() {
.add_property( .add_property(
"customFeatures", &helpers::get_customFeatures, "customFeatures", &helpers::get_customFeatures,
&helpers::set_customFeatures, &helpers::set_customFeatures,
"Custom features for the shape. Requires a list of tuples of" "Custom features for the shape. Requires a list of lists of tuples of"
" int (the feature type), Point3D (the coordinates) and float (the radius).") " int (the feature type), Point3D (the coordinates), float (the radius)"
" and optionally a list of indices of the atoms that the feature was derived from.")
.add_property( .add_property(
"atomRadii", &helpers::get_atomRadii, &helpers::set_atomRadii, "atomRadii", &helpers::get_atomRadii, &helpers::set_atomRadii,
"Non-standard radii to use for the atoms specified by their indices" "Non-standard radii to use for the atoms specified by their indices"
" in the molecule. Not all atoms need have a radius specified." " in the molecule. Not all atoms need have a radius specified."
" A list of tuples of [int, float].") " A list of tuples of [int, float].")
.def_readwrite(
"shapePruneThreshold",
&GaussianShape::ShapeInputOptions::shapePruneThreshold,
"If there is more than 1 conformer for the input molecule, prune the"
" shapes so that none of them are more similar to each other than the"
" threshold. Default -1.0 means no pruning.")
.def_readwrite(
"sortShapes", &GaussianShape::ShapeInputOptions::sortShapes,
"If True (the default), the shapes are sorted into descending order"
" of total volume.")
.def_readwrite(
"includeDummies", &GaussianShape::ShapeInputOptions::includeDummies,
"Whether to include dummy atoms in the shape or not. Default=True.")
.def("__setattr__", &safeSetattr); .def("__setattr__", &safeSetattr);
python::class_<GaussianShape::ShapeOverlayOptions, boost::noncopyable>( python::class_<GaussianShape::ShapeOverlayOptions, boost::noncopyable>(
@@ -322,14 +458,70 @@ void wrap_rdGaussianShape() {
python::init<const ROMol &, int, const GaussianShape::ShapeInputOptions &, python::init<const ROMol &, int, const GaussianShape::ShapeInputOptions &,
const GaussianShape::ShapeOverlayOptions &>( const GaussianShape::ShapeOverlayOptions &>(
python::args("self", "confId", "shapeOpt", "overlayOpts"))) python::args("self", "confId", "shapeOpt", "overlayOpts")))
.add_property(
"GetSmiles", &GaussianShape::ShapeInput::getSmiles,
"Get the SMILES string for the molecule that the shape relates to.")
.add_property(
"setActiveShape", &GaussianShape::ShapeInput::setActiveShape,
"Set the active shape, the one that will be used for overlays etc.")
.add_property("getActiveShape",
&GaussianShape::ShapeInput::getActiveShape,
"Return the number of the active shape.")
.add_property("NumAtoms", &GaussianShape::ShapeInput::getNumAtoms, .add_property("NumAtoms", &GaussianShape::ShapeInput::getNumAtoms,
"Get the number of atoms defining the shape.") "Get the number of atoms defining the shape.")
.add_property("NumFeatures", &GaussianShape::ShapeInput::getNumFeatures, .add_property("NumFeatures", &GaussianShape::ShapeInput::getNumFeatures,
"Get the number of features in the shape.") "Get the number of features in the shape.")
.add_property("ShapeVolume", &GaussianShape::ShapeInput::getShapeVolume, .add_property(
"Get the shape's volume due to the atoms.") "NumShapes", &GaussianShape::ShapeInput::getNumShapes,
.add_property("ColorVolume", &GaussianShape::ShapeInput::getColorVolume, "Get the number of shapes. There will be a shape for each conformation "
"Get the volume of the shape's color features.") "of the input molecule, unless shape pruning was carried out in which case"
" there may be fewer.")
.add_property("ShapeVolume", &helpers::getShapeVolume_helper,
"Get the volume due to the atoms for the active shape.")
.add_property(
"ColorVolume", &helpers::getColorVolume_helper,
"Get the volume of the shape's color features for the active shap.")
.def(
"NormalizeCoords", &GaussianShape::ShapeInput::normalizeCoords,
"Align the principal axes to the cartesian axes and centre on the origin."
" Doesn't require that the shape was created from a molecule. Creates"
" the necessary transformation if not already done.")
.def(
"ShapeToMol", &helpers::shapeToMol_helper,
(python::arg("self"), python::arg("includeColors") = false,
python::arg("withBonds") = true),
"Return a molecule with coordinates of the current active shape."
" If includeColors is True, (default is False) the color features"
" will be added as xenon atoms. If withBonds is True (the default)"
" a molecule with bonds will be created, if not then just atoms at the"
" appropriate positions will be produced.",
python::return_value_policy<python::manage_new_object>())
.def(
"BestSimilarity", &helpers::bestSimilarity_helper,
(python::arg("self"), python::arg("fitShape"),
python::arg("threshold") = -1.0,
python::arg("overlayOpts") = python::object()),
"Find the best similarity score between all shapes in this shape and the"
" other one. Stops as soon as it gets something above the threshold."
" The score runs between 0.0 and 1.0, so the default threshold of -1.0"
" means no threshold. Fills in the shape numbers of the two that were"
" responsible if there is something above the threshold, and the"
" transformation that did it. Returns a tuple of the similarity scores"
" ((-1.0, -1.0, -1.0) if there was nothing above the threshold), the number of the"
" shape for this object and the shape number of the fitShape that gave"
" the best similarity and the transformation matrix (as a list of 16 floats)"
" that will reproduce the best overlay. The shapes won't necessarily"
" be left in the state that gave the best similarity. Note that the"
" shape numbers are not necessarily the same as the original molecule"
" conformation numbers.")
.def("MaxPossibleSimilarity", &helpers::maxPossibleSimilarity_helper,
(python::arg("self"), python::arg("fitShape"),
python::arg("overlayOpts") = python::object()),
"Get the maximum possible similarity score between all shapes in"
" this shape and all shapes in the fitShape. The maximum similarity"
" is when one shape is entirely inside the other. This returns"
" the similarity in that case, which is the upper bound on what"
" is achievable between these 2 shapes.")
.def("__setattr__", &safeSetattr); .def("__setattr__", &safeSetattr);
python::def( python::def(
@@ -506,6 +698,40 @@ Returns
0.0 if color features not used, in which case combo_score and shape_score will 0.0 if color features not used, in which case combo_score and shape_score will
be the same. be the same.
)DOC"); )DOC");
python::def("ScoreMoleculeAllConformers", &helpers::scoreMolAllConfs_helper,
(python::arg("ref"), python::arg("fit"),
python::arg("refOpts") = python::object(),
python::arg("fitOpts") = python::object(),
python::arg("overlayOpts") = python::object()),
R"DOC(Calculate the scores for the alignment of all conformers
of the fit molecule onto the reference. The molecules themselves are not
altered.
Parameters
----------
ref: RDKit.ROMol
Reference molecule
fit: RDKit.ROMol
Fit molecule that will be scored
refOpts: ShapeInputOptions, optional
Options for building the ref shape
fitOpts: ShapeInputOptions, optional
Options for building the fit shape
overlayOpts: ShapeOverlayOptions, optional
Options for controlling the volume calculation
Returns
-------
A complex tuple containing:
A tuple of tuples containing the scores from aligning the fit conformations
onto the reference conformations. scores[0][1] is the score of aligning
fit conformation 1 onto ref conformation 0.
The ID of the ref conformer from the best-scoring alignment
The ID of the fit conformer from the best-scoring alignment
The transformation that gives the best-scoring alignment for those
conformers as a 16-float tuple.
)DOC");
} }
BOOST_PYTHON_MODULE(rdGaussianShape) { wrap_rdGaussianShape(); } BOOST_PYTHON_MODULE(rdGaussianShape) { wrap_rdGaussianShape(); }

View File

@@ -2,12 +2,12 @@ import unittest
import numpy as np import numpy as np
from rdkit import Chem from rdkit import Chem
from rdkit.Chem import rdGaussianShape, rdMolTransforms from rdkit.Chem import rdGaussianShape, rdMolTransforms, rdDistGeom
from rdkit import RDConfig from rdkit import RDConfig
from rdkit.Geometry import Point3D from rdkit.Geometry import Point3D
datadir = RDConfig.RDBaseDir + '/External/pubchem_shape/test_data' datadir = RDConfig.RDBaseDir + '/Code/GraphMol/GaussianShape/test_data'
class TestCase(unittest.TestCase): class TestCase(unittest.TestCase):
@@ -46,50 +46,63 @@ class TestCase(unittest.TestCase):
self.assertAlmostEqual(tpl[1], 0.760, places=3) self.assertAlmostEqual(tpl[1], 0.760, places=3)
self.assertAlmostEqual(tpl[2], 0.233, places=3) self.assertAlmostEqual(tpl[2], 0.233, places=3)
mol = shp.ShapeToMol()
self.assertEqual(Chem.MolToSmiles(mol), "CC(=O)Oc1ccccc1C(=O)O")
mol = shp.ShapeToMol(True)
self.assertEqual(Chem.MolToSmiles(mol), "CC(=O)Oc1ccccc1C(=O)O.[Xe].[Xe].[Xe].[Xe].[Xe].[Xe]")
mol = shp.ShapeToMol(False, False)
self.assertEqual(Chem.MolToSmiles(mol), "C.C.C.C.C.C.C.C.C.C.C.C.C")
mol = shp.ShapeToMol(True, False)
self.assertEqual(Chem.MolToSmiles(mol), "C.C.C.C.C.C.C.C.C.C.C.C.C.[Xe].[Xe].[Xe].[Xe].[Xe].[Xe]")
def test4_customFeatures(self): def test4_customFeatures(self):
m1 = Chem.MolFromSmiles( m1 = Chem.MolFromSmiles(
"O=CC=O |(-1.75978,0.148897,0;-0.621382,-0.394324,0;0.624061,0.3656,.1;1.7571,-0.120174,.1)|") "O=CC=O |(-1.75978,0.148897,0;-0.621382,-0.394324,0;0.624061,0.3656,.1;1.7571,-0.120174,.1)|")
opts = rdGaussianShape.ShapeInputOptions() opts = rdGaussianShape.ShapeInputOptions()
opts.customFeatures = ((1, Point3D(-1.75978, 0.148897, opts.customFeatures = [[(1, Point3D(-1.75978, 0.148897, 0), 1.0),
0), 1.0), (2, Point3D(1.7571, -0.120174, 0.1), 1.0)) (2, Point3D(1.7571, -0.120174, 0.1), 1.0)]]
ovOpts = rdGaussianShape.ShapeOverlayOptions() ovOpts = rdGaussianShape.ShapeOverlayOptions()
shp = rdGaussianShape.ShapeInput(m1, -1, opts, ovOpts) shp = rdGaussianShape.ShapeInput(m1, -1, opts, ovOpts)
self.assertEqual(shp.NumAtoms, 4) self.assertEqual(shp.NumAtoms, 4)
self.assertEqual(shp.NumFeatures, 2) self.assertEqual(shp.NumFeatures, 2)
m2 = Chem.Mol(m1) m2 = Chem.Mol(m1)
opts2 = rdGaussianShape.ShapeInputOptions() opts2 = rdGaussianShape.ShapeInputOptions()
opts2.customFeatures = ((2, Point3D(-1.75978, 0.148897, opts2.customFeatures = [[(2, Point3D(-1.75978, 0.148897, 0), 1.0, [1, 2, 3]),
0), 1.0), (1, Point3D(1.7571, -0.120174, 0.1), 1.0)) (1, Point3D(1.7571, -0.120174, 0.1), 1.0, [4, 5, 6])]]
shp2 = rdGaussianShape.ShapeInput(m2, -1, opts2, ovOpts) shp2 = rdGaussianShape.ShapeInput(m2, -1, opts2, ovOpts)
tpl = rdGaussianShape.AlignShapes(shp, shp2, ovOpts) tpl = rdGaussianShape.AlignShapes(shp, shp2, ovOpts)
self.assertAlmostEqual(tpl[0], 0.999, places=3) self.assertAlmostEqual(tpl[0], 0.999, places=3)
self.assertAlmostEqual(tpl[1], 1.000, places=3) self.assertAlmostEqual(tpl[1], 1.000, places=3)
self.assertAlmostEqual(tpl[2], 0.999, places=3) self.assertAlmostEqual(tpl[2], 0.998, places=3)
tf = tpl[3] tf = tpl[3]
self.assertGreater(0.0, tf[0]) self.assertGreater(0.0, tf[0])
self.assertEqual(1.0, tf[15]) self.assertEqual(1.0, tf[15])
# check the getter: # check the getter:
cfs = opts2.customFeatures cfs = opts2.customFeatures
self.assertEqual(len(cfs), 2) self.assertEqual(len(cfs), 1)
self.assertEqual(cfs[0][0], 2) self.assertEqual(len(cfs[0]), 2)
self.assertEqual(cfs[1][0], 1) self.assertEqual(len(cfs[0][0]), 4)
self.assertEqual(cfs[0][0][0], 2)
self.assertEqual(cfs[0][1][3][0], 4)
self.assertEqual(cfs[0][1][3][1], 5)
self.assertEqual(cfs[0][1][3][2], 6)
def test5_customFeatures(self): def test5_customFeatures(self):
m1 = Chem.MolFromSmiles( m1 = Chem.MolFromSmiles(
"O=CC=O |(-1.75978,0.148897,0;-0.621382,-0.394324,0;0.624061,0.3656,.1;1.7571,-0.120174,.1)|") "O=CC=O |(-1.75978,0.148897,0;-0.621382,-0.394324,0;0.624061,0.3656,.1;1.7571,-0.120174,.1)|")
opts = rdGaussianShape.ShapeInputOptions() opts = rdGaussianShape.ShapeInputOptions()
opts.customFeatures = ((1, Point3D(-1.75978, 0.148897, opts.customFeatures = [[(1, Point3D(-1.75978, 0.148897,
0), 1.0), (2, Point3D(1.7571, -0.120174, 0.1), 1.0)) 0), 1.0), (2, Point3D(1.7571, -0.120174, 0.1), 1.0)]]
m2 = Chem.Mol(m1) m2 = Chem.Mol(m1)
opts2 = rdGaussianShape.ShapeInputOptions() opts2 = rdGaussianShape.ShapeInputOptions()
opts2.customFeatures = ((2, Point3D(-1.75978, 0.148897, opts2.customFeatures = [[(2, Point3D(-1.75978, 0.148897,
0), 1.0), (1, Point3D(1.7571, -0.120174, 0.1), 1.0)) 0), 1.0), (1, Point3D(1.7571, -0.120174, 0.1), 1.0)]]
ovOpts = rdGaussianShape.ShapeOverlayOptions() ovOpts = rdGaussianShape.ShapeOverlayOptions()
tpl = rdGaussianShape.AlignMol(m1, m2, opts, opts2, ovOpts) tpl = rdGaussianShape.AlignMol(m1, m2, opts, opts2, ovOpts)
self.assertAlmostEqual(tpl[0], 0.999, places=3) self.assertAlmostEqual(tpl[0], 0.999, places=3)
self.assertAlmostEqual(tpl[1], 1.000, places=3) self.assertAlmostEqual(tpl[1], 1.000, places=3)
self.assertAlmostEqual(tpl[2], 0.999, places=3) self.assertAlmostEqual(tpl[2], 0.998, places=3)
def test6_FixedScore(self): def test6_FixedScore(self):
ovOpts = rdGaussianShape.ShapeOverlayOptions() ovOpts = rdGaussianShape.ShapeOverlayOptions()
@@ -116,6 +129,8 @@ class TestCase(unittest.TestCase):
self.assertAlmostEqual(tpl[0], 1.0, places=3) self.assertAlmostEqual(tpl[0], 1.0, places=3)
self.assertAlmostEqual(tpl[1], 1.0, places=3) self.assertAlmostEqual(tpl[1], 1.0, places=3)
self.assertAlmostEqual(tpl[2], 1.0, places=3) self.assertAlmostEqual(tpl[2], 1.0, places=3)
self.assertAlmostEqual(tpl[3], 751.0, places=1)
self.assertAlmostEqual(tpl[4], 42.5, places=1)
def test7_customAtomRadii(self): def test7_customAtomRadii(self):
ovOpts = rdGaussianShape.ShapeOverlayOptions() ovOpts = rdGaussianShape.ShapeOverlayOptions()
@@ -155,7 +170,55 @@ class TestCase(unittest.TestCase):
self.assertAlmostEqual(fit_tversky[0], 0.557, places=3) self.assertAlmostEqual(fit_tversky[0], 0.557, places=3)
self.assertAlmostEqual(fit_tversky[1], 0.780, places=3) self.assertAlmostEqual(fit_tversky[1], 0.780, places=3)
self.assertAlmostEqual(fit_tversky[2], 0.335, places=3) self.assertAlmostEqual(fit_tversky[2], 0.335, places=3)
def test10_multipleConformers(self):
def loadFile(fileName):
mol = None
with open(fileName, "r") as f:
for line in f:
line = line.strip()
if not line:
continue
m = Chem.MolFromSmiles(line)
if mol is None:
mol = m
else:
mol.AddConformer(m.GetConformer())
return mol
esofile = datadir + "/esomeprazole_multi.smi"
esomeprazole = loadFile(esofile)
self.assertEqual(esomeprazole.GetNumConformers(), 10)
ovOpts = rdGaussianShape.ShapeOverlayOptions()
opts = rdGaussianShape.ShapeInputOptions()
# Set the to default values, just to show that they can be set.
opts.sortShapes = True
opts.includeDummies = True
opts.shapePruneThreshold = -1.0
shapes1 = rdGaussianShape.ShapeInput(esomeprazole, -1, opts, ovOpts)
self.assertEqual(shapes1.NumShapes, 10)
ranfile = datadir + "/ranitidine_multi.smi"
ranitidine = loadFile(ranfile)
shapes2 = rdGaussianShape.ShapeInput(ranitidine, -1, opts, ovOpts)
self.assertEqual(shapes2.NumShapes, 10)
bestSim, best1, best2, xform = shapes1.BestSimilarity(shapes2)
self.assertAlmostEqual(bestSim[0], 0.449, places=3)
self.assertEqual(best1, 2)
self.assertEqual(best2, 6)
self.assertEqual(len(xform), 16)
stuff = rdGaussianShape.ScoreMoleculeAllConformers(esomeprazole, ranitidine)
self.assertEqual(len(stuff), 4)
self.assertEqual(len(stuff[0]), 10)
self.assertEqual(len(stuff[0][0]), 10)
self.assertEqual(stuff[1], 8)
self.assertEqual(stuff[2], 3)
self.assertAlmostEqual(stuff[0][8][3], 0.449, places=3)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -20,12 +20,15 @@
#include <catch2/matchers/catch_matchers_floating_point.hpp> #include <catch2/matchers/catch_matchers_floating_point.hpp>
#include <GraphMol/MolOps.h> #include <GraphMol/MolOps.h>
#include <GraphMol/DistGeomHelpers/Embedder.h>
#include <GraphMol/FileParsers/MolWriters.h> #include <GraphMol/FileParsers/MolWriters.h>
#include <GraphMol/FileParsers/MolSupplier.h> #include <GraphMol/FileParsers/MolSupplier.h>
#include <GraphMol/GaussianShape/GaussianShape.h> #include <GraphMol/GaussianShape/GaussianShape.h>
#include <GraphMol/GaussianShape/ShapeInput.h> #include <GraphMol/GaussianShape/ShapeInput.h>
#include <GraphMol/MolTransforms/MolTransforms.h> #include <GraphMol/MolTransforms/MolTransforms.h>
#include <GraphMol/SmilesParse/SmilesWrite.h> #include <GraphMol/SmilesParse/SmilesWrite.h>
#include <RDGeneral/RDLog.h>
#include <GraphMol/Substruct/SubstructMatch.h>
using namespace RDKit; using namespace RDKit;
@@ -48,7 +51,7 @@ bool checkMolsHaveRoughlySameCoords(const ROMol &m1, const ROMol &m2,
TEST_CASE("basic alignment") { TEST_CASE("basic alignment") {
std::string dirName = getenv("RDBASE"); std::string dirName = getenv("RDBASE");
dirName += "/External/pubchem_shape/test_data"; dirName += "/Code/GraphMol/GaussianShape/test_data";
auto suppl = v2::FileParsers::SDMolSupplier(dirName + "/test1.sdf"); auto suppl = v2::FileParsers::SDMolSupplier(dirName + "/test1.sdf");
auto refT = suppl[0]; auto refT = suppl[0];
@@ -198,7 +201,6 @@ TEST_CASE("bulk") {
CHECK_THAT(rescores[0], Catch::Matchers::WithinAbs(scores[0], 0.005)); CHECK_THAT(rescores[0], Catch::Matchers::WithinAbs(scores[0], 0.005));
CHECK_THAT(rescores[1], Catch::Matchers::WithinAbs(scores[1], 0.005)); CHECK_THAT(rescores[1], Catch::Matchers::WithinAbs(scores[1], 0.005));
CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.005)); CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.005));
writer.write(*probe); writer.write(*probe);
break; break;
} }
@@ -207,7 +209,7 @@ TEST_CASE("bulk") {
TEST_CASE("shape alignment") { TEST_CASE("shape alignment") {
std::string dirName = getenv("RDBASE"); std::string dirName = getenv("RDBASE");
dirName += "/External/pubchem_shape/test_data"; dirName += "/Code/GraphMol/GaussianShape/test_data";
auto suppl = v2::FileParsers::SDMolSupplier(dirName + "/test1.sdf"); auto suppl = v2::FileParsers::SDMolSupplier(dirName + "/test1.sdf");
auto ref = suppl[0]; auto ref = suppl[0];
@@ -224,11 +226,15 @@ TEST_CASE("shape alignment") {
CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(0.760, 0.005)); CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(0.760, 0.005));
CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.235, 0.005)); CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.235, 0.005));
// This effectively checks that xform is correct. // This effectively checks that xform is correct.
auto rescores = GaussianShape::ScoreShape(refShape, probeShape); std::pair<double, double> overlapVols;
auto rescores = GaussianShape::ScoreShape(
refShape, probeShape, GaussianShape::ShapeOverlayOptions(), &overlapVols);
CHECK_THAT(rescores[0], Catch::Matchers::WithinAbs(scores[0], 0.001)); CHECK_THAT(rescores[0], Catch::Matchers::WithinAbs(scores[0], 0.001));
CHECK_THAT(rescores[1], Catch::Matchers::WithinAbs(scores[1], 0.001)); CHECK_THAT(rescores[1], Catch::Matchers::WithinAbs(scores[1], 0.001));
CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.001)); CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.001));
CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.001));
CHECK_THAT(overlapVols.first, Catch::Matchers::WithinAbs(579.649, 0.005));
CHECK_THAT(overlapVols.second, Catch::Matchers::WithinAbs(14.0792, 0.005));
SmilesWriteParams params; SmilesWriteParams params;
params.canonical = false; params.canonical = false;
// The input structure being from an SDF doesn't have the atoms in an order // The input structure being from an SDF doesn't have the atoms in an order
@@ -271,8 +277,8 @@ TEST_CASE("Overlay onto shape bug (Github8462)") {
CHECK_THAT(scores1[1], Catch::Matchers::WithinAbs(1.0, 0.005)); CHECK_THAT(scores1[1], Catch::Matchers::WithinAbs(1.0, 0.005));
CHECK_THAT(scores1[2], Catch::Matchers::WithinAbs(1.0, 0.005)); CHECK_THAT(scores1[2], Catch::Matchers::WithinAbs(1.0, 0.005));
for (unsigned int i = 0; i < m3.getNumAtoms(); ++i) { for (unsigned int i = 0; i < m3.getNumAtoms(); ++i) {
RDGeom::Point3D pos1(s1.getCoords()[4 * i], s1.getCoords()[4 * i + 1], RDGeom::Point3D pos1(s1.getCoords()[3 * i], s1.getCoords()[3 * i + 1],
s1.getCoords()[4 * i + 2]); s1.getCoords()[3 * i + 2]);
auto pos2 = m3.getConformer().getAtomPos(i); auto pos2 = m3.getConformer().getAtomPos(i);
CHECK_THAT((pos1 - pos2).length(), Catch::Matchers::WithinAbs(0.0, 0.01)); CHECK_THAT((pos1 - pos2).length(), Catch::Matchers::WithinAbs(0.0, 0.01));
} }
@@ -303,10 +309,16 @@ TEST_CASE("handling molecules with Hs") {
CHECK((pos.x > -10 && pos.x < 10)); CHECK((pos.x > -10 && pos.x < 10));
} }
// Check the rescore // Check the rescore
auto rescores = GaussianShape::ScoreMolecule(*ref, cp); std::pair<double, double> overlapVols;
auto rescores = GaussianShape::ScoreMolecule(
*ref, cp, GaussianShape::ShapeInputOptions(),
GaussianShape::ShapeInputOptions(),
GaussianShape::ShapeOverlayOptions(), -1, -1, &overlapVols);
CHECK_THAT(rescores[0], Catch::Matchers::WithinAbs(scores[0], 0.005)); CHECK_THAT(rescores[0], Catch::Matchers::WithinAbs(scores[0], 0.005));
CHECK_THAT(rescores[1], Catch::Matchers::WithinAbs(scores[1], 0.005)); CHECK_THAT(rescores[1], Catch::Matchers::WithinAbs(scores[1], 0.005));
CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.005)); CHECK_THAT(rescores[2], Catch::Matchers::WithinAbs(scores[2], 0.005));
CHECK_THAT(overlapVols.first, Catch::Matchers::WithinAbs(355.566, 0.005));
CHECK_THAT(overlapVols.second, Catch::Matchers::WithinAbs(8.261, 0.005));
} }
} }
@@ -407,10 +419,15 @@ TEST_CASE("Score No Overlay") {
} }
{ {
auto shape = GaussianShape::ShapeInput(*pdb_trp_3tmn, -1, shapeOpts); auto shape = GaussianShape::ShapeInput(*pdb_trp_3tmn, -1, shapeOpts);
auto scores = ScoreMolecule(shape, *pdb_0zn_1tmn, shapeOpts); std::pair<double, double> overlapVols;
auto scores =
ScoreMolecule(shape, *pdb_0zn_1tmn, shapeOpts,
GaussianShape::ShapeOverlayOptions(), -1, &overlapVols);
CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(0.307, 0.005)); CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(0.307, 0.005));
CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(0.349, 0.005)); CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(0.349, 0.005));
CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.265, 0.005)); CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.265, 0.005));
CHECK_THAT(overlapVols.first, Catch::Matchers::WithinAbs(590.831, 0.005));
CHECK_THAT(overlapVols.second, Catch::Matchers::WithinAbs(26.736, 0.005));
} }
{ {
auto shape1 = GaussianShape::ShapeInput(*pdb_trp_3tmn, -1, shapeOpts); auto shape1 = GaussianShape::ShapeInput(*pdb_trp_3tmn, -1, shapeOpts);
@@ -560,9 +577,9 @@ TEST_CASE("Fragment Mode") {
// Use the smaller molecule as the probe // Use the smaller molecule as the probe
auto scores = GaussianShape::AlignShape(refShape, probeShape, &xform, opts); auto scores = GaussianShape::AlignShape(refShape, probeShape, &xform, opts);
// These are close to the values above for starting from the xtal structures. // These are close to the values above for starting from the xtal structures.
CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(0.332, 0.005)); CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(0.315, 0.005));
CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(0.413, 0.005)); CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(0.413, 0.005));
CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.251, 0.005)); CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.220, 0.005));
} }
TEST_CASE("custom feature points") { TEST_CASE("custom feature points") {
@@ -571,45 +588,48 @@ TEST_CASE("custom feature points") {
SECTION("using shapes") { SECTION("using shapes") {
auto shape1 = GaussianShape::ShapeInput(*m1, -1); auto shape1 = GaussianShape::ShapeInput(*m1, -1);
// each carbonyl O gets one feature: // each carbonyl O gets one feature:
CHECK(shape1.getCoords().size() == 24); CHECK(shape1.getCoords().size() == 18);
GaussianShape::ShapeInputOptions opts2; GaussianShape::ShapeInputOptions opts2;
opts2.customFeatures = GaussianShape::CustomFeatures{ opts2.customFeatures = {{{1, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0,
{1, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0}, std::vector<unsigned int>{}},
{2, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0}}; {2, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0,
std::vector<unsigned int>{}}}};
auto shape2 = GaussianShape::ShapeInput(*m1, -1, opts2); auto shape2 = GaussianShape::ShapeInput(*m1, -1, opts2);
CHECK(shape2.getCoords().size() == 24); CHECK(shape2.getCoords().size() == 18);
{ {
// confirm that we don't add the features if not requested. // confirm that we don't add the features if not requested.
GaussianShape::ShapeInputOptions topts; GaussianShape::ShapeInputOptions topts;
topts.customFeatures = GaussianShape::CustomFeatures{ topts.customFeatures = {{{1, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0,
{1, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0}, std::vector<unsigned int>{}},
{2, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0}}; {2, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0,
std::vector<unsigned int>{}}}};
topts.useColors = false; topts.useColors = false;
auto tshape = GaussianShape::ShapeInput(*m1, -1, topts); auto tshape = GaussianShape::ShapeInput(*m1, -1, topts);
CHECK(tshape.getCoords().size() == 16); CHECK(tshape.getCoords().size() == 12);
} }
// we'll swap the features on the second shape so that the alignment has to // we'll swap the features on the second shape so that the alignment has to
// be inverted // be inverted
GaussianShape::ShapeInputOptions opts3; GaussianShape::ShapeInputOptions opts3;
opts3.customFeatures = GaussianShape::CustomFeatures{ opts3.customFeatures = {{{2, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0,
{2, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0}, std::vector<unsigned int>{}},
{1, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0}}; {1, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0,
std::vector<unsigned int>{}}}};
auto m2 = ROMol(*m1); auto m2 = ROMol(*m1);
auto shape3 = GaussianShape::ShapeInput(m2, -1, opts3); auto shape3 = GaussianShape::ShapeInput(m2, -1, opts3);
CHECK(shape3.getCoords().size() == 24); CHECK(shape3.getCoords().size() == 18);
GaussianShape::ShapeOverlayOptions overlayOpts; GaussianShape::ShapeOverlayOptions overlayOpts;
overlayOpts.optParam = 0.5; overlayOpts.optParam = 0.5;
RDGeom::Transform3D xform; RDGeom::Transform3D xform;
auto scores = AlignShape(shape2, shape3, &xform, overlayOpts); auto scores = AlignShape(shape2, shape3, &xform, overlayOpts);
CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(1.000, 0.001)); CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(0.999, 0.005));
CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(1.000, 0.001)); CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(1.000, 0.005));
CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.999, 0.001)); CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.998, 0.005));
CHECK(shape3.getCoords()[0] > 0); // x coord of first atom CHECK(shape3.getCoords()[0] > 0); // x coord of first atom
CHECK(shape3.getCoords()[3 * 4] < 0); // x coord of fourth atom CHECK(shape3.getCoords()[3 * 3] < 0); // x coord of fourth atom
auto conf = m2.getConformer(-1); auto conf = m2.getConformer(-1);
MolTransforms::transformConformer(conf, xform); MolTransforms::transformConformer(conf, xform);
@@ -618,25 +638,27 @@ TEST_CASE("custom feature points") {
} }
SECTION("using molecules") { SECTION("using molecules") {
GaussianShape::ShapeInputOptions opts2; GaussianShape::ShapeInputOptions opts2;
opts2.customFeatures = GaussianShape::CustomFeatures{ opts2.customFeatures = {{{1, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0,
{1, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0}, std::vector<unsigned int>{}},
{2, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0}}; {2, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0,
std::vector<unsigned int>{}}}};
auto m2 = ROMol(*m1); auto m2 = ROMol(*m1);
// we'll swap the features on the second shape so that the alignment has to // we'll swap the features on the second shape so that the alignment has to
// be inverted // be inverted
GaussianShape::ShapeInputOptions opts3; GaussianShape::ShapeInputOptions opts3;
opts3.customFeatures = GaussianShape::CustomFeatures{ opts3.customFeatures = {{{2, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0,
{2, RDGeom::Point3D(-1.75978, 0.148897, 0), 1.0}, std::vector<unsigned int>{}},
{1, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0}}; {1, RDGeom::Point3D(1.7571, -0.120174, 0.1), 1.0,
std::vector<unsigned int>{}}}};
GaussianShape::ShapeOverlayOptions overlayOpts; GaussianShape::ShapeOverlayOptions overlayOpts;
overlayOpts.optParam = 0.5; overlayOpts.optParam = 0.5;
std::vector<float> matrix(12, 0.0); std::vector<float> matrix(12, 0.0);
auto scores = AlignMolecule(*m1, m2, opts2, opts3, nullptr, overlayOpts); auto scores = AlignMolecule(*m1, m2, opts2, opts3, nullptr, overlayOpts);
CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(1.000, 0.001)); CHECK_THAT(scores[0], Catch::Matchers::WithinAbs(0.999, 0.005));
CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(1.000, 0.001)); CHECK_THAT(scores[1], Catch::Matchers::WithinAbs(1.000, 0.005));
CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.999, 0.001)); CHECK_THAT(scores[2], Catch::Matchers::WithinAbs(0.998, 0.005));
auto conf = m2.getConformer(-1); auto conf = m2.getConformer(-1);
CHECK(conf.getAtomPos(0).x > 0); CHECK(conf.getAtomPos(0).x > 0);
CHECK(conf.getAtomPos(3).x < 0); CHECK(conf.getAtomPos(3).x < 0);
@@ -651,7 +673,7 @@ TEST_CASE("Non-standard radii") {
shapeOpts.allCarbonRadii = false; shapeOpts.allCarbonRadii = false;
auto shape1 = GaussianShape::ShapeInput(*m1, -1, shapeOpts); auto shape1 = GaussianShape::ShapeInput(*m1, -1, shapeOpts);
CHECK(shape1.getCoords().size() == 28); CHECK(shape1.getCoords().size() == 21);
CHECK_THAT(shape1.getShapeVolume(), CHECK_THAT(shape1.getShapeVolume(),
Catch::Matchers::WithinAbs(387.396, 0.005)); Catch::Matchers::WithinAbs(387.396, 0.005));
// mol1 with atom 4 with an N radius and a bigger Xe. // mol1 with atom 4 with an N radius and a bigger Xe.
@@ -675,16 +697,22 @@ TEST_CASE("Shape subset") {
REQUIRE(m1); REQUIRE(m1);
GaussianShape::ShapeInputOptions shapeOpts; GaussianShape::ShapeInputOptions shapeOpts;
shapeOpts.atomSubset = std::vector<unsigned int>{0, 1, 2, 3, 10, 11}; shapeOpts.atomSubset = std::vector<unsigned int>{0, 1, 2, 3, 10, 11};
auto partShape = GaussianShape::ShapeInput(*m1, -1, shapeOpts); GaussianShape::ShapeInput partShape(*m1, -1, shapeOpts);
CHECK(partShape.getCoords().size() == 28); CHECK(partShape.getSmiles() == "c1ccccc1");
CHECK(partShape.getNumAtoms() == 6);
CHECK(partShape.getNumFeatures() == 1);
CHECK(partShape.getCoords().size() == 21);
CHECK_THAT(partShape.getShapeVolume(), CHECK_THAT(partShape.getShapeVolume(),
Catch::Matchers::WithinAbs(261.166, 0.005)); Catch::Matchers::WithinAbs(261.166, 0.005));
CHECK_THAT(partShape.getColorVolume(), CHECK_THAT(partShape.getColorVolume(),
Catch::Matchers::WithinAbs(5.316, 0.005)); Catch::Matchers::WithinAbs(5.316, 0.005));
shapeOpts.atomSubset.clear(); shapeOpts.atomSubset.clear();
auto wholeShape = GaussianShape::ShapeInput(*m1, -1, shapeOpts); GaussianShape::ShapeInput wholeShape(*m1, -1, shapeOpts);
CHECK(wholeShape.getCoords().size() == 56); CHECK(wholeShape.getSmiles() == "c1ccc(-c2ccccc2)cc1");
CHECK(wholeShape.getNumAtoms() == 12);
CHECK(wholeShape.getNumFeatures() == 2);
CHECK(wholeShape.getCoords().size() == 42);
CHECK_THAT(wholeShape.getShapeVolume(), CHECK_THAT(wholeShape.getShapeVolume(),
Catch::Matchers::WithinAbs(556.266, 0.005)); Catch::Matchers::WithinAbs(556.266, 0.005));
CHECK_THAT(wholeShape.getColorVolume(), CHECK_THAT(wholeShape.getColorVolume(),
@@ -765,10 +793,10 @@ TEST_CASE("Serialization") {
GaussianShape::ShapeInput shape2(istr); GaussianShape::ShapeInput shape2(istr);
CHECK(shape2.getCoords() == shape.getCoords()); CHECK(shape2.getCoords() == shape.getCoords());
CHECK(shape2.getTypes() == shape.getTypes()); CHECK(shape2.getFeatureTypes() == shape.getFeatureTypes());
CHECK(shape2.getNumAtoms() == shape.getNumAtoms()); CHECK(shape2.getNumAtoms() == shape.getNumAtoms());
CHECK(shape2.getNumFeatures() == shape.getNumFeatures()); CHECK(shape2.getNumFeatures() == shape.getNumFeatures());
CHECK(shape2.getNormalized() == shape.getNormalized()); CHECK(shape2.getIsNormalized() == shape.getIsNormalized());
CHECK(shape2.calcExtremes() == shape.calcExtremes()); CHECK(shape2.calcExtremes() == shape.calcExtremes());
CHECK(shape2.calcCanonicalRotation() == shape.calcCanonicalRotation()); CHECK(shape2.calcCanonicalRotation() == shape.calcCanonicalRotation());
CHECK(shape2.calcCanonicalTranslation() == shape.calcCanonicalTranslation()); CHECK(shape2.calcCanonicalTranslation() == shape.calcCanonicalTranslation());
@@ -857,3 +885,171 @@ TEST_CASE("multithreaded") {
CHECK(test == ref); CHECK(test == ref);
} }
#endif #endif
std::unique_ptr<ROMol> loadConformers(const std::string &fileName) {
std::ifstream ifs(fileName.c_str());
std::string nextLine;
std::unique_ptr<ROMol> retMol;
while (!ifs.eof()) {
std::getline(ifs, nextLine);
if (!retMol) {
retMol = v2::SmilesParse::MolFromSmiles(nextLine);
} else {
auto m = v2::SmilesParse::MolFromSmiles(nextLine);
if (!m || !m->getNumConformers()) {
continue;
}
retMol->addConformer(new Conformer(m->getConformer()), true);
}
}
return retMol;
}
TEST_CASE("Multiple Conformers") {
std::string dirName = getenv("RDBASE");
dirName += "/Code/GraphMol/GaussianShape/test_data";
auto esomeprazole = loadConformers(dirName + "/esomeprazole_multi.smi");
CHECK(esomeprazole->getNumConformers() == 10);
{
GaussianShape::ShapeInput shape1(*esomeprazole);
CHECK(shape1.getNumShapes() == 10);
shape1.setActiveShape(0);
auto firstVol = shape1.getShapeVolume() + shape1.getColorVolume();
shape1.setActiveShape(9);
auto lastVol = shape1.getShapeVolume() + shape1.getColorVolume();
CHECK(firstVol > lastVol);
GaussianShape::ShapeInputOptions shapeOptions;
shapeOptions.shapePruneThreshold = 0.75;
GaussianShape::ShapeInput shape2(*esomeprazole, -1, shapeOptions);
CHECK(shape2.getNumShapes() == 6);
shape2.setActiveShape(0);
firstVol = shape2.getShapeVolume() + shape2.getColorVolume();
shape2.setActiveShape(5);
lastVol = shape2.getShapeVolume() + shape2.getColorVolume();
CHECK(firstVol > lastVol);
RDLog::InitLogs();
{
RDLog::CaptureLog capture{rdErrorLog};
CHECK_THROWS(shape2.setActiveShape(8));
}
// The shapes in 2 are a subset of 1, so the maximum similarity
// should be 1.0.
auto maxSim = shape1.maxPossibleSimilarity(shape2);
CHECK_THAT(maxSim, Catch::Matchers::WithinAbs(1.0, 0.001));
unsigned int bestFitShape, bestRefShape;
RDGeom::Transform3D bestXform;
auto maxSimBF =
shape1.bestSimilarity(shape2, bestRefShape, bestFitShape, bestXform);
CHECK_THAT(maxSimBF[0], Catch::Matchers::WithinAbs(1.0, 0.001));
CHECK(bestRefShape == 2);
CHECK(bestFitShape == 0);
}
{
// Single conformations
GaussianShape::ShapeInput shape1(*esomeprazole, 1);
CHECK(shape1.getNumShapes() == 1);
GaussianShape::ShapeInput shape2(*esomeprazole, 8);
CHECK(shape2.getNumShapes() == 1);
// Demonstrate that the shapes are different:
auto scores = GaussianShape::AlignShape(shape1, shape2);
CHECK(scores[0] < 0.5);
}
{
auto ranit = loadConformers(dirName + "/ranitidine_multi.smi");
CHECK(ranit->getNumConformers() == 10);
int best1, best2;
std::vector<std::vector<double>> sims;
RDGeom::Transform3D bestXform;
GaussianShape::ScoreMoleculeAllConformers(
*esomeprazole, *ranit, best1, best2, sims,
GaussianShape::ShapeInputOptions(), GaussianShape::ShapeInputOptions(),
GaussianShape::ShapeOverlayOptions(), &bestXform);
auto bestRanit = ROMol(*ranit, false, best2);
MolTransforms::transformConformer(bestRanit.getConformer(), bestXform);
CHECK(best1 == 8);
CHECK(best2 == 3);
CHECK_THAT(sims[best1][best2], Catch::Matchers::WithinAbs(0.449, 0.005));
auto bestRanitOvly =
"CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(6.68611,0.863597,0.97613;5.95586,-0.334349,0.711421;4.78328,-0.243678,-0.125471;4.84653,-0.396906,-1.43692;6.12872,-0.659266,-2.0185;6.4838,-0.151668,-3.10944;7.03627,-1.48835,-1.39831;3.56378,0.0228591,0.572233;2.49642,-0.947668,0.69362;1.39542,-0.282981,1.51715;0.820454,1.20438,0.676443;-0.0207479,0.813676,-0.869585;-1.28323,0.0840072,-0.51181;-1.47056,-1.27554,-0.36285;-2.81128,-1.45606,-0.0285064;-3.36164,-0.193138,0.00641597;-4.76099,0.220968,0.30873;-5.51751,0.268459,-0.927648;-6.86716,0.680208,-0.603162;-4.89123,1.12287,-1.90674;-2.42652,0.691609,-0.284809)|"_smiles;
CHECK(checkMolsHaveRoughlySameCoords(bestRanit, *bestRanitOvly, 0.02));
}
}
namespace {
bool checkBondLengths(const ROMol &mol) {
// DetermineBonds::connectivityVdw uses a covalent factor of 1.3.
static constexpr double radFactor = 1.3;
const auto conf = mol.getConformer();
for (const auto bond : mol.bonds()) {
if (!bond->getBeginAtom()->getAtomicNum() ||
!bond->getEndAtom()->getAtomicNum()) {
continue;
}
auto bondlen = MolTransforms::getBondLength(conf, bond->getBeginAtomIdx(),
bond->getEndAtomIdx());
auto rad1 = PeriodicTable::getTable()->getRcovalent(
bond->getBeginAtom()->getAtomicNum());
auto rad2 = PeriodicTable::getTable()->getRcovalent(
bond->getEndAtom()->getAtomicNum());
if (bondlen > radFactor * (rad1 + rad2)) {
std::cout << bond->getIdx() << " : " << bond->getBeginAtomIdx() << " -> "
<< bond->getEndAtomIdx() << " len = " << bondlen << " vs "
<< radFactor * (rad1 + rad2) << std::endl;
return false;
}
}
return true;
}
} // namespace
TEST_CASE("Different atom orders for ShapeInput") {
// Make sure that different atom orders always produce a ShapeInput that gives
// a correct molecule from shapeToMol. This wasn't always the case.
auto fullMol =
"C[C@@H]1C[C@H](NC(=O)NC2COC2)CN(C(=O)c2nccnc2F)C1 |(-0.346914,-0.986206,-4.28744;-0.686863,-0.0357247,-3.13265;0.429505,-0.1946,-2.14134;0.21099,0.659676,-0.907145;1.06526,0.0812473,0.104663;2.29297,0.75201,0.373712;2.5837,1.80373,-0.246936;3.23325,0.27478,1.33777;4.47647,0.99197,1.57602;4.94347,1.01294,2.99117;5.59284,-0.21541,2.82613;5.71049,0.107583,1.47157;-1.19052,0.721766,-0.417623;-2.14086,-0.0964663,-1.12904;-3.25312,-0.745252,-0.540367;-3.95877,-1.43507,-1.38825;-3.7227,-0.763533,0.803759;-4.9481,-0.204581,1.05395;-5.52492,-0.18107,2.24654;-4.86554,-0.748644,3.30451;-3.63585,-1.32731,3.13734;-3.08234,-1.32608,1.89052;-1.84839,-1.9059,1.69441;-2.02702,-0.329978,-2.57998),wD:1.0,wU:3.3|"_smiles;
REQUIRE(fullMol);
CHECK(checkBondLengths(*fullMol));
std::vector<unsigned int> atomOrder(fullMol->getNumAtoms());
std::iota(atomOrder.begin(), atomOrder.end(), 0);
auto rng = std::default_random_engine{};
for (unsigned int i = 0; i < 100; ++i) {
std::ranges::shuffle(atomOrder, rng);
std::unique_ptr<ROMol> renumMol(MolOps::renumberAtoms(*fullMol, atomOrder));
CHECK(checkBondLengths(*renumMol));
GaussianShape::ShapeInput shape(*renumMol);
auto outMol = shape.shapeToMol(false);
CHECK(checkBondLengths(*outMol));
}
// And the same for the shape from a subset.
GaussianShape::ShapeInputOptions shapeOptions;
auto bitToGo = "c1nccnc1F"_smarts;
REQUIRE(bitToGo);
for (unsigned int i = 0; i < 100; ++i) {
std::ranges::shuffle(atomOrder, rng);
std::unique_ptr<ROMol> renumMol(MolOps::renumberAtoms(*fullMol, atomOrder));
CHECK(checkBondLengths(*renumMol));
auto match = SubstructMatch(*renumMol, *bitToGo);
boost::dynamic_bitset<> toGo(fullMol->getNumAtoms());
for (auto mp : match.front()) {
toGo[mp.second] = true;
}
shapeOptions.atomSubset.clear();
shapeOptions.atomSubset.reserve(renumMol->getNumAtoms());
for (unsigned int j = 0; j < renumMol->getNumAtoms(); ++j) {
if (!toGo[j]) {
shapeOptions.atomSubset.push_back(j);
}
}
GaussianShape::ShapeInput shape(*renumMol, -1, shapeOptions);
auto outMol = shape.shapeToMol(false);
CHECK(MolToSmiles(*outMol) == "C[C@@H]1C[C@H](NC(=O)NC2COC2)CN(C=O)C1");
CHECK(checkBondLengths(*outMol));
}
}

View File

@@ -0,0 +1,10 @@
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(7.76279,0.35859,-1.79983;7.53284,0.543911,-0.412224;6.23406,0.342998,0.059063;6.04426,0.531464,1.4141;4.8049,0.361309,1.99058;3.74498,-0.00802913,1.16245;2.44177,-0.247558,1.40977;1.82578,-0.57871,0.238989;0.133254,-0.97614,0.0466529;-0.0292098,-2.63143,0.274352;-0.906512,-0.0893407,1.18338;-2.33278,-0.453729,0.991447;-2.98618,-1.27889,1.8428;-4.26215,-1.55812,1.60297;-4.97156,-1.04913,0.520132;-6.4143,-1.41646,0.322105;-4.31709,-0.212654,-0.347233;-4.97671,0.320991,-1.43986;-5.66291,1.53654,-1.38935;-2.97364,0.0866439,-0.101326;-2.3116,0.99437,-1.06769;2.76165,-0.539624,-0.723264;3.95693,-0.188733,-0.174452;5.1937,-0.0252622,-0.787456),wU:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(7.35181,-0.955914,-0.621074;6.28507,-1.86782,-0.521315;4.97699,-1.43464,-0.357253;3.96771,-2.3979,-0.265735;2.64159,-2.00456,-0.100574;2.33901,-0.644613,-0.0284915;1.18055,0.0401624,0.124973;1.44727,1.35906,0.12956;0.358085,2.71465,0.29292;-0.179464,3.19596,-1.19798;-0.887532,2.39995,1.47656;-1.6807,1.21767,1.07535;-1.4337,0.00597518,1.65106;-2.07771,-1.11213,1.35103;-3.07939,-1.12656,0.400029;-3.80174,-2.38461,0.0642386;-3.3657,0.0780431,-0.2108;-4.37059,0.0737521,-1.16773;-5.71993,0.301119,-0.890862;-2.67373,1.235,0.124516;-3.05619,2.48943,-0.565999;2.7917,1.53354,-0.022281;3.36466,0.302456,-0.122353;4.68788,-0.0836175,-0.287005),wD:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(7.39497,-0.361838,1.79104;7.3996,0.046751,0.434628;6.16859,0.195626,-0.193925;6.22741,0.600845,-1.53194;5.05242,0.78061,-2.25681;3.84443,0.544841,-1.60638;2.60275,0.643232,-2.07132;1.72763,0.316321,-1.08407;-0.0285132,0.289233,-1.1919;-0.557131,-1.19417,-1.6913;-0.690432,0.811836,0.370378;-2.1558,0.824094,0.387096;-2.79602,2.01738,0.247135;-4.13566,2.13418,0.2371;-4.91329,0.992953,0.374886;-6.38623,1.18973,0.354436;-4.31469,-0.240381,0.519929;-5.13293,-1.35276,0.654232;-5.56482,-2.08157,-0.50572;-2.937,-0.303103,0.523686;-2.30517,-1.63349,0.693218;2.47757,0.0119843,0.00933457;3.78711,0.144155,-0.282888;4.96473,-0.0288726,0.41984),wU:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(-7.30663,1.6042,1.34604;-7.38184,0.681969,0.264301;-6.19619,0.184262,-0.25949;-6.26612,-0.719726,-1.32;-5.12331,-1.25807,-1.89342;-3.86967,-0.884051,-1.39452;-2.61997,-1.21438,-1.71967;-1.73995,-0.559854,-0.911137;-0.00638954,-0.731935,-1.0281;0.63296,0.396008,-2.06679;0.668118,-0.662434,0.610029;2.15631,-0.801132,0.61989;2.68042,-2.01362,0.952857;4.01852,-2.23748,0.973572;4.82071,-1.16913,0.638722;6.30373,-1.40378,0.655719;4.3324,0.0739359,0.296125;5.15417,1.14325,-0.0394303;5.58344,1.42729,-1.35934;2.96643,0.261805,0.286896;2.43545,1.60126,-0.0698277;-2.46155,0.205977,-0.0508191;-3.78127,0.0206498,-0.332244;-4.95462,0.538939,0.216195),wD:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(8.44478,-0.37723,0.596492;7.2654,-0.580275,-0.155497;6.06924,-0.0698148,0.322764;5.98988,0.641429,1.52529;4.74765,1.10257,1.90604;3.61735,0.884103,1.14923;2.32278,1.22443,1.30991;1.60231,0.756995,0.264138;-0.141546,0.980355,0.0855554;-0.385,2.46285,-0.713367;-0.797062,-0.258103,-1.0128;-2.26511,-0.00959642,-1.11262;-2.76396,0.603626,-2.22114;-4.09186,0.855836,-2.35789;-4.96252,0.478987,-1.34221;-6.42001,0.754746,-1.4823;-4.43753,-0.1489,-0.2118;-5.29505,-0.543217,0.833838;-5.80331,-1.87705,0.677847;-3.09,-0.405085,-0.0718154;-2.53726,-1.08102,1.15415;2.4354,0.107418,-0.584315;3.69161,0.173798,-0.0539904;4.92968,-0.290211,-0.440767),wD:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(8.46379,-1.05094,0.890213;7.21183,-1.16805,1.50804;6.0787,-0.685299,0.859082;6.1466,-0.0866102,-0.385617;4.94662,0.361921,-0.943967;3.73748,0.221501,-0.29541;2.46019,0.563658,-0.615352;1.66664,0.180758,0.401873;-0.0660968,0.349359,0.58095;-0.298354,1.79944,1.44456;-0.829807,0.645701,-1.02424;-2.29755,0.784757,-0.869117;-2.85558,2.00151,-0.90078;-4.17795,2.21833,-0.761503;-5.03128,1.15041,-0.574816;-6.49902,1.42805,-0.422147;-4.51975,-0.130216,-0.532847;-5.38723,-1.1689,-0.347656;-5.79751,-1.73049,0.864803;-3.13789,-0.292823,-0.683435;-2.59445,-1.64958,-0.640581;2.42904,-0.403031,1.36868;3.70293,-0.378955,0.941823;4.8803,-0.83812,1.52914),wU:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(8.57217,-0.16887,0.287836;7.54849,0.755066,-0.0225839;6.22613,0.367838,0.0319823;5.80551,-0.900006,0.378794;4.47451,-1.26864,0.427659;3.55885,-0.274222,0.102235;2.22298,-0.346305,0.0660394;1.72821,0.858366,-0.301653;0.0244018,1.24074,-0.481864;-0.513761,0.965755,-2.01621;-0.823922,0.326633,0.770099;-2.28146,0.521574,0.776842;-2.852,1.34627,1.68268;-4.18165,1.52864,1.67914;-5.03924,0.918868,0.792467;-6.51338,1.13443,0.805542;-4.45665,0.0644511,-0.147024;-5.25976,-0.574254,-1.05911;-5.86409,-1.83108,-0.886879;-3.07479,-0.127986,-0.146582;-2.5466,-1.05993,-1.16973;2.79155,1.68491,-0.494271;3.9392,1.0023,-0.248956;5.28473,1.34748,-0.29046),wD:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(8.41978,-0.586137,-0.366336;7.07447,-0.841345,-0.590299;6.01744,-0.094299,-0.129259;6.11067,1.03795,0.63536;4.99466,1.7528,1.07335;3.72348,1.31218,0.727873;2.4669,1.71094,0.948745;1.59004,0.861609,0.344851;-0.130785,0.878912,0.298238;-0.850941,1.7115,-0.894062;-0.94071,-0.469937,1.00684;-2.40924,-0.397152,0.931047;-3.12589,-0.0264934,2.03463;-4.45996,0.0712964,2.03668;-5.11986,-0.223005,0.85046;-6.60869,-0.111396,0.852147;-4.42617,-0.604729,-0.296353;-5.18139,-0.8757,-1.42621;-5.44703,0.22881,-2.29872;-3.05163,-0.692554,-0.253126;-2.34068,-1.1114,-1.4897;2.34025,-0.0997441,-0.273803;3.65464,0.171853,-0.0411892;4.74278,-0.534631,-0.473867),wD:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(8.00568,1.10139,1.32099;7.71643,0.0695959,0.369002;6.40249,-0.0593623,-0.0257689;6.05125,-1.04299,-0.949608;4.74755,-1.20529,-1.37091;3.73517,-0.397168,-0.892691;2.40822,-0.309169,-1.10318;1.90591,0.702705,-0.339193;0.250339,1.2396,-0.233509;-0.14194,2.42306,-1.32618;-0.812932,-0.148039,-0.320382;-2.25744,0.246731,-0.234733;-2.53517,1.55862,-0.107667;-3.81389,2.00104,-0.0234843;-4.84553,1.09541,-0.0685337;-6.24555,1.61513,0.0267932;-4.55871,-0.243502,-0.198336;-5.61172,-1.17856,-0.245268;-5.96922,-1.65576,1.06226;-3.2668,-0.699427,-0.284188;-2.9003,-2.12263,-0.424321;2.94246,1.24778,0.354434;4.0764,0.580173,0.0235633;5.3812,0.750509,0.452031),wD:8.8|
COc1ccc2[n-]c([S@@+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1 |(-7.28487,-0.650857,0.253721;-6.25841,-1.47535,-0.269635;-4.95748,-1.00727,-0.181334;-3.9588,-1.81971,-0.695916;-2.6295,-1.45103,-0.660178;-2.33012,-0.229377,-0.0863033;-1.12506,0.388295,0.0866016;-1.32515,1.57843,0.699993;-0.0722802,2.7185,1.12613;0.553365,2.47631,2.6509;1.15079,2.80443,-0.133088;1.76505,1.49546,-0.330039;1.32131,0.738378,-1.37065;1.80666,-0.485482,-1.6251;2.78725,-1.08227,-0.866056;3.30899,-2.4386,-1.16658;3.2678,-0.33483,0.211092;4.24421,-0.911314,0.979308;5.60027,-0.730433,0.664585;2.74585,0.93754,0.454397;3.31697,1.67538,1.59436;-2.65389,1.69039,0.903275;-3.3007,0.608449,0.437767;-4.61869,0.204602,0.383371),wU:8.8|

View File

@@ -0,0 +1,11 @@
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(-5.75638,1.64389,0.339481;-5.29962,0.375706,0.88353;-4.80142,-0.679132,0.0704169;-5.57222,-1.69807,-0.281357;-6.9327,-1.8123,0.120372;-7.81675,-1.00769,-0.233715;-7.24154,-2.88261,0.943388;-3.45634,-0.711221,-0.415137;-2.39705,-1.26208,0.406128;-1.10115,-1.14534,-0.399887;-0.818204,0.610732,-0.754202;0.728638,0.79049,-1.71784;1.84691,0.385321,-0.84412;2.29848,-0.937124,-0.784727;3.33995,-0.967307,0.117124;3.47748,0.349381,0.570429;4.51293,0.716493,1.56801;5.77053,1.1047,0.959975;5.61644,2.25078,0.091636;6.32536,-0.0256005,0.224903;2.58283,1.10456,-0.0195067)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(4.51284,1.33313,1.92564;4.20597,0.0825839,1.26359;4.40781,-0.136005,-0.12875;5.47643,0.281129,-0.783021;6.5463,0.994651,-0.175899;6.87972,2.13417,-0.605961;7.26531,0.496946,0.89138;3.42665,-0.837789,-0.885599;2.2844,-0.187578,-1.49902;1.37904,0.283177,-0.367628;0.884921,-1.18597,0.57728;-0.148781,-2.22382,-0.491889;-1.40445,-1.4809,-0.764455;-1.69942,-0.597892,-1.78355;-3.00423,-0.190622,-1.57561;-3.44127,-0.841117,-0.445475;-4.76559,-0.762903,0.215971;-4.8438,0.270869,1.20702;-4.58844,1.54655,0.586489;-6.15485,0.235814,1.80606;-2.46731,-1.60819,0.0239325)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(5.74542,-2.50628,0.674713;5.54601,-1.08718,0.465206;4.34229,-0.526556,-0.0371992;4.41214,0.331472,-1.04291;5.70223,0.667491,-1.59251;6.16961,1.7683,-1.17081;6.35936,-0.124866,-2.4808;3.05197,-0.861361,0.51642;2.25408,0.2543,0.966464;0.92643,-0.113942,1.53823;0.122211,1.46059,2.02576;-0.217723,2.39156,0.482409;-1.25818,1.5762,-0.240872;-1.0572,0.567077,-1.16029;-2.34272,0.157568,-1.50991;-3.22049,0.926832,-0.797014;-4.70742,0.912544,-0.784737;-5.26475,0.0294569,0.207516;-6.72363,0.159602,0.0653491;-4.95826,-1.36274,0.000804639;-2.53615,1.77862,-0.0355077)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(6.54305,-0.909422,0.625898;5.79299,-0.448044,-0.497913;4.56789,0.272778,-0.24673;4.54369,1.5885,-0.121558;5.78371,2.29471,-0.242738;6.06745,3.27599,0.485597;6.72763,1.92262,-1.17346;3.39811,-0.543506,-0.145087;2.33798,-0.528492,-1.13089;1.29607,-1.54317,-0.664981;0.671558,-1.07204,0.959097;-0.271399,0.463124,0.891462;-1.5098,0.184306,0.0896425;-1.69127,0.311447,-1.27291;-3.0075,-0.064632,-1.5337;-3.55023,-0.399187,-0.311977;-4.92515,-0.872382,0.0147464;-5.76169,0.274447,0.311718;-7.08556,-0.216636,0.63203;-5.19882,1.09171,1.35868;-2.63357,-0.239186,0.624061)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(6.82876,0.0744723,-0.0859065;5.5418,-0.515567,0.206674;4.49281,0.339321,0.690171;4.54824,0.896363,1.8835;5.67963,0.636371,2.70157;6.00568,-0.545152,2.98036;6.43414,1.69833,3.19909;3.40105,0.535631,-0.21547;2.38885,-0.479244,-0.445064;1.41646,0.0958672,-1.47756;0.67902,1.60017,-0.820159;-0.370923,1.1148,0.594409;-1.48403,0.304418,0.0353544;-1.51226,-1.05375,-0.12346;-2.72978,-1.36856,-0.674528;-3.39981,-0.157349,-0.828012;-4.76718,-0.0608357,-1.39908;-5.72342,-0.159378,-0.345975;-5.66029,0.844452,0.661487;-7.04209,-0.556759,-0.72793;-2.61715,0.814468,-0.390408)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(5.79708,-1.3169,0.129351;5.60097,0.0902381,0.384918;5.06872,1.00488,-0.561195;5.65973,2.18362,-0.656847;6.77689,2.39881,0.214227;7.6581,1.51897,0.30947;6.87766,3.57274,0.94428;3.96151,0.735912,-1.39388;2.6309,0.609107,-0.841933;2.46076,-0.606132,0.00601144;0.761367,-0.642772,0.636231;-0.373712,-0.894168,-0.764989;-1.75807,-0.931712,-0.218857;-2.34884,-2.10079,0.228566;-3.62121,-1.791,0.668466;-3.7565,-0.422233,0.467626;-4.91771,0.45116,0.760659;-5.81205,0.549851,-0.344116;-6.36906,-0.756763,-0.627795;-6.88915,1.46946,-0.0314991;-2.63173,0.0186918,-0.0538856)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(6.45571,-0.234524,1.40588;5.10302,0.222237,1.05949;4.24146,-0.640116,0.291936;4.48663,-0.786632,-1.00009;5.59921,-0.0822584,-1.58487;6.48707,-0.711357,-2.23303;5.72065,1.29095,-1.451;3.16985,-1.28945,0.963102;2.02393,-1.8224,0.210888;1.30759,-0.585161,-0.327492;0.802714,0.420145,1.07151;-0.051951,1.93474,0.565937;-1.35779,1.61859,-0.0663397;-1.63803,1.36901,-1.39512;-2.99796,1.13556,-1.43818;-3.49986,1.24694,-0.158583;-4.89005,1.09146,0.335502;-5.14776,-0.293716,0.686027;-5.00291,-1.17411,-0.445312;-6.46152,-0.462733,1.26963;-2.47721,1.53484,0.611095)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(-5.13876,2.14859,-0.926944;-4.70605,0.801038,-1.22147;-4.28486,-0.0284512,-0.149175;-4.91077,-1.19646,-0.00799895;-5.95924,-1.60685,-0.888934;-5.90898,-2.66305,-1.55247;-7.04721,-0.742153,-0.956174;-3.23858,0.34065,0.754838;-1.96373,0.856627,0.317836;-1.13473,1.10377,1.58481;-1.00314,-0.52844,2.39876;-0.0163036,-1.59329,1.3211;1.37446,-1.05826,1.28283;2.33362,-1.49465,2.20356;3.47155,-0.793657,1.86218;3.16111,0.015246,0.778694;4.1336,0.943889,0.0973721;4.74116,0.229556,-0.967234;5.72992,0.908513,-1.7298;4.90116,-1.17253,-0.800592;1.90597,-0.176361,0.471236)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(-6.15067,-0.546919,-1.70149;-5.50653,0.134178,-0.588546;-4.28228,-0.367703,-0.0288948;-4.25726,-1.02344,1.10507;-5.48868,-1.23308,1.78495;-5.55489,-1.49545,3.00343;-6.68608,-1.15302,1.11035;-3.10343,-0.0951529,-0.804491;-2.94068,1.25241,-1.2853;-1.63874,1.40897,-1.97806;-0.231909,1.17666,-0.903468;-0.0522882,-0.464912,-0.214144;1.27799,-0.513995,0.523292;1.51434,-0.211783,1.83712;2.8769,-0.412909,2.01854;3.38105,-0.824744,0.801954;4.76355,-1.17807,0.418005;5.5174,-0.0610087,-0.058567;5.65997,1.01229,0.896743;6.83932,-0.537069,-0.424422;2.37584,-0.867898,-0.0605732)|
CN/C(=C\[N+](=O)[O-])NCCSCc1ccc(CN(C)C)o1 |(-5.92487,-1.32841,-0.0091739;-5.35979,0.0191155,-0.00931145;-3.98455,0.186799,0.351258;-3.6065,1.10461,1.20867;-4.62374,1.93945,1.7759;-5.7576,2.00654,1.28744;-4.36155,2.70576,2.90398;-3.06275,-0.713844,-0.289638;-2.97888,-0.660361,-1.70961;-1.9068,-1.58599,-2.20196;-0.289943,-1.11464,-1.65293;0.00466957,-1.19041,0.0941783;1.45989,-0.916719,0.426239;2.41322,-1.93765,0.492299;3.5977,-1.31267,0.813084;3.31997,0.0448138,0.927475;4.33142,1.09321,1.26657;4.87703,1.55407,0.0276043;5.51859,0.559194,-0.767917;5.52328,2.82356,0.071238;2.03302,0.213064,0.685665)|

View File

@@ -0,0 +1,91 @@
2244
RDKit 3D
13 13 0 0 0 0 0 0 0 0999 V2000
-17.2334 -5.4951 2.3053 O 0 0 0 0 0 0 0 0 0 0 0 0
-14.0938 -3.8235 0.2824 O 0 0 0 0 0 0 0 0 0 0 0 0
-15.9621 -3.2015 1.3992 O 0 0 0 0 0 0 0 0 0 0 0 0
-17.3818 -6.7019 0.3049 O 0 0 0 0 0 0 0 0 0 0 0 0
-15.9632 -5.9632 2.4646 C 0 0 0 0 0 0 0 0 0 0 0 0
-14.8724 -5.2643 1.9474 C 0 0 0 0 0 0 0 0 0 0 0 0
-15.7617 -7.1573 3.1570 C 0 0 0 0 0 0 0 0 0 0 0 0
-13.5802 -5.7595 2.1226 C 0 0 0 0 0 0 0 0 0 0 0 0
-14.4695 -7.6525 3.3323 C 0 0 0 0 0 0 0 0 0 0 0 0
-13.3789 -6.9536 2.8151 C 0 0 0 0 0 0 0 0 0 0 0 0
-15.0605 -4.0152 1.2226 C 0 0 0 0 0 0 0 0 0 0 0 0
-17.8696 -5.9525 1.1405 C 0 0 0 0 0 0 0 0 0 0 0 0
-19.2553 -5.3841 1.0571 C 0 0 0 0 0 0 0 0 0 0 0 0
1 5 1 0
1 12 1 0
2 11 1 0
3 11 2 0
4 12 2 0
5 6 2 0
5 7 1 0
6 8 1 0
6 11 1 0
7 9 2 0
8 10 2 0
9 10 1 0
12 13 1 0
M END
> <PUBCHEM_PHARMACOPHORE_FEATURES>
5
1 2 acceptor
1 3 acceptor
1 4 acceptor
3 2 3 11 anion
6 5 6 7 8 9 10 rings
$$$$
166295140
RDKit 3D
16 17 0 0 1 0 0 0 0 0999 V2000
-2.8293 0.2150 1.4768 F 0 0 0 0 0 0 0 0 0 0 0 0
-4.0626 0.7806 -0.2297 F 0 0 0 0 0 0 0 0 0 0 0 0
0.1990 -2.7308 -0.6547 O 0 0 0 0 0 0 0 0 0 0 0 0
2.1655 0.9098 0.9439 O 0 0 0 0 0 0 0 0 0 0 0 0
0.2386 -1.4812 1.2511 O 0 0 0 0 0 0 0 0 0 0 0 0
-0.5537 0.6911 -0.1687 N 0 0 0 0 0 0 0 0 0 0 0 0
3.9396 -0.0288 -0.0987 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.8484 -0.6331 -0.7303 C 0 0 2 0 0 0 0 0 0 0 0 0
-2.3589 -0.8260 -0.5838 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.8195 0.4211 0.1395 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.7941 1.4679 -0.2225 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5493 1.3657 -0.8388 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.0875 -1.6304 0.0829 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8856 0.8393 -0.3872 C 0 0 0 0 0 0 0 0 0 0 0 0
2.9644 0.2690 -1.0161 C 0 0 0 0 0 0 0 0 0 0 0 0
3.4121 0.3708 1.0364 C 0 0 0 0 0 0 0 0 0 0 0 0
1 10 1 0
2 10 1 0
3 13 1 0
4 14 1 0
4 16 1 0
5 13 2 0
6 8 1 0
6 11 1 0
6 12 1 0
7 15 1 0
7 16 2 0
8 9 1 0
8 13 1 1
9 10 1 0
10 11 1 0
12 14 1 0
14 15 2 0
M END
> <PUBCHEM_PHARMACOPHORE_FEATURES>
6
1 3 acceptor
1 5 acceptor
1 6 cation
3 3 5 13 anion
5 4 7 14 15 16 rings
5 6 8 9 10 11 rings
$$$$