Files
rdkit/Code/GraphMol/GaussianShape/GaussianShape.h
David Cosgrove 9f551aedbe Multi conf gaussian shape (#9265)
* First import of GaussianShape.

* Tidying.

* Custom features.

* Optimise.

* Optimise.

* Return 3 scores rather than 2 including combo score.

* Rename useFeatures to useColors.

* Python wrappers.

* Python tests.

* Take out big test.

* Add new start mode, as PubChem does it.

* Doh!

* Fix MolTransforms eigenvalue return.

* Two cycle optimisation, mostly working.

* Take out bestSoFar score from SCA.

* Take out DTYPE.

* Tidy out redundant variables.

* Optimisation in 2 parts.

* More fiddling in pursuit of speed.

* Update Python wrapper.

* Tweak.

* Atom subsets and different radii.

* Fix test.

* Revert pubchem_shape's test.cpp.

* Serialize ShapeInput.

* Trigger build

* Remove pointers to std::arrays in ShapeInput.

* ShapeInput virtual d'tor.

* Precondition - ShapeInput needs a molecule with at least 1 conformer.

* Rename ShapeInput::d_centroid to ShapeInput::d_canonTrans.

* Fix normalization bugs.

* Select start mode using moments of inertia rather than eigenvalues of canonical transformation.

* Include color features in moments of inertia.

* Smidge faster.

* Tversky similarity.

* Tidy tests.

* Tests working on Linux.

* Revert force of right handed axes in MolTransforms::computePrincipalAxesAndMomentsFromGyrationMatrix replacing with a comment in the code.

* Response to review.

* Sneaky allCarbon bug.

* add multithreaded test

* Response to review.

* Doh! Don't recalculate normalization after every transformation.

* Re-instate d_normalizationOK.

* Re-name functions for fetching canonical transformations.

* Separate alpha from coords.

* MultiConf works with single conf extraction.

* Extract all conformations.
Max and best similarities.

* Renames d_currConformer to d_activeShape.

* Update shapeToMol.

* Update shapeToMol.

* Changes from synthon shape searching.

* Fix normalization of multiple confs.

* Update Python wrappers.

* Fix shape merge.

* Improve bestSimilarity.

* Fix python wrapper.

* Pull in changes from SynthonShapeSearch:
make pruneShapes public.
function to negate Alpha values.

* clang-tidy suggestions.

* clang-tidy suggestions.

* Bug in quaternion gradients - we now have only 3 coordinates.

* Tidy tests.

* Mac result slightly different.

* Multi conformer molecule alignment.

* Optionally return raw overlap volumes in score functions.

* Python wrappers for raw overlap volumes.

* Update Python wrapper ShapeInputOptions.

* Tidy for PR.

* Extra include file.

* Extra library

* Tidy forward declarations.

* Don't prune if threshold < 0.0.

* Windows exporty thing.

* Check SMILES on merge of ShapeInputs.

* PRECONDITION of SMILES on merge of ShapeInputs.

* Response to review - rename some functions.

* change how overlapVols is passed
add a test for it

* API suggestions

* Response to review.

* Remove debugging writes.

* Fix Python wrappers.

---------

Co-authored-by: David Cosgrove <david@cozchemix.co.uk>
Co-authored-by: greg landrum <greg.landrum@gmail.com>
2026-06-03 06:09:09 +02:00

195 lines
8.8 KiB
C++

//
// Copyright (C) 2026 David Cosgrove and other RDKit contributors
//
// @@ All Rights Reserved @@
// This file is part of the RDKit.
// The contents are covered by the terms of the BSD license
// which is included in the file license.txt, found at the root
// of the RDKit source tree.
//
// Original author: David Cosgrove (CozChemIx Limited)
//
// This is the interface for the functions to perform shape-based molecule
// alignments and scoring. It is experimental code and the API and/or
// results may change in future releases.
#ifndef RDKIT_GAUSSIANSHAPE_GUARD
#define RDKIT_GAUSSIANSHAPE_GUARD
#include <RDGeneral/export.h>
#include <Geometry/Transform3D.h>
#include <GraphMol/GaussianShape/ShapeInput.h>
#include <GraphMol/GaussianShape/ShapeOverlayOptions.h>
namespace RDKit {
class ROMol;
class Conformer;
namespace GaussianShape {
//! Align a shape onto a reference shape.
/*!
\param refShape the reference shape
\param fitShape the shape to align
\param xform if passed in as non-null, will be populated with the
transformation matrix that aligns fit onto ref.
\param overlayOpts options for the overlay
\return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If
using color features, defaults to RDKit pharmacophore types for the features.
*/
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> AlignShape(
const ShapeInput &refShape, ShapeInput &fitShape,
RDGeom::Transform3D *xform = nullptr,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions());
//! Align a molecule to a reference shape
/*!
\param refShape the reference shape
\param fit the molecule to align
\param fitOpts the options for creating the fit shape
\param xform if passed in as non-null, will be populated with the
transformation matrix that aligns fit onto ref.
\param overlayOpts options for setting up and running the overlay
\param fitConfId (optional) the conformer to use for the fit
molecule
\return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If
using color features, defaults to RDKit pharmacophore types for the features.
*/
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> AlignMolecule(
const ShapeInput &refShape, ROMol &fit,
const ShapeInputOptions &fitOpts = ShapeInputOptions(),
RDGeom::Transform3D *xform = nullptr,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int fitConfId = -1);
//! Align a molecule to a reference molecule
/*!
\param ref the reference molecule
\param fit the molecule to align
\param refOpts the options for creating the ref shape
\param fitOpts the options for creating the fit shape
\param xform if passed in as non-null, will be populated with the
transformation matrix that aligns fit onto ref.
\param overlayOpts options for setting up and running the overlay
\param refConfId (optional) the conformer to use for the reference
molecule
\param fitConfId (optional) the conformer to use for the fit
molecule
\return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If
using color features, defaults to RDKit pharmacophore types for the features.
*/
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> AlignMolecule(
const ROMol &ref, ROMol &fit,
const ShapeInputOptions &refOpts = ShapeInputOptions(),
const ShapeInputOptions &fitOpts = ShapeInputOptions(),
RDGeom::Transform3D *xform = nullptr,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int refConfId = -1, int fitConfId = -1);
//! Calculate scores for the alignment of all conformers of one molecule
//! onto another. Returns a matrix of the combination scores, the conformer
//! numbers of the two molecules that gave the best overlay and the
//! transformation matrix for that overlay if requested. The molecules
//! themselves are not altered. scores[0][1] is the score of aligning
//! fit conformer 1 onto ref conformer 0
/*!
\param ref the reference molecule
\param fit the molecule to align
\param refConfId returns the reference conformer for the best scoring
overlay
\param fitConfId returns the fit conformer for the best scoring overlay
\param combScores the scores for all the overlays. Will be returned sized
by the number of conformers of the ref and fit molecules.
combScores[i][j] will be the score for the jth fit
conformer onto the ith ref conformer.
\param refOpts the options for creating the ref shape
\param fitOpts the options for creating the fit shape
\param overlayOpts options for setting up and running the overlay
\param xform if passed in as non-null, will be populated with the
transformation matrix that gives the best-scoring
overlay.
*/
RDKIT_GAUSSIANSHAPE_EXPORT void ScoreMoleculeAllConformers(
const ROMol &ref, const ROMol &fit, int &refConfId, int &fitConfId,
std::vector<std::vector<double>> &combScores,
const ShapeInputOptions &refOpts = ShapeInputOptions(),
const ShapeInputOptions &fitOpts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
RDGeom::Transform3D *xform = nullptr);
//! Score the overlap of a shape to a reference shape without moving
// either.
/*!
\param refShape the reference shape
\param fitShape the shape to score
\param overlayOpts options for controlling the volume calculation
\param overlapVols if not-null, is filled with the raw overlap volumes
\return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If
using color features, defaults to RDKit pharmacophore types for the features.
*/
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreShape(
const ShapeInput &refShape, const ShapeInput &fitShape,
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
std::pair<double, double> *overlapVols = nullptr);
//! Score the overlap of a molecule to a reference shape without moving
// either.
/*!
\param refShape the reference shape
\param fit the molecule to score
\param fitOpts the options for creating the fit shape
\param overlayOpts options for controlling the volume calculation
\param fitConfId (optional) the conformer to use for the fit
molecule
\param overlapVols if not-null, is filled with the raw overlap volumes
\return an array of the combination score of the shape Tversky value and the
color Tversky value (zero if colors not used) and the individual values. If
using color features, defaults to RDKit pharmacophore types for the features.
*/
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreMolecule(
const ShapeInput &refShape, const ROMol &fit,
const ShapeInputOptions &fitOpts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int fitConfId = -1,
std::pair<double, double> *overlapVols = nullptr);
//! Score the overlap of a molecule to a reference molecule without moving
// either.
/*!
\param ref the reference molecule
\param fit the molecule to score
\param refOpts the options for creating the ref shape
\param fitOpts the options for creating the fit shape
\param overlayOpts options for controlling the volume calculation
\param refConfId (optional) the conformer to use for the reference
molecule
\param fitConfId (optional) the conformer to use for the fit
molecule
\param overlapVols if not-null, is filled with the raw overlap volumes
\return an array of the combination score of the shape Tverksy value and the
color Tversky value (zero if colors not used) and the individual values. If
using color features, defaults to RDKit pharmacophore types for the features.
*/
RDKIT_GAUSSIANSHAPE_EXPORT std::array<double, 3> ScoreMolecule(
const ROMol &ref, const ROMol &fit,
const ShapeInputOptions &refOpts = ShapeInputOptions(),
const ShapeInputOptions &fitOpts = ShapeInputOptions(),
const ShapeOverlayOptions &overlayOpts = ShapeOverlayOptions(),
int refConfId = -1, int fitConfId = -1,
std::pair<double, double> *overlapVols = nullptr);
} // namespace GaussianShape
} // namespace RDKit
#endif // RDKIT_GAUSSIANSHAPE_GUARD