Files
rdkit/Code/GraphMol/StructChecker/Wrap/structchecker.cpp
Brian Kelley 8609cd4883 Add StructChecker functionality
* StructChecker changes. Initial commit. First implementation. Added some tests.

* StructChecker: add  GoodAtoms and AcidicAtoms. new updates

* StructChecker: add new tests

* StructChecker: added TransformAugmentedAtoms()

* StructCheck: add structCheck to GraphMol. Fix compilation errors.

* StructChecker: add stereo verification and some utilities.

* StructChecker: function FixDubious3DMolecule was added

* StructChecker: checkStereo added. done with stereo.

* StructChecker: add StripSmallFragments()

* StructChecker: add AtomClash() function. Some cosmetic + tests

* StructChecker: checkAtoms() was started

* StructChecker: checkAtoms is ready

* StructChecker: user RingInfo from RDkit. Start regarge

* StructChecker: ReCharge molecule method prototype

* StructChecker: updates for ReCharge. Almost finished

* StructChecker: all ReCharge is done except external data tables loading

* StructChecker: add path tables into API. ReCharge completed

* Adds augmented atom data

Signed-off-by: Brian Kelley <brian.kelley@novartis.com>

* Removes extra files

Signed-off-by: Brian Kelley <brian.kelley@novartis.com>

* Adds path to test data via RDBASE environment

Signed-off-by: Brian Kelley <brian.kelley@novartis.com>

* Revert "Struct checker apr15"

* StructChecker: add missing tautomer tests

* Updates test to use RDBASE

* Adds initialization of data from data section

* Adds Python API and tests

* Fixes namespace for enum

* StructChecker: update/imporve strip small fragments

* StructChecker: fix acidic atoms (but logic does not work)

* StructChecker: fix match issue for CheckAtoms

* Adds macro guards

* Adds loading API and proper constructor

* Fixes tests, adds stereo test

* Fixes crash bug, matches[0] was being accessed from an empty match vector

* Reverts crash fix - conflicts with previous

* Adds the rest of the structure checker options

* StructChecker: fix atom matching for aromatic rings

* StructChecker: add tautomers checks. Update some tests

* StructChecker: stereo fixes. Add some tests

* StructChecker: fix check atoms. Start ligand symbol list

* StructChecker: fix some check atoms validation. Add Tranform to query lists. Start correct loading augmented atoms

* update

* another set of fixes

* StructChecker: fix loadDefaultAugmentedAtoms. Some changes in CheckAtom + tests + debug conditional breakpoints (TEMP operators)

* StructChecker: rewrited RecMatch() to sequential. Changed bond matching algorithm. small bug fixes

* Adds better logging of mismatched atoms

* Removes duplicated negative charge

* Fixes charges

* Adds nitro group test

* StructChecker: add better logging

* remove double logging

* Reformats code using RDKit's clang-format style

* StructChecker: Fix charge reformat using RDKit format.

* StructChecker: compilation restore after merge

* restore bond matching

* Removes the same fragments that strucheck does in case of ties

* Don't resanitize - this adds aromaticity which mucks things up

* Adds empty molecule checks

* Fixes atom clashes.

* Removes debug printing

* Removes debug logging info

* First pass at stereo fixes

* Fixes off by one error for dubious stereo fix

* Fixes more off by one errors

* Fixes more off by one errors

* More off by one fixes.

* Another off by one

* Fixes chiral flag set in molfile check

* Copies chiral flag over to largest fragment if necessary

* Poor man’s parity check.

* Find unspecified chiral centers ala Avalon.

* StructChecker: fix recursive match. Fix transformations

* StructChecker: fix transformation for atom list (using query atoms)

* Fixes checks && to &

* StructChecker: fix carboxylic acids tranform issue. Atom list is changed only if different

* StructChecker: documentation was updated

* Fixes snprintf and silences some warnings

* Adds Get/Set StructCheckerOptions

* Adds default AugmentedAtomTransforms
2016-10-24 08:00:07 +02:00

160 lines
7.3 KiB
C++

// Copyright (c) 2016, Novartis Institutes for BioMedical Research Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following
// disclaimer in the documentation and/or other materials provided
// with the distribution.
// * Neither the name of Novartis Institutes for BioMedical Research Inc.
// nor the names of its contributors may be used to endorse or promote
// products derived from this software without specific prior written
// permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
#include <RDBoost/python.h>
#include <RDBoost/Wrap.h>
#include <GraphMol/StructChecker/StructChecker.h>
#include <GraphMol/RDKitBase.h>
namespace python = boost::python;
namespace RDKit {
namespace StructureCheck {
unsigned int checkMolStructureHelper(const StructChecker &checker, ROMol &m) {
RWMol &fixer = static_cast<RWMol &>(m);
return checker.checkMolStructure(fixer);
}
}
}
struct struct_wrapper {
static void wrap() {
python::enum_<RDKit::StructureCheck::StructChecker::StructureFlags>(
"StructureFlags")
.value("NO_CHANGE", RDKit::StructureCheck::StructChecker::NO_CHANGE)
.value("BAD_MOLECULE",
RDKit::StructureCheck::StructChecker::BAD_MOLECULE)
.value("ALIAS_CONVERSION_FAILED",
RDKit::StructureCheck::StructChecker::ALIAS_CONVERSION_FAILED)
.value("STEREO_ERROR",
RDKit::StructureCheck::StructChecker::STEREO_ERROR)
.value("STEREO_FORCED_BAD",
RDKit::StructureCheck::StructChecker::STEREO_FORCED_BAD)
.value("ATOM_CLASH", RDKit::StructureCheck::StructChecker::ATOM_CLASH)
.value("ATOM_CHECK_FAILED",
RDKit::StructureCheck::StructChecker::ATOM_CHECK_FAILED)
.value("SIZE_CHECK_FAILED",
RDKit::StructureCheck::StructChecker::SIZE_CHECK_FAILED)
.value("TRANSFORMED", RDKit::StructureCheck::StructChecker::TRANSFORMED)
.value("FRAGMENTS_FOUND",
RDKit::StructureCheck::StructChecker::FRAGMENTS_FOUND)
.value("EITHER_WARNING",
RDKit::StructureCheck::StructChecker::EITHER_WARNING)
.value("DUBIOUS_STEREO_REMOVED",
RDKit::StructureCheck::StructChecker::DUBIOUS_STEREO_REMOVED)
.value("RECHARGED", RDKit::StructureCheck::StructChecker::RECHARGED)
.value("STEREO_TRANSFORMED",
RDKit::StructureCheck::StructChecker::STEREO_TRANSFORMED)
.value("TEMPLATE_TRANSFORMED",
RDKit::StructureCheck::StructChecker::TEMPLATE_TRANSFORMED)
.value("TAUTOMER_TRANSFORMED",
RDKit::StructureCheck::StructChecker::TAUTOMER_TRANSFORMED);
python::class_<RDKit::StructureCheck::StructCheckerOptions,
RDKit::StructureCheck::StructCheckerOptions *>(
"StructCheckerOptions", python::init<>())
.def_readwrite(
"AcidityLimit",
&RDKit::StructureCheck::StructCheckerOptions::AcidityLimit)
.def_readwrite(
"RemoveMinorFragments",
&RDKit::StructureCheck::StructCheckerOptions::RemoveMinorFragments)
.def_readwrite(
"DesiredCharge",
&RDKit::StructureCheck::StructCheckerOptions::DesiredCharge)
.def_readwrite(
"CheckCollisions",
&RDKit::StructureCheck::StructCheckerOptions::CheckCollisions)
.def_readwrite(
"CollisionLimitPercent",
&RDKit::StructureCheck::StructCheckerOptions::CollisionLimitPercent)
.def_readwrite("MaxMolSize",
&RDKit::StructureCheck::StructCheckerOptions::MaxMolSize)
.def_readwrite(
"ConvertSText",
&RDKit::StructureCheck::StructCheckerOptions::ConvertSText)
.def_readwrite("StripZeros",
&RDKit::StructureCheck::StructCheckerOptions::StripZeros)
.def_readwrite(
"CheckStereo",
&RDKit::StructureCheck::StructCheckerOptions::CheckStereo)
.def_readwrite(
"ConvertAtomTexts",
&RDKit::StructureCheck::StructCheckerOptions::ConvertAtomTexts)
.def_readwrite(
"GroupsToSGroups",
&RDKit::StructureCheck::StructCheckerOptions::GroupsToSGroups)
.def_readwrite("Verbose",
&RDKit::StructureCheck::StructCheckerOptions::Verbose)
.def(
"LoadGoodAugmentedAtoms",
&RDKit::StructureCheck::StructCheckerOptions::
loadGoodAugmentedAtoms,
(python::arg("path")),
"Load the set of good augmented atoms from the specified file path")
.def("LoadAcidicAugmentedAtoms",
&RDKit::StructureCheck::StructCheckerOptions::
loadAcidicAugmentedAtoms,
(python::arg("path")),
"Load the set of acidic augmented atoms from the specified file "
"path")
.def("LoadAugmentedAtomTranslations",
&RDKit::StructureCheck::StructCheckerOptions::
loadAugmentedAtomTranslations,
(python::arg("path")),
"Load the set of acidic augmented atoms from the specified file "
"path");
python::class_<RDKit::StructureCheck::StructChecker>("StructChecker",
python::init<>())
.def(
python::init<const RDKit::StructureCheck::StructCheckerOptions &>())
.def("CheckMolStructure",
&RDKit::StructureCheck::checkMolStructureHelper,
(python::arg("mol")),
"Check the structure and return a set of structure flags")
.def("StructureFlagsToString",
&RDKit::StructureCheck::StructChecker::StructureFlagsToString,
(python::arg("flags")),
"Return the structure flags as a human readable string")
.staticmethod("StructureFlagsToString")
.def("StringToStructureFlags",
&RDKit::StructureCheck::StructChecker::StringToStructureFlags,
(python::arg("str")),
"Convert a comma seperated string to the appropriate structure "
"flags")
.staticmethod("StringToStructureFlags");
}
};
BOOST_PYTHON_MODULE(rdStructChecker) { struct_wrapper::wrap(); }