Files
rdkit/Code/GraphMol/StructChecker/StructureFlags.cpp
Brian Kelley 8609cd4883 Add StructChecker functionality
* StructChecker changes. Initial commit. First implementation. Added some tests.

* StructChecker: add  GoodAtoms and AcidicAtoms. new updates

* StructChecker: add new tests

* StructChecker: added TransformAugmentedAtoms()

* StructCheck: add structCheck to GraphMol. Fix compilation errors.

* StructChecker: add stereo verification and some utilities.

* StructChecker: function FixDubious3DMolecule was added

* StructChecker: checkStereo added. done with stereo.

* StructChecker: add StripSmallFragments()

* StructChecker: add AtomClash() function. Some cosmetic + tests

* StructChecker: checkAtoms() was started

* StructChecker: checkAtoms is ready

* StructChecker: user RingInfo from RDkit. Start regarge

* StructChecker: ReCharge molecule method prototype

* StructChecker: updates for ReCharge. Almost finished

* StructChecker: all ReCharge is done except external data tables loading

* StructChecker: add path tables into API. ReCharge completed

* Adds augmented atom data

Signed-off-by: Brian Kelley <brian.kelley@novartis.com>

* Removes extra files

Signed-off-by: Brian Kelley <brian.kelley@novartis.com>

* Adds path to test data via RDBASE environment

Signed-off-by: Brian Kelley <brian.kelley@novartis.com>

* Revert "Struct checker apr15"

* StructChecker: add missing tautomer tests

* Updates test to use RDBASE

* Adds initialization of data from data section

* Adds Python API and tests

* Fixes namespace for enum

* StructChecker: update/imporve strip small fragments

* StructChecker: fix acidic atoms (but logic does not work)

* StructChecker: fix match issue for CheckAtoms

* Adds macro guards

* Adds loading API and proper constructor

* Fixes tests, adds stereo test

* Fixes crash bug, matches[0] was being accessed from an empty match vector

* Reverts crash fix - conflicts with previous

* Adds the rest of the structure checker options

* StructChecker: fix atom matching for aromatic rings

* StructChecker: add tautomers checks. Update some tests

* StructChecker: stereo fixes. Add some tests

* StructChecker: fix check atoms. Start ligand symbol list

* StructChecker: fix some check atoms validation. Add Tranform to query lists. Start correct loading augmented atoms

* update

* another set of fixes

* StructChecker: fix loadDefaultAugmentedAtoms. Some changes in CheckAtom + tests + debug conditional breakpoints (TEMP operators)

* StructChecker: rewrited RecMatch() to sequential. Changed bond matching algorithm. small bug fixes

* Adds better logging of mismatched atoms

* Removes duplicated negative charge

* Fixes charges

* Adds nitro group test

* StructChecker: add better logging

* remove double logging

* Reformats code using RDKit's clang-format style

* StructChecker: Fix charge reformat using RDKit format.

* StructChecker: compilation restore after merge

* restore bond matching

* Removes the same fragments that strucheck does in case of ties

* Don't resanitize - this adds aromaticity which mucks things up

* Adds empty molecule checks

* Fixes atom clashes.

* Removes debug printing

* Removes debug logging info

* First pass at stereo fixes

* Fixes off by one error for dubious stereo fix

* Fixes more off by one errors

* Fixes more off by one errors

* More off by one fixes.

* Another off by one

* Fixes chiral flag set in molfile check

* Copies chiral flag over to largest fragment if necessary

* Poor man’s parity check.

* Find unspecified chiral centers ala Avalon.

* StructChecker: fix recursive match. Fix transformations

* StructChecker: fix transformation for atom list (using query atoms)

* Fixes checks && to &

* StructChecker: fix carboxylic acids tranform issue. Atom list is changed only if different

* StructChecker: documentation was updated

* Fixes snprintf and silences some warnings

* Adds Get/Set StructCheckerOptions

* Adds default AugmentedAtomTransforms
2016-10-24 08:00:07 +02:00

80 lines
2.2 KiB
C++

//
// Copyright (C) 2016 Novartis Institutes for BioMedical Research
//
// @@ All Rights Reserved @@
// This file is part of the RDKit.
// The contents are covered by the terms of the BSD license
// which is included in the file license.txt, found at the root
// of the RDKit source tree.
//
#include <map>
#include "StructChecker.h"
namespace RDKit {
namespace StructureCheck {
static const char* flags[] = {
"BAD_MOLECULE",
"ALIAS_CONVERSION_FAILED",
"STEREO_ERROR",
"STEREO_FORCED_BAD",
"ATOM_CLASH",
"ATOM_CHECK_FAILED",
"SIZE_CHECK_FAILED",
"", // reserved error = 0x0080,
"TRANSFORMED",
"FRAGMENTS_FOUND",
"EITHER_WARNING",
"DUBIOUS_STEREO_REMOVED",
"RECHARGED",
"STEREO_TRANSFORMED",
"TEMPLATE_TRANSFORMED",
"TAUTOMER_TRANSFORMED",
};
// Converts structure property flags to a comma seperated string
std::string StructChecker::StructureFlagsToString(unsigned f) {
std::string s;
for (unsigned bit = 0; bit < 16; bit++) {
if (0 != (f & (1 << bit))) {
if (!s.empty()) s += ",";
s += flags[bit];
}
}
return s;
}
// Converts a comma seperated string to a StructureFlag unsigned integer
class FMap : public std::map<std::string, unsigned> {
public:
FMap() {
for (unsigned bit = 0; bit < 16; bit++)
if (*flags[bit]) (*this)[std::string(flags[bit])] = (1 << bit);
}
};
unsigned StructChecker::StringToStructureFlags(const std::string& str) {
static const FMap fmap; // map name string to StructureFlags enum value
unsigned int f = 0;
const char* token = str.c_str();
while (*token) {
while (*token && *token <= ' ') // skip whitespaces (<tab>|<space>...)
token++;
unsigned len = 0;
while (token[len] && !(token[len] == ',' || token[len] <= ' ')) len++;
if (0 == len) continue;
std::string name(token, len);
std::map<std::string, unsigned>::const_iterator it = fmap.find(name);
if (fmap.end() != it) f |= it->second;
while (token[len] &&
(token[len] == ',' || token[len] <= ' ')) // skip delimeter
len++;
token += len;
}
return f;
// there is no way to return syntax error in input string
}
} // namespace StructureCheck
} // namespace RDKit