Greg Landrum cd74dc2207 Initial support for non-tetrahedral stereochemistry (#5084)
* very basics: actually parsing the new atom stereochem features

* add some input verification for the chiral permutations

* fix a typo
add quadruple bond SMILES/SMARTS extension

* add forgotten files

* patch from Roger

* add Roger's parsing examples

* typo

* new tests

* adjusted version of next PR from Roger:
- add SP2D hybridization for square planar (this may change)
- some modernizationof Chirality.cpp
- stop using < HybridizationType in Chirality.cpp (should probably do this elsewhere too)
- improved handling of hybridization assignment for new stereochem
- handle new stereo/hybridization in UFF
- tests for the above

* perception of non-tetrahedral stereo from 3D (from Roger S)
Basic testing of SP and TB based on opensmiles docs

* potential fixes for octahedral assignment
more tests

* docs update
need way more!

* map the TH tags directly to @ tags

* very basics of SMILES writing
this does not work with anything that changes the permutation order
like canonicalization or writing things in rings.

* start to support the getChiralAcross API

* more testing

* consistency

* add hasNonTetrahedralStereo() and getIdealAngleBetweenLigands()

* assignStereochemistry should only remove non-tetrahedral stereo

* re-simplify those tests

* cleanup matrix stream output

* initial pass at supporting nontet stereo in distgeom

* backup

* start on the reference docs

* TBP reference

* first pass at Oh finished

* update SP section

* more doc updates

* fix a typo

* add param to not remove Hs connected to non-tetrahedral atoms

* VERY basic coord generation for square planar

* TBP basics

* basic OH depiction

* start testing missing ligands
allow non-tet stereo in rings (ugly, but correct)

* add new TBP functions from Roger

* update depiction code for new API

* backup, the new tests work so far

* Finish the TB tests

* OH tests pass too

* cleanup

* first pass at getting correct SMILES with reordering
need way more testing than this

* ensure permutation 0 is correctly preserved

* some progress towards adding non-tetrahedral stereo to StereoInfo

* doc update

* add non-tet chiral classes to python wrappers

* make sure removeAllHs also gets neighbors of non-tetrahedral centers
more testing

* a bit of depictor cleanup

* make the assignment from 3D more tolerant
more testing

* improve the bulk testing

* cleanup

* remove a bit of redundant code

* ensure we don't write bogus permutation values to SMILES

* fix some rebase problems

* allow assignStereochemistryFrom3D() to be called without sanitization

* allow disabling the non-tetrahedral stereo when it's not explicit

* get that working on windows too
2022-05-20 09:07:16 +02:00
2021-09-17 16:43:56 +02:00
2022-03-10 06:55:18 +01:00
2022-04-12 05:57:25 +02:00
2022-05-05 15:37:24 +02:00
2022-03-10 06:55:18 +01:00
2022-03-31 19:25:22 +02:00
2020-04-17 17:48:58 +02:00
2015-11-26 02:34:33 +01:00
2015-11-26 02:34:33 +01:00
2022-04-15 06:06:25 +02:00
2016-09-23 04:58:46 +02:00

RDKit

Azure build Status Documentation Status DOI

RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python.

  • BSD license - a business friendly license for open source
  • Core data structures and algorithms in C++
  • Python 3.x wrapper generated using Boost.Python
  • Java and C# wrappers generated with SWIG
  • 2D and 3D molecular operations
  • Descriptor and Fingerprint generation for machine learning
  • Molecular database cartridge for PostgreSQL supporting substructure and similarity searches as well as many descriptor calculators
  • Cheminformatics nodes for KNIME
  • Contrib folder with useful community-contributed software harnessing the power of the RDKit

Community

Code

Web presence

Materials from user group meetings

Documentation

Available on the RDKit page and in the Docs folder on GitHub

Installation

Installation instructions are available in Docs/Book/Install.md.

Binary distributions, anaconda, homebrew

  • binaries for conda python or, if you are using the conda-forge stack, the RDKit is also available from conda-forge.
  • RPMs for RedHat Enterprise Linux, Centos, and Fedora. Contributed by Gianluca Sforna.
  • debs for Ubuntu and other Debian-derived Linux distros. Contributed by the Debichem team.
  • homebrew formula for building on the Mac. Contributed by Eddie Cao.
  • recipes for building using the excellent conda package manager. Contributed by Riccardo Vianello.
  • APKs for Alpine Linux. Contributed by da Verona

Projects using RDKit

  • ChEMBL Structure Pipeline - ChEMBL protocols used to standardise and salt strip molecules.
  • FPSim2 - Simple package for fast molecular similarity searches.
  • Datamol (docs, repo) - A Python library to intuitively manipulate molecules.
  • Scopy (docs, paper) - an integrated negative design Python library for desirable HTS/VS database design
  • stk (docs, paper) - a Python library for building, manipulating, analyzing and automatic design of molecules.
  • gpusimilarity - A Cuda/Thrust implementation of fingerprint similarity searching
  • Samson Connect - Software for adaptive modeling and simulation of nanosystems
  • mol_frame - Chemical Structure Handling for Dask and Pandas DataFrames
  • RDKitjs - port of RDKit functionality to JavaScript
  • DeepChem - python library for deep learning for chemistry
  • mmpdb - Matched molecular pair database generation and analysis
  • CheTo (paper)- Chemical topic modeling
  • OCEAN (paper)- Optimized cross reactivity estimation
  • ChEMBL Beaker - standalone web server wrapper for RDKit and OSRA
  • ZINC - Free database of commercially-available compounds for virtual screening
  • sdf_viewer.py - an interactive SDF viewer
  • sdf2ppt - Reads an SDFile and displays molecules as image grid in powerpoint/openoffice presentation.
  • MolGears - A cheminformatics tool for bioactive molecules
  • PYPL - Simple cartridge that lets you call Python scripts from Oracle PL/SQL.
  • shape-it-rdkit - Gaussian molecular overlap code shape-it (from silicos it) ported to RDKit backend
  • WONKA - Tool for analysis and interrogation of protein-ligand crystal structures
  • OOMMPPAA - Tool for directed synthesis and data analysis based on protein-ligand crystal structures
  • OCEAN - web-tool for target-prediction of chemical structures which uses ChEMBL as datasource
  • chemfp - very fast fingerprint searching
  • rdkit_ipynb_tools - RDKit Tools for the IPython Notebook
  • Vernalis KNIME nodes
  • Erlwood KNIME nodes
  • AZOrange

License

Code released under the BSD license.

Description
No description provided
Readme 380 MiB
Languages
C++ 69.6%
Python 15.3%
PLSQL 3.6%
CMake 2.8%
C 2.5%
Other 6.1%