tadhurst-cdd ca41fa5bfd Add SCSR parsing to RDKit (#8147)
* Parsing SCSR

* add scsrol to mol

* removed bad include file

* loosen distGeom test slightly

* add wrap test for SCSRMol

* Add test for scsr in python

* tests added for scsr and strict parsing removed

* remove extra stuff

* More fully specified use of SCSRMol for PR CI build

* Added flags for SCSR expansion to not include any leaving groups

* Added MolFromScsrParams to Wrap for python

* added SCSRMol destructor

* Added two tests for RNA macromols, and fixed a bug they revealed

* Added new tests abd expected files

* changes as per PR review

* SCSR Chnages for leaving groups

* fixed testScsr.py

* hydrogen bond treatment

* in SCSR expand, allow Hbond to be autoatically detected

* changes as per code review

* Adding new test file

* chages for SCSR contructors, destructors for CI build

* fixed pyton for SCSR hydrogen bond modes, and added tests

* Added new test files

* fixed edge case for SCSR

* fix checksum for inchi

* consistent capitalization of SCSR throughout

* switch to enum class

* make things shorter

* simplify

* get rid of the ATTCHORD class

* New section for SCSR in RDKit_book

* addeed section to RDKit_Book

* SCSRMol is no longer exposed in Python

* fix leak in MolFromSCSRFile()
light refactoring

* expose MolFromSCSRFile() to python
make the MolFromSCSR functions work with default args
a bit more testing

* removed C++ access to SCSRMol

* CXMsiles now ouputs hbonds, fix to template matching, and a few other things

* Addl fix for bad aromaticity in Hbond rings

* Test files needed

* Test files needed

* try to fix a CI build errors

* CI error fix

* Added missing test file

* CMake version - for CI build

* remove full file compoarison from macromol test file

* accidental change to debug restored to release

* Code review changes

* As per PR review

---------

Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
2025-05-14 13:37:59 +02:00
2025-05-14 13:37:59 +02:00
2025-01-09 07:26:53 +01:00
2025-05-14 13:37:59 +02:00
2025-05-11 05:10:18 +02:00
2024-08-16 17:11:31 +02:00
2025-05-11 05:10:18 +02:00
2024-12-05 08:36:26 +01:00
2025-05-08 19:24:16 +02:00
2020-04-17 17:48:58 +02:00
2015-11-26 02:34:33 +01:00
2016-09-23 04:58:46 +02:00

RDKit

Azure build Status DOI

What is it?

The RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python.

  • BSD license - a business friendly license for open source
  • Core data structures and algorithms in C++
  • Python 3.x wrapper generated using Boost.Python
  • Java and C# wrappers generated with SWIG
  • JavaScript (generated with emscripten) and CFFI wrappers around important functionality
  • 2D and 3D molecular operations
  • Descriptor and Fingerprint generation for machine learning
  • Molecular database cartridge for PostgreSQL supporting substructure and similarity searches as well as many descriptor calculators
  • Cheminformatics nodes for KNIME
  • Contrib folder with useful community-contributed software harnessing the power of the RDKit

Installation and getting started

If you are working in Python and using conda (our recommendation), installation is super easy:

$ conda install -c conda-forge rdkit

You can then take a look at our Getting Started in Python guide.

More detailed installation instructions are available in Docs/Book/Install.md.

Documentation

Available on the RDKit page and in the Docs folder on GitHub

The RDKit blog often has useful tips and tricks.

Support and Community

If you have questions, comments, or suggestions, the best places for those are:

If you've found a bug or would like to request a feature, please create an issue

We also have a LinkedIn group

We have a yearly user group meeting (the UGM) where members of the community do presentations and lightning talks on things they've done with the RDKit. Materials from past UGMs, which can quite useful, are also online:

License

Code released under the BSD license.

Description
No description provided
Readme 380 MiB
Languages
C++ 69.6%
Python 15.3%
PLSQL 3.6%
CMake 2.8%
C 2.5%
Other 6.1%