mirror of
https://github.com/rdkit/rdkit.git
synced 2026-06-03 21:44:30 +08:00
* Parsing SCSR * add scsrol to mol * removed bad include file * loosen distGeom test slightly * add wrap test for SCSRMol * Add test for scsr in python * tests added for scsr and strict parsing removed * remove extra stuff * More fully specified use of SCSRMol for PR CI build * Added flags for SCSR expansion to not include any leaving groups * Added MolFromScsrParams to Wrap for python * added SCSRMol destructor * Added two tests for RNA macromols, and fixed a bug they revealed * Added new tests abd expected files * changes as per PR review * SCSR Chnages for leaving groups * fixed testScsr.py * hydrogen bond treatment * in SCSR expand, allow Hbond to be autoatically detected * changes as per code review * Adding new test file * chages for SCSR contructors, destructors for CI build * fixed pyton for SCSR hydrogen bond modes, and added tests * Added new test files * fixed edge case for SCSR * fix checksum for inchi * consistent capitalization of SCSR throughout * switch to enum class * make things shorter * simplify * get rid of the ATTCHORD class * New section for SCSR in RDKit_book * addeed section to RDKit_Book * SCSRMol is no longer exposed in Python * fix leak in MolFromSCSRFile() light refactoring * expose MolFromSCSRFile() to python make the MolFromSCSR functions work with default args a bit more testing * removed C++ access to SCSRMol * CXMsiles now ouputs hbonds, fix to template matching, and a few other things * Addl fix for bad aromaticity in Hbond rings * Test files needed * Test files needed * try to fix a CI build errors * CI error fix * Added missing test file * CMake version - for CI build * remove full file compoarison from macromol test file * accidental change to debug restored to release * Code review changes * As per PR review --------- Co-authored-by: Greg Landrum <greg.landrum@gmail.com>
Description of data files in this folder
Solubility dataset
- solubility.test.sdf (257 records)
- solubility.train.sdf (1025 records)
The two sdf files(hereby named "solubility dataset") are originated from the Huuskonen dataset. The Huuskonen dataset contains a training set of 884 compounds and a randomly chosen test set of 413 compounds.
- Reference: Huuskonen, J. (2000). Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology. Journal of Chemical Information and Computer Sciences, 40(3), 773–777. https://doi.org/10.1021/ci9901338
This solubility dataset is originally downloaded from
Although cheminformatics.org no longer exists, supplementary file from https://doi.org/10.1021/ci9901338 contains a list of all the structures and the corresponding data in PDF format.