* First pass at port
Mostly auto-converted using claude sonnet 4
Things are a bit slower in this initial port. Here's some timing data for molecules from SMILES (no coords) and from SDF (with coords)
# MASTER
## smiles
read: 50000 mols.
9.260000s wall, 8.650000s user + 0.600000s system = 9.250000s CPU (99.9%)
serialize
3.060000s wall, 2.400000s user + 0.660000s system = 3.060000s CPU (100.0%)
deserialize
1.350000s wall, 1.250000s user + 0.090000s system = 1.340000s CPU (99.3%)
## SDF
read: 50000 mols.
9.340000s wall, 8.930000s user + 0.400000s system = 9.330000s CPU (99.9%)
serialize
6.630000s wall, 5.960000s user + 0.680000s system = 6.640000s CPU (100.2%)
deserialize
1.450000s wall, 1.450000s user + 0.000000s system = 1.450000s CPU (100.0%)
# Boost::JSON
## smiles
read: 50000 mols.
9.250000s wall, 8.830000s user + 0.420000s system = 9.250000s CPU (100.0%)
serialize
4.770000s wall, 4.410000s user + 0.350000s system = 4.760000s CPU (99.8%)
deserialize
2.320000s wall, 2.100000s user + 0.230000s system = 2.330000s CPU (100.4%)
## SDF
read: 50000 mols.
9.500000s wall, 9.100000s user + 0.400000s system = 9.500000s CPU (100.0%)
serialize
8.760000s wall, 8.330000s user + 0.420000s system = 8.750000s CPU (99.9%)
deserialize
2.540000s wall, 2.330000s user + 0.210000s system = 2.540000s CPU (100.0%)
* some json parser optimization
* around the edges
* optimizations for the writer
* hopefully get things compiling
* convert the MinimalLib stuff to use boost::json
Again, a lot of the lifting here was done using Claude Sonnet 4 in VS Code Copilot
* fix Windows DLL build
* response to review
Co-authored-by: Paolo Tosco <paolo.tosco.mail@gmail.com>
* better not to blindly accept suggestions
* fix the problems in MinimalLib
---------
Co-authored-by: Paolo Tosco <paolo.tosco.mail@gmail.com>
Co-authored-by: = <=>
- implement get_v2Kmolblock() in MinimalLib
- add the possibility to specify the MDL version preference as a get_molblock() forceMDLVersion JSON parameter, which is ignored by get_v3Kmolblock() and get_v2Kmolblock()
- changes in response to review
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* get SynthonSpace.cpp to build also when RDK_USE_BOOST_SERIALIZATION is
not defined
* test should not fail when RDK_USE_BOOST_SERIALIZATION is not defined
* - expose reading/writing PNG metadata to CFFI and MinimalLib
- add relevant CFFI and MinimalLib unit tests
- add RDK_USE_BOOST_PROGRAM_OPTIONS CMake option
- enable using standalone zlib in the absence of boost::iostreams for parsing PNG files
- enable linking against maeparser in the absence of boost::iostreams also on Windows
- enable building RDKit in the absence of boost::program_options
* add check for boost::program_options
* change size_t into std::uint64_t in SearchResults for consistency with doTheSearch() which uses std::uint64_t
* change size_t into std::uint64_t in SearchResults for consistency with
SynthonSpaceSearcher::doTheSearch()
* set CMake policy to allow YAeHMOP to require a version which is not
actually supported
* reverted External/YAeHMOP/CMakeLists.txt to master version
* check if Windows build will work
* fix build
* configure zlib install location
* build zlib dependency
* include zlib header directory
* explicitly set PropertyFlags.AllProps so the test does not fail on
static builds
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
- str_to_c() should check the ptr returned by malloc for non-nullness before using it
- change has_coords() mol_pkl parameter to const
- use assert.equal in JS tests where possible
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* make sure that loggers can be enabled, disabled, captured and tee'd from MinimalLib without issues
* changes in response to review
* change in response to review
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* - moved SMILES and RGroupDecomp JSON parsers to their own translation units
- added missing DLL export decorators that had been previously forgotten
- changed the signatures of MolToCXSmiles and updateCXSmilesFieldsFromJSON
to replace enum parameters with the underlying types
- updated ReleaseNotes.md
* make sure cxSmilesFields is only updated if the JSON string contains keys
belonging to the CXSmilesFields enum
* added missing copyright notices
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
* - renamed getParamsFromJSON() to updateSmilesWriteParamsFromJSON() and moved it from the anonymous namespace to the RDKit namespace such that it is publicly available
- implemented updateCXSmilesFieldsAndRestoreBondDirOptionFromJSON()
- added CFFI and JS tests
- get_smiles(), get_smarts(), get_cxsmiles() and get_cxsmarts() are now available in MinimalLib in both CFFI and JS layers and they can be passed JSON parameters
- CFFI get_qmol() now returns NULL if it fails to generate a RWMol rather than returning the "Error!" const char[] string, for consistency with what get_mol() and get_rxn() do. This was documented in release notes
* suggestions
---------
Co-authored-by: ptosco <paolo.tosco@novartis.com>
Co-authored-by: greg landrum <greg.landrum@gmail.com>
* Compile time and runtime deprecation warnings
* Used [[deprecated]] attribute to mark deprecation on cpp side
* Used RDLog to escalate deprecation warnings to python
* deprecated non fingerprint generator fingerprint generation functions
* Address build errors
* suppress deprecation warnings in cpps and tests
* experiment with new SWIG versions in the mac azure pipeline
* More deprecation suppression
* revert mac java experiment
* Fix SWIG syntax errors
* Attempt to fix windows unit test
* Remove test because of logging behavior
* Change linux java build to SWIG 4.1
- removes the need for preprocessor interaction
* Change mac java build to SWIG 4.1
* try updating the CI buid
* lock cmake version
needed to find JNI correctly
* update compiler versions
needed for the boost
* Fix typo and unavailable version
* Fix version conflict
* update mac build
* get linux build working?
---------
Co-authored-by: Patrick Penner <patrick@ppenner.com>
1. tee: the log is captured in a buffer but also sent to the native output channel
2. capture: the log is captured in a buffer without being sent to the native output channel
- removed duplicate logging_needs_init and needs_init atomic bool variables from cffiwrapper.cpp and consolidated them into a static LoggingFlag d_loggingNeedsInit class variable
- added relevant C and JS tests
Co-authored-by: ptosco <paolo.tosco@novartis.com>
- setAromaticity (defaults to true)
- fastFindRings (defaults to true; only used when sanitize=false)
- assignStereo (defaults to true)
These options avoid doing unnecessary work when the molecule is only used for specific purposes (e.g., computing FPs or doing substructure searches)
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* - expose [sg]etUseLegacyStereo()
- In MolToSmiles() doIsomericSmiles should default to true as in C++ and Python
- added missing parameters to MolToSmiles() and MolToMolBlock()
- added SmilesWriteParams MolToSmiles() overload
- added and updated Java tests
* - changes in response to review
- exposed the same functionality also in MinimalLib and CFFI and added tests
---------
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* - fix indentation
- fix regex check (which currently always fails)
* wip
* - added clearMolBlockWedgingInfo()
- added invertMolBlockWedgingInfo()
- MinimalLib::generate_aligned_coords() now inverts stereochemistry if a rigid-body alignment transformation caused chiality inversion
- MinimalLib::generate_aligned_coords() now clears stereochemistry if coordinates changed
- added JSMol::clear_prop() to the already existing JSMol::get_prop() and JSMol::set_prop()
- renamed commonchem to rdkitjson in JS unit test
- added relevant unit tests
* fixed mistake in logic
* - added add_hs_in_place() and remove_hs_in_place() to the JS MinmalLib
- added relevant tests
* - removed check for existence of a property ahead of clearing it as it is not necessary; updated the clearProp docstring to reflect this
- updated the MolFileStereochem.h docstrings based on review comments and fixed a typo
- fixed two (legitimate) compiler warnings as get_molblock() and get_v3kmolblock() should return nullptr and not a pointer to an empty string; added tests for this as there was none
- in MinimalLib/common.h, moved the check of whether a molecule has undergone a flip around the Z axis to a function in the anonymous namespace
- in MinimalLib/common.h, added logic to preserve original wedging (and eventually invert it) also when alignOnly is set to false, in case the wedging is all within the constrained scaffold
- added thourough testing of the wedging logic on both CFFI and JS sides
* - added equality operator to CXXAtomIter and CXXBondIter classes such that they can used with implicit loop STL algorithms
- added relevant unit tests
* fix Windows build
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* - enable get_molblock(details_json) from MinimalLib as it is already enabled in CFFI
- enable useMolBlockWedging on get_molblock() in both CFFI and JS MinimalLib
- add tests
* - expose also addChiralHs
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
- removed details from get_maccs_fp calls since there are no adjustable parameters
- exposed get_maccs_fp to JS
- added tests and adjusted existing ones since some deprecated functions were removed and do not need testing anymore
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* update AvalonTools to version 2.0.1
* Improvements to 2D depiction and alignment/RMSD calculation
- Refactored the straightenDepiction code which is now much simpler and more readable and supports a minimizeRotation parameter
- added C++, Python and JS tests for the new minimizeRotation parameter
- refactored tests to use CalcRMS rather than an internal implementation to compute RMS deviations
- Removed duplicated code in CalcRMS() and getBestRMS() and made their APIs consistent with respect to supported parameters
IMPORTANT NOTE: for backwards compatibility I set the CalcRMS() default for the new symmetrizeConjugatedTerminalGroups
to false as this parameter was not originally supported. @greg: I would be very much in favor of setting this to true instead
if you agree, even though it might change results for existing scripts, as I think it is a much more sensible default.
- Improved documentation to clarify the difference between CalcRMS() and getBestRMS()
- Added unit tests for CalcRMS() as there was none previously
- Added tests for the additional CalcRMS() and getBestRMS() parameters
- Added a new getBestAlignmentTransform() function
- The CFFI function set_2d_coords_aligned() now returns the matching atoms similarly to the C++, Python and JS counterparts
IMPORTANT NOTE: this required an API change for the additional char ** parameter used to return the match.
Existing code using set_2d_coords_aligned() will fail to compile and will require a last NULL parameter to be added to compile again
- Removed duplicated code between CFFI set_2d_coords_aligned() and JS generate_aligned_coords()
- Added has_2d_coords() to the CFFI library
- generate_aligned_coords() now supports JSON parameters and the previous versions are deprecated
- set_2d_coords_aligned() and generate_aligned_coords() both support an alignOnly parameter (which defaults to false).
If set to true, rather than re-generating a fresh 2D layout around templateMol, the existing coordinates (if any) are simply aligned
to the provided templateMol. If the molecule has no coordinates, a set of 2D coordinates is generated independently of templateMol
and then aligned to the provided templateMol
- avoid that when acceptFailure is false set_2d_coords_aligned() and generate_aligned_coords() overwrite existing coordinates
* - explicitly link testDepictor to MolAlign library
* - add MolAlign dependency to testDepictor (rather than to the catch test as in the previous commit)
- add a couple of tweaks
* suppress compiler warnings (1st pass)
* warnings: 2nd pass
* warnings: 3rd pass
* - alignOnly mode should also support allowRGroups
* - fixed C++ build
- added tests for allowRGroups+alignOnly combination
* changes in response to review
* added an entry to backward incompatible changes regarding set_2d_coords_aligned()
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* - exposed reaction drawing in MinimalLib
- fixed a typo in the error message "JSON doesn't contain 'atoms' field, or it is not an array"
- replaced RapidJson HasMember() with FindMember() to avoid a duplicate lookup in case the member exists and can be accessed
- some cosmetic style changes (avoid multiple variable declarations on a single line, use curly bracket also for one-liner if clauses, use auto where possible)
- capitalized "greg Landrum" to "Greg Landrum" (well deserved)
- exposed other FPs in addition to Morgan and Pattern FPs in MinimalLib
- added relevant tests
* - update CXSMARTS test in MinimalLib
* Changes in response to review:
- exposed reaction drawing functionality to CFFI and added relevant tests
- refactored fingerprint code to use JSON details and deprecated the Morgan/pattern fingerprint functions that used multiple parameters
- all fingerprints are now exposed to both JS and CFFI with no code duplication
- fixed a potential crash bug in the CFFI library where calling get_morgan_fp(), get_rdkit_fp() or get_pattern_fp with a NULL mol_pkl would result in dereferencing a nullptr
* removed debugging printouts committed accidentally
Co-authored-by: Tosco, Paolo <paolo.tosco@novartis.com>
* revert duplicate chunk in release notes
* replace deprecated ifdefs
This one gets rid of USE_BUILTIN_POPCNT and RDK_THREADSAFE_SS
use RDK_OPTIMIZE_POPCNT or RDK_BUILD_THREADSAFE_SSS instead
* get rid of BUILD_COORDGEN_SUPPORT from ROMol.i
* fix a stupid typo
* update release notes
* add SmilesWriteParameters
* rename that
* teach the CFFI code about it
* add python wrappers
* basic python testing
* Fixes#4320
support toggling the various CXSMILES output fields
* win64 builds
* it helps to actually commit everything
* I admit that I am just guessing at this point
* Remove accidentally tracked files and unset x flag
* Ignore ComicNeue
* Unify test tag to `reader`
* Trivial destructors
* Bump CMAKE_CXX_STANDARD to 14 (#4165)
* set all source code files to have native line endings
* normalized all source code line endings
Co-authored-by: Paolo Tosco <paolo.tosco@novartis.com>
* hello world works
* more
* more
minimallib needs to be tested
* parse substructure parameters from JSON
* add substruct search and parameters
* add descriptors
* register more descriptors
* fingerprints, first pass
* stop outputting tiny coord vals
* support generating 2d coords
* coordgen testing
* return nulls
* initial 3d support; add/removeHs; cleanup
* Embedding parameters from JSON
* update
* pattern fp, fps as bytes
* use json to configure MFP
* use json to configure rdkit and pattern fps
* aligned 2d coords
* parsing options
* options for writers
* rename remove_hs
* get this working on windows (kind of)
* silence some msvc warnings
* cmake updates
* update python tests
* add the CFFI code to CI builds
* cleanup line ending mess?
* a couple small fixes
* make this work with URF
* support coordMap in the 3D coordinate generation
* updates in response to review