p2rank

mirror of https://github.com/rdk/p2rank.git synced 2026-06-04 12:44:24 +08:00

Author	SHA1	Message	Date
rdk	6d47285116	Add kdtree_implementation param and fix quickselect duplicate-key hang Add runtime parameter to switch between KdTree3D (default) and v1 AtomKdTree. Fix O(N²) quickselect degeneration on duplicate coordinates by adding post-partition equal-range scan.	2026-03-02 22:20:59 +01:00
rdk	24b9f5f709	Optimize KdTree3D build: bottom-up bounds, eliminate redundant traversals - Bounding boxes computed bottom-up from leaf scans instead of scanning full data range at every tree level (O(N) vs O(N log N)) - Approximate parent bounds passed down for split-axis selection (O(1) per node instead of O(range) scan) - Remove findNodeCount() and dead code; buildNode returns max index - Resolve split-axis array once in quickselect inner loop	2026-03-02 21:15:41 +01:00
rdk	76026b9297	Refactor Dataset item cache clearing and fix processItem typo Rename processssItem to processItem. Add per-item conditional cache clearing after processing to reduce peak memory. Refactor cleanCaches into clearCache/clearPrimaryCache/clearSecondaryCache with null-safety.	2026-03-02 20:52:07 +01:00
rdk	7f4d37b5c4	Add comparative benchmark test for v1 vs v2 KdTree Parametrized test generates random points, builds both trees, verifies identical results for all query types, and measures relative performance. Skipped during normal test runs; invoked via kdtree-benchmark.sh script.	2026-03-02 20:52:05 +01:00
rdk	6cce0eb016	Rewrite KdTree as immutable, hardcoded 3D implementation in v2 package New KdTree3D.java uses SoA storage, linearized implicit-heap layout, balanced quickselect build, and stack-based traversal. Immutable design eliminates mutable node state, enabling thread-safe concurrent queries. AtomKdTree.groovy provides drop-in API wrapper. Atoms.java switched to v2 with invalidate-on-add pattern and periodic-rebuild consolidate().	2026-03-02 20:13:47 +01:00
rdk	5d9ec9eb58	Fix bugs and add error reporting to analyze subcommands - Fix integer division in BinCounter.getPosRatio() (long/long → double) - Fix broken NaN check in ConservationCloudFeature (== → Double.isNaN) - Fix wrong variable in apo_protein error message (proteinFile → apoProteinFile) - Fix outerLater typo → outerLayer in Atoms.SphereLayers and usages - Fix xenegy_cloud2_layered typo → xenergy_cloud2_layered in Params and usages - Add error reporting (writeErrorCsvs) to all analyze subcommands - Add ignoreLigandsSwitch to doCmdFasta (doesn't need ligands)	2026-03-02 09:59:36 +01:00
rdk	5a38f8f1de	Avoid unnecessary allocations in hot paths Cache aggregated errors in Dataset.Result to avoid recomputing. Use direct x/y/z field access instead of getCoords() in Atoms.copyPoints and PointExportData to avoid double[3] allocations.	2026-03-02 04:47:14 +01:00
rdk	1bbdcbc196	Split dataset by protein chain presence in analyze proteins command Add Dataset.Item.getRow() to reconstruct dataset row strings. In cmdProteins(), collect items into with/without protein chains using ConcurrentLinkedQueue and write split .ds files when any structures lack protein chains.	2026-03-02 02:23:30 +01:00
rdk	22a7dec4bc	Bump version to 2.6.0-dev.3 and update xz dependency to 1.12	2026-03-01 21:04:37 +01:00
rdk	3a8e985eb4	Skip ligand loading in parse-proteins command	2026-03-01 20:48:00 +01:00
rdk	582d5ebf1f	Optimize distance calculations to avoid getCoords() array allocations Add Atom-based sqrDist/dist overloads in PerfUtils that use getX/getY/getZ directly instead of allocating double[] via getCoords(). Refactor Point to store x/y/z as individual fields instead of a double[] array. Fix Point.setCoords() which was previously a no-op. Pre-build KD tree in Ligands.makeLigands() before the ligand loop. Simplify KD tree usage in Atoms.dist/sqrDist by removing redundant size threshold check.	2026-03-01 20:47:56 +01:00
rdk	4240d9e5c8	Add aggregated error reporting and writeErrorCsvs convenience method Add writeAggregatedItemErrorsToCsv to Dataset.Result that groups errors by message and outputs count/error sorted by frequency. Update getErrorSummary to display an aggregated error table instead of just the count. Add writeErrorCsvs(outdir) that writes all three error files (per-item, aggregated, full stack traces) and use it in AnalyzeRoutine.	2026-03-01 19:28:12 +01:00
rdk	a8ab7e97a2	Add analyze proteins and parse-proteins commands with DataTable utility Add 'analyze proteins' command that outputs per-protein stats CSV (chain counts, residues, atoms, ligands, peptides) and a summary table with min/max/avg/median. Add 'analyze parse-proteins' for parsing all dataset items and reporting errors only. Introduce DataTable — a lock-free, pre-registered-column table for structured data collection across threads, with CSV and summary output.	2026-03-01 18:17:07 +01:00
rdk	6434a097f8	Clean up unused imports and sort import order across codebase	2026-02-26 03:46:34 +01:00
rdk	bab04a2a5e	Avoid duplicate console output: skip stdout write when log_to_console is enabled	2026-02-26 01:22:27 +01:00
rdk	e923d199e6	Add external conservation provider with cache, health check, and documentation	2026-02-26 00:07:55 +01:00
rdk	34a742cd1b	Add tests for ResidueSite and site-based evaluation	2026-02-26 00:07:55 +01:00
rdk	347d4e38d6	Implement site-metrics criteria and evaluation - Add ResidueSite and SiteLoader - Update pocket criteria (DCA, DCC, DPA, DSA, DSO, DSWO) for site evaluation - Extend Evaluation with site-metrics support - Bump version to 2.6.0-dev.1	2026-02-26 00:07:55 +01:00
rdk	bfdc87f55b	replace ArrayList<PPred> with PredictedScores parallel-array structure Memory optimization: per-prediction cost reduced from ~40 bytes (PPred object) to ~9 bytes (parallel double[] + boolean[] arrays). For large datasets this reduces prediction storage by ~77%. PredictedScores provides: ArrayList-style growth, bulk addAll via arraycopy, cached observedPositiveCount, stable descending merge sort (required for reproducible metrics with tied RF scores), and direct backing array access for hot loops in Metrics/Curves.	2026-02-26 00:07:55 +01:00
rdk	65fc8f3676	update FasterForest to 2.10.2, add Weka RandomForest conversion support	2026-02-26 00:07:55 +01:00
rdk	5ec88309ef	update FasterForest to 2.10.1	2026-02-26 00:07:55 +01:00
rdk	1f19bdd2a4	fix ModelConverterTest failing on macOS CI: skip NativePanamaFloat forest types when native library unavailable	2026-02-26 00:07:52 +01:00
rdk	2bf6bfa270	update FasterForest to 2.10.0, bump version to 2.5.2-dev.11	2026-02-23 02:18:41 +01:00
rdk	9fcce6156f	add UseCompactObjectHeaders note to local-env.sh template	2026-02-23 02:11:29 +01:00
rdk	57fb214881	update local-env.sh template with throughput-oriented JVM options	2026-02-23 01:17:32 +01:00
rdk	40c7638bc2	implement ModelConverterTest with comprehensive forest conversion tests	2026-02-23 00:54:11 +01:00
rdk	b1a05d3097	bump version to 2.5.2-dev.10	2026-02-23 00:39:54 +01:00
rdk	f3fc9329bc	update FasterForest to 2.9.1, bump JUnit Jupiter to 6.0.3, and add NativePanama flattened eval tests	2026-02-23 00:35:18 +01:00
rdk	4aaf212b9b	update FasterForest to 2.8.1 with NativePanama support and improve eval time tracking - Add NativePanamaForest/NativePanamaForestAvx2 availability checks in ModelConverter - Refactor flattening logic to separate trainable forest preparation from conversion - Track all eval times and compute average excluding first run (caching warmup) - Rename TIME_M to TIME_TRAINEVAL_AVG_M, add TIME_EVAL_AVG_M stat	2026-02-22 21:11:16 +01:00
rdk	b5a8edc377	track last evaluation time in EvalResults for seed loop benchmarks	2026-02-22 17:44:56 +01:00
rdk	3ad261645c	update FasterForest to 2.8.0 and support flattening of FlatBinaryForest models	2026-02-22 17:33:27 +01:00
rdk	8f7d71ffb3	update FasterForest to 2.7.0	2026-02-20 12:58:14 +01:00
rdk	b8f802b145	refactor model flattening to use FasterForestConverter API with configurable target types Generalize Model classifier from Classifier to Object to support both trainable classifiers and flat BinaryForest models. Add rf_flatten_target parameter for selecting forest type (FlatBinaryForest, LegacyFlatBinaryForest, InterleavedBfsForest, etc). Deprecate rf_flatten_as_legacy in favor of the new target type selection.	2026-02-16 01:00:55 +01:00
rdk	de75ac6be1	upgrade FasterForest to 2.6.0 and fix GString compilation errors Replace flat jar with local Maven repo dependency at correct path (groupId/artifactId/version/). Fix GString-to-String type errors in AnalyzeRoutine that broke compilation with @CompileStatic.	2026-02-14 07:35:29 +01:00
rdk	27caa5fe46	sort CSV output rows as strings in analyze commands (chains, chains-residues, labeled-residues)	2026-02-14 06:04:13 +01:00
rdk	93fd8e953a	add experimental rescoring model section to rescoring docs	2026-02-11 18:44:04 +01:00
rdk	ad946de45e	rephrase Requirements section in README	2026-02-11 18:31:13 +01:00
rdk	9711cc7192	fix aa-mapping docs: broken csv link, replace special characters, cleanup	2026-02-11 18:14:03 +01:00
rdk	4a42f664e2	update aa-mapping documentation: add links to pdbfixer source	2026-02-11 18:05:11 +01:00
rdk	652442a8d2	add JVM compatibility flags to run scripts, document all flags	2026-02-11 15:27:39 +01:00
rdk	e9f530ce37	make --sun-misc-unsafe-memory-access conditional on Java 23+	2026-02-11 15:24:09 +01:00
rdk	6e35db0390	update build instructions in README	2026-02-11 15:03:40 +01:00
rdk	752a645937	exclude unavailable openchart transitive dep from biojava-alignment	2026-02-11 14:36:38 +01:00
rdk	126a0653f0	move tutorials to documentation/, update rescoring tutorial and README Move misc/tutorials/ to documentation/ and add index readme. Update rescoring.md: add quick-start examples, paper links for all methods, add Pocketeer to supported methods list. Fix stale links in README.md (tutorials path, local-env.sh typo).	2026-02-11 10:52:20 +01:00
rdk	7634c57749	add pocketeer prediction loader and rescoring tutorial Add PocketeerLoader that parses pockets.json output from Pocketeer, including alpha spheres, residues, centroids, and surface atom mapping. Register "pocketeer" as a prediction method in Dataset. Add unit tests covering all 7 available datasets (CIF and PDB). Add rescoring tutorial documenting all supported methods with examples.	2026-02-11 10:22:46 +01:00
rdk	8614bed9c5	add pocketeer output examples and schema	2026-02-11 08:38:02 +01:00
rdk	65aee4cc84	add aa-mapping tutorial documenting non-canonical residue mapping feature	2026-02-11 02:01:49 +01:00
rdk	ed048ecf83	add non-canonical residue mapping (default, pdbfixer, custom CSV modes) #79	2026-02-11 01:41:55 +01:00
rdk	9c35bd542c	update export-points tutorial to document new export-points command	2026-02-10 22:17:11 +01:00
rdk	26af252659	bump version to 2.5.2-dev.6	2026-02-08 23:43:22 +01:00

1 2 3 4 5 ...

1871 Commits