p2rank

mirror of https://github.com/rdk/p2rank.git synced 2026-06-04 12:44:24 +08:00

Author	SHA1	Message	Date
rdk	1c636757d6	update CI Java version matrix: drop 23/24, add 26	2026-03-21 17:54:56 +01:00
rdk	b58726c27e	bump arrow and parquet-floor dependencies	2026-03-21 17:52:37 +01:00
rdk	0a51f504d0	bump gradle	2026-03-21 16:04:31 +01:00
rdk	a66bea74be	Add eval_output_prediction_files param to output per-protein prediction CSVs in eval commands	2026-03-17 18:59:13 +01:00
rdk	faddcfb70f	Lazy-init EnergyCalculator and LJEnergyCalculator in energy features 2.6.0-dev.7	2026-03-16 07:55:16 +01:00
rdk	48cb681aaa	Refactor DSO/DSWO: replace Tuple2 with OverlapCounts, cache counts instead of Atoms, simplify CdkUtils	2026-03-16 03:20:48 +01:00
rdk	5b4613c3a4	Extract FpocketAdHocHelper, add run_fpocket_ad_hoc param for eval-rescore and rescore commands	2026-03-16 03:20:41 +01:00
rdk	ba53b97e90	Add per-method CSVs and grouped summary to binding-site-centers, add DataTable filter/distinctValues/formatGroupedSummaryTable	2026-03-16 01:06:44 +01:00
rdk	91987129fe	Bump version to 2.6.0-dev.7	2026-03-15 21:37:05 +01:00
rdk	8852739016	Add DCC_4 protein-centric success rate metrics	2026-03-15 21:35:53 +01:00
rdk	a814157e2b	Minor cleanups: fix typos, normalize loop syntax and imports in Evaluation	2026-03-15 21:32:23 +01:00
rdk	f3616da217	Unify Protein.sites to contain all binding sites, add predictedPocket to BindingSite interface Protein.sites now holds ground-truth binding sites for both ligand-defined and explicit (residue-based) evaluation modes. Sites are populated from ligands via populateSitesFromLigands() when no explicit sites are defined. - Add predictedPocket and setSasPoints to BindingSite interface - Add predictedPocket field to ResidueSite - Rename assignPocketsToLigands to assignPocketsToSites (works on BindingSite) - Update calcCoveragesProt to use BindingSite.predictedPocket - Determine isLigandMode via instanceof instead of sites.isEmpty() - Unify PymolRenderer sites/ligands branch into single BindingSite loop - Simplify AnalyzeRoutine.cmdBindingSiteCenters to use p.sites directly	2026-03-15 21:25:49 +01:00
rdk	829cf9b8be	Return typed result objects from calcConservationStats and calcOverlapStatsForPockets	2026-03-15 20:28:51 +01:00
rdk	8a516228e1	Fix @CompileStatic errors in Evaluation: destructuring assignment, int-to-Double casts	2026-03-15 19:59:15 +01:00
rdk	5ac9aab18a	Refactor Evaluation: simplify avg/div methods, use Function instead of Closure, extract writeScoresToFileIfRequested	2026-03-15 19:27:15 +01:00
rdk	20236ef092	Refactor conservation/chains analysis, add @CompileStatic to Evaluation, rename criterium to criterion	2026-03-15 17:59:53 +01:00
rdk	d9de1fba7e	Add contact_atoms_centroid site evaluation center method for ligand-defined sites	2026-03-15 17:09:04 +01:00
rdk	49a8430a7d	Add binding-site-centers command, refactor center methods, consolidate error reporting - Rename SiteCentroidMethod to SiteCenterMethod - Extract getCenterForMethod(SiteCenterMethod) into BindingSite interface for thread-safe, param-independent center calculation - Refactor Ligand/ResidueSite getCenterForEval() to delegate to getCenterForMethod() - Add analyze binding-site-centers command comparing all center methods per site - Add Dataset.Result.writeErrorsAndGetSummary() and use it across all AnalyzeRoutine commands for consistent error reporting to both console and CSV	2026-03-14 18:22:47 +01:00
rdk	0e0cb47907	Add ca_atoms_centroid site evaluation center method with tests	2026-03-14 15:57:41 +01:00
rdk	1ecb29f876	Add load_ligands_from_separate_files param for loading ligands from individual ligand_* files	2026-03-13 18:21:26 +01:00
rdk	0b5b61304d	Add legacy conservation file name format fallback (e.g. 2ed4_A.)	2026-03-13 17:22:27 +01:00
rdk	e7fc457f6a	Fix ligand detection for BioJava GroupType misclassifications BioJava assigns GroupType based on its Chemical Component Dictionary, not structural role. Ligands in non-polymer chains can get any GroupType: - GDP, GTP, ATP -> GroupType.NUCLEOTIDE - SHR and similar -> GroupType.AMINOACID - Most others -> GroupType.HETATM Previously only HETATM groups were detected as ligands, causing errors like "Ligand definition 'GDP' matches no ligands" for nucleotide and amino acid derivative ligands. Fix: any non-water group in a NONPOLYMER chain is now a ligand candidate, regardless of GroupType. Polymer chain groups (protein AA, DNA/RNA) are only included if they have GroupType.HETATM. Add test PDB files (1a2kC.pdb with GDP, 1e5qA.pdb with SHR) and comprehensive tests for all three GroupType cases. 2.6.0-dev.6	2026-03-10 14:34:28 +01:00
rdk	d78f80ee73	Extract writeCases() method, rename sites.csv to observed_sites.csv Consolidate case CSV writing into Evaluation.writeCases(). Remove duplicate DSO_0.1 criterion and stale TODO comments.	2026-03-10 03:24:44 +01:00
rdk	838b0a697f	Fix integer division bug in DSO criterion and clean up The Jaccard ratio was computed as int/int, always producing 0 or 1, making fractional thresholds ineffective. Cast to double for correct floating-point division. Also fix typo (cahe->cache), remove debug comments, and update javadoc.	2026-03-10 02:27:11 +01:00
rdk	2de315e9e0	Rename API: PocketCriterium->PocketCriterion, getLigandAtoms->getAtoms, centroid->center - Rename PocketCriterium to PocketCriterion (fix Latin spelling) - Revert getLigandAtoms() back to getAtoms() in BindingSite interface - Rename getCentroidForEval() to getCenterForEval() - Rename explicitCentroid to explicitCenter in ResidueSite - Rename SiteCentroidMethod values: explicit_centroid->explicit, sas_points_center_of_mass->sas_points_centroid - Rename site_centroid_method param to site_eval_center_method - Ligand.getCentroid() now delegates to getCenterForEval()	2026-03-10 02:02:47 +01:00
rdk	412c590dcb	Fix CSV spacing consistency: remove padding and trailing spaces Remove leading-space padding from fmt calls in getMiscStatsCSV and FeatureImportances, fix header/data spacing mismatch in toPocketsCSV, and remove trailing space in toLigandsCSV header.	2026-03-09 13:32:51 +01:00
rdk	fdebd71daf	Add example Jupyter notebook for analyzing P2Rank output Add notebook loading _predictions.csv and _residues.csv with example data from predict_1fbl. Clean up CSV formatting: remove padding from values, add fmtCsv() without leading spaces for CSV output.	2026-03-09 12:05:00 +01:00
rdk	61b8863c27	Simplify CSV output formatting and add null guard in CsvRow Remove fixed-width column padding from PredictionSummary, fix spacing in ResidueLabelings CSV output, and add null safety in CsvRow.add().	2026-03-09 11:17:59 +01:00
rdk	42ad4dfe9f	Move centerOfMass and calculateCentroid to PerfUtils to avoid array allocation Reimplements BioJava's centerOfMass and Atoms.calculateCentroid in PerfUtils accepting Collection directly, avoiding temporary array allocation. Adds delegate methods in Struct.	2026-03-09 02:22:48 +01:00
rdk	d9b34ffbde	Bump version to 2.6.0-dev.5 and update dependencies Update parquet-floor 1.60→1.62, CDK 2.11→2.12. Add dev config.	2026-03-07 23:15:17 +01:00
rdk	af2f68e7b9	Add sites.csv to eval output and rename getAtoms() to getLigandAtoms() in BindingSite Add unified sites.csv (alongside ligands.csv) containing site type, centroid coordinates, radius, and residue counts for both ligand-defined and explicit sites. Rename BindingSite.getAtoms() to getLigandAtoms() for clarity and update all callers.	2026-03-07 22:53:03 +01:00
rdk	228cd1ab18	Fix review issues: stale comments, null centroid in closestPocket, docs - DCC: remove stale comment, avoid double-call of centroidForEval - ResidueSite/Ligand: fix stale javadocs, reference SiteCentroidMethod - Ligand: add missing getLigandAtoms() for renamed BindingSite interface - Evaluation.closestPocket(): skip pockets with null centroid - Params: document site_centroid_method default semantics - PocketRescorer: document point labeling vs DCA site representation gap	2026-03-07 20:35:01 +01:00
rdk	60225e3f1f	Add null guards for centroids in DCC and DCA criteria Prevents NPE when site centroid is null (e.g. buried residues with no SAS points when using sas_points_center_of_mass) or pocket centroid is null (e.g. PUResNetPocket).	2026-03-05 05:06:22 +01:00
rdk	7adb080022	Write error files to outdir in finalizeDatasetResult Write errors.csv, errors_aggregated.csv, and errors_full.txt.gz with full stack traces when processing errors occur. Also rename pockets.csv to predicted_pockets.csv in eval results output.	2026-03-05 04:16:44 +01:00
rdk	ed8e9cabe9	Add configurable site centroid method and SAS-as-atoms option for evaluation Add SiteCentroidMethod enum with support flags for ligand/explicit sites. Rename ResidueSite.centroid to explicitCentroid, add getCentroidForEval() to BindingSite interface used by DCC. Add site_eval_sas_pts_as_atoms param to allow DCA to use SAS points instead of atoms for site representation.	2026-03-05 02:16:55 +01:00
rdk	ea0968816b	Render predicted pocket and explicit site centroids in PyMOL renderer Render predicted pocket centroids with individual pocket colors and explicit site centroids (or ligand centroids as fallback) as hotpink spheres, controlled by vis_site_centers param.	2026-03-04 21:37:56 +01:00
rdk	22ac1e51ee	Fix DCC criterion to use predefined site centroid for ResidueSites DCC was computing site.atoms.centroid (geometric center of resolved residue atoms) instead of site.getCentroid(). For ResidueSites this returns the predefined centroid from the input file, which is the authoritative binding site location. For Ligands this changes from geometric to mass-weighted center (negligible difference).	2026-03-04 04:00:18 +01:00
rdk	53500dd129	Fix SAS point classification stats for explicit-site datasets and improve cluster logging - PocketRescorer: fall back to explicit site residue atoms for point labeling when no ligand atoms are available, fixing 0-positives in binary classification stats for site-based eval-predict - SLinkClustererV2: log cluster count and sizes instead of full contents	2026-03-04 03:55:26 +01:00
rdk	c9ad8f71ff	Add vis_site_centers param for rendering site/pocket centroids in PyMOL - New vis_site_centers param (default false) renders centroids as hotpink pseudoatom spheres in both old (PymolRenderer) and new (NewPymolRenderer) - Pass site centroids via RenderingModel.siteCentroids for analyze command - Old renderer shows predicted pocket centroids and ligand centroids - Fix empty visualizations/ dir in eval-predict: create vis dir under predDir instead of top-level outdir	2026-03-04 02:47:43 +01:00
rdk	d5715d9797	Fix PyMol renderer: bulk selections, CIF-to-PDB conversion, site-based labeling - Use bulk atom ID selections instead of per-residue named selections to avoid exceeding PyMOL's object limit on large proteins - Convert CIF inputs to PDB format with correct .pdb extension (PyMOL can't reliably parse BioJava CIF and uses extension to pick parser) - Rename PyMOL object from "protein" to "prot" to avoid reserved keyword - Fix null interpolation in PML when no ligands or no labeling - Build BinaryLabeling from explicit site residues for visualization (item.binaryLabeling doesn't support site-based datasets)	2026-03-04 01:13:48 +01:00
rdk	026be7eae5	Improve analyze binding-sites: visualizations, site radius, eager loading - Add PyMol visualizations using dataset.binaryResidueLabeler - Add site_radius column (max distance from centroid to any site atom) - Add excludeFromSummary param to DataTable.formatSummaryTable to skip center coordinates from numeric summary stats - Load ExplicitSitesIndex eagerly during dataset loading (fail-fast) - Skip CSV rows with empty residue/coordinate fields in AhojUbsSiteParser - Write items without binding sites to separate file in outdir	2026-03-03 21:58:55 +01:00
rdk	9e9a500836	Bump version to 2.6.0-dev.4	2026-03-03 15:00:46 +01:00
rdk	c9fef83950	Use AtomKdTree interface in Atoms and minor cleanups Switch Atoms.kdTree field and buildKdTree() to use the AtomKdTree interface instead of AtomKdTreeV1 directly. Add @NonNull to iterator(), improve initial capacity estimates, and fix whitespace.	2026-03-03 15:00:36 +01:00
rdk	997727e878	Add explicit sites loading and analyze binding-sites command Implement ExplicitSitesIndex for loading binding site definitions from external CSV files (pluggable format system, first format: ahoj_ubs). Sites are resolved during item loading via DatasetItemLoader. Add 'analyze binding-sites' sub-command producing unified CSV and summary stats for both ligand-based and explicit site datasets, with unresolved residue/site tracking for explicit datasets. Remove unused SiteLoader stub.	2026-03-03 15:00:31 +01:00
rdk	8f5da9fdcd	Add fused addWeighted and O(N²) single-linkage clusterer Add GenericVector.addWeighted() for fused multiply-add, eliminating per- neighbor array allocation in feature vector aggregation. Add SLinkClustererV2 using union-find with path compression, reducing single-linkage clustering from O(N³) to O(N²). Wire V2 via factory methods on AtomClusterer and AtomGroupClusterer.	2026-03-03 05:13:05 +01:00
rdk	261dae09c9	Rename consolidate() to sparsify() and add surface_sparsify param Rename Atoms.consolidate() to Atoms.sparsify() for clarity. Use mutable V1 KdTree for O(N log N) incremental insertion instead of periodic rebuilds. Add surface_sparsify runtime param (default true) to allow disabling surface point sparsification. Hardcode AtomKdTreeV1 in Atoms.buildKdTree() and delegate Dataset cache clearing to item methods.	2026-03-03 04:01:54 +01:00
rdk	a66a973e1c	Refactor KdTree into AtomKdTree interface with V1/V2 implementations Rewrite AtomKdTreeV1 from Groovy to Java to eliminate Groovy IndyInterface monitor contention that serialized 16 parallel threads down to ~2. Move V1 KdTree into v1/ subpackage, extract AtomKdTree as a Java interface with factory method dispatching by kdtree_implementation param, and rename the old v2 wrapper to AtomKdTreeV2 implementing the same interface.	2026-03-03 00:17:38 +01:00
rdk	6d47285116	Add kdtree_implementation param and fix quickselect duplicate-key hang Add runtime parameter to switch between KdTree3D (default) and v1 AtomKdTree. Fix O(N²) quickselect degeneration on duplicate coordinates by adding post-partition equal-range scan.	2026-03-02 22:20:59 +01:00
rdk	24b9f5f709	Optimize KdTree3D build: bottom-up bounds, eliminate redundant traversals - Bounding boxes computed bottom-up from leaf scans instead of scanning full data range at every tree level (O(N) vs O(N log N)) - Approximate parent bounds passed down for split-axis selection (O(1) per node instead of O(range) scan) - Remove findNodeCount() and dead code; buildNode returns max index - Resolve split-axis array once in quickselect inner loop	2026-03-02 21:15:41 +01:00
rdk	76026b9297	Refactor Dataset item cache clearing and fix processItem typo Rename processssItem to processItem. Add per-item conditional cache clearing after processing to reduce peak memory. Refactor cleanCaches into clearCache/clearPrimaryCache/clearSecondaryCache with null-safety.	2026-03-02 20:52:07 +01:00

1 2 3 4 5 ...

1918 Commits