PLAN.md and SPEC.md were pre-implementation design docs for the pocket-grid
feature. The feature has shipped, so they're frozen artifacts in the active
todo/ namespace. Delete them and strip the three "see SPEC.md" comments that
pointed at SPEC.md from Main.groovy and the predict/rescore routines.
Also reassess the PyMOL rank-gap entry in the audit: P2Rank ranks pockets
contiguously throughout the predict path and all in-tree loaders (except
SiteHoundLoader), so the previously-listed "renderer ignores rank gaps" is
cosmetic-only (empty objects in the Models panel for small pockets whose
filled BitSet ended up empty). Downgrade to a parity nit under
Inconsistencies; promote the PUResNet surfaceAtoms re-linking to the Top-5.
Pure technical cleanup, not a perf win — savings are microseconds per
protein. The useful artifact is item 2 below: bytecode verification that
the existing Groovy/BitSet workaround is still needed.
- MorphologicalCloser: pre-allocate the two per-iteration BitSets and
reuse via swap+clear. Zero BitSet allocations inside the loop (vs
two per iter previously).
- PocketGridRows: tried replacing the manual BitSet-OR loop with a
direct .or(bs) call. Bytecode inspection showed Groovy dispatches it
under @CompileStatic to DefaultGroovyMethods.or(BitSet, BitSet) which
RETURNS a new BitSet rather than mutating in place — test failed.
Reverted; updated the comment with the verification and the escape
hatch (move the block into a Java helper if we ever want BitSet#or).
- PocketGridChimeraXRenderer: palette color loop iterates the present
rank set (perPocketBasenames.keySet()) instead of dense 1..maxRank,
matching the layer loops below and avoiding unreferenced color
definitions for missing ranks.
- PocketDescriptorsRows: replaced `pockets.any { ... }` Groovy closure
with manual loop under @CompileStatic — consistent with the rest of
the constructor and one fewer per-protein closure allocation.
- DescriptorListValidator: HashSet → LinkedHashSet for the dedup
tracker. Tiny UX improvement (deterministic order in any future
multi-duplicate debug output).
Output byte-identical end-to-end; full test suite green.
Tier 3 reuse refactor: collapse ~120 lines of duplication across the
descriptor framework. Composition over inheritance throughout — no
public API change, no behavior change (smoke run output byte-identical).
NamedRegistryHelper<T> (new, generic):
- Composition helper for name-keyed registries. Both descriptor
registries (per-pocket and per-grid-point) now delegate register/
unregister/get/knownNames to one shared helper, keeping their public
static API. Per-registry invariants (the size/dup-cols check) stay
in each registry's private validate() and plug in via a Consumer<T>
hook. PocketDescriptorRegistry shrinks ~80→55 lines;
PocketGridPointDescriptorRegistry shrinks ~75→55.
DescriptorSchemaHelper.appendColumns (new):
- Single point where the "{name}.{col}" multi-column header rule lives.
Both PocketDescriptorsRows and PocketGridRows route schema build
through it. Interface-agnostic (takes name + colNames + colTypes
directly), so it works for both descriptor types without coupling.
GridPointStats.centroid (new):
- Static helper for the centroid loop duplicated across
SphericityDescriptor, RadiusOfGyramentDescriptor, and
PrincipalMomentsDescriptor. Three descriptors each had the same
BitSet → allPoints centroid pass; now one method call.
Skipped from the same plan (per Tier-3+4 reconsideration):
- vis_renderers validator merge (item 13): semantic mismatch
(null handling, error wording) makes the abstraction lossy.
- AbstractVolsiteGridPointDescriptor base (item 16): two impls is
below the threshold where a shared base earns its keep.
- Pre-classify protein atoms, per-point cache, Params hoist
(items 18-20): real wins on the volsite hot path but speculative
without a benchmarked workload. Defer until someone reports
volsite descriptor compute as a bottleneck.
Bug fix:
- PrincipalMomentsDescriptor.clampNonNegative now also clamps NaN. The
v<0 check was false for NaN, so a NaN eigenvalue (possible if a future
code path bypasses GridGenerator.isFiniteBox) would have propagated
to the CSV output.
Doc refresh:
- breaking-changes.md: 2.6 entry for the multi-column descriptor
migration + the -vis_pocket_grid / pocket_grid_vis_* renames.
- export-pocket-descriptors.md: step 4 rewrites a self-contradicting
rationale — adding to the default list IS a breaking change for
index-based parsers; recommends parse-by-name + breaking-changes.md
note for future additions.
- export-pocket-grid.md: added "Adding a new per-grid-point descriptor"
recipe (parallel to the per-pocket one); unified √3/2 precision to
0.866 across docs and Params.groovy.
- README.md: added an "Opt-in tabular exports" subsection mentioning
-export_pocket_descriptors, -export_pocket_grid, -vis_pocket_grid.
- testsets.sh "Full descriptor menu" now lists all seven shipped
descriptors (was six).
Exception taxonomy:
- PocketDescriptorsRows.groovy and PocketGridBuilder.java now throw
PrankException (was IllegalArgumentException) for user-facing config
errors, matching the rest of the codebase.
Registry hardening:
- Both PocketDescriptorRegistry and PocketGridPointDescriptorRegistry
now assert columnNames.size() == columnTypes.size() in register().
A future descriptor with mismatched lists fails fast at class-load.
Quality fixes:
- PocketGridRows.getColumn uses BASE_COLS-1 instead of literal 3 for
the pocket column. Removed dead 2-arg PocketGridRows constructor
(only 3 test sites used it; now inlined).
- PocketGridPointContext gets a compact-constructor validator that
rejects negative pointIndex/pocketRank, limiting blast radius of an
int-arg swap.
Test hardening:
- VolsiteSmoothGridPointDescriptorTest + VolsiteGridPointDescriptorTest
now pin sigma/radius in @BeforeEach AND restore in @AfterEach, so
the Params singleton is clean for subsequent test classes.
- New tests: HIS ND1 double-flag (single atom setting donor+acceptor),
PrincipalMoments at cardinality=2, PrincipalMoments two coincident
points, GridGenerator NaN-box throw, PocketDescriptorRegistry
register/unregister round-trip, MorphologicalCloser maxIters=1.
- Renamed respectsMaxIters → maxItersZeroIsNoOp (the test only covered
the maxIters=0 case despite the general name); added maxIters=1
companion that verifies one iteration of fill actually runs.
- Extracted RendererTestFixtures.tinyGrid (was byte-identical in both
renderer test files); unified the volsite atomAt signatures so the
parameter order can't get swapped between the two volsite tests.
- Params.groovy: pocket_descriptors javadoc now lists all 7 shipped
descriptors (was: 6); softens the "essentially free" rationale to
acknowledge principal_moments' small eigendecomposition cost.
- PocketDescriptorsTest.groovy: class javadoc "six shipped descriptors"
→ "seven", names principal_moments alongside the rest.
- export-pocket-descriptors.md: "6 base shipped descriptors use this
adapter" → "6 of 7 use the adapter; principal_moments (multi-column)
implements PocketDescriptor directly". Removes a misleading count.
- export-pocket-{grid,descriptors}.md: default-list rationale no longer
claims adding descriptors is "essentially free" — clarifies that
grid-derived scalars are cheap once the grid is built but
principal_moments adds a small per-pocket compute on top, still
negligible vs the grid build.
Caught by deep audit of 60220d7a..73e7c9df focused on doc/comment drift
after the recent multi-column interface migration.
Unifies the per-pocket descriptor framework with the per-grid-point
framework: same shape (name + columnNames + columnTypes + double[]
compute), same multi-column "{name}.{col}" header convention, same
public register / unregister / dup-column-check registry. Shipped as
breaking change behind the same -pocket_descriptors knob.
Interface change:
String name();
List<String> columnNames();
List<ColumnType> columnTypes();
double[] compute(PocketGridContext);
boolean needsGrid(); // unchanged
Scalar descriptors stay one-liners via the new
AbstractScalarPocketDescriptor adapter (name + scalarType +
computeScalar). The 6 existing descriptors migrated; behavior and
output byte-identical to before.
New descriptor: PrincipalMomentsDescriptor (3 × DOUBLE) — the three
eigenvalues of the pocket grid points' gyration tensor, sorted
descending. Implementation uses Apache Commons Math 3
EigenDecomposition. Shape signature complement to sphericity /
radius_of_gyration; sum equals radius_of_gyration² (verified in test).
Added to the default -pocket_descriptors list.
Default list reordered to put num_* (cheap, integer-valued) first,
then geometric scalars, then principal_moments:
num_residues, num_surface_atoms, num_grid_points,
volume, sphericity, radius_of_gyration,
principal_moments
Tests:
- 5 new PrincipalMomentsDescriptor tests (cube isotropy, rod-shape
eigenvalues, sort order, degenerate empty/single, sum=Rg²)
- PocketDescriptorsRowsTest +2 (multi-column prefix rule, mixed
scalar + multi ordering)
- existing 13 callsites updated for the double[] return signature
- columnType() registry test → columnTypes()
User-visible change: the default -pocket_descriptors output now has
three new columns (principal_moments.lambda1/2/3) and the existing
columns appear in a different order. Scripts parsing by column name
are unaffected; scripts parsing by column index need updating.
Bug fixes:
- MorphologicalCloser: gate the "didn't converge" warning on maxIters>0.
maxIters=0 is a valid "disable fill" config and would otherwise log
spuriously on every protein.
- GridGenerator: hoist the isFiniteBox NaN guard into the (Box, edge)
ctor so both sampleGridPointsBetween and sampleGridPointsAroundAtoms
are covered (the second sampler was previously unguarded — used by
the training/feature path).
- PocketGridPdbSidecar.writePerPocket: serial-wrap warning added for
parity with the combined write() path.
Test hardening:
- PocketGridPointDescriptorRegistry: add unregister() so tests can
clean up fixture registrations; PocketGridRowsTest now @AfterAll
unregisters its scalar fixture so it doesn't leak into the JVM-wide
registry.
- VolsiteSmoothGridPointDescriptorTest: pin sigma via @BeforeEach so
other tests mutating the Params singleton can't shift expectations;
new weightAtExactCutoffEqualsExpMinusEight test pins the 4σ-inclusive
cutoff semantic (cutoutSphere is inclusive; exp(-8) ≈ 3.354e-4).
Docs / clarifications:
- Params.pocket_grid_point_descriptors javadoc: the silent-ignore when
-export_pocket_grid=false is intentional (symmetric with
-pocket_descriptors / -export_pocket_descriptors).
- PocketDescriptor javadoc: intentionally scalar-only; recommend
unifying with PocketGridPointDescriptor if multi-col is ever needed
rather than ad-hoc extending this one.
- PocketGridPointDescriptor javadoc: needsGrid() is intentionally
absent — every grid-point descriptor needs the grid by definition.
- documentation/export-pocket-grid.md: explain the default-empty
rationale (cost: per-row × per-atom, not backward-compat).
- VdwRadiusTable.resolveSymbol: comment that the name-prefix isotope
branch is a safety net, not a semantic mapping (e.g. "DA" in DNA
isn't deuterium).
Adds focused regression tests for the new framework: 11 tests in three
new files plus 4 added to PocketGridRowsTest.
PocketGridRowsTest +4
- descriptor schema uses "{name}.{col}" prefix for multi-col
- getRow appends descriptor values after the base 4 columns
- unknown descriptor name throws at construction
- scalar descriptor emits bare name() with no prefix (uses an
inline ScalarTestDescriptor registered via the now-public
registry hook — none of the shipped descriptors are scalar so
the branch was untested)
VolsiteGridPointDescriptorTest (new, 4 tests)
- covers indicator aggregation + radius cutoff
VolsiteSmoothGridPointDescriptorTest (new, 4 tests)
- covers Gaussian kernel arithmetic + 4σ cutoff
PocketGridPointDescriptorRegistryTest (new, 2 tests)
- shipped names resolve, unknown name throws helpful error
DescriptorListValidatorTest (new, 8 tests)
- null/empty/valid/unknown/duplicate/null-entry/blank/dash-prefix
Refactors Main.validateDescriptorList out to a self-contained Java
utility (DescriptorListValidator) under predict/output/. The two call
sites in Main.validatePocketGridParams now invoke the static helper;
the private helper in Main is removed (-37 lines).
PocketGridPointDescriptorRegistry.register is promoted from private to
public so tests (and future external descriptor plugins) can add
descriptors without touching the registry's static initializer. The
shipped registrations still happen at class-load.
Adds an opt-in extension to the pocket-grid export — extra columns per
(point, pocket) row driven by a registry of per-grid-point descriptors.
Mirrors the existing per-pocket descriptor framework (interface, context
record, static registry, name-driven CLI selection).
CLI:
-pocket_grid_point_descriptors list, default []
-pocket_grid_volsite_radius 4.0 Å (volsite indicator cutoff)
-pocket_grid_volsite_sigma 2.0 Å (volsite_smooth Gaussian σ)
Shipped descriptors (both 6-column, prefixed `{name}.`):
volsite INT 0/1 per pharmacophore type within radius
volsite_smooth DOUBLE Gaussian-weighted sum, kernel truncated at 4σ
Atom-level pharmacophore classification reuses VolSitePharmacophore — a
1 in volsite.vsCation here matches a 1 in vsCation from VolsiteFeature.
The 6 VolSite column names now live as VolSitePharmacophore.COLUMN_NAMES
(single source of truth, also used by VolsiteFeature). VolSitePharmacophore
gains a getAtomProperties(Atom) overload that does the PdbUtils hop.
Validation: -pocket_grid_point_descriptors goes through a new shared
validateDescriptorList(names, known, paramName) helper in Main, which
also replaces the open-coded equivalent for -pocket_descriptors. The
two new numeric params are bounds-checked.
- ChimeraX renderer: surfaces-layer rename now iterates the actual rank
set (perPocketBasenames.keySet) instead of 1..maxRank. The previous
code assumed every rank produces a ChimeraX submodel; a rank-skip
would mis-target the rename. Latent today (P2Rank reorders pockets
contiguously) but the assumption is now explicit in the code.
- PdbSidecar: warn when total grid atoms exceed the PDB 5-digit serial
column (wrap still happens; the warning surfaces the limit so users
with very fine grids know why bond-inference tools might misbehave).
- MorphologicalCloser: warn when loop exits at maxIters without
converging, naming the param to raise. Previously silent.
- GridGenerator: throw early on non-finite SAS-point bounding box.
IEEEremainder(NaN, edge) = NaN would otherwise produce a NaN-everywhere
lattice from a broken PDB.
- VdwRadiusTable: map D/T isotopes to H before CDK lookup. Previously
fell through to carbon (1.7 Å instead of hydrogen's 1.2 Å); marginal
effect because of the atom_buffer cushion but no reason to be wrong.
- PocketDescriptorsRows: throw at construction if grid==null and any
selected descriptor declares needsGrid()=true, instead of NPEing
inside compute(). The upstream gate in PocketGridOutputs already
honors this; the guard catches programming errors elsewhere.
- testsets.sh: 4 sites still invoking -export_pocket_grid_pml after the
rename; they were hard-failing at startup.
- PocketGridPymolRenderer javadoc: pocket_dens_N -> pocket_gauss_N (3
refs), pocket_vol_N default ON not OFF (changed long ago in 82daf58a).
- documentation/export-pocket-grid.md: vis_pocket_grid_volume_radius
default is the -1 sentinel, not the auto-scaled 1.02 Å; ChimeraX layers
doc now shows the #99 (spheres) + #100 (surfaces) split.
- Main.validatePocketGridParams: numeric range checks for spacing,
max_dist, atom_buffer, assign_cutoff, fill_min_neighbors (must lie in
the 26-neighborhood), fill_max_iters, vis_pocket_grid_volume_radius
(-1 sentinel or strictly positive), and gaussian_iso. Catches values
that would otherwise produce a NaN lattice, empty grid, or garbage
passed to PyMOL/ChimeraX.
Both fpocket and Seq2Pocket loaders could previously produce a pocket
with a null centroid that NPEs downstream feature extraction:
- FPocketLoader: skip the pocket if its voronoi-centers het group is
empty (Atoms.centerOfMass returns null on empty list). Guard runs
before rank assignment so surviving ranks stay sequential.
- Seq2PocketLoader: skip the pocket if the input named atom serials
but none resolved against queryProtein.allAtoms (otherwise the
pocket would carry empty surfaceAtoms and null centroid). Real
inputs rarely trigger this; synthetic test covers it.
Neither path is expected with well-formed input; both fixes are
defensive.
Parses per-protein <ID>_predictions.txt (semicolon CSV) and resolves
atom_ids against queryProtein.allAtoms by PDB serial. Empty/header-only
files produce 0 pockets gracefully. Prediction is bound to the
caller-supplied queryProtein, avoiding the ConcavityLoader bug class.
- Dataset.groovy: new case "seq2pocket"
- README.md: list SwinSite and Seq2Pocket in rescoring methods;
cite pocketeer.ds + swinsite.ds in test_data/ examples
- CLAUDE.md: note that distro/README.md is a transient build artifact
- Test fixtures: 5 real predictions under distro/test_data/, plus
unsorted/header-only/path-independence variants under src/test/resources/
- Seq2PocketLoaderTest: 10 tests, all passing
- GenericVector.toList(): replace deprecated DefaultGroovyMethods.toList
(Groovy 5) with a plain Java loop; drop unused addTo() (no callers)
- Atoms(List<? extends Atom>): @SuppressWarnings("unchecked") for the
intentional wrap-without-copy
- KdNode.splitLeafNode: @SuppressWarnings("unchecked") for casts from
the Object[] backing store
- Drop dead mask_unknown_residues=true from default(_rescore).groovy
(param removed from Params.groovy in 1b7809a6, 2019; configs missed)
- Rewrite distro/models/readme.md to match models on disk (add rescore_2024,
rescore_conservation; remove nonexistent conservation.model)
- Remove broken documentation/rescoring.md link from distro/README.md
- distro/config/readme.md: drop nonexistent working.groovy reference,
fix github link master->develop
- Delete dead commented-out method bodies in PdbUtils, RPlotter,
PredictionVisualizer
- Fix typo in Main.groovy javadoc
Bumps faster-molecular-surface 1.0 -> 1.1, vendored in
lib/local-mvn-repo/. The 1.1 release adds a VdW radius fallback for
elements whose CDK Elements enum entry is null (Co, Ni, Cu, Rh, Os, Ir,
plus radioactive/synthetic). Without the fix, cobalamin-bearing
structures crashed surface computation under -cofactors.
PatchedCdkNumericalSurface wraps the default CDK NumericalSurface (used
when -use_optimized_surface 0) with the same fallback, via a Krypton
proxy for null-VdW atoms. Surface.groovy switched over to it. Unit tests
mirror the FMS-side regressions.
AnalyzeRoutine.cmdCofactors: replace Struct.getHetGroups with
Struct.getLigandGroups (2 call sites) so GDP/GTP/ATP and other groups
that BioJava classifies as NUCLEOTIDE/AMINOACID don't get falsely
reported as "name not in structure" in cofactor_matches.csv or omitted
from het_groups.csv. Mirrors the M1 fix applied earlier to
CofactorHandler.extractCofactorAtoms.
testsets.sh: new cofactors_full() function exercising the cofactor
demo + full datasets in p2rank-datasets2/other/cofactors/ (predict,
analyze cofactors, -aa_mapping composition, visualizations,
export-points). Uses -fail_fast 1 so per-structure errors surface as
test failures rather than silent skips.
The -cofactors flag and dataset cofactors column accept LigandDefinition
specifiers ("FAD", "FAD[atom_id:N]", "FAD[contact_res_ids:A_T259,A_D246]").
Matched HET groups merge into the protein surface (proteinAtoms) and are
excluded from ligand listings; per-item resolution lets a dataset column
override the global Params.cofactors.
New: analyze cofactors subcommand (HETATM survey + specifier dry-run),
PyMOL teal-stick visualization (vis_highlight_cofactors), distant-cofactor
and chain-excluded WARN diagnostics, aa_mapping collision WARN (R19),
drop-in safety benchmark with byte-equality on a never-present specifier.
Documentation in documentation/cofactors.md (user-facing) and
documentation/dev/cofactors.md (engineering record with R1-R24 design choices
and post-merge audit fixes). Tests in CofactorHandlerTest,
CofactorIntegrationTest, CofactorPipelineTest, CofactorAnalyzeTest,
DataTableCsvTest plus a Log4jCapture test helper.
Registers `swinsite` as a third-party predictor in Dataset.groovy. The
loader reads grid<N>_score_<float>.mol2 (raw voxel points) per pocket,
parses score from the filename, computes pocket centroid from the grid,
and derives surfaceAtoms via cutoutShell against queryProtein.exposedAtoms
(4.5 -> 10 A expanding shell), mirroring ConcavityLoader.
Reads grid mol2 instead of pocket mol2: pocket mol2 atoms are standalone
copies with chain reset to 'A' and synthetic residue names, so they break
P2Rank's residue/conservation/ASA feature lookups. Grid + cutoutShell
keeps surfaceAtoms bound to real queryProtein atoms.
Mol2 parsing is a small inline @<TRIPOS>ATOM scan rather than CDK's
Mol2Reader: CDK has a lazy-init race in AtomTypeFactory that NPEs under
parallel dataset processing.
Ships swinsite.ds plus 6 protein PDBs (1tjw_A from SwinSite's
test_protein_only example, plus 1a26A/1a2kC/1afkA/1atlA/1bqoB from
coach420) covering 1/2/3/4/6-pocket cases. 1atlA's on-disk N-order is
non-monotonic in score (0.7288, 0.0664, 0.3433), exercising the rerank.
SwinSiteLoaderTest covers all six fixtures plus the
predictionIsBoundToQueryProtein contract and empty-dir tolerance.
The points export (predict/rescore -export_points 1) now includes an
integer 'pocket' column matching newRank in *_predictions.csv, so users
can directly aggregate per-pocket descriptors without a spatial join.
Standalone 'export-points' (no prediction) omits the column.
Pocket-extension shells can overlap, so a single SAS point can sit in
multiple pocket.labeledPoints lists. Previously the assignment loop
last-write-wins gave the worst rank to shared points, which was
counter-intuitive for both visualization (PredictionVisualizer PDB
output) and descriptor aggregation. PocketRescorer.setNewRanks now
iterates pockets best-first with a guard, so the lowest newRank wins;
the redundant lp.pocket write in PocketPredictor is removed.
TableData gains a per-column ColumnType (DOUBLE default, INT) so
TableExporter emits true integers in CSV (no decimals), Arrow (Int32),
and Parquet (INT32) for the pocket column.
Bump version to 2.6.0-dev.8.
ConcavityLoader.loadPrediction was ignoring its queryProtein parameter
and binding the returned Prediction to a Protein loaded from
*_residue.pdb (a pocket-touching residue subset, not the full protein).
Downstream features keyed on prediction.protein.fileName then resolved
against the wrong basename — most visibly conservation lookup, which
searched for "<ID>_<submethod>_residue_<chain>.hom" instead of
"<ID>_<chain>.hom" and silently produced zero conservation features.
Other feature extractors were similarly reading the truncated atom set.
The residue subset is still loaded and used to define the per-pocket
surface-atom shell (no behaviour change there), but the Prediction is
now bound to queryProtein, matching FPocketLoader and PUResNetLoader.
Add ConcavityLoaderTest plus a matching test in FPocketLoaderTest that
assert the loader-contract invariant prediction.protein === queryProtein.
PUResNet pocket PDBs occasionally left-shift the residue insertion code
into column 26 instead of column 27, breaking BioJava's strict resSeq
parser with NumberFormatException and silently dropping affected
predictions (216 of 9955 entries on holo4k+pdbbind2020).
Add PUResNetPdbRepair which detects the malformed pattern and rewrites
it in memory before parsing. Wire PUResNetLoader through it. PdbUtils
and the rest of the load path are unchanged.
- Replace manual line.split(",") with Apache Commons CSV (column-name access)
- Support both reduced (9-col) and full (59-col) ahoj_ubs CSV formats
- Add AhojSiteInfo: typed data class for 14 pocket metadata fields
- Add secondaryData map to ResidueSite for extensible metadata
- Export AhojSiteInfo columns in observed_sites.csv when available
- Add comprehensive parser tests for both CSV formats
- Add test data files and format documentation
Protein.sites now holds ground-truth binding sites for both ligand-defined
and explicit (residue-based) evaluation modes. Sites are populated from
ligands via populateSitesFromLigands() when no explicit sites are defined.
- Add predictedPocket and setSasPoints to BindingSite interface
- Add predictedPocket field to ResidueSite
- Rename assignPocketsToLigands to assignPocketsToSites (works on BindingSite)
- Update calcCoveragesProt to use BindingSite.predictedPocket
- Determine isLigandMode via instanceof instead of sites.isEmpty()
- Unify PymolRenderer sites/ligands branch into single BindingSite loop
- Simplify AnalyzeRoutine.cmdBindingSiteCenters to use p.sites directly
- Rename SiteCentroidMethod to SiteCenterMethod
- Extract getCenterForMethod(SiteCenterMethod) into BindingSite interface
for thread-safe, param-independent center calculation
- Refactor Ligand/ResidueSite getCenterForEval() to delegate to getCenterForMethod()
- Add analyze binding-site-centers command comparing all center methods per site
- Add Dataset.Result.writeErrorsAndGetSummary() and use it across all
AnalyzeRoutine commands for consistent error reporting to both console and CSV
BioJava assigns GroupType based on its Chemical Component Dictionary,
not structural role. Ligands in non-polymer chains can get any GroupType:
- GDP, GTP, ATP -> GroupType.NUCLEOTIDE
- SHR and similar -> GroupType.AMINOACID
- Most others -> GroupType.HETATM
Previously only HETATM groups were detected as ligands, causing errors
like "Ligand definition 'GDP' matches no ligands" for nucleotide and
amino acid derivative ligands.
Fix: any non-water group in a NONPOLYMER chain is now a ligand
candidate, regardless of GroupType. Polymer chain groups (protein AA,
DNA/RNA) are only included if they have GroupType.HETATM.
Add test PDB files (1a2kC.pdb with GDP, 1e5qA.pdb with SHR) and
comprehensive tests for all three GroupType cases.
The Jaccard ratio was computed as int/int, always producing 0 or 1,
making fractional thresholds ineffective. Cast to double for correct
floating-point division. Also fix typo (cahe->cache), remove debug
comments, and update javadoc.
- Rename PocketCriterium to PocketCriterion (fix Latin spelling)
- Revert getLigandAtoms() back to getAtoms() in BindingSite interface
- Rename getCentroidForEval() to getCenterForEval()
- Rename explicitCentroid to explicitCenter in ResidueSite
- Rename SiteCentroidMethod values: explicit_centroid->explicit,
sas_points_center_of_mass->sas_points_centroid
- Rename site_centroid_method param to site_eval_center_method
- Ligand.getCentroid() now delegates to getCenterForEval()
Remove leading-space padding from fmt calls in getMiscStatsCSV and
FeatureImportances, fix header/data spacing mismatch in toPocketsCSV,
and remove trailing space in toLigandsCSV header.
Add notebook loading _predictions.csv and _residues.csv with example
data from predict_1fbl. Clean up CSV formatting: remove padding from
values, add fmtCsv() without leading spaces for CSV output.