Drop frozen pocket-grid PLAN/SPEC; refine audit punch-list

PLAN.md and SPEC.md were pre-implementation design docs for the pocket-grid feature. The feature has shipped, so they're frozen artifacts in the active todo/ namespace. Delete them and strip the three "see SPEC.md" comments that pointed at SPEC.md from Main.groovy and the predict/rescore routines. Also reassess the PyMOL rank-gap entry in the audit: P2Rank ranks pockets contiguously throughout the predict path and all in-tree loaders (except SiteHoundLoader), so the previously-listed "renderer ignores rank gaps" is cosmetic-only (empty objects in the Models panel for small pockets whose filled BitSet ended up empty). Downgrade to a parity nit under Inconsistencies; promote the PUResNet surfaceAtoms re-linking to the Top-5.
2026-06-04 12:44:24 +08:00 · 2026-05-20 19:42:47 +02:00
parent 40d333f77f
commit af1d6eeb18
6 changed files with 19 additions and 822 deletions
--- a/misc/dev/audit-post-2.5.1.md
+++ b/misc/dev/audit-post-2.5.1.md
@@ -33,14 +33,6 @@ focused cleanups.
  validate. Fix: `volatile` + DCL, or do one-shot init under a synchronized guard
  in `preProcessProtein`; update the concurrency test to construct under contention.

- **PyMOL pocket-grid renderer ignores rank gaps.**
-  `src/main/groovy/cz/siret/prank/program/visualization/renderers/PocketGridPymolRenderer.groovy:167,201,242`
-  loop `for (rank = 1; rank <= maxRank; rank++)`. If a pocket rank is missing
-  (`{1,3}`), the script emits an empty `pocket_grid_2` and shifts palette indices.
-  ChimeraX was fixed in commit `a3efd084`; PyMOL still has the latent bug.
-  Comment at lines 131-132 even acknowledges it. Fix: iterate
-  `grid.pocketToPointIndices.keySet()` like ChimeraX does.
-
 - **Coulomb plumbing is dead code.**
  `EnergyCalculator.getAtomCharge` always returns 0
  (`src/main/groovy/cz/siret/prank/features/implementation/energy2/calc/EnergyCalculator.groovy:351-357`).
@@ -161,6 +153,17 @@ focused cleanups.
  everything into a `TreeMap` (`EvalResults.groovy:189`) — insertion order is
  lost. Either drop the misleading comment or use `LinkedHashMap` downstream.

+- **PyMOL pocket-grid renderer iterates `1..maxRank`; ChimeraX iterates
+  `perPocketBasenames.keySet()`.**
+  `PocketGridPymolRenderer.groovy:167,201,242`.
+  Cosmetic-only: P2Rank ranks pockets contiguously (every `predict`-path and
+  in-tree loader except `SiteHoundLoader` assign `i++`/`rank++`), and the
+  sidecar PDB strips ranks whose `filled` BitSet is empty. PyMOL therefore
+  emits empty `pocket_grid_N`/`pocket_vol_N`/`pocket_gauss_N`/`pocket_hull_N`
+  objects when the assigner produced no points for a small pocket — they
+  render as invisible but clutter the Models panel. Mirror the ChimeraX
+  iteration pattern (`a3efd084`) for parity; not a correctness fix.
+
 - **PyMOL grid `solvent_radius=0` vs ChimeraX non-zero probe.**
  `PocketGridPymolRenderer.groovy:189-190` vs `PocketGridChimeraXRenderer.groovy:264`.
  `vis_pocket_grid_volume_radius` means different things to the two renderers.
@@ -219,9 +222,6 @@ focused cleanups.
 - **`documentation/readme.md`** index misses `cofactors.md`, `conservation.md`,
  `export-pocket-grid.md`, `export-pocket-descriptors.md`.

- **`misc/todo/pocket_grid/{SPEC,PLAN}.md`** still mention
-  `export_pocket_grid_pml` (renamed to `vis_pocket_grid`).
-
 - **CI matrix is `17,21,25,26` only** (`.github/workflows/develop.yml:23`).
  README claims "Java 17 or later (tested up to Java 25)"; 18–20/22/23/24 not
  exercised; "tested up to Java 25" lags the now-present 26.
@@ -369,14 +369,15 @@ focused cleanups.
 1. **Fix `VoxelHashAssigner` cell-prune lower bound** (or drop it and rely on
   the post-fetch distance check). Restores the assigner-strategy equivalence
   the docs promise.
-2. **Apply the rank-gap fix to `PocketGridPymolRenderer`** — mirror what
-   commit `a3efd084` did for ChimeraX.
-3. **Make energy-feature lazy-init actually thread-safe**
+2. **Make energy-feature lazy-init actually thread-safe**
   (`MethylEnergyFeature`, `AbstractProbeEnergyFeature`); fix `ConcurrencyTest`
   to construct calculators under contention.
-4. **Guard `AhojSiteInfo.fromCsvRecord` with `record.isMapped(...)`** for the
+3. **Guard `AhojSiteInfo.fromCsvRecord` with `record.isMapped(...)`** for the
   new `rg`/`n_unp_pockets[_multichain]` columns, so the parser doesn't crash
   on older "full" CSVs.
+4. **Re-link `PUResNetLoader.surfaceAtoms` to `queryProtein`** by PDB serial
+   (mirror `FPocketLoader.groovy:137`); same identity-mismatch class as the
+   Concavity fix.
 5. **README/help.txt/`distro/prank.bat` trio**: bump the version badge, fix
   the `./make-disro.sh` typo, regenerate `help.txt` to list current commands,
   and bring Windows launcher JVM flags up to parity with the Bash launchers.
--- a/misc/todo/pocket_grid/PLAN.md
+++ b/misc/todo/pocket_grid/PLAN.md
@@ -1,446 +0,0 @@
-# Plan — Pocket grid points export + per-pocket descriptors
-
-Companion to `SPEC.md`. Ordered, atomic phases. Each phase is a single
-reviewable commit (or two if splitting tests helps). Compile + test must
-be green at the end of every phase.
-
-## Phase order rationale
-
-Layered, foundation-first. Each phase only depends on phases above it.
-
-```
-1. TableData STRING refactor          (foundation, no behavior change)
-2. VdW radius helper + grid generator (foundation)
-3. PocketGrid data class + fill strategies
-4. PocketGridBuilder (orchestration)
-5. Descriptors infrastructure + menu
-6. Export-data classes + exporters
-7. PyMOL renderer + PDB sidecar
-8. Params + Main-startup validation
-9. Wire into PredictPockets + RescorePockets routines
-10. Documentation (2 new MD files + cross-ref)
-11. Smoke test on real data
-```
-
---
-
-## Phase 1 — `TableData` STRING column-type refactor
-
-**Goal:** Extend the export infrastructure to support string columns. No
-behavioral change to existing SAS-points export.
-
-**Changes:**
- `TableData.groovy` — add `ColumnType.STRING`; new method
-  `default String getString(int rowIndex, int colIndex) { throw ... }`
-  for STRING columns; default `getColumn` only meaningful for numeric.
- `TableExporter.groovy`:
-  - `writeCsv` — string branch with RFC 4180 quoting (escape `,`, `"`, newline).
-  - `writeArrow` — `VarCharVector` for STRING columns; `buildSchema` updated.
-  - `writeParquet` — `BINARY` with `LogicalTypeAnnotation.stringType()`;
-    `RowDehydrator` updated.
- `PointExportData.groovy` — no functional change; verify
-  `getColumnType` doesn't accidentally return STRING (it currently can't —
-  all columns are DOUBLE/INT).
-
-**Tests:**
- `TableExporterTest` — new round-trip tests for a synthetic table with
-  one STRING column, one INT, one DOUBLE; csv, csv.gz, arrow, parquet.
- CSV quoting edge cases: value contains `,`, `"`, `\n`.
- Regression: existing `PointsExporterTest` still passes (no schema
-  changes to SAS export).
-
-**Commit:** `Extend TableData with STRING column type`
-
---
-
-## Phase 2 — VdW radius helper + `GridGenerator` extension
-
-**Goal:** Make per-atom VdW radii available; extend the existing grid
-sampler.
-
-**Changes:**
- New `src/main/groovy/cz/siret/prank/program/routines/predict/output/grid/VdwRadiusTable.groovy`:
-  - `static double get(Atom atom)` — looks up via CDK `Elements` by
-    element symbol; if `null`, falls back to Krypton's 2.02 Å (matches
-    the existing pattern in `PatchedCdkNumericalSurface.groovy:54-56`).
-  - Caches `String elementSymbol → double radius` in a
-    `ConcurrentHashMap` (predict runs multi-threaded via
-    `Dataset.process(...)`, so the cache is shared across threads;
-    `computeIfAbsent` is safe and avoids races).
- `GridGenerator.java` — extend
-  `sampleGridPointsAroundAtoms(Atoms, edge, radius)` into a new variant
-  `sampleGridPointsBetween(Atoms, edge, maxDist, double atomBuffer)`:
-  - Keep existing method unchanged.
-  - New method uses `Atoms.withKdTreeConditional()`, walks the lattice,
-    for each cell computes `nearest = atoms.nearestSqrDist(p)`,
-    `vdw = VdwRadiusTable.get(nearestAtom)`, drops if
-    `sqrt(nearest) < vdw + atomBuffer` or `sqrt(nearest) > maxDist`.
-  - Note: `nearestSqrDist` returns squared distance only; for the per-atom
-    VdW check we need the actual nearest **atom**, not just distance.
-    Use `Atoms.findNearest(point)` (`Atoms.java:244`) which returns the
-    Atom; then compute `dist` once.
-
-**Tests:**
- `VdwRadiusTableTest` — known elements (C, N, O, S, P, Fe, Cu, Co)
-  return non-null; Co/Ni/Cu use the Krypton fallback (2.02 Å); unknown
-  symbol → fallback.
- `GridGeneratorTest` (new file or existing if present) — synthetic
-  small `Atoms` set, verify min/max filtering on cubic lattice
-  produces expected count. Edge case: single-atom input.
-
-**Commit:** `Add VdwRadiusTable and GridGenerator min/max sampler`
-
---
-
-## Phase 3 — `PocketGrid` data class + fill strategies
-
-**Goal:** Pure data + algorithms, no orchestration.
-
-**Changes:**
- `PocketGrid.groovy`:
-  - Fields:
-    - `Atoms allPoints` — kept grid points after filtering, wrapped as
-      `Atoms` (since `Point implements Atom`). Reusing `Atoms` gives us
-      `cutoutShell`, `withKdTree`, `getByID` for free.
-    - `Map<Integer, Set<Integer>> pocketToPointIndices` (rank → indices
-      into `allPoints`).
-    - `Set<Integer> assignedIndices` (union of all per-pocket sets).
-    - `double spacing`.
-    - `Map<LatticeCoord, Integer> latticeIndex` — integer-lattice
-      coordinate `(i, j, k)` → point index. Computed from
-      `originX/Y/Z` + `spacing` during grid generation; **required by
-      `MorphologicalCloser`** for `O(1)` neighbor lookups (without it
-      morph closing degrades to all-pairs distance comparisons).
-    - `LatticeCoord` is a small immutable value class with proper
-      `equals`/`hashCode`.
-  - Provides: `Atoms pointsForPocket(int rank)`,
-    `Set<Integer> pocketsForPoint(int pointIndex)`,
-    `Set<Integer> neighborsOf(int pointIndex, int connectivity)` (where
-    `connectivity ∈ {6, 18, 26}` consults `latticeIndex`).
- `fill/PocketShapeFiller.groovy` — interface:
-  ```groovy
-  Set<Integer> fill(Set<Integer> rawShellIndices,
-                    List<Point> allPoints,
-                    double spacing,
-                    Params params)
-  ```
- `fill/NoOpFiller.groovy` — returns input unchanged.
- `fill/MorphologicalCloser.groovy`:
-  - Operates on a `Map<(int,int,int) → Integer>` lattice index built from
-    allPoints. For each iteration, scans candidate cells (immediate
-    neighbors of assigned cells) and promotes those whose neighbor count
-    ≥ `pocket_grid_fill_min_neighbors`. Stops at fixed-point or
-    `pocket_grid_fill_max_iters`.
-  - Neighborhood: 26-connectivity (configurable later if needed).
- `fill/ConvexHullFiller.groovy` — initial **stub** that throws
-  `UnsupportedOperationException("convex_hull fill not yet implemented")`
-  so users get a clear error if they select it. Real impl in a followup.
-
-**Tests:**
- `MorphologicalCloserTest` — synthetic shapes:
-  - Pure sphere shell (3-cell-thick) → fills to solid sphere within
-    `max_iters`.
-  - U-shape with concavity → concavity filled in.
-  - Disconnected components → not merged when far apart.
- `NoOpFillerTest` — identity.
-
-**Commit:** `Add PocketGrid data class and morph-closing fill strategy`
-
---
-
-## Phase 4 — `PocketGridBuilder` (orchestration)
-
-**Goal:** End-to-end grid generation + per-pocket assignment + fill.
-
-**Changes:**
- `PocketGridBuilder.groovy`:
-  - `static PocketGrid build(Protein protein, List<? extends Pocket> pockets, Params params)`
-  - Steps:
-    1. Call the new sampler from Phase 2 →
-       `Atoms allPoints` of kept lattice points + their lattice
-       coordinates. Store both in the resulting `PocketGrid`.
-    2. Build a KdTree on `allPoints` (`allPoints.withKdTree()`) — cheap
-       once, reused by callers downstream.
-    3. For each pocket `p`:
-       - `p.surfaceAtoms.withKdTreeConditional()` (small set, KdTree
-         built on demand).
-       - Iterate `allPoints` once; for each point at index `i`, keep
-         `i` in the **raw shell** set if
-         `p.surfaceAtoms.nearestDist(allPoints.list[i]) <= params.pocket_grid_assign_cutoff`.
-         O(|allPoints| × log|surfaceAtoms|) per pocket.
-       - Pass the raw shell set + `latticeIndex` to
-         `filler.fill(...)` → final per-pocket index set.
-    4. Aggregate into `PocketGrid.pocketToPointIndices`; derive
-       `assignedIndices` as the union.
-  - Filler selection: dispatch on `params.pocket_grid_fill` enum value.
-  - All `@CompileStatic` + `@Slf4j`.
-
-**Tests:**
- `PocketGridBuilderTest`:
-  - 1fbl.pdb fixture (small, fast). Predict pockets via existing
-    `PrankFacade`; build grid; assert:
-    - `allPoints` count is reasonable for the bounding box (sanity check).
-    - Each pocket has a non-empty point set after fill.
-    - Multi-pocket overlap can occur (count of `(point, pocket)` pairs
-      > count of distinct points).
- Edge case: protein with 0 predicted pockets → `PocketGrid` with
-  `allPoints` non-empty but `pocketToPointIndices` empty.
-
-**Commit:** `Add PocketGridBuilder orchestrating grid + assignment + fill`
-
---
-
-## Phase 5 — Descriptors infrastructure + initial 4
-
-**Goal:** Pluggable descriptors with default `["volume"]`.
-
-**Changes:**
- `descriptors/PocketGridContext.groovy` — data class: `pocket`, `protein`,
-  `gridPointsForPocket`, `pocketGrid`, `params`.
- `descriptors/PocketDescriptor.groovy` — interface:
-  ```groovy
-  String name()
-  ColumnType columnType()   // INT or DOUBLE
-  double compute(PocketGridContext ctx)
-  ```
-  (Return type `double` — INT descriptors cast at write time, mirroring
-  TableData's int-as-double convention.)
- `descriptors/PocketDescriptorRegistry.groovy`:
-  - `static Map<String, PocketDescriptor> REGISTRY` — populated at
-    classload with the 4 shipped descriptors.
-  - `static PocketDescriptor get(String name)` — throws `PrankException`
-    on unknown.
-  - `static Set<String> knownNames()`.
- `VolumeDescriptor.groovy` — `count(gridPoints) × spacing³`.
- `SphericityDescriptor.groovy` — bounding-sphere variant. **Centroid is
-  the centroid of the pocket's assigned grid points**, not
-  `pocket.centroid` (which is derived from surfaceAtoms and would give
-  misleading numbers for asymmetric pockets):
-  - `gridCentroid = mean(p for p in ctx.gridPointsForPocket)`
-  - `r = max(dist(p, gridCentroid))`
-  - `V_sphere = (4/3) · π · r³`
-  - `result = V_pocket / V_sphere` (≤ 1 by construction; clamp is
-    defensive)
- `NumResiduesDescriptor.groovy` — `pocket.residues.size()`.
- `NumSurfaceAtomsDescriptor.groovy` — `pocket.surfaceAtoms.count`.
-
-**Tests:**
- Per-descriptor unit tests using a synthetic small `PocketGridContext`:
-  - Volume: 8 grid cells @ 1Å spacing → V = 8 Å³.
-  - Sphericity: solid sphere of N cells → sphericity ≈ 1.0 (within
-    tolerance for lattice quantization); flat disc → sphericity << 1.
-  - num_residues / num_surface_atoms: stub pockets.
- `PocketDescriptorRegistryTest` — known names resolve; unknown throws.
-
-**Commit:** `Add pocket descriptor framework with 4 initial descriptors`
-
---
-
-## Phase 6 — Export-data classes + exporters
-
-**Goal:** Bridge `PocketGrid` and descriptor computations to `TableExporter`.
-
-**Changes:**
- `PocketGridExportData.groovy` (implements `TableData`):
-  - Constructor takes `PocketGrid` and `boolean includeUnassigned`.
-  - Materializes long-format rows during construction (point-pocket pairs);
-    sort by `(pocket, x, y, z)`.
-  - Columns: `x`, `y`, `z` (DOUBLE), `pocket` (INT).
- `PocketDescriptorsExportData.groovy` (implements `TableData`):
-  - Constructor takes pockets, descriptor results, `boolean includeProbability`.
-  - Columns: `name` (STRING — uses Phase 1 refactor), `rank` (INT),
-    `score` (DOUBLE), `probability` (DOUBLE, conditional),
-    `center_x/y/z` (DOUBLE), then one column per descriptor (INT or
-    DOUBLE per the descriptor's `columnType()`).
- `PocketGridExporter.groovy`:
-  - `static void tryExport(PocketGrid grid, String outdir, String label, Params params)`
-  - Gated by `params.export_pocket_grid`; uses `params.pocket_grid_format`.
-  - Writes `{outdir}/{label}_pocket_grid.{format}`.
- `PocketDescriptorsExporter.groovy`:
-  - `static void tryExport(List<? extends Pocket> pockets, PocketGrid grid, Protein protein, Params params, String outdir, String label)`
-  - Derives `includeProbability` from the data itself:
-    `pockets.any { !Double.isNaN(it.probaTP) }`. No extra parameter
-    threaded through the wiring.
-  - Iterates `params.pocket_descriptors`, computes each, builds
-    `PocketDescriptorsExportData`, writes to file.
-
-**Tests:**
- `PocketGridExportDataTest` — assert row count, sort order, column types
-  on a synthetic `PocketGrid`.
- `PocketDescriptorsExportDataTest` — STRING column round-trips through
-  CSV correctly (depends on Phase 1).
- Integration smoke: small fixture, export to all 7 formats, re-read with
-  the same reader paths used by `PointExportDataTest`.
-
-**Commit:** `Add pocket grid and descriptors exporters`
-
---
-
-## Phase 7 — PyMOL renderer + PDB sidecar
-
-**Goal:** Visualization of the grid in PyMOL.
-
-**Changes:**
- New util in `PocketGridPymolRenderer.groovy`:
-  - `static void render(PocketGrid grid, String outdir, String label, Params params)`
-  - Writes:
-    1. `{outdir}/visualizations/data/{label}_pocket_grid.pdb.gz` — one
-       HETATM per `(point, pocket)` pair; pocket rank in residue-sequence
-       column (cols 23-26); element column = `H` (or `D` for dummy).
-       Mirrors `PredictionVisualizer.writeLabeledPointsPdb:44-56`.
-    2. `{outdir}/visualizations/{label}_pocket_grid.pml`:
-       - `load data/{label}_pocket_grid.pdb.gz, pocket_grid`
-       - Per pocket rank N:
-         - `create pocket_grid_<N>, pocket_grid and resi <N>`
-         - `color <hex>, pocket_grid_<N>` (color via
-           `PredictionVisualizer.generatePocketColors(numPockets)`)
-       - `show spheres, pocket_grid_*`; `set sphere_scale, 0.3`
-       - `delete pocket_grid` (drop the bulk object).
- All paths via `Futils` for cross-platform safety.
-
-**Tests:**
- `PocketGridPymolRendererTest` — synthetic small `PocketGrid` (3 pockets,
-  ~20 points each); assert output files exist; spot-check PML contains
-  `load`, `create pocket_grid_1`, `color`, `show spheres`.
- Sanity: PDB output gzip-decompresses to valid HETATM records.
-
-**Commit:** `Add PocketGridPymolRenderer with PDB sidecar`
-
---
-
-## Phase 8 — Params + Main-startup validation
-
-**Goal:** All 12 new params wired and validated.
-
-**Changes:**
- `Params.groovy` — add 12 `@RuntimeParam` fields with javadoc, defaults
-  per spec table. Place near `export_points` / `export_points_format`.
- `Main.groovy` — extend the existing param-validation block (around
-  `:142-153`, same pattern used by cofactors):
-  - `pocket_grid_format` ∈ allowed enumeration.
-  - `pocket_grid_fill` ∈ {`morph_closing`, `convex_hull`, `none`}.
-  - Every name in `pocket_descriptors` ∈ `PocketDescriptorRegistry.knownNames()`.
-  - If `export_pocket_grid_pml` and `!export_pocket_grid` → throw
-    `PrankException("export_pocket_grid_pml requires export_pocket_grid=true")`.
-
-**Tests:**
- `ParamsTest` — defaults match spec.
- `MainTest` (or wherever cofactor validation is tested) — each of the 4
-  validation failures triggers a fail-fast with a clear message.
-
-**Commit:** `Add pocket grid params and startup validation`
-
---
-
-## Phase 9 — Wire into routines
-
-**Goal:** Call the new pipeline from prediction routines.
-
-**Changes:**
- `PredictPocketsRoutine.groovy`:
-  - After score transformation and the existing
-    `PointsExporter.tryExportPoints(...)` call, insert:
-    ```groovy
-    PocketGrid grid = null
-    if (params.export_pocket_grid || params.export_pocket_descriptors || params.export_pocket_grid_pml) {
-        grid = PocketGridBuilder.build(item.protein, prediction.pockets, params)
-    }
-    PocketGridExporter.tryExport(grid, outdir, item.label, params)
-    PocketDescriptorsExporter.tryExport(prediction.pockets, grid, item.protein, params, outdir, item.label)
-    if (params.visualizations && params.export_pocket_grid_pml) {
-        PocketGridPymolRenderer.render(grid, outdir, item.label, params)
-    }
-    ```
- `RescorePocketsRoutine.groovy` — identical hook at the analogous point.
- Order is critical: build → grid file → descriptors (needs grid for
-  volume) → PML (needs grid).
-
-**Tests:**
- `PredictPocketsRoutineTest` (extend existing) — run a small prediction
-  with `-export_pocket_grid 1 -export_pocket_descriptors 1
-  -export_pocket_grid_pml 1` on 1fbl.pdb; verify all four output files
-  appear at the right paths.
-
-**Commit:** `Wire pocket grid/descriptors/PML into prediction routines`
-
---
-
-## Phase 10 — Documentation
-
-**Goal:** User-facing docs.
-
-**Changes:**
- New `documentation/export-pocket-grid.md`:
-  - Sections: Overview, Output file format (long format, sort order,
-    formats), Algorithm summary (grid generation, assignment, fill),
-    Params table, CLI examples, PyMOL visualization, Notes.
-  - Mirrors the structure of `documentation/export-points.md`.
- New `documentation/export-pocket-descriptors.md`:
-  - Sections: Overview, Output file format, Descriptor catalog
-    (volume, sphericity, num_residues, num_surface_atoms — with
-    formulas), Extensibility (how to add a new descriptor), Params
-    relevant to descriptors.
- `documentation/export-points.md` — append a brief "See also" block at
-  the end pointing to the two new docs.
- `README.md` — single bullet in "What's new" for 2.7 (or whenever this
-  ships) referencing the two new docs.
-
-**Tests:** none (docs only).
-
-**Commit:** `Document pocket grid and descriptors export`
-
---
-
-## Phase 11 — Smoke test on real data
-
-**Goal:** End-to-end on real proteins; eyeball outputs.
-
-**Changes:** none.
-
-**Verification (manual):**
- Run on `distro/test_data/1fbl.pdb` with `-export_pocket_grid 1
-  -export_pocket_descriptors 1 -export_pocket_grid_pml 1`.
- Verify:
-  - Grid CSV row counts and centroid statistics look right (small protein
-    → maybe 5k-15k assigned point-rows).
-  - Descriptors CSV — volume in 50-2000 Å³ range per pocket; sphericity
-    in [0, 1]; residue/atom counts non-zero.
-  - PyMOL: open the PML; visually confirm grid points cluster near
-    predicted pockets, colored consistently with the main pocket PML.
- Run on one of the SwinSite test proteins (1tjw_A) for cross-method
-  sanity.
- No regressions in existing SAS-points export.
-
-**Commit:** none (or "Smoke test results: …" in a project log under `local/`).
-
---
-
-## Risks / clarifications
-
-Notes from the plan review that don't require code changes but are worth
-flagging:
-
- **Sphericity clamp is redundant** — `V_pocket ≤ V_bounding_sphere`
-  always (covering sphere by construction). The `[0, 1]` clamp is purely
-  defensive; keep it.
- **Heavy Phase 4 integration test** — `PocketGridBuilderTest` uses
-  `PrankFacade` to predict pockets, which is slow. Keep the integration
-  test but also add a fast unit test that constructs `Pocket` instances
-  manually with a synthetic `surfaceAtoms` set.
- **Empty `pocket_descriptors`** — `-pocket_descriptors ""` (empty list)
-  is supported: descriptors file emits only the base columns
-  (`name, rank, score, [probability,] center_x/y/z`). Add a regression
-  test in Phase 6.
- **PDB residue-sequence column** is 4 chars (cols 23-26) → pockets are
-  capped at rank 9999 in the PML output. Real pockets stay well under
-  100; document the limit in the PML renderer's javadoc.
- **CSV string quoting** added in Phase 1 fires only for STRING columns.
-  Existing DOUBLE/INT writes stay unquoted — no CSV-format drift for
-  SAS-points export. Mention this in the Phase 1 commit message.
-
-## Out-of-scope (followups noted in spec)
-
- Per-residue descriptors.
- `convex_hull` filler real implementation.
- Pocket overlap matrix output file.
- Long-format SAS-points export.
--- a/misc/todo/pocket_grid/SPEC.md
+++ b/misc/todo/pocket_grid/SPEC.md
@@ -1,353 +0,0 @@
-# Spec — Pocket grid points export + per-pocket descriptors
-
-Status: spec, not plan. Author decisions captured in two rounds:
-
- **Initial 6 Qs:** (1) long-format grid CSV, (2) morph-closing proxy with
-  strategy switch, (3) defaults OK, (4) separate descriptors file,
-  (5) no standalone command, (6) initial descriptor menu accepted.
- **20-audit cross-check vs. code:** see below; all 20 decisions are applied
-  in this revision.
-
-## Goals
-
-Two new opt-in outputs, both produced by any `predict` or `rescore` run,
-plus an optional PyMOL visualization:
-
-1. **`{outdir}/{name}_pocket_grid.{format}`** — regular 3D grid of points
-   covering the empty space around the protein, in **long format**: one row
-   per `(point, pocket)` pair. By default only **assigned** points are
-   written (one or more rows per point, one per pocket they belong to).
-   Unassigned points (`pocket = 0`) can be opted in with
-   `pocket_grid_include_unassigned`.
-2. **`{outdir}/{name}_pocket_descriptors.{format}`** — one row per predicted
-   pocket with score, rank, centroid, and an extensible list of
-   geometric/chemical descriptors (volume from grid-point count, plus
-   others).
-3. **`{outdir}/visualizations/{name}_pocket_grid.pml`** — optional PyMOL
-   visualization, produced by a new renderer.
-
-Both data files reuse the existing `TableExporter` (csv / csv.gz / csv.zst /
-arrow / arrow.gz / arrow.zst / parquet), matching the SAS-points export
-pattern documented at `documentation/export-points.md`. Decoupled from the
-prediction algorithm: P2Rank still scores SAS points exactly as today; the
-grid is a post-prediction geometric overlay used only for descriptor
-computation.
-
-## Prerequisite refactor
-
-**`TableData` and the three writers (`writeCsv`/`writeArrow`/`writeParquet`)
-must be extended to support a `STRING` column type** (audit #1). Currently
-`TableData` only accepts `DOUBLE` and `INT`
-(`src/main/groovy/cz/siret/prank/program/routines/predict/output/TableData.groovy:13-15`).
-Without this, the descriptors file's `name` column cannot be written.
-
-Scope of the refactor:
- Add `ColumnType.STRING` and a `String[] getStringColumn(int)` (or boxed
-  `Object` access path) to `TableData`.
- Extend `writeCsv` to emit strings with proper CSV quoting (escape `,`,
-  `"`, newlines per RFC 4180).
- Extend `writeArrow` to use `VarCharVector` for string columns.
- Extend `writeParquet` to use `BINARY` (UTF8) primitive type for string
-  columns.
- Update `PointExportData` to declare its columns via the new type system
-  (no functional change for SAS-point export — no strings used today).
-
-## Algorithms
-
-### Grid generation (once per protein)
-
-1. Build a KdTree over `protein.proteinAtoms`
-   (`protein.proteinAtoms.withKdTreeConditional()`). Note: when
-   `CofactorHandler` is enabled, cofactor atoms are already merged into
-   `proteinAtoms` (`Protein.groovy:571-583`) — no separate union step
-   needed (audit #4).
-2. Bounding box around `protein.proteinAtoms`, expanded by
-   `pocket_grid_max_dist` in every direction (reuses
-   `Box.aroundAtoms(...).withMargin(...)`).
-3. Walk a regular cubic lattice with edge `pocket_grid_spacing` inside the
-   box (reuses `GridGenerator.forBox(box, edge)`).
-4. Per-atom VdW radius via CDK `Elements` (audit #2). Reuse the same
-   accessor pattern as `PatchedCdkNumericalSurface` — when CDK returns
-   `null` for an element, fall back to the Krypton proxy (2.02 Å), matching
-   the existing null-VdW workaround. Implemented as a small helper
-   `VdwRadiusTable.get(Atom) → double`.
-5. For each lattice point:
-   - **drop** if `min_dist_to_proteinAtoms < vdw_radius(nearest_atom) + pocket_grid_atom_buffer`
-     — overlaps the protein;
-   - **drop** if `min_dist_to_proteinAtoms > pocket_grid_max_dist` — too
-     far from the surface;
-   - **keep** otherwise.
-
-**Implementation note** (audit #3): extend
-`GridGenerator.sampleGridPointsAroundAtoms` (`GridGenerator.java:157-172`)
-to accept both `minDist` (semantically per-atom: VdW + buffer) and
-`maxDist`. The current method already does the `maxDist` side; the
-extension is the per-atom-VdW exclusion check.
-
-### Per-pocket assignment (multi-valued)
-
-1. For each pocket `p`, take all kept grid points within
-   `pocket_grid_assign_cutoff` of any atom in `p.surfaceAtoms`. That's the
-   *raw shell* — analogous to `SwinSiteLoader`'s `cutoutShell` at
-   `SwinSiteLoader.groovy:92-100`.
-2. **Shape fill** (pluggable via `pocket_grid_fill`, runs **per-pocket** —
-   each pocket's raw shell is dilated independently, audit #6):
-   - `morph_closing` (default): morphological closing on the lattice. Mark
-     any unassigned lattice cell whose 6-/18-/26-neighborhood contains
-     ≥ `pocket_grid_fill_min_neighbors` already-assigned cells; iterate
-     until stable or `pocket_grid_fill_max_iters` reached. Integer-grid
-     native, no extra deps.
-   - `convex_hull`: build the 3D convex hull of the raw shell (Quickhull or
-     equivalent — TBD at plan time); include every lattice point inside.
-     Exact; pulls a hull dependency.
-   - `none`: keep the raw shell exactly.
-
-   The `PocketShapeFiller` strategy interface (see Extensibility) makes
-   adding alternatives a single-file change.
-3. A grid point may belong to multiple pockets. In the output file each
-   `(point, pocket)` membership is a separate row.
-
-### Descriptor computation
-
-After assignment, for each pocket and each name in `pocket_descriptors`,
-look up the registered `PocketDescriptor` and compute. See "Initial
-descriptor menu" below.
-
-## New params (additions to `Params.groovy`)
-
-All carry `@RuntimeParam` (audit #7) — runtime / output concerns, not
-training.
-
-Allowed values for `pocket_grid_format` (audit #8, enumerated explicitly to
-avoid drift): `csv`, `csv.gz`, `csv.zst`, `arrow`, `arrow.gz`, `arrow.zst`,
-`parquet`.
-
-| Param | Default | Notes |
-|---|---|---|
-| `export_pocket_grid` | `false` | gate for the grid-points file |
-| `export_pocket_descriptors` | `false` | gate for the descriptors file |
-| `export_pocket_grid_pml` | `false` | gate for the PyMOL visualization; requires `export_pocket_grid=true` (fail-fast otherwise, audit #16) |
-| `pocket_grid_format` | `"csv"` | one of the enumerated values above |
-| `pocket_grid_include_unassigned` | `false` | include `pocket = 0` rows in the grid file |
-| `pocket_grid_spacing` | `1.0` (Å) | lattice edge; volume scales with this³ |
-| `pocket_grid_max_dist` | `6.0` (Å) | upper bound: nearest-atom distance to keep a grid point |
-| `pocket_grid_atom_buffer` | `0.5` (Å) | additive buffer on per-atom VdW exclusion: keep if `dist > vdw_radius(atom) + buffer` (audit #9) |
-| `pocket_grid_assign_cutoff` | `4.5` (Å) | membership cutoff vs. `pocket.surfaceAtoms`; matches `SwinSiteLoader.SURFACE_ATOMS_CUTOFF` |
-| `pocket_grid_fill` | `"morph_closing"` | one of `morph_closing`, `convex_hull`, `none` |
-| `pocket_grid_fill_min_neighbors` | `3` | morph_closing only — neighbor count threshold |
-| `pocket_grid_fill_max_iters` | `5` | morph_closing only — guard against runaway dilation |
-| `pocket_descriptors` | `["volume"]` | list-param; each name selects a registered descriptor |
-
-**Validation** (audit #10): unknown values in `pocket_descriptors`,
-`pocket_grid_fill`, and `pocket_grid_format`, plus the
-`export_pocket_grid_pml ⇒ export_pocket_grid` invariant, are checked at
-Main startup. Same pattern as the cofactor validation at
-`Main.groovy:142-153`.
-
-## Output schemas
-
-### `{name}_pocket_grid.{format}` (long format)
-
-| Column | Type | Description |
-|---|---|---|
-| `x`, `y`, `z` | f64 | grid point coordinate |
-| `pocket` | i32 | pocket rank this row belongs to; `0` only present if `pocket_grid_include_unassigned` is on |
-
-**Sort order** (audit #5): rows sorted by `pocket` asc, then `x` asc,
-`y` asc, `z` asc. `pocket=0` (if enabled) goes last so readers that only
-care about assigned points can stop early. Deterministic and reproducible
-across runs.
-
-### `{name}_pocket_descriptors.{format}`
-
-Base columns (always present), then one column per name in
-`pocket_descriptors`:
-
-| Column | Type | Source |
-|---|---|---|
-| `name` | string | `pocket.name` (requires `TableData` STRING support, prerequisite refactor) |
-| `rank` | i32 | `pocket.rank` |
-| `score` | f64 | `pocket.score` |
-| `probability` | f64 | from score transformer; **column omitted entirely** when no transformer ran |
-| `center_x`, `center_y`, `center_z` | f64 | `pocket.centroid` |
-| `<descriptor>` | f64 / i32 | one per requested descriptor |
-
-**`probability` column inclusion** (audit #19): controlled by a constructor
-flag on the export-data class, mirroring `PointExportData.includeScore`
-(`PointExportData.groovy:47-48`). Schema is fixed at construction; no
-runtime branching on row write.
-
-## Initial descriptor menu
-
-Shipped registry:
-
-| Name | Output | Definition |
-|---|---|---|
-| `volume` | f64 (Å³) | `\|assigned grid points\| × pocket_grid_spacing³` |
-| `sphericity` | f64 in [0, 1] | `V_pocket / V_bounding_sphere`, where `V_bounding_sphere = (4/3)π · r³` with `r = max(\|p − centroid\|)` over the pocket's grid points. Quantization-free; 1 = perfect sphere. (audit #18 — replaces the boundary-area formula) |
-| `num_residues` | i32 | `pocket.residues.size()` (reuses existing accessor, audit #17) |
-| `num_surface_atoms` | i32 | `pocket.surfaceAtoms.count` |
-
-`volume` is the default value of `pocket_descriptors`. Others must be opted
-in by name.
-
-## Extensibility
-
-All new Groovy classes carry `@CompileStatic` and `@Slf4j` per repo
-convention (audit #20).
-
-```
-src/main/groovy/cz/siret/prank/program/routines/predict/output/descriptors/
-  ├── PocketDescriptor.groovy          # interface: String name(); Object compute(PocketGridContext ctx)
-  ├── PocketDescriptorRegistry.groovy  # name → factory; selection from Params.pocket_descriptors
-  ├── VolumeDescriptor.groovy
-  ├── SphericityDescriptor.groovy
-  ├── NumResiduesDescriptor.groovy
-  └── NumSurfaceAtomsDescriptor.groovy
-
-src/main/groovy/cz/siret/prank/program/routines/predict/output/grid/
-  ├── PocketGrid.groovy                # data: kept points + per-pocket assignment map
-  ├── PocketGridBuilder.groovy         # generation + assignment + fill orchestration
-  ├── VdwRadiusTable.groovy            # Atom → double, via CDK Elements + Krypton fallback
-  └── fill/
-        ├── PocketShapeFiller.groovy   # interface: Set<Point> fill(rawShell, allPoints, params)
-        ├── MorphologicalCloser.groovy
-        ├── ConvexHullFiller.groovy    # may be stub initially
-        └── NoOpFiller.groovy
-```
-
-`PocketGridContext` exposes: the per-pocket grid-point set, the global
-grid, the pocket, the protein, and `Params`. Adding a descriptor = drop one
-file in `descriptors/` + register the name. Adding a fill strategy = drop
-one file in `fill/` + extend the enum.
-
-## Pocket grid visualization
-
-Output:
- `{outdir}/visualizations/data/{name}_pocket_grid.pdb.gz` — one HETATM per
-  grid point; pocket rank stored in the residue-sequence column (mirrors
-  `writeLabeledPointsPdb` at `PredictionVisualizer.groovy:44-56`); generated
-  in long format (one HETATM per `(point, pocket)` pair so PyMOL can split
-  by residue).
- `{outdir}/visualizations/{name}_pocket_grid.pml` — small PyMOL script
-  that `load`s the PDB and colors by residue.
-
-This **PDB-sidecar approach** (audit #11) replaces the earlier inline
-`pseudoatom`-per-point design — at ~20k–100k grid points the inline
-approach would take seconds-to-minutes for PyMOL to load.
-
-**Renderer:**
-`src/main/groovy/cz/siret/prank/program/visualization/renderers/PocketGridPymolRenderer.groovy`,
-parallel to `PymolRenderer` / `ChimeraXRenderer`. Takes the in-memory
-`PocketGrid` (not the CSV file — the grid is already in memory and the PDB
-sidecar is derived from it, audit #15 makes the format constraint moot).
-
-**Colors:** reuse `PredictionVisualizer.generatePocketColors(numPockets)`
-(`PredictionVisualizer.groovy:38`) so the grid PML matches the main pocket
-PML palette (audit #13).
-
-**Layout in the PML:**
- `load .../data/{name}_pocket_grid.pdb.gz, pocket_grid`
- Per pocket: `create pocket_grid_<rank>, pocket_grid and resi <rank>` and
-  `color <hex>, pocket_grid_<rank>`.
- `show spheres, pocket_grid_*` with small `sphere_scale` (e.g. 0.3).
-
-**Path layout** (audit #12): data files (`_pocket_grid.{fmt}`,
-`_pocket_descriptors.{fmt}`) at the root of `outdir`, matching the SAS
-points export. Visualization artifacts (`_pocket_grid.pdb.gz`,
-`_pocket_grid.pml`) under `visualizations/` / `visualizations/data/`,
-matching the existing main-PML layout.
-
-**Master visualization switch** (audit #14): respects `visualizations=false`
-— if visualizations are globally off, the grid PML + PDB sidecar are
-skipped even when `export_pocket_grid_pml=true`. Single off-switch for ALL
-viz.
-
-**Independence from `vis_renderers`:** the new renderer has its own gate
-(`export_pocket_grid_pml`) and does *not* tie into the
-`["pymol", "chimerax"]` renderer list. The grid PML is a power-user output
-that shouldn't be implicit. Easy to revisit if usage patterns argue
-otherwise.
-
-## CLI examples
-
-```bash
-# grid + default descriptors (just volume), parquet
-prank predict -f protein.pdb -export_pocket_grid 1 -export_pocket_descriptors 1 \
-    -pocket_grid_format parquet
-
-# custom descriptor list + tighter grid
-prank predict dataset.ds -export_pocket_descriptors 1 \
-    -pocket_descriptors "volume,sphericity,num_residues,num_surface_atoms" \
-    -pocket_grid_spacing 0.75 -pocket_grid_max_dist 5
-
-# rescore with grid export, arrow.zst
-prank rescore fpocket.ds -export_pocket_grid 1 -pocket_grid_format arrow.zst
-
-# switch fill strategy (e.g. for ablation studies)
-prank predict -f protein.pdb -export_pocket_grid 1 -pocket_grid_fill none
-
-# grid CSV + PyMOL visualization
-prank predict -f protein.pdb -export_pocket_grid 1 -export_pocket_grid_pml 1
-
-# also keep the unassigned envelope (e.g. for debugging the grid generator)
-prank predict -f protein.pdb -export_pocket_grid 1 -pocket_grid_include_unassigned 1
-```
-
-## Files touched (preview, plan will refine)
-
-New:
- `descriptors/` and `grid/` packages as above
- `PocketGridExporter.groovy` + `PocketDescriptorsExporter.groovy` next to
-  `PointsExporter.groovy`
- `PocketGridExportData` / `PocketDescriptorsExportData` data classes next
-  to `PointExportData.groovy`
- `PocketGridPymolRenderer.groovy` under `program/visualization/renderers/`
- Tests next to each new class
- **`documentation/export-pocket-grid.md`** — user-facing how-to for the
-  grid file: algorithm summary, sort order, params, format options, CLI
-  examples, PyMOL visualization details
- **`documentation/export-pocket-descriptors.md`** — descriptors file
-  format, descriptor catalog with formulas, extensibility for adding new
-  descriptors
-
-Modified:
- `Params.groovy` — 11 new `@RuntimeParam` fields (table above)
- `Main.groovy` — startup validation hooks for `pocket_descriptors`,
-  `pocket_grid_fill`, `pocket_grid_format`, and the
-  `export_pocket_grid_pml ⇒ export_pocket_grid` invariant
- `PredictPocketsRoutine.groovy` + `RescorePocketsRoutine.groovy` — wire
-  the new exporters and renderer at the same hook point as
-  `PointsExporter.tryExportPoints`
- `TableData.groovy` + `TableExporter.groovy` + `PointExportData.groovy` —
-  STRING column-type support (prerequisite refactor)
- `GridGenerator.java` — extend `sampleGridPointsAroundAtoms` to accept a
-  per-atom minDist (VdW + buffer) alongside the existing maxDist
- `documentation/export-points.md` — cross-reference the two new docs from
-  the "See also" section
-
-Not touched:
- `PredictionSummary.toCSV()` / `predictions.csv` schema — descriptors live
-  in their own file.
- `PocketStats.realVolumeApprox` — keep as-is; SwinSite still uses it. The
-  new grid-volume is independent.
-
-## Scope notes
-
- Cofactor atoms participate in the bounding box and the VdW exclusion via
-  their inclusion in `protein.proteinAtoms`. They do **not** affect
-  `pocket.surfaceAtoms` membership for assignment — the existing pocket
-  surface-atom set defines membership.
- Outputs are computed *after* score transformation so `probability` is
-  available when applicable.
- `breaking-changes.md` (2.7 or whenever this ships) gets a bullet for the
-  new param family and the new output files.
-
-## Followups / not in this spec
-
- Per-residue descriptors (different file, different aggregation).
- Pocket overlap matrix (cheap byproduct of the long-format grid file —
-  group-by `pocket` and intersect, or compute eagerly and dump as
-  `{name}_pocket_overlap.csv`).
- Long-format SAS-points export (parallel change, separate spec).
- Real-3D-hull `convex_hull` filler (initial ship may stub it).
--- a/src/main/groovy/cz/siret/prank/program/Main.groovy
+++ b/src/main/groovy/cz/siret/prank/program/Main.groovy
@@ -204,10 +204,7 @@ class Main implements Parametrized, Writable {
        }
    }

-    /**
-     * Fail-fast validation for the pocket-grid export feature
-     * (see misc/todo/pocket_grid/SPEC.md).
-     */
+    /** Fail-fast validation for the pocket-grid export feature. */
    private void validatePocketGridParams() {
        // pocket_grid_format must be one of the values supported by TableExporter.
        Set<String> allowedFormats = ['csv', 'csv.gz', 'csv.zst',
--- a/src/main/groovy/cz/siret/prank/program/routines/predict/PredictPocketsRoutine.groovy
+++ b/src/main/groovy/cz/siret/prank/program/routines/predict/PredictPocketsRoutine.groovy
@@ -156,8 +156,7 @@ class PredictPocketsRoutine extends Routine {
                    new GetcleftOutputCalculator().generateGetcleftSasPdbFiles(pair.prediction, outdir)
                }

-                // Pocket grid + descriptors export + optional PyMOL viz
-                // (see misc/todo/pocket_grid/SPEC.md).
+                // Pocket grid + descriptors export + optional PyMOL viz.
                PocketGridOutputs.exportIfEnabled(pair.prediction, item.protein, outdir, item.label)
            }

--- a/src/main/groovy/cz/siret/prank/program/routines/predict/RescorePocketsRoutine.groovy
+++ b/src/main/groovy/cz/siret/prank/program/routines/predict/RescorePocketsRoutine.groovy
@@ -129,8 +129,7 @@ class RescorePocketsRoutine extends Routine {
                // Export SAS points with feature vectors and scores (pocket points only in rescore mode)
                PointsExporter.tryExportPoints(rescorer.exportData, outdir, item.label)

-                // Pocket grid + descriptors export + optional PyMOL viz
-                // (see misc/todo/pocket_grid/SPEC.md).
+                // Pocket grid + descriptors export + optional PyMOL viz.
                PocketGridOutputs.exportIfEnabled(pair.prediction, item.protein, outdir, item.label)
            }