mirror of
https://github.com/rdk/p2rank.git
synced 2026-06-04 12:44:24 +08:00
Per-pocket descriptors: multi-column interface + PrincipalMomentsDescriptor
Unifies the per-pocket descriptor framework with the per-grid-point
framework: same shape (name + columnNames + columnTypes + double[]
compute), same multi-column "{name}.{col}" header convention, same
public register / unregister / dup-column-check registry. Shipped as
breaking change behind the same -pocket_descriptors knob.
Interface change:
String name();
List<String> columnNames();
List<ColumnType> columnTypes();
double[] compute(PocketGridContext);
boolean needsGrid(); // unchanged
Scalar descriptors stay one-liners via the new
AbstractScalarPocketDescriptor adapter (name + scalarType +
computeScalar). The 6 existing descriptors migrated; behavior and
output byte-identical to before.
New descriptor: PrincipalMomentsDescriptor (3 × DOUBLE) — the three
eigenvalues of the pocket grid points' gyration tensor, sorted
descending. Implementation uses Apache Commons Math 3
EigenDecomposition. Shape signature complement to sphericity /
radius_of_gyration; sum equals radius_of_gyration² (verified in test).
Added to the default -pocket_descriptors list.
Default list reordered to put num_* (cheap, integer-valued) first,
then geometric scalars, then principal_moments:
num_residues, num_surface_atoms, num_grid_points,
volume, sphericity, radius_of_gyration,
principal_moments
Tests:
- 5 new PrincipalMomentsDescriptor tests (cube isotropy, rod-shape
eigenvalues, sort order, degenerate empty/single, sum=Rg²)
- PocketDescriptorsRowsTest +2 (multi-column prefix rule, mixed
scalar + multi ordering)
- existing 13 callsites updated for the double[] return signature
- columnType() registry test → columnTypes()
User-visible change: the default -pocket_descriptors output now has
three new columns (principal_moments.lambda1/2/3) and the existing
columns appear in a different order. Scripts parsing by column name
are unaffected; scripts parsing by column index need updating.
This commit is contained in:
@@ -17,8 +17,8 @@ counts, etc.) written to a tabular file alongside any `predict` or
|
||||
## Quick start
|
||||
|
||||
```bash
|
||||
# Default: every shipped descriptor (volume, sphericity, radius_of_gyration,
|
||||
# num_residues, num_surface_atoms, num_grid_points)
|
||||
# Default: every shipped descriptor (num_residues, num_surface_atoms,
|
||||
# num_grid_points, volume, sphericity, radius_of_gyration, principal_moments)
|
||||
prank predict -f protein.pdb -export_pocket_descriptors 1
|
||||
|
||||
# Narrow set + tighter grid for more accurate volume/sphericity
|
||||
@@ -41,21 +41,25 @@ One row per predicted pocket.
|
||||
| `score` | f64 | Raw P2Rank pocket score |
|
||||
| `probability` | f64 | Calibrated probability from the score transformer. **Column is omitted entirely** when no transformer ran |
|
||||
| `center_x`, `center_y`, `center_z` | f64 | Pocket centroid coordinates |
|
||||
| *(one column per requested descriptor)* | f64 / i32 | See descriptor catalog below |
|
||||
| *(one or more columns per requested descriptor)* | f64 / i32 | See descriptor catalog below |
|
||||
|
||||
Descriptor columns appear in the order given on the command line via
|
||||
`-pocket_descriptors`.
|
||||
`-pocket_descriptors`. Most descriptors emit a single column whose header is
|
||||
the descriptor name; multi-column descriptors emit N columns prefixed with
|
||||
`"{name}."` (e.g. `principal_moments.lambda1`, `principal_moments.lambda2`,
|
||||
`principal_moments.lambda3`).
|
||||
|
||||
## Descriptor catalog
|
||||
|
||||
| Name | Output | Definition |
|
||||
| Name | Columns | Definition |
|
||||
|---|---|---|
|
||||
| `volume` | f64 | Pocket volume in **ų**: `\|assigned grid points\| × pocket_grid_spacing³`. Accuracy scales with the lattice spacing (smaller `pocket_grid_spacing` → finer estimate). |
|
||||
| `sphericity` | f64 ∈ [0, 1] | `V_pocket / V_bounding_sphere`. Bounding sphere is centered at the **centroid of the pocket's grid points** (not `pocket.centroid` which is atom-derived); radius is the max distance from that centroid. Quantization-free. 1 = perfect sphere; ≪ 1 = elongated / irregular. |
|
||||
| `radius_of_gyration` | f64 | Radius of gyration in **Å**: `sqrt(mean(\|r_i - r_cm\|²))` over the pocket's grid points (equal weights). Absolute spatial extent — pairs well with `sphericity`, which only captures compactness. `0` for empty / single-point pockets. |
|
||||
| `num_residues` | i32 | Number of distinct residues touching the pocket (reuses `Pocket.getResidues()`). |
|
||||
| `num_surface_atoms` | i32 | Size of `pocket.surfaceAtoms`. |
|
||||
| `num_grid_points` | i32 | Total grid points assigned to the pocket (cardinality of the BitSet after shape fill). Raw count complement to `volume`. |
|
||||
| `volume` | 1 × f64 | Pocket volume in **ų**: `\|assigned grid points\| × pocket_grid_spacing³`. Accuracy scales with the lattice spacing (smaller `pocket_grid_spacing` → finer estimate). |
|
||||
| `sphericity` | 1 × f64 ∈ [0, 1] | `V_pocket / V_bounding_sphere`. Bounding sphere is centered at the **centroid of the pocket's grid points** (not `pocket.centroid` which is atom-derived); radius is the max distance from that centroid. Quantization-free. 1 = perfect sphere; ≪ 1 = elongated / irregular. |
|
||||
| `radius_of_gyration` | 1 × f64 | Radius of gyration in **Å**: `sqrt(mean(\|r_i - r_cm\|²))` over the pocket's grid points (equal weights). Absolute spatial extent — pairs well with `sphericity`, which only captures compactness. `0` for empty / single-point pockets. |
|
||||
| `num_residues` | 1 × i32 | Number of distinct residues touching the pocket (reuses `Pocket.getResidues()`). |
|
||||
| `num_surface_atoms` | 1 × i32 | Size of `pocket.surfaceAtoms`. |
|
||||
| `num_grid_points` | 1 × i32 | Total grid points assigned to the pocket (cardinality of the BitSet after shape fill). Raw count complement to `volume`. |
|
||||
| `principal_moments` | 3 × f64 | Three eigenvalues of the pocket grid points' gyration tensor (equal-weight PCA), sorted descending: `principal_moments.lambda1` ≥ `lambda2` ≥ `lambda3`. Unit Ų. Shape signature: λ₁≈λ₂≈λ₃ → sphere; λ₁≫λ₂,λ₃ → rod; λ₁≈λ₂≫λ₃ → disk. Sum equals `radius_of_gyration²`. `0`s for pockets with <2 grid points. |
|
||||
|
||||
`-pocket_descriptors` defaults to **all of the above** — they share the
|
||||
pocket-grid input, so adding more is essentially free once the grid is
|
||||
@@ -86,22 +90,31 @@ Implementations live under
|
||||
|
||||
1. Implement the `PocketDescriptor` interface:
|
||||
```java
|
||||
String name();
|
||||
ColumnType columnType();
|
||||
double compute(PocketGridContext ctx);
|
||||
boolean needsGrid(); // default true — override to false if compute()
|
||||
// doesn't read ctx.grid() or ctx.gridPointIndices()
|
||||
String name(); // CLI token and multi-column header prefix
|
||||
List<String> columnNames(); // sub-names; scalar entry IGNORED at output
|
||||
List<ColumnType> columnTypes(); // parallel to columnNames()
|
||||
double[] compute(PocketGridContext); // same length as columnNames()
|
||||
boolean needsGrid(); // default true; override to false if compute()
|
||||
// doesn't read ctx.grid() or ctx.gridPointIndices()
|
||||
```
|
||||
`PocketGridContext` exposes `pocket`, `protein`, `grid`, and the
|
||||
per-pocket `gridPointIndices` set. If your `compute()` only reads
|
||||
`ctx.pocket()` (i.e., domain fields like `surfaceAtoms` or `residues`),
|
||||
override `needsGrid()` to return `false` — that lets the orchestrator
|
||||
skip the full grid build when only grid-free descriptors are selected.
|
||||
Leaving it at the default-true on a truly grid-free descriptor is not
|
||||
incorrect, but forces a wasted build per protein.
|
||||
|
||||
For **scalar** descriptors (one column), extend `AbstractScalarPocketDescriptor`
|
||||
instead of implementing the interface directly — it boils the boilerplate down
|
||||
to `name()`, `scalarType()`, and `computeScalar(ctx)`. The 6 base shipped
|
||||
descriptors use this adapter.
|
||||
|
||||
For **multi-column** descriptors (e.g. `principal_moments` with three
|
||||
eigenvalues from a single decomposition), implement `PocketDescriptor`
|
||||
directly; output column headers are `"{name()}.{columnNames()[i]}"`.
|
||||
|
||||
2. Register the implementation in `PocketDescriptorRegistry`'s static
|
||||
initializer (Java; no auto-discovery).
|
||||
initializer (Java; no auto-discovery). The registry rejects descriptors
|
||||
that declare duplicate `columnNames` at registration time.
|
||||
|
||||
3. Users can opt into it by name via
|
||||
`-pocket_descriptors "volume,my_new_descriptor"`.
|
||||
@@ -113,10 +126,9 @@ Implementations live under
|
||||
user's output schema — that's intentional; skip step 4 if the new
|
||||
descriptor is opt-in only.
|
||||
|
||||
INT descriptors return their value as a `double` that the writer
|
||||
downcasts at output time, matching the existing `TableData` convention.
|
||||
Implementations must guarantee the value fits in i32 (no concern in
|
||||
practice for pocket-grid counts).
|
||||
INT columns return their value as a `double` that the writer downcasts at
|
||||
output time, matching the existing `TableData` convention. Implementations
|
||||
must guarantee the value fits in i32.
|
||||
|
||||
## See also
|
||||
|
||||
|
||||
@@ -954,8 +954,9 @@ class Params {
|
||||
* pocket-grid input, so adding more to the list is essentially free.
|
||||
*/
|
||||
@RuntimeParam
|
||||
List<String> pocket_descriptors = ["volume", "sphericity", "radius_of_gyration",
|
||||
"num_residues", "num_surface_atoms", "num_grid_points"]
|
||||
List<String> pocket_descriptors = ["num_residues", "num_surface_atoms", "num_grid_points",
|
||||
"volume", "sphericity", "radius_of_gyration",
|
||||
"principal_moments"]
|
||||
|
||||
/**
|
||||
* Per-grid-point descriptors appended as extra columns to the pocket-grid
|
||||
|
||||
@@ -18,12 +18,19 @@ import org.biojava.nbio.structure.Atom
|
||||
* [probability (DOUBLE) — only when at least one pocket has a calibrated
|
||||
* probaTP set by the score transformer],
|
||||
* center_x (DOUBLE), center_y (DOUBLE), center_z (DOUBLE),
|
||||
* <descriptor1>, <descriptor2>, ... // one column per requested descriptor
|
||||
* <descriptor1 col(s)>, <descriptor2 col(s)>, ...
|
||||
* </pre>
|
||||
*
|
||||
* <p>Each descriptor contributes 1 or more columns. Scalar descriptors emit
|
||||
* one column headed by {@link PocketDescriptor#name()}; multi-column descriptors
|
||||
* emit N columns prefixed with {@code "{name()}."} (e.g. principal_moments emits
|
||||
* {@code principal_moments.lambda1}, {@code principal_moments.lambda2},
|
||||
* {@code principal_moments.lambda3}).
|
||||
*
|
||||
* <p>{@code includeProbability} is derived from the data — no caller has to
|
||||
* thread the flag through. {@code descriptorNames} ordering is preserved in
|
||||
* the output columns.
|
||||
* the output columns; columns within a multi-column descriptor follow the order
|
||||
* declared by {@link PocketDescriptor#columnNames()}.
|
||||
*/
|
||||
@CompileStatic
|
||||
final class PocketDescriptorsRows implements TableData {
|
||||
@@ -34,8 +41,10 @@ final class PocketDescriptorsRows implements TableData {
|
||||
private final boolean includeProbability
|
||||
private final List<String> header
|
||||
private final ColumnType[] columnTypes
|
||||
/** Pre-computed descriptor values: [pocketIndex][descriptorIndex]. */
|
||||
/** Pre-computed descriptor values per pocket: flat array of all descriptor columns. */
|
||||
private final double[][] descriptorValues
|
||||
/** Total number of descriptor columns across all descriptors (sum of arities). */
|
||||
private final int totalDescriptorCols
|
||||
|
||||
PocketDescriptorsRows(List<? extends Pocket> pockets,
|
||||
List<String> descriptorNames,
|
||||
@@ -77,7 +86,8 @@ final class PocketDescriptorsRows implements TableData {
|
||||
}
|
||||
}
|
||||
|
||||
// Build header + column types.
|
||||
// Build header + column types. Apply the "{name}.{col}" prefix rule for
|
||||
// multi-column descriptors; scalar descriptors get the bare name().
|
||||
List<String> h = new ArrayList<>()
|
||||
List<ColumnType> ct = new ArrayList<>()
|
||||
h.add('name'); ct.add(ColumnType.STRING)
|
||||
@@ -89,20 +99,35 @@ final class PocketDescriptorsRows implements TableData {
|
||||
h.add('center_x'); ct.add(ColumnType.DOUBLE)
|
||||
h.add('center_y'); ct.add(ColumnType.DOUBLE)
|
||||
h.add('center_z'); ct.add(ColumnType.DOUBLE)
|
||||
int total = 0
|
||||
for (PocketDescriptor d : descriptors) {
|
||||
h.add(d.name()); ct.add(d.columnType())
|
||||
List<String> cols = d.columnNames()
|
||||
List<ColumnType> types = d.columnTypes()
|
||||
boolean multi = cols.size() > 1
|
||||
for (int i = 0; i < cols.size(); i++) {
|
||||
String headerName = multi ? d.name() + "." + cols.get(i) : d.name()
|
||||
h.add(headerName)
|
||||
ct.add(types.get(i))
|
||||
}
|
||||
total += cols.size()
|
||||
}
|
||||
this.header = h.asImmutable()
|
||||
this.columnTypes = ct.toArray(new ColumnType[0])
|
||||
this.totalDescriptorCols = total
|
||||
|
||||
// Pre-compute descriptor values once.
|
||||
this.descriptorValues = new double[pockets.size()][descriptors.size()]
|
||||
// Pre-compute descriptor values once. Flat layout: descriptorValues[pocket][col],
|
||||
// where col is the index into the flattened descriptor schema (in registration order).
|
||||
this.descriptorValues = new double[pockets.size()][totalDescriptorCols]
|
||||
for (int i = 0; i < pockets.size(); i++) {
|
||||
Pocket p = pockets.get(i)
|
||||
BitSet indices = grid != null ? grid.indicesForPocket(p.rank) : new BitSet()
|
||||
PocketGridContext ctx = new PocketGridContext(p, protein, grid, indices)
|
||||
for (int d = 0; d < descriptors.size(); d++) {
|
||||
descriptorValues[i][d] = descriptors.get(d).compute(ctx)
|
||||
int col = 0
|
||||
for (PocketDescriptor d : descriptors) {
|
||||
double[] vals = d.compute(ctx)
|
||||
for (int k = 0; k < vals.length; k++) {
|
||||
descriptorValues[i][col++] = vals[k]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,48 @@
|
||||
package cz.siret.prank.program.routines.predict.output.descriptors;
|
||||
|
||||
import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
|
||||
import java.util.Collections;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Sugar adapter for single-column {@link PocketDescriptor} implementations.
|
||||
*
|
||||
* <p>Most existing descriptors are scalar (one value per pocket); having them
|
||||
* implement the multi-column interface directly is noise. Subclasses just
|
||||
* override {@link #name}, {@link #scalarType}, and {@link #computeScalar}.
|
||||
*
|
||||
* <p>The single sub-column-name is the empty string — the header builder in
|
||||
* {@link cz.siret.prank.program.routines.predict.output.PocketDescriptorsRows}
|
||||
* special-cases the scalar branch (size 1) by emitting {@link #name} alone,
|
||||
* so the empty entry never reaches the output schema.
|
||||
*
|
||||
* <p>Parallel to
|
||||
* {@link cz.siret.prank.program.routines.predict.output.grid.descriptors.PocketGridPointDescriptor}'s
|
||||
* "single-column impls use a one-element list" convention (no shared adapter
|
||||
* exists for that interface yet because both shipped descriptors are multi-column).
|
||||
*/
|
||||
public abstract class AbstractScalarPocketDescriptor implements PocketDescriptor {
|
||||
|
||||
private static final List<String> SCALAR_NAMES = Collections.singletonList("");
|
||||
|
||||
@Override
|
||||
public final List<String> columnNames() {
|
||||
return SCALAR_NAMES;
|
||||
}
|
||||
|
||||
@Override
|
||||
public final List<ColumnType> columnTypes() {
|
||||
return Collections.singletonList(scalarType());
|
||||
}
|
||||
|
||||
@Override
|
||||
public final double[] compute(PocketGridContext ctx) {
|
||||
return new double[] { computeScalar(ctx) };
|
||||
}
|
||||
|
||||
protected abstract ColumnType scalarType();
|
||||
|
||||
protected abstract double computeScalar(PocketGridContext ctx);
|
||||
|
||||
}
|
||||
@@ -9,13 +9,13 @@ import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
* <p>Useful as a raw size complement to {@code volume}: scales linearly with
|
||||
* cell count whereas volume scales with {@code count * spacing³}.
|
||||
*/
|
||||
public final class NumGridPointsDescriptor implements PocketDescriptor {
|
||||
public final class NumGridPointsDescriptor extends AbstractScalarPocketDescriptor {
|
||||
|
||||
@Override public String name() { return "num_grid_points"; }
|
||||
@Override public ColumnType columnType() { return ColumnType.INT; }
|
||||
@Override protected ColumnType scalarType() { return ColumnType.INT; }
|
||||
|
||||
@Override
|
||||
public double compute(PocketGridContext ctx) {
|
||||
protected double computeScalar(PocketGridContext ctx) {
|
||||
return ctx.gridPointIndices().cardinality();
|
||||
}
|
||||
|
||||
|
||||
@@ -7,14 +7,14 @@ import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
* {@code Pocket.getResidues()} which lazily derives the list from
|
||||
* {@code surfaceAtoms.distinctGroupsSorted}.
|
||||
*/
|
||||
public final class NumResiduesDescriptor implements PocketDescriptor {
|
||||
public final class NumResiduesDescriptor extends AbstractScalarPocketDescriptor {
|
||||
|
||||
@Override public String name() { return "num_residues"; }
|
||||
@Override public ColumnType columnType() { return ColumnType.INT; }
|
||||
@Override protected ColumnType scalarType() { return ColumnType.INT; }
|
||||
@Override public boolean needsGrid() { return false; }
|
||||
|
||||
@Override
|
||||
public double compute(PocketGridContext ctx) {
|
||||
protected double computeScalar(PocketGridContext ctx) {
|
||||
return ctx.pocket().getResidues().size();
|
||||
}
|
||||
|
||||
|
||||
@@ -5,14 +5,14 @@ import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
/**
|
||||
* Number of pocket surface atoms — the size of {@code pocket.getSurfaceAtoms()}.
|
||||
*/
|
||||
public final class NumSurfaceAtomsDescriptor implements PocketDescriptor {
|
||||
public final class NumSurfaceAtomsDescriptor extends AbstractScalarPocketDescriptor {
|
||||
|
||||
@Override public String name() { return "num_surface_atoms"; }
|
||||
@Override public ColumnType columnType() { return ColumnType.INT; }
|
||||
@Override protected ColumnType scalarType() { return ColumnType.INT; }
|
||||
@Override public boolean needsGrid() { return false; }
|
||||
|
||||
@Override
|
||||
public double compute(PocketGridContext ctx) {
|
||||
protected double computeScalar(PocketGridContext ctx) {
|
||||
return ctx.pocket().getSurfaceAtoms().getCount();
|
||||
}
|
||||
|
||||
|
||||
@@ -2,41 +2,58 @@ package cz.siret.prank.program.routines.predict.output.descriptors;
|
||||
|
||||
import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Pluggable per-pocket descriptor.
|
||||
*
|
||||
* <p>Each descriptor produces a fixed-arity vector — 1 column for scalar
|
||||
* descriptors (extend {@link AbstractScalarPocketDescriptor}), N columns for
|
||||
* multi-column descriptors (e.g. principal moments of inertia: three eigenvalues
|
||||
* from a single decomposition).
|
||||
*
|
||||
* <p>Header convention (applied by
|
||||
* {@link cz.siret.prank.program.routines.predict.output.PocketDescriptorsRows}):
|
||||
* <ul>
|
||||
* <li>Scalar (size 1): the sub-name entry is IGNORED at output; column header
|
||||
* is exactly {@link #name()}.</li>
|
||||
* <li>Multi-column (size > 1): each header becomes
|
||||
* {@code "{name()}.{columnNames().get(i)}"} — e.g.
|
||||
* {@code principal_moments.lambda1}.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <p>Implementations should be stateless and thread-safe (descriptors are
|
||||
* computed across pockets, potentially in parallel).
|
||||
*
|
||||
* <p>Numeric INT descriptors return their value as a {@code double} that
|
||||
* a writer can downcast to int; this matches the {@link cz.siret.prank.program.routines.predict.output.TableData}
|
||||
* convention (see {@link cz.siret.prank.program.routines.predict.output.PointExportData} for precedent).
|
||||
* <p>INT columns return their value as a {@code double} that the writer
|
||||
* downcasts at output time, matching the {@link cz.siret.prank.program.routines.predict.output.TableData}
|
||||
* convention. Implementations must guarantee the value fits in i32.
|
||||
*
|
||||
* <p>This framework is intentionally <b>scalar-only</b> — each descriptor
|
||||
* produces exactly one column. The sibling
|
||||
* {@link cz.siret.prank.program.routines.predict.output.grid.descriptors.PocketGridPointDescriptor}
|
||||
* supports multi-column descriptors with a {@code "{name}.{col}"} header
|
||||
* convention; if you need that shape here, the two interfaces should be
|
||||
* unified rather than this one re-extended ad-hoc.
|
||||
* <p>This interface is the sibling of
|
||||
* {@link cz.siret.prank.program.routines.predict.output.grid.descriptors.PocketGridPointDescriptor};
|
||||
* the two share the same shape (name + columnNames + columnTypes + double[] compute)
|
||||
* so future descriptors can move between the two contexts without an interface
|
||||
* mismatch.
|
||||
*/
|
||||
public interface PocketDescriptor {
|
||||
|
||||
/** Stable name; matches a token in {@code -pocket_descriptors}. */
|
||||
/** Stable name; matches a token in {@code -pocket_descriptors} and prefixes multi-column output headers. */
|
||||
String name();
|
||||
|
||||
/** Determines the output column's type in the descriptors file. */
|
||||
ColumnType columnType();
|
||||
|
||||
/**
|
||||
* @return descriptor value for {@code ctx.pocket()}.
|
||||
*
|
||||
* <p>INT descriptors return their value as a {@code double} that the writer
|
||||
* downcasts at output (matches the {@link cz.siret.prank.program.routines.predict.output.TableData}
|
||||
* convention). Implementations must guarantee the value fits in i32 — for
|
||||
* pocket-grid counts (cells, residues, atoms), that's many orders of
|
||||
* magnitude of headroom.
|
||||
* Column names this descriptor produces.
|
||||
* <ul>
|
||||
* <li>Scalar (size 1): entry IGNORED at output; column header is exactly {@link #name()}.</li>
|
||||
* <li>Multi-column (size > 1): each header becomes {@code "{name()}.{columnNames().get(i)}"}.</li>
|
||||
* </ul>
|
||||
*/
|
||||
double compute(PocketGridContext ctx);
|
||||
List<String> columnNames();
|
||||
|
||||
/** Parallel to {@link #columnNames()} — one entry per output column. */
|
||||
List<ColumnType> columnTypes();
|
||||
|
||||
/** @return one value per {@link #columnNames()} entry, in the same order. */
|
||||
double[] compute(PocketGridContext ctx);
|
||||
|
||||
/**
|
||||
* Does {@link #compute} read the pocket grid ({@code ctx.grid()} or
|
||||
|
||||
@@ -3,7 +3,9 @@ package cz.siret.prank.program.routines.predict.output.descriptors;
|
||||
import cz.siret.prank.program.PrankException;
|
||||
|
||||
import java.util.Collections;
|
||||
import java.util.HashSet;
|
||||
import java.util.LinkedHashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.Set;
|
||||
|
||||
@@ -12,11 +14,13 @@ import java.util.Set;
|
||||
* shipped set. Selection at runtime is name-driven via the
|
||||
* {@code -pocket_descriptors} list param.
|
||||
*
|
||||
* <p>Mirrors {@link cz.siret.prank.program.routines.predict.output.grid.descriptors.PocketGridPointDescriptorRegistry}
|
||||
* — same shape, same register/unregister/get/knownNames surface. Multi-column
|
||||
* descriptors are validated at registration time (no duplicate sub-column names
|
||||
* within one descriptor).
|
||||
*
|
||||
* <p>Adding a new descriptor = drop a new {@link PocketDescriptor}
|
||||
* implementation in this package and register it here.
|
||||
*
|
||||
* <p>Java mirror of {@code PocketAssignerRegistry} — same pluggable-registry
|
||||
* pattern, same package layout.
|
||||
*/
|
||||
public final class PocketDescriptorRegistry {
|
||||
|
||||
@@ -29,14 +33,35 @@ public final class PocketDescriptorRegistry {
|
||||
register(new NumResiduesDescriptor());
|
||||
register(new NumSurfaceAtomsDescriptor());
|
||||
register(new NumGridPointsDescriptor());
|
||||
register(new PrincipalMomentsDescriptor());
|
||||
}
|
||||
|
||||
private PocketDescriptorRegistry() {}
|
||||
|
||||
private static void register(PocketDescriptor d) {
|
||||
/**
|
||||
* Add a descriptor to the registry. Public so tests can register fixture
|
||||
* descriptors and future external descriptor plugins can register without
|
||||
* touching the static initializer.
|
||||
*/
|
||||
public static void register(PocketDescriptor d) {
|
||||
List<String> cols = d.columnNames();
|
||||
if (cols.size() > 1 && new HashSet<>(cols).size() != cols.size()) {
|
||||
throw new IllegalStateException(
|
||||
"Descriptor '" + d.name() + "' declares duplicate columnNames: " + cols);
|
||||
}
|
||||
REGISTRY.put(d.name(), d);
|
||||
}
|
||||
|
||||
/**
|
||||
* Remove a descriptor by name. Intended for tests that register a fixture
|
||||
* descriptor via {@link #register} and need to undo the side effect in an
|
||||
* {@code @AfterAll} hook so the registry doesn't leak across test classes.
|
||||
* No-op if {@code name} is not registered.
|
||||
*/
|
||||
public static void unregister(String name) {
|
||||
REGISTRY.remove(name);
|
||||
}
|
||||
|
||||
/** @throws PrankException if {@code name} is unknown. */
|
||||
public static PocketDescriptor get(String name) {
|
||||
PocketDescriptor d = REGISTRY.get(name);
|
||||
|
||||
@@ -0,0 +1,115 @@
|
||||
package cz.siret.prank.program.routines.predict.output.descriptors;
|
||||
|
||||
import com.google.common.collect.ImmutableList;
|
||||
import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
import org.apache.commons.math3.linear.Array2DRowRealMatrix;
|
||||
import org.apache.commons.math3.linear.EigenDecomposition;
|
||||
import org.biojava.nbio.structure.Atom;
|
||||
|
||||
import java.util.Arrays;
|
||||
import java.util.BitSet;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Principal moments of the pocket's grid-point distribution — three eigenvalues
|
||||
* of the gyration tensor (equal-weighted PCA on the grid points, centered at the
|
||||
* grid-point centroid).
|
||||
*
|
||||
* <p>Output columns (DOUBLE), sorted descending:
|
||||
* <ul>
|
||||
* <li>{@code principal_moments.lambda1} — largest eigenvalue (Ų)</li>
|
||||
* <li>{@code principal_moments.lambda2} — middle eigenvalue (Ų)</li>
|
||||
* <li>{@code principal_moments.lambda3} — smallest eigenvalue (Ų)</li>
|
||||
* </ul>
|
||||
*
|
||||
* <p>Shape signatures:
|
||||
* <ul>
|
||||
* <li>λ₁ ≈ λ₂ ≈ λ₃ → spherical pocket</li>
|
||||
* <li>λ₁ ≫ λ₂, λ₃ → elongated (rod-like)</li>
|
||||
* <li>λ₁ ≈ λ₂ ≫ λ₃ → flat (disk-like)</li>
|
||||
* </ul>
|
||||
*
|
||||
* <p>Sum of the three eigenvalues equals {@code radius_of_gyration²} —
|
||||
* pairs naturally with the existing {@code radius_of_gyration} descriptor.
|
||||
*
|
||||
* <p>This is the canonical motivating case for the multi-column descriptor
|
||||
* interface: a single eigendecomposition produces all three values, so
|
||||
* splitting into three scalar descriptors would re-do the decomposition
|
||||
* thrice per pocket.
|
||||
*
|
||||
* <p>Empty / degenerate pockets:
|
||||
* <ul>
|
||||
* <li>0 points → all zeros</li>
|
||||
* <li>1 point → all zeros (point coincides with the centroid)</li>
|
||||
* <li>collinear points → λ₃ = 0 (one eigenvector is degenerate)</li>
|
||||
* <li>coplanar points → λ₃ = 0 (one eigenvector lies along the plane normal)</li>
|
||||
* </ul>
|
||||
*/
|
||||
public final class PrincipalMomentsDescriptor implements PocketDescriptor {
|
||||
|
||||
private static final List<String> COLUMN_NAMES = ImmutableList.of("lambda1", "lambda2", "lambda3");
|
||||
private static final List<ColumnType> COLUMN_TYPES = ImmutableList.of(
|
||||
ColumnType.DOUBLE, ColumnType.DOUBLE, ColumnType.DOUBLE);
|
||||
private static final double[] ZEROS = new double[] { 0d, 0d, 0d };
|
||||
|
||||
@Override public String name() { return "principal_moments"; }
|
||||
@Override public List<String> columnNames() { return COLUMN_NAMES; }
|
||||
@Override public List<ColumnType> columnTypes() { return COLUMN_TYPES; }
|
||||
|
||||
@Override
|
||||
public double[] compute(PocketGridContext ctx) {
|
||||
BitSet indices = ctx.gridPointIndices();
|
||||
int n = indices.cardinality();
|
||||
if (n < 2) return ZEROS.clone(); // degenerate — see class javadoc
|
||||
|
||||
List<Atom> allPoints = ctx.grid().getAllPoints().list;
|
||||
|
||||
// Centroid (equal-weighted).
|
||||
double sx = 0d, sy = 0d, sz = 0d;
|
||||
for (int i = indices.nextSetBit(0); i >= 0; i = indices.nextSetBit(i + 1)) {
|
||||
Atom p = allPoints.get(i);
|
||||
sx += p.getX(); sy += p.getY(); sz += p.getZ();
|
||||
}
|
||||
double cx = sx / n, cy = sy / n, cz = sz / n;
|
||||
|
||||
// Gyration tensor (3x3 symmetric): G_ab = (1/n) Σ (r_a - c_a)(r_b - c_b).
|
||||
// Accumulate off-diagonal once; the matrix is symmetric.
|
||||
double gxx = 0d, gyy = 0d, gzz = 0d, gxy = 0d, gxz = 0d, gyz = 0d;
|
||||
for (int i = indices.nextSetBit(0); i >= 0; i = indices.nextSetBit(i + 1)) {
|
||||
Atom p = allPoints.get(i);
|
||||
double dx = p.getX() - cx, dy = p.getY() - cy, dz = p.getZ() - cz;
|
||||
gxx += dx * dx;
|
||||
gyy += dy * dy;
|
||||
gzz += dz * dz;
|
||||
gxy += dx * dy;
|
||||
gxz += dx * dz;
|
||||
gyz += dy * dz;
|
||||
}
|
||||
gxx /= n; gyy /= n; gzz /= n; gxy /= n; gxz /= n; gyz /= n;
|
||||
|
||||
// Eigendecomposition of the symmetric 3x3 tensor. EigenDecomposition's symmetric
|
||||
// path (used when the matrix is symmetric within its tolerance) returns real
|
||||
// eigenvalues in arbitrary order — sort descending below.
|
||||
double[][] m = new double[][] {
|
||||
{ gxx, gxy, gxz },
|
||||
{ gxy, gyy, gyz },
|
||||
{ gxz, gyz, gzz }
|
||||
};
|
||||
EigenDecomposition ed = new EigenDecomposition(new Array2DRowRealMatrix(m, false));
|
||||
double[] eigenvalues = ed.getRealEigenvalues();
|
||||
|
||||
// Sort descending. The gyration tensor is positive semi-definite — any tiny
|
||||
// negative eigenvalue from numerical noise gets clamped to 0 below.
|
||||
Arrays.sort(eigenvalues);
|
||||
double l1 = clampNonNegative(eigenvalues[2]);
|
||||
double l2 = clampNonNegative(eigenvalues[1]);
|
||||
double l3 = clampNonNegative(eigenvalues[0]);
|
||||
return new double[] { l1, l2, l3 };
|
||||
}
|
||||
|
||||
/** PSD eigenvalues should be ≥ 0; numerical noise can push them slightly negative. */
|
||||
private static double clampNonNegative(double v) {
|
||||
return v < 0d ? 0d : v;
|
||||
}
|
||||
|
||||
}
|
||||
@@ -16,13 +16,13 @@ import java.util.List;
|
||||
* extent of the pocket. Two pockets with the same volume can have very different Rg
|
||||
* (compact vs. elongated). {@code Rg = 0} for an empty or single-point pocket.
|
||||
*/
|
||||
public final class RadiusOfGyrationDescriptor implements PocketDescriptor {
|
||||
public final class RadiusOfGyrationDescriptor extends AbstractScalarPocketDescriptor {
|
||||
|
||||
@Override public String name() { return "radius_of_gyration"; }
|
||||
@Override public ColumnType columnType() { return ColumnType.DOUBLE; }
|
||||
@Override protected ColumnType scalarType() { return ColumnType.DOUBLE; }
|
||||
|
||||
@Override
|
||||
public double compute(PocketGridContext ctx) {
|
||||
protected double computeScalar(PocketGridContext ctx) {
|
||||
BitSet indices = ctx.gridPointIndices();
|
||||
int n = indices.cardinality();
|
||||
if (n < 2) return 0.0d;
|
||||
|
||||
@@ -20,13 +20,13 @@ import java.util.List;
|
||||
* the volume ratio, so 1.0 = perfect sphere, low values = elongated /
|
||||
* irregular.
|
||||
*/
|
||||
public final class SphericityDescriptor implements PocketDescriptor {
|
||||
public final class SphericityDescriptor extends AbstractScalarPocketDescriptor {
|
||||
|
||||
@Override public String name() { return "sphericity"; }
|
||||
@Override public ColumnType columnType() { return ColumnType.DOUBLE; }
|
||||
@Override protected ColumnType scalarType() { return ColumnType.DOUBLE; }
|
||||
|
||||
@Override
|
||||
public double compute(PocketGridContext ctx) {
|
||||
protected double computeScalar(PocketGridContext ctx) {
|
||||
BitSet indices = ctx.gridPointIndices();
|
||||
int n = indices.cardinality();
|
||||
if (n == 0) return 0.0d;
|
||||
|
||||
@@ -8,13 +8,13 @@ import cz.siret.prank.program.routines.predict.output.TableData.ColumnType;
|
||||
*
|
||||
* <p>Unit: ų. Accuracy scales with {@code pocket_grid_spacing}.
|
||||
*/
|
||||
public final class VolumeDescriptor implements PocketDescriptor {
|
||||
public final class VolumeDescriptor extends AbstractScalarPocketDescriptor {
|
||||
|
||||
@Override public String name() { return "volume"; }
|
||||
@Override public ColumnType columnType() { return ColumnType.DOUBLE; }
|
||||
@Override protected ColumnType scalarType() { return ColumnType.DOUBLE; }
|
||||
|
||||
@Override
|
||||
public double compute(PocketGridContext ctx) {
|
||||
protected double computeScalar(PocketGridContext ctx) {
|
||||
double s = ctx.grid().getSpacing();
|
||||
return ctx.gridPointIndices().cardinality() * s * s * s;
|
||||
}
|
||||
|
||||
@@ -118,4 +118,39 @@ class PocketDescriptorsRowsTest {
|
||||
assertEquals(['name', 'rank', 'score', 'center_x', 'center_y', 'center_z'], data.header)
|
||||
}
|
||||
|
||||
@Test
|
||||
void multiColumnDescriptorEmitsPrefixedHeadersAndCorrectValues() {
|
||||
// principal_moments is the only multi-column shipped descriptor today.
|
||||
// Verifies the schema-build prefix rule AND that the row layout puts the
|
||||
// descriptor's three eigenvalues in the correct trailing positions.
|
||||
TestPocket p = pocket(1, "pocket.1", 0.5d, new Atoms([heavyAtomAt(0d, 0d, 0d)]))
|
||||
PocketDescriptorsRows data = new PocketDescriptorsRows(
|
||||
[p], ['principal_moments'], null, tinyGrid(1, 27)) // 3³ cube → isotropic eigenvalues
|
||||
assertEquals(['name', 'rank', 'score', 'center_x', 'center_y', 'center_z',
|
||||
'principal_moments.lambda1', 'principal_moments.lambda2', 'principal_moments.lambda3'],
|
||||
data.header)
|
||||
double[] row = data.getRow(0)
|
||||
assertEquals(9, row.length)
|
||||
// For a tinyGrid 27-point sequence along x (not a real cube), the eigenvalues
|
||||
// aren't isotropic — assert non-negative and sorted descending, which is the
|
||||
// contract this row layout test cares about.
|
||||
assertTrue(row[6] >= row[7], "λ₁ ≥ λ₂ in row")
|
||||
assertTrue(row[7] >= row[8], "λ₂ ≥ λ₃ in row")
|
||||
assertTrue(row[8] >= 0d, "λ₃ ≥ 0 in row")
|
||||
}
|
||||
|
||||
@Test
|
||||
void scalarAndMultiColumnDescriptorsAppearInRequestedOrder() {
|
||||
// Mix scalar + multi: column order is determined by the descriptorNames list,
|
||||
// and within each descriptor by its declared columnNames(). 'volume' (scalar)
|
||||
// then 'principal_moments' (3 cols) → 4 trailing columns after the base.
|
||||
TestPocket p = pocket(1, "pocket.1", 0.5d, new Atoms([heavyAtomAt(0d, 0d, 0d)]))
|
||||
PocketDescriptorsRows data = new PocketDescriptorsRows(
|
||||
[p], ['volume', 'principal_moments'], null, tinyGrid(1, 8))
|
||||
assertEquals(['name', 'rank', 'score', 'center_x', 'center_y', 'center_z',
|
||||
'volume',
|
||||
'principal_moments.lambda1', 'principal_moments.lambda2', 'principal_moments.lambda3'],
|
||||
data.header)
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
@@ -70,7 +70,7 @@ class PocketDescriptorsTest {
|
||||
void volumeIs8For8UnitCells() {
|
||||
PocketGrid grid = gridOfPoints(cube(2)) // 2x2x2 = 8 points, spacing 1.0
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double v = new VolumeDescriptor().compute(ctx(grid, p))
|
||||
double v = new VolumeDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(8.0d, v, DELTA)
|
||||
}
|
||||
|
||||
@@ -88,7 +88,7 @@ class PocketDescriptorsTest {
|
||||
assigned.put(1, bs)
|
||||
PocketGrid grid = new PocketGrid(new Atoms(pts), 0.5d, 0d, 0d, 0d, index, assigned)
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double v = new VolumeDescriptor().compute(ctx(grid, p))
|
||||
double v = new VolumeDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(8 * 0.125d, v, DELTA) // 8 cells × 0.5³
|
||||
}
|
||||
|
||||
@@ -100,7 +100,7 @@ class PocketDescriptorsTest {
|
||||
// V_pocket = 125, V_sphere = 4/3·π·3.46³ ≈ 173.5; ratio ≈ 0.72.
|
||||
PocketGrid grid = gridOfPoints(cube(5))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double s = new SphericityDescriptor().compute(ctx(grid, p))
|
||||
double s = new SphericityDescriptor().compute(ctx(grid, p))[0]
|
||||
assertTrue(s > 0.5d, "cube sphericity ${s} too low")
|
||||
assertTrue(s <= 1.0d, "sphericity in [0,1]")
|
||||
}
|
||||
@@ -117,7 +117,7 @@ class PocketDescriptorsTest {
|
||||
}
|
||||
PocketGrid grid = gridOfPoints(pts)
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double s = new SphericityDescriptor().compute(ctx(grid, p))
|
||||
double s = new SphericityDescriptor().compute(ctx(grid, p))[0]
|
||||
assertTrue(s < 0.2d, "flat disc sphericity ${s} too high")
|
||||
}
|
||||
|
||||
@@ -127,7 +127,7 @@ class PocketDescriptorsTest {
|
||||
new LongIntHashMap(),
|
||||
Collections.<Integer, BitSet> singletonMap(1, new BitSet()))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double s = new SphericityDescriptor().compute(ctx(empty, p))
|
||||
double s = new SphericityDescriptor().compute(ctx(empty, p))[0]
|
||||
assertEquals(0.0d, s, DELTA)
|
||||
}
|
||||
|
||||
@@ -135,7 +135,7 @@ class PocketDescriptorsTest {
|
||||
void sphericityOneForSinglePoint() {
|
||||
PocketGrid grid = gridOfPoints([new Point(0d, 0d, 0d) as Atom])
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double s = new SphericityDescriptor().compute(ctx(grid, p))
|
||||
double s = new SphericityDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(1.0d, s, DELTA)
|
||||
}
|
||||
|
||||
@@ -147,7 +147,7 @@ class PocketDescriptorsTest {
|
||||
p.rank = 1
|
||||
p.surfaceAtoms = new Atoms([heavyAtomAt(0d, 0d, 0d), heavyAtomAt(1d, 0d, 0d), heavyAtomAt(2d, 0d, 0d)])
|
||||
PocketGrid grid = gridOfPoints([new Point(0d, 0d, 0d) as Atom])
|
||||
double n = new NumSurfaceAtomsDescriptor().compute(ctx(grid, p))
|
||||
double n = new NumSurfaceAtomsDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(3.0d, n, DELTA)
|
||||
}
|
||||
|
||||
@@ -161,8 +161,8 @@ class PocketDescriptorsTest {
|
||||
TestPocket pNull = new TestPocket(); pNull.rank = 1 // surfaceAtoms stays null
|
||||
PocketGrid grid = gridOfPoints([new Point(0d, 0d, 0d) as Atom])
|
||||
NumResiduesDescriptor d = new NumResiduesDescriptor()
|
||||
assertEquals(0.0d, d.compute(ctx(grid, pEmpty)), DELTA)
|
||||
assertEquals(0.0d, d.compute(ctx(grid, pNull)), DELTA)
|
||||
assertEquals(0.0d, d.compute(ctx(grid, pEmpty))[0], DELTA)
|
||||
assertEquals(0.0d, d.compute(ctx(grid, pNull))[0], DELTA)
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -188,7 +188,7 @@ class PocketDescriptorsTest {
|
||||
void numGridPointsCountsAssignedCells() {
|
||||
PocketGrid grid = gridOfPoints(cube(3)) // 27 points, all assigned to pocket 1
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double n = new NumGridPointsDescriptor().compute(ctx(grid, p))
|
||||
double n = new NumGridPointsDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(27.0d, n, DELTA)
|
||||
}
|
||||
|
||||
@@ -198,7 +198,7 @@ class PocketDescriptorsTest {
|
||||
new com.carrotsearch.hppc.LongIntHashMap(),
|
||||
Collections.<Integer, BitSet> singletonMap(1, new BitSet()))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double n = new NumGridPointsDescriptor().compute(ctx(empty, p))
|
||||
double n = new NumGridPointsDescriptor().compute(ctx(empty, p))[0]
|
||||
assertEquals(0.0d, n, DELTA)
|
||||
}
|
||||
|
||||
@@ -210,7 +210,7 @@ class PocketDescriptorsTest {
|
||||
new com.carrotsearch.hppc.LongIntHashMap(),
|
||||
Collections.<Integer, BitSet> singletonMap(1, new BitSet()))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(empty, p))
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(empty, p))[0]
|
||||
assertEquals(0.0d, rg, DELTA)
|
||||
}
|
||||
|
||||
@@ -218,7 +218,7 @@ class PocketDescriptorsTest {
|
||||
void radiusOfGyrationZeroForSinglePoint() {
|
||||
PocketGrid grid = gridOfPoints([new Point(0d, 0d, 0d) as Atom])
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(0.0d, rg, DELTA)
|
||||
}
|
||||
|
||||
@@ -232,7 +232,7 @@ class PocketDescriptorsTest {
|
||||
new Point(-1d, 0d, 0d) as Atom,
|
||||
new Point(+1d, 0d, 0d) as Atom])
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(1.0d, rg, DELTA)
|
||||
}
|
||||
|
||||
@@ -245,7 +245,7 @@ class PocketDescriptorsTest {
|
||||
void radiusOfGyrationOfCube() {
|
||||
PocketGrid grid = gridOfPoints(cube(3))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))[0]
|
||||
assertEquals(Math.sqrt(2d), rg, 1e-6d)
|
||||
}
|
||||
|
||||
@@ -254,7 +254,8 @@ class PocketDescriptorsTest {
|
||||
@Test
|
||||
void registryResolvesKnownNames() {
|
||||
['volume', 'sphericity', 'radius_of_gyration',
|
||||
'num_residues', 'num_surface_atoms', 'num_grid_points'].each { String name ->
|
||||
'num_residues', 'num_surface_atoms', 'num_grid_points',
|
||||
'principal_moments'].each { String name ->
|
||||
PocketDescriptor d = PocketDescriptorRegistry.get(name)
|
||||
assertNotNull(d)
|
||||
assertEquals(name, d.name())
|
||||
@@ -273,17 +274,104 @@ class PocketDescriptorsTest {
|
||||
Set<String> known = PocketDescriptorRegistry.knownNames()
|
||||
assertTrue(known.containsAll(
|
||||
['volume', 'sphericity', 'radius_of_gyration',
|
||||
'num_residues', 'num_surface_atoms', 'num_grid_points'] as Set))
|
||||
'num_residues', 'num_surface_atoms', 'num_grid_points',
|
||||
'principal_moments'] as Set))
|
||||
}
|
||||
|
||||
@Test
|
||||
void columnTypesAreCorrect() {
|
||||
assertEquals(ColumnType.DOUBLE, PocketDescriptorRegistry.get('volume').columnType())
|
||||
assertEquals(ColumnType.DOUBLE, PocketDescriptorRegistry.get('sphericity').columnType())
|
||||
assertEquals(ColumnType.DOUBLE, PocketDescriptorRegistry.get('radius_of_gyration').columnType())
|
||||
assertEquals(ColumnType.INT, PocketDescriptorRegistry.get('num_residues').columnType())
|
||||
assertEquals(ColumnType.INT, PocketDescriptorRegistry.get('num_surface_atoms').columnType())
|
||||
assertEquals(ColumnType.INT, PocketDescriptorRegistry.get('num_grid_points').columnType())
|
||||
// Scalar descriptors return a 1-element columnTypes() list; multi-column
|
||||
// descriptors return the full list. Spot-check both groups.
|
||||
assertEquals([ColumnType.DOUBLE], PocketDescriptorRegistry.get('volume').columnTypes())
|
||||
assertEquals([ColumnType.DOUBLE], PocketDescriptorRegistry.get('sphericity').columnTypes())
|
||||
assertEquals([ColumnType.DOUBLE], PocketDescriptorRegistry.get('radius_of_gyration').columnTypes())
|
||||
assertEquals([ColumnType.INT], PocketDescriptorRegistry.get('num_residues').columnTypes())
|
||||
assertEquals([ColumnType.INT], PocketDescriptorRegistry.get('num_surface_atoms').columnTypes())
|
||||
assertEquals([ColumnType.INT], PocketDescriptorRegistry.get('num_grid_points').columnTypes())
|
||||
assertEquals([ColumnType.DOUBLE, ColumnType.DOUBLE, ColumnType.DOUBLE],
|
||||
PocketDescriptorRegistry.get('principal_moments').columnTypes())
|
||||
}
|
||||
|
||||
// --- principal_moments ---
|
||||
|
||||
/**
|
||||
* 3×3×3 cube centered at (1,1,1). All three principal axes are equivalent
|
||||
* (cube is isotropic in axes-aligned directions), so the gyration tensor's
|
||||
* eigenvalues all equal the per-dimension variance = 2/3.
|
||||
*/
|
||||
@Test
|
||||
void principalMomentsOfCubeAreEqual() {
|
||||
PocketGrid grid = gridOfPoints(cube(3))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double[] lambdas = new PrincipalMomentsDescriptor().compute(ctx(grid, p))
|
||||
assertEquals(3, lambdas.length)
|
||||
assertEquals(2d / 3d, lambdas[0], 1e-9d)
|
||||
assertEquals(2d / 3d, lambdas[1], 1e-9d)
|
||||
assertEquals(2d / 3d, lambdas[2], 1e-9d)
|
||||
}
|
||||
|
||||
/**
|
||||
* Two points along the x-axis: gyration tensor has λ₁ = (per-dim variance
|
||||
* of x), the other two are zero. Distinguishes rod-like shape signatures.
|
||||
*/
|
||||
@Test
|
||||
void principalMomentsOfTwoPointsAlongAxis() {
|
||||
Atom a = new Point(0d, 0d, 0d)
|
||||
Atom b = new Point(2d, 0d, 0d)
|
||||
LongIntHashMap idx = new LongIntHashMap()
|
||||
idx.put(PocketGrid.pack(0, 0, 0), 0)
|
||||
idx.put(PocketGrid.pack(2, 0, 0), 1)
|
||||
BitSet bs = new BitSet(); bs.set(0, 2)
|
||||
Map<Integer, BitSet> assigned = [(1): bs] as LinkedHashMap
|
||||
PocketGrid grid = new PocketGrid(new Atoms([a, b]), 1.0d, 0d, 0d, 0d, idx, assigned)
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double[] lambdas = new PrincipalMomentsDescriptor().compute(ctx(grid, p))
|
||||
// Per-dim variance of {0, 2} = mean((-1)² + 1²) = 1; other two dims have zero spread.
|
||||
assertEquals(1.0d, lambdas[0], 1e-9d)
|
||||
assertEquals(0.0d, lambdas[1], 1e-9d)
|
||||
assertEquals(0.0d, lambdas[2], 1e-9d)
|
||||
}
|
||||
|
||||
@Test
|
||||
void principalMomentsEigenvaluesAreSortedDescending() {
|
||||
// Square in the xy plane: variance(x) = variance(y) = some value > 0,
|
||||
// variance(z) = 0. λ₁, λ₂ are equal (and > 0); λ₃ = 0. Verifies the sort.
|
||||
List<Atom> pts = [
|
||||
new Point(0d, 0d, 0d),
|
||||
new Point(2d, 0d, 0d),
|
||||
new Point(0d, 2d, 0d),
|
||||
new Point(2d, 2d, 0d),
|
||||
]
|
||||
PocketGrid grid = gridOfPoints(pts)
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double[] lambdas = new PrincipalMomentsDescriptor().compute(ctx(grid, p))
|
||||
assertTrue(lambdas[0] >= lambdas[1], "λ₁ ≥ λ₂")
|
||||
assertTrue(lambdas[1] >= lambdas[2], "λ₂ ≥ λ₃")
|
||||
assertEquals(0.0d, lambdas[2], 1e-9d) // flat in xy → λ₃ = 0
|
||||
}
|
||||
|
||||
@Test
|
||||
void principalMomentsOfEmptyOrSinglePocketIsAllZeros() {
|
||||
// Cardinality < 2 short-circuits to zeros — see PrincipalMomentsDescriptor javadoc.
|
||||
PocketGrid empty = gridOfPoints([])
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double[] lambdas = new PrincipalMomentsDescriptor().compute(ctx(empty, p))
|
||||
assertArrayEquals([0.0d, 0.0d, 0.0d] as double[], lambdas, 0d)
|
||||
}
|
||||
|
||||
/**
|
||||
* Pairs with {@code radius_of_gyration}: trace(gyration tensor) = sum of
|
||||
* eigenvalues = Rg². Pins the relationship between the two descriptors so
|
||||
* a future change to one without the other gets caught.
|
||||
*/
|
||||
@Test
|
||||
void principalMomentsSumEqualsRadiusOfGyrationSquared() {
|
||||
PocketGrid grid = gridOfPoints(cube(3))
|
||||
TestPocket p = new TestPocket(); p.rank = 1
|
||||
double[] lambdas = new PrincipalMomentsDescriptor().compute(ctx(grid, p))
|
||||
double rg = new RadiusOfGyrationDescriptor().compute(ctx(grid, p))[0]
|
||||
double sum = lambdas[0] + lambdas[1] + lambdas[2]
|
||||
assertEquals(rg * rg, sum, 1e-9d)
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user