Rename processssItem to processItem. Add per-item conditional cache
clearing after processing to reduce peak memory. Refactor cleanCaches
into clearCache/clearPrimaryCache/clearSecondaryCache with null-safety.
Parametrized test generates random points, builds both trees, verifies
identical results for all query types, and measures relative performance.
Skipped during normal test runs; invoked via kdtree-benchmark.sh script.
Cache aggregated errors in Dataset.Result to avoid recomputing.
Use direct x/y/z field access instead of getCoords() in
Atoms.copyPoints and PointExportData to avoid double[3] allocations.
Add Dataset.Item.getRow() to reconstruct dataset row strings.
In cmdProteins(), collect items into with/without protein chains
using ConcurrentLinkedQueue and write split .ds files when any
structures lack protein chains.
Add Atom-based sqrDist/dist overloads in PerfUtils that use
getX/getY/getZ directly instead of allocating double[] via getCoords().
Refactor Point to store x/y/z as individual fields instead of a
double[] array. Fix Point.setCoords() which was previously a no-op.
Pre-build KD tree in Ligands.makeLigands() before the ligand loop.
Simplify KD tree usage in Atoms.dist/sqrDist by removing redundant
size threshold check.
Add writeAggregatedItemErrorsToCsv to Dataset.Result that groups errors
by message and outputs count/error sorted by frequency. Update
getErrorSummary to display an aggregated error table instead of just
the count. Add writeErrorCsvs(outdir) that writes all three error files
(per-item, aggregated, full stack traces) and use it in AnalyzeRoutine.
Add 'analyze proteins' command that outputs per-protein stats CSV
(chain counts, residues, atoms, ligands, peptides) and a summary table
with min/max/avg/median. Add 'analyze parse-proteins' for parsing
all dataset items and reporting errors only.
Introduce DataTable — a lock-free, pre-registered-column table for
structured data collection across threads, with CSV and summary output.
- Add ResidueSite and SiteLoader
- Update pocket criteria (DCA, DCC, DPA, DSA, DSO, DSWO) for site evaluation
- Extend Evaluation with site-metrics support
- Bump version to 2.6.0-dev.1
Memory optimization: per-prediction cost reduced from ~40 bytes (PPred object)
to ~9 bytes (parallel double[] + boolean[] arrays). For large datasets this
reduces prediction storage by ~77%.
PredictedScores provides: ArrayList-style growth, bulk addAll via arraycopy,
cached observedPositiveCount, stable descending merge sort (required for
reproducible metrics with tied RF scores), and direct backing array access
for hot loops in Metrics/Curves.
- Add NativePanamaForest/NativePanamaForestAvx2 availability checks in ModelConverter
- Refactor flattening logic to separate trainable forest preparation from conversion
- Track all eval times and compute average excluding first run (caching warmup)
- Rename TIME_M to TIME_TRAINEVAL_AVG_M, add TIME_EVAL_AVG_M stat
Generalize Model classifier from Classifier to Object to support both
trainable classifiers and flat BinaryForest models. Add rf_flatten_target
parameter for selecting forest type (FlatBinaryForest, LegacyFlatBinaryForest,
InterleavedBfsForest, etc). Deprecate rf_flatten_as_legacy in favor of the
new target type selection.
Replace flat jar with local Maven repo dependency at correct path
(groupId/artifactId/version/). Fix GString-to-String type errors in
AnalyzeRoutine that broke compilation with @CompileStatic.
Move misc/tutorials/ to documentation/ and add index readme.
Update rescoring.md: add quick-start examples, paper links for all
methods, add Pocketeer to supported methods list.
Fix stale links in README.md (tutorials path, local-env.sh typo).
Add PocketeerLoader that parses pockets.json output from Pocketeer,
including alpha spheres, residues, centroids, and surface atom mapping.
Register "pocketeer" as a prediction method in Dataset. Add unit tests
covering all 7 available datasets (CIF and PDB). Add rescoring tutorial
documenting all supported methods with examples.