AndrewKubaney
|
87264c2cb3
|
feat: add mpnn to refactor/rf3-lab (#704)
* Initial pass at file structure for MPNN merge.
* Copy and refactor for clarity pos encoding.
* PositionWiseFeedForward copied.
* Message passing layers, refactored.
* Token and atom encodings for MPNN.
* Naming consistency.
* Graph featurization for ProteinMPNN.
As well as membrane and pssm versions of MPNN.
* Finished LigandMPNN graph features.
* Code cleanup.
* MPNN classes rough draft.
* Additional comments.
* Move structure noise/sc atomize flag to kwargs.
* Finish encoding for protein and ligand MPNN.
* Saved some features for bookkeeping.
* Masks and decoding order. Decoder in progress.
* Added decoding; many bug fixes.
* feat: MPNN pipeline (#268)
* feat: MPNN pipeline, pipeline tests
* feat: backbone occupancy threshold
* chore: MR comments
* chore: fix tests
* Directory cleanup.
* Rework kwargs, change masking.
* Bug fix: permutation of causality masks.
* Chore: push probability utils.
* Chore: variable name typo.
* Feat: Symmetry handling during decoding.
* Bug fix: repeat symmetry input along batch.
* Bug fix: if/else for symmetry_weight.
* Bug fix: node_features -> num_node_features.
* Bug fix: various typos/misnamings.
* Bug fix: np.minimum -> min for python ints.
* Chore: spacing.
* Chore: rename forward output.
* Chore: documentation.
* Feat: loss.
* Feat: weight init static method.
* Chore: int->bool for masks.
* Chore: ensure decode_last_mask is bool.
* Bug: fix modelhub imports.
* Chore: refactor ligand subgraph featurization.
* Chore: missing imports.
* Chore: rename loss.
* Chore: rename model file.
* Chore: bug fixes and documentation.
* Bug fix: symmetry and autograd.
* Chore: documentation.
* Bug: save "pre noise" coords even when no noise.
* Bug: Fix dtypes.
* Chore: Model tests.
* Chore: input names.
* Chore: update comment.
* Chore: change input to model.
* Chore: rename feats.
* Chore: rename feats downstream.
* Chore: S_pred->S_sampled rename for clarity.
* Chore: linter
* Feat: protein-ligand interface calculation. (#410)
* Feat: protein-ligand interface calculation.
* Chore: use datahub validation check.
---------
Co-authored-by: Andrew Kubaney <akubaney@localhost>
* Chore: move token encoding.
* Chore: split transforms from pipeline.
* Feat: protein interface mask and batching.
* Chore: move protein-interface calc.
* Chore: empty commits for sampler/trainer.
* Chore: rename protein-ligand to polymer-ligand
* Feat: turn interface calcs into transform.
* Feat: transform for polymer interface mask.
* Bug: rename feats->input_features in tests.
* Bug: rename S_pred->S_sampled in tests.
* Bug: fix tests to match new model input format.
* Bug: fixed issues with polymer-ligand tests.
* Feat: collator.
* Chore: remove empty sampler.
* Chore: test refactor.
* Chore: update collate default.
* Chore: remove dist calc from interface for speed.
* Feat: auxillary settings in pipeline.
* Bug: fix mask_for_loss repeating.
* Feat: metrics and test updates.
* Chore: cleanup old files.
* Feat: padded token bucket sampler. (#432)
* Feat: padded token bucket sampler.
* Chore: defaults.
---------
Co-authored-by: Andrew Kubaney <akubaney@localhost>
* Bug: ligand subgraph shapes.
* Feat: trainer.
* Bug: call .item() on metrics.
* Bug: move idx to proper device.
* Chore: remove compute train metrics (unused).
* Feat: checkpointing.
* Feat: minimal return option in pipeline (for mem).
* Chore: refactor sampler.
* Chore: move empty atom_array assert.
* Feat: rough training code.
* Feat: set_epoch and torch generator.
* Bug: sampler name.
* Feat: token budget aware collation.
* Chore: cleanup prints.
* Chore: code style.
* Feat: more robust pipeline (from Nate).
* Feat: batch sampler logic (from Nate).
* Feat: train updates.
* Feat: checks for invalid examples.
* Bug: fix for empty non_atomized_array.
* Feat: shell scripts for training.
* Feat: first pass old weight loading.
* Chore: updates to training hyperparams.
* Chore: partial restructure under src.
* Chore: changes for amp and comment.
* Chore: move MPNN into models.
* Chore: add mpnn to shebang.
* Chore: initial readme.
* Chore: fix imports for atomworks and model.
* Chore: conftest added.
* Chore: move training shell scripts.
* Chore: add __init__.py.
* Chore: restructure mpnn dir.
* Bug: fix atomworks imports.
* Bug: continued import fixing.
* Chore: update autocast dtype functions.
* Bug: fix issues with tests.
* Chore: add model route in data pipeline.
* Chore: fix comment about ligandmpnn legacy bug.
* Chore: rename add auxillary settings.
* chore: organize transforms.
* feat: pipeline handles atomarray annotation.
* chore: split mpnn and rf3 exec.
* chore: rename old -> legacy.
* chore: rename old->legacy in code.
* chore: update intro README.
* chore: update shebang.
* chore: update training scripts.
* chore: move launch training scripts.
* chore: fix path for train file.
* chore: add addn params protein vs ligand.
* fix: make train.py/inference.py executable.
* fix: python->srun.
* chore: update notes.
* feat: add back metrics_logging to modelhub.
TODO: deduplicate rf3/rfd3 for callback
StoreValidationMetricsInDFCallback.
* chore: fix atomworks imports.
* chore: add .env call.
* fix: actually fix env setup.
* chore: rename featurization of user setting.
* fix: move featurize user settings to end.
* fix: import.
* chore: import order.
* fix: train date cutoff, ckpt loading.
* fix: update atomworks to fix residue starts.
* chore: rearrange utils.
* chore: create io utils file.
* chore: file rename.
* feat: DRAFT of inference engine/utils/script.
* small inference input loading fixes
* fix: collater and repeat_sample_num handling.
* feat: significant upgrade of cli/inference input.
* fix: addtional checks for user inputs.
* chore: update high level inference script.
* chore: minor changes; prepping for refactor.
* chore: comment and small fix legacy wts.
* chore: reorder constants in legacy wts.
* chore: warnings to README.
* chore: more notes on readme.
* chore: readme updates.
* feat: inference working.
* chore: note on README.
* fix: readme syntax issue.
* chore: readme format.
* chore: tests for inference.
* chore: formatting.
---------
Co-authored-by: Andrew Kubaney <akubaney@localhost>
Co-authored-by: Nathaniel Corley <ncorley@uw.edu>
Co-authored-by: Andrew Kubaney <akubaney@digs>
Co-authored-by: Raktim Mitra <raktim@localhost>
|
2025-11-29 22:34:05 -08:00 |
|