mirror of https://github.com/RosettaCommons/foundry.git synced 2026-06-04 13:24:22 +08:00

Go to file

lyskov-ai 3ae4ee81e6 chore(mypy): bring models/rfd3 into scope behind an ignore_errors ratchet (#297 )

* refactor(mypy): un-ignore 5 easy-tier modules

Fix each module's single pre-existing type error with a pure annotation
or setattr change (no behavior change) and remove it from the
[[tool.mypy.overrides]] ignore_errors list:

- callbacks/train_logging: loss_trackers: dict[str, MeanMetric]
- callbacks/metrics_logging: seen_examples: set[str]
- common: setattr(wrapper, "_has_run", True) for the @wraps wrapper
- hydra/resolvers: attribute_path: str | None (body already guards)
- inference_engines/base: base_overrides: dict[str, Any]

13 modules remain on the ignore list. mypy now type-checks the 5
newly-included modules cleanly.

Co-authored-by: Sergey Lyskov <sergey.lyskov@jhu.edu>

* refactor(mypy): un-ignore 7 medium-tier modules

Resolve the type errors in and remove from the [[tool.mypy.overrides]]
ignore_errors list. Mostly narrowing / annotation fixes; two deliberate
type-honesty fixes flagged below.

- utils/weights: lowercase `any` -> `Any` in _PatternPolicyMixin (4x);
  assert-narrow fallback_policy at the call site (matches get_policy idiom)
- model/layers/blocks: class-level w/b: torch.Tensor for the registered
  buffers (avoids nn.Module's Tensor | Module __getattr__ fallback)
- utils/components: is-None narrowing + tip_names local in get_name_mask's
  TIP branch (exists() can't narrow for mypy); drop orphaned exists import
- utils/logging: str(field) for the tree key; assign to a new hparams local
  rather than reassigning the typed cfg param
- foundry_cli/download_checkpoints: guard on `hasher is not None`;
  total_size = 0.0 for the float accumulation
- training/schedulers: SchedulerConfig.scheduler is now a required field
  (was = None, but documented required and assumed non-None everywhere)
- utils/xpu/xpu_accelerator: name @property -> @staticmethod to match
  lightning's Accelerator ABC

6 hard-tier modules remain on the ignore list.

Co-authored-by: Sergey Lyskov <sergey.lyskov@jhu.edu>

* refactor(mypy): un-ignore metrics/metric module

Fix the 11 type errors in foundry.metrics.metric and remove it from the
[tool.mypy.overrides] ignore_errors list (5 hard-tier modules remain).

- str(name) coercion of DictConfig.items() keys (str|bytes|int|... union)
- exists() -> 'is not None' narrowing; drop orphaned atomworks import
- widen compute_from_kwargs -> dict|list and kwargs_to_compute_args -> dict|None
  to match the actual returns / documented contract (callers already handle them)
- three type: ignore[arg-type] on nested_dict.get/getitem for an upstream
  atomworks annotation bug (param typed dict[tuple,...] but navigated as nested
  dict[str,Any]); warn_unused_ignores will flag them if upstream is fixed

No behavior change. All gates green (ruff, mypy 41 files, pytest 27 passed).

Co-authored-by: Sergey Lyskov <sergey.lyskov@jhu.edu>

* refactor(mypy): un-ignore utils/{ddp,rigid,datasets}

Clear the three remaining foundry.utils.* modules off the mypy ignore_errors list (47 errors: ddp 12, rigid 16, datasets 19). Type-honesty and annotation fixes only, no behavior change: narrow DictConfig|dict params to DictConfig where attribute access requires it (item access kept where a plain-dict default is real), honest int|None / Tensor|None widenings, variable renames to avoid type-reuse, str() coercion of DictConfig keys, the file's own if/elif/else narrowing pattern, and documented type: ignore / cast for genuine torch and atomworks stub limitations. Two hard-tier modules remain (callbacks/health_logging, trainers/fabric).

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

* refactor(mypy): un-ignore callbacks/health_logging

Clear foundry.callbacks.health_logging off the mypy ignore_errors list
by fixing its 23 type errors (annotation / type-honesty only, no
behavior change):

- import the stdlib 'types' module directly instead of relying on
  'from typing import types' (worked at runtime but fragile/untyped)
- replace 'callable'-used-as-a-type with Mapping[str, Callable[..., Any]]
  on the stat/histogram dict params and Callable[..., bool] | None on the
  filter params; annotate the two MappingProxyType default constants to
  match
- annotate the _hooks / _temp_cache / _cache instance vars
- make implicit-Optional defaults explicit (... | None) on the two
  plot_tensor_* helpers, matching their is-not-None guards
- in plot_tensor_hist, replace two type-changing param reassignments with
  equivalent always-set locals (display_values, step_labels)

Only trainers/fabric remains on the ignore list.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

* refactor(mypy): un-ignore trainers/fabric (ratchet complete)

Clear foundry.trainers.fabric (the last and largest module) off the
mypy ignore_errors list and remove the now-empty override block. The
ratchet ignore list is now empty: all of src/foundry + src/foundry_cli
type-checks with no per-module exemptions.

Fixes are annotation / type-honesty only, no behavior change:

- annotate self.state as dict[str, Any] (a heterogeneous, dynamically-
  keyed training-state bag, also merged with arbitrary checkpoint keys);
  this collapses ~69 union-attr/operator/arg-type errors. Also annotate
  default_state and declare _current_train_return (set by subclass
  training_step implementations).
- dataloader types: Fabric.setup_dataloaders is stub-typed to return
  DataLoader | list[DataLoader], so cast its single-loader results to
  DataLoader and change train_loop/validation_loop params from
  _FabricDataLoader to DataLoader (drop the now-unused import).
- precision: widen the param to str | int | None (the body sets it None
  when an XPU plugin takes over), cast to the guarded Literal at the
  XPUMixedPrecision call, and add one documented type: ignore[arg-type]
  where our public API is wider than Fabric's precision Literal.
- narrow the parameter-freezing guard to direct attribute access; type
  get_latest_checkpoint as Path | None (matching its returns) with a
  cast at the single caller; drop a stale type: ignore.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

* chore(mypy): bring models/rfd3 into scope behind an ignore_errors ratchet

Add models/rfd3/src/rfd3 to [tool.mypy].files so the rfd3 model package
is type-checked by the standard gate (mypy now covers 99 files: foundry +
rfd3). Seed a fresh [[tool.mypy.overrides]] ignore_errors ratchet listing
the 32 rfd3 modules with pre-existing type errors (194 total), mirroring
the original src/foundry bootstrap; the 26 already-clean rfd3 modules are
type-checked immediately. Modules are cleared from the ratchet one slice
at a time in follow-up work.

Config only, no code changes. rfd3 is an editable install, so imports
resolve without an added mypy_path.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>

---------

Co-authored-by: Sergey Lyskov <sergey.lyskov@jhu.edu>
Co-authored-by: Hope Woods <hope.woods@omsf.io>

2026-06-03 15:48:39 -05:00

.github

test: bootstrap mypy + pytest + coverage CI gates (#284 )

2026-06-02 13:51:28 -05:00

docs

Docs add analytics (#264 )

2026-04-07 10:15:21 -07:00

examples

Update author email for Sergey Lyskov (#228 )

2026-02-25 14:33:29 -08:00

models

test: bootstrap mypy + pytest + coverage CI gates (#284 )

2026-06-02 13:51:28 -05:00

src

refactor(mypy): un-ignore trainers/fabric (ratchet complete) (#296 )

2026-06-03 15:13:37 -05:00

tests

fix: make weighted_rigid_align dtype-agnostic (accept float64) (#286 )

2026-06-02 16:45:33 -05:00

.env

Shorten .env

2025-12-04 21:36:56 -08:00

.gitignore

fix: weight initialization bug in chunked P_LL (#229 )

2026-02-25 16:29:51 -08:00

.gitmodules

DRAFT: docs for release, soft code hbplus (#699 )

2025-12-01 18:23:02 -08:00

.pre-commit-config.yaml

refactor source files for open sourcing (#648 )

2025-11-20 16:29:47 -08:00

.project-root

refactor: new modelhub (#109 )

2025-04-08 13:33:17 -07:00

CONTRIBUTING.md

Docs: Creating a contribution guide for Foundry (#215 )

2026-04-04 18:43:12 -07:00

LICENSE.md

clean: make pip installable, remove unused files, ruff, add license

2025-08-14 14:34:42 -07:00

Makefile

Add initial RFD3 Files and passing tests

2025-11-11 10:07:43 -08:00

pyproject.toml

chore(mypy): bring models/rfd3 into scope behind an ignore_errors ratchet (#297 )

2026-06-03 15:48:39 -05:00

README.md

rfd3na (#269 )

2026-04-16 09:52:02 -07:00

refactor.sh

Add initial RFD3 Files and passing tests

2025-11-11 10:07:43 -08:00

README.md

Protein design with Foundry

Foundry provides tooling and infrastructure for using and training all classes of models for protein design, including design (RFD3), inverse folding (ProteinMPNN) and protein folding (RF3).

All models within Foundry rely on AtomWorks - a unified framework for manipulating and processing biomolecular structures - for both training and inference.

Note

We have a slack now! Join for updates and to get your questions answered here.

Getting Started

Quickstart guide

Installation

pip install "rc-foundry[all]"

Intel XPU Installation

For Intel XPU devices, install PyTorch with XPU support first, then install Foundry.

pip install torch --index-url https://download.pytorch.org/whl/xpu
pip install "rc-foundry[all]"

Note

Use pip (not uv) for XPU installs since UV re-resolves dependencies and may replace your XPU torch with the standard PyPI version.

macOS (Apple Silicon) Installation

MPS support is available via a community fork. Install PyTorch first, then install directly from the fork:

pip install torch
pip install "rc-foundry[all] @ git+https://github.com/fnachon/foundry.git"

All three models — RFD3, RF3, and ProteinMPNN/LigandMPNN — run on Apple Silicon MPS.

Note

The rf3 extra (cuEquivariance) is Linux-only and is automatically skipped on macOS.

Use float32 precision — bfloat16 is not supported on MPS. The MPS accelerator is selected and float32 precision is enforced automatically.

Inference only; multi-GPU training is not supported on MPS.

For rf3 fold, pass an absolute path to the input CIF file.

Downloading weights Models can be downloaded to a target folder with:

foundry install base-models --checkpoint-dir <path/to/ckpt/dir>

where checkpoint-dir will be ~/.foundry/checkpoints by default. Foundry always searches ~/.foundry/checkpoints plus any colon-separated entries in $FOUNDRY_CHECKPOINT_DIRS during inference or subsequent commands to find checkpoints. base-models installs the latest RFD3, RF3 and MPNN variants - you can also download all of the models supported (including multiple checkpoints of RF3) with all, or by listing the models sequentially (e.g. foundry install rfd3 rf3 ...). To list the registry of available checkpoints:

foundry list-available

To check what you already have downloaded (searches ~/.foundry/checkpoints plus $FOUNDRY_CHECKPOINT_DIRS if set):

foundry list-installed

See examples/all.ipynb for how to run each model and design proteins end-to-end in a notebook.

Docker Image

There is an official Foundry image maintained by the Rosetta Commons. The default image comes with the model weights for the available models, but you can use the slim tag to either use pre-exiting model weights or use the image to download the available model weights.

For more information and example syntax, see the Overview on DockerHub.

The recipe to create the Docker image can be found in foundry/examples/docker and can be used as a "blue-print" for creating your own images.

Google Colab

For an interactive Google Colab notebook walking through a basic design pipeline with RFD3, MPNN, and RF3, please see the IPD Design Pipeline Tutorial.

RFdiffusion3 (RFD3)

RFdiffusion3 is an all-atom generative model capable of designing protein structures under complex constraints.

See models/rfd3/README.md for complete documentation.

RFdiffusion3NA (RFD3NA)

RFdiffusion3NA is an extension of RFDiffusion3 capable of designing also nucleic acid structures under complex constraints.

See models/rfd3na/README.md for complete documentation.

RosettaFold3 (RF3)

RF3 is a structure prediction neural network that narrows the gap between closed-source AF-3 and open-source alternatives.

See models/rf3/README.md for complete documentation.

ProteinMPNN

ProteinMPNN and LigandMPNN are lightweight inverse-folding models which can be use to design diverse sequences for backbones under constrained conditions.

See models/mpnn/README.md for complete documentation.

Development

Code Organization

Strict dependency flow: foundry → atomworks

atomworks: Structure I/O, preprocessing, featurization
foundry: Model architectures, training, inference endpoints
models/<model>: Released models.

For Core Developers (Multiple Packages)

Install both foundry and models in editable mode for development:

uv pip install -e '.[all,dev]'

This approach allows you to:

Modify foundry shared utilities and see changes immediately
Work on specific models without installing all models
Add new models as independent packages in models/

Note

Running tests is not currently supported, test files may be missing.

Adding New Models

To add a new model:

Create models/<model_name>/ directory with its own pyproject.toml
Add foundry as a dependency
Implement model-specific code in models/<model_name>/src/
Users can install with: uv pip install -e ./models/<model_name>

Pre-commit Formatting

We ship a .pre-commit-config.yaml that runs make format (via ruff format) before each commit. Enable it once per clone:

pip install pre-commit  # if not already installed
pre-commit install

After installation the hook automatically formats the repo whenever you git commit. Use pre-commit run --all-files to apply it manually.

Citation

If you use this repository code or data in your work, please cite the relavant work as below:

@article{corley2025accelerating,
  title={Accelerating biomolecular modeling with atomworks and rf3},
  author={Corley, Nathaniel and Mathis, Simon and Krishna, Rohith and Bauer, Magnus S and Thompson, Tuscan R and Ahern, Woody and Kazman, Maxwell W and Brent, Rafael I and Didi, Kieran and Kubaney, Andrew and others},
  journal={bioRxiv},
  year={2025}
}

@article {butcher2025_rfdiffusion3,
    author = {Butcher, Jasper and Krishna, Rohith and Mitra, Raktim and Brent, Rafael Isaac and Li, Yanjing and Corley, Nathaniel and Kim, Paul T and Funk, Jonathan and Mathis, Simon Valentin and Salike, Saman and Muraishi, Aiko and Eisenach, Helen and Thompson, Tuscan Rock and Chen, Jie and Politanska, Yuliya and Sehgal, Enisha and Coventry, Brian and Zhang, Odin and Qiang, Bo and Didi, Kieran and Kazman, Maxwell and DiMaio, Frank and Baker, David},
    title = {De novo Design of All-atom Biomolecular Interactions with RFdiffusion3},
    elocation-id = {2025.09.18.676967},
    year = {2025},
    doi = {10.1101/2025.09.18.676967},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2025/11/19/2025.09.18.676967},
    eprint = {https://www.biorxiv.org/content/early/2025/11/19/2025.09.18.676967.full.pdf},
    journal = {bioRxiv}
}

@article{dauparas2022robust,
  title={Robust deep learning--based protein sequence design using ProteinMPNN},
  author={Dauparas, Justas and Anishchenko, Ivan and Bennett, Nathaniel and Bai, Hua and Ragotte, Robert J and Milles, Lukas F and Wicky, Basile IM and Courbet, Alexis and de Haas, Rob J and Bethel, Neville and others},
  journal={Science},
  volume={378},
  number={6615},
  pages={49--56},
  year={2022},
  publisher={American Association for the Advancement of Science}
}

@article{dauparas2025atomic,
  title={Atomic context-conditioned protein sequence design using LigandMPNN},
  author={Dauparas, Justas and Lee, Gyu Rie and Pecoraro, Robert and An, Linna and Anishchenko, Ivan and Glasscock, Cameron and Baker, David},
  journal={Nature Methods},
  pages={1--7},
  year={2025},
  publisher={Nature Publishing Group US New York}
}

Acknowledgments

We thank Rachel Clune and Hope Woods from the RosettaCommons for their collaboration on the codebase, documentation, tutorials and examples.