ProteinMPNN, LigandMPNN, and SolubleMPNN
Warning
Benchmarking: Please use the old repositories of ProteinMPNN, LigandMPNN, and SolubleMPNN for model benchmarking/comparison until the API and public weights stabilize. We are in the process of validating that the re-implementation (both a retrained version and the old weight loading option) is as performant as the original models.
Important
Issues: Please provide feedback on any issues you encounter with the ProteinMPNN/LigandMPNN/SolubleMPNN re-implementation. We are particularly interested in discrepancies between the original models and this re-implementation, issues with performance when loading the original weights from the old repositories, problems with inference hyperparameters/conditioning, and input/output bugs.
ProteinMPNN enables protein sequence design given a fixed backbone structure of a protein. LigandMPNN extends this functionality to enable fixed-backbone sequence design of proteins in the context of ligands (i.e. small molecules, ions, DNA/RNA, etc.). This module represents a re-implementation of the original ProteinMPNN and LigandMPNN models within the modelforge/atomworks framework.
For more information on the original models, please see:
- ProteinMPNN: Robust deep learning–based protein sequence design using ProteinMPNN | ProteinMPNN Original Github
- LigandMPNN: Atomic context-conditioned protein sequence design using LigandMPNN | LigandMPNN Original Github
- SolubleMPNN: Computational design of soluble and functional membrane protein analogues
This guide provides instructions on preparing inputs and running inference for ProteinMPNN/LigandMPNN, as well as training these models.
Installation
A. Installation using uv
git clone https://github.com/RosettaCommons/foundry.git \
&& cd foundry \
&& uv python install 3.12 \
&& uv venv --python 3.12 \
&& source .venv/bin/activate \
&& uv pip install -e ".[mpnn]"
B. Download Model Weights
ProteinMPNN
Please use the following settings with these ProteinMPNN weights:
model_type:"protein_mpnn"is_legacy_weights:True
48 Nearest Neighbors, \sigma = 0.20 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/proteinmpnn_v_48_020.pt
Additional ProteinMPNN Weights
48 Nearest Neighbors, \sigma = 0.02 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/proteinmpnn_v_48_002.pt
48 Nearest Neighbors, \sigma = 0.10 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/proteinmpnn_v_48_010.pt
48 Nearest Neighbors, \sigma = 0.30 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/proteinmpnn_v_48_030.pt
LigandMPNN
Please use the following settings with these LigandMPNN weights:
model_type:"ligand_mpnn"is_legacy_weights:True
32 Nearest Neighbors, \sigma = 0.10 Å of Gaussian noise during training, 25 ligand atom context:
wget https://files.ipd.uw.edu/pub/ligandmpnn/ligandmpnn_v_32_010_25.pt
Additional LigandMPNN Weights
32 Nearest Neighbors, \sigma = 0.05 Å of Gaussian noise during training, 25 ligand atom context:
wget https://files.ipd.uw.edu/pub/ligandmpnn/ligandmpnn_v_32_005_25.pt
32 Nearest Neighbors, \sigma = 0.20 Å of Gaussian noise during training, 25 ligand atom context:
wget https://files.ipd.uw.edu/pub/ligandmpnn/ligandmpnn_v_32_020_25.pt
32 Nearest Neighbors, \sigma = 0.30 Å of Gaussian noise during training, 25 ligand atom context:
wget https://files.ipd.uw.edu/pub/ligandmpnn/ligandmpnn_v_32_030_25.pt
SolubleMPNN
Please use the following settings with these SolubleMPNN weights:
model_type:"protein_mpnn"is_legacy_weights:True
48 Nearest Neighbors, \sigma = 0.20 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/solublempnn_v_48_020.pt
Additional SolubleMPNN Weights
48 Nearest Neighbors, \sigma = 0.02 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/solublempnn_v_48_002.pt
48 Nearest Neighbors, \sigma = 0.10 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/solublempnn_v_48_010.pt
48 Nearest Neighbors, \sigma = 0.30 Å Gaussian noise during training:
wget https://files.ipd.uw.edu/pub/ligandmpnn/solublempnn_v_48_030.pt
Inference
Warning
Known Bug: There is currently an issue with loading MPNN user annotation (temperature, designed residues, etc.) from CIF/atom array annotations. Command line passing of these options works as expected, as does
input_dictspecificiation with MPNNInferenceEngine.
Important
API Instability: We are currently finalizing some cleanup work on the inference API. Please expect the API (including input formats and outputs) to stabilize in the upcoming weeks. Thank you for your patience!
Important
When using weights from the original ProteinMPNN/LigandMPNN/SolubleMPNN repositories, please ensure to set
is_legacy_weightstoTruewhen running inference.
A. Command Line Inference
Detailed documentation coming soon!
B. JSON-based Inference
Detailed documentation coming soon!
C. Programmatic (Scripted) Inference
Detailed documentation coming soon!
Important
Currently, 'mpnn_bias' and 'mpnn_pair_bias' annotations cannot be saved to CIF files due to shape limitations. As a result, these annotations must be recreated (either directly with annotation on the atom array or via the input config dictionary) when reloading designed structures from CIF files.
Training
Instructions for training ProteinMPNN/LigandMPNN/SolubleMPNN models will be updated here shortly.
Important
Training Code and New Weights: We are working to release the dataframes used for retrianing the ProteinMPNN, LigandMPNN, and SolubleMPNN re-implementations. Also, we are finalizing the retraining runs and will release weights retrained within this repository shortly.