From fc6689eaf14e3db5b64d5f9710e61045b7e44d67 Mon Sep 17 00:00:00 2001 From: Hannah Baumann <43765638+hannahbaumann@users.noreply.github.com> Date: Fri, 24 Apr 2026 14:03:06 +0200 Subject: [PATCH] User guide updates membrane (#1933) * Membrane support prototype start * Point to corresponding branch in gufe * Make plain md protocol membrane ready * Dry run for membrane system for debugging * Add box vectors to modeller * Adapt htf to allow for membrane barostat * Fix test * Add test files membrane * Change test to cpu * Prints for debugging * Some more debugginh stuff, remove later * Small fix * Temporarily use branches for openmmtools and openmmforcefield * Small fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * support membrane in gather * First pass membrane support for SepTop protocol * Change validation solvent * Move changes in solvent validation to validate_solvent * Some small fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add membrane test septop * Change box vector units for handling membrane barostat * Test box vectors SepTop more * First pass at ABFE membrane prototype * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove some prints * Revert box changes * Some fixes in SepTop to fix tests * Apply suggestion from @hannahbaumann * Add tests for system creation * FIx test * Add ABFE membrane tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * More tests membrane septop and more dry outputs in septop * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add validation for box vector in ProteinMembraneComponent * Adapt gather for membranes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * small changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Modify box vectors plain md * Address review comments * Remove unnecessary prints * Subclass membrane test * Small fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unnecessary import * Remove prints * Small fix * Small fix * Larger change to checks for BaseSolventComponent, SolvatedPDBComponent, and adding barostat settings * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Some more changes * Small fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ABFE enforce timestep complex solvent the same * Update test files * Update environment.yaml with new openmmtools and openmmforcefield versions * Remove unnecessary checks for box_vectors which are handled in gufe * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update openfe/protocols/openmm_afe/equil_binding_afe_method.py Co-authored-by: Irfan Alibay * Attempt at rewriting validate_solvent * Move get barostat to self * Simplify mapping in test * Add validator for surface_tension and membrane barostat * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add membrane barostat to adaptive settings rfe protocol * Add validation/warning for barostat component mismatch * Fix tests (except CLI) * More merge fixes * Remove unnecessary imports * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add test files * Update tests * Add lipid ff in adaptive settings * remove commented out lines * Update test abfe * Update ABFE results with new settings * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix solvent validation * change env to gufe main * Apply suggestion from @hannahbaumann * Remove files from before src layout change * Update tests * Small fix * mark slow tests * Update env file * Small fix * Update src/openfe/protocols/openmm_afe/equil_binding_afe_method.py Co-authored-by: Irfan Alibay * Update src/openfe/protocols/openmm_afe/equil_binding_afe_method.py Co-authored-by: Irfan Alibay * Update src/openfe/protocols/openmm_rfe/_rfe_utils/relative.py Co-authored-by: Irfan Alibay * Update src/openfe/protocols/openmm_utils/system_validation.py Co-authored-by: Irfan Alibay * Address review comments 1 * update doc string * Update src/openfe/tests/protocols/openmm_abfe/test_abfe_protocol.py Co-authored-by: Irfan Alibay * Update src/openfe/tests/protocols/openmm_abfe/test_abfe_protocol.py Co-authored-by: Irfan Alibay * Update src/openfe/tests/protocols/openmm_abfe/test_abfe_protocol.py Co-authored-by: Irfan Alibay * Update src/openfe/tests/protocols/openmm_abfe/test_abfe_protocol.py Co-authored-by: Irfan Alibay * Address review comments 2 * Address review comments 3 * fix mypy * Fix system creation tests * Roundtrip box vectors to remove vec3 issues * Roundtrip box vectors to remove vec3 issues * Remove unnecessary imports * Apply suggestion from @hannahbaumann * Test mypy failure * Test mypy failure 2 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * some fixes * small fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test files * Update SepTop results settings * add more tests for adaptive settings and others * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add news entry * Fix for box vectors hybtop * Make test slow again * Make mypy happy * try to make mypy happy again * Adapt gather RFE membrane component types * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use reference platform for the membrane energy test * Add ChemicalSystem Component validation to Protocols * Add test for chemicalsystem validation * Apply suggestions from code review Co-authored-by: Josh Horton * Update with slight variation to @jthorton 's suggestion * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * point gufe back at main * Drop lipid17 from adaptive settings * just a tiny nit change * fix typo * Add extra class to openfecli check * Add total particle checks to abfe protocol tests * remove lipid17 check from adaptive settings * remove lipid17 check from adaptive settings * Add some checks for number of particles & residues * Fix up some tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minimized protein * Update protein * Revert back to old protein * Apply suggestion from @hannahbaumann * Remove box roundtrip and fix solvent from get_components * Merge conflicts * Update src/openfe/tests/protocols/openmm_abfe/test_abfe_protocol.py Co-authored-by: Irfan Alibay * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix merge conflict * Forgot another one * Move box roundtrip to _pre_equilibrate * First go at modifying the user guide for membranes and ABFE and SepTop * some small fixes * Add adaptive_settings to user guide * Add example code loading protein membrane * Add box vector explanation * Some smaller changes * Make adaptive settings subheading * Some more changes * Add warning adaptive settings * Update docs/guide/setup/defining_protocols.rst Co-authored-by: Irfan Alibay * Addressing review comments * some more updates * Test out subheadings * Restructure user guide * merge conflicts * Small fix * some small updates * addressing review comments * Small fix * address review comments * small fix * Add news entry * Update docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst Co-authored-by: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> * Update docs/guide/setup/defining_protocols.rst Co-authored-by: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> * Update docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst Co-authored-by: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> * Addres review comment * Update docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst Co-authored-by: Josh Horton * Update docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst Co-authored-by: Josh Horton * Address review comment --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Irfan Alibay Co-authored-by: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> Co-authored-by: Josh Horton Co-authored-by: Josh Horton --- docs/guide/protocols/absolutebinding.rst | 2 + docs/guide/protocols/septop.rst | 2 + ...mical_systems_and_thermodynamic_cycles.rst | 233 +++++++++++++++--- docs/guide/setup/defining_protocols.rst | 36 +++ news/user_guide_membranes.rst | 23 ++ 5 files changed, 260 insertions(+), 36 deletions(-) create mode 100644 news/user_guide_membranes.rst diff --git a/docs/guide/protocols/absolutebinding.rst b/docs/guide/protocols/absolutebinding.rst index f35f573a..9d7580a1 100644 --- a/docs/guide/protocols/absolutebinding.rst +++ b/docs/guide/protocols/absolutebinding.rst @@ -1,3 +1,5 @@ +.. _userguide_abfe_protocol: + Absolute Binding Protocol ========================= diff --git a/docs/guide/protocols/septop.rst b/docs/guide/protocols/septop.rst index bb2e449b..bd660953 100644 --- a/docs/guide/protocols/septop.rst +++ b/docs/guide/protocols/septop.rst @@ -1,3 +1,5 @@ +.. _userguide_septop_protocol: + Separated Topologies Protocol ============================= diff --git a/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst b/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst index c4983dc0..966bf8b7 100644 --- a/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst +++ b/docs/guide/setup/chemical_systems_and_thermodynamic_cycles.rst @@ -1,55 +1,201 @@ .. _userguide_chemicalsystems_and_components: -Chemical Systems, Components and Thermodynamic Cycles +Components, Chemical Systems and Thermodynamic Cycles ===================================================== -.. _userguide_chemical_systems: - -Chemical Systems ----------------- - -A :class:`.ChemicalSystem` represents the end state of an alchemical transformation, -which can then be input to a :class:`.Protocol`. - -A :class:`.ChemicalSystem` **does** contain the following information (when present): - -* exact atomic information (including protonation state) of protein, ligands, co-factors, and any crystallographic - waters -* atomic positions of all explicitly defined components such as ligands or proteins -* the abstract definition of the solvation environment, if present - -A :class:`.ChemicalSystem` does **NOT** include the following: - -* forcefield applied to any component, including details on water model or virtual particles -* thermodynamic conditions (e.g. temperature or pressure) +This page describes the core building blocks used to define simulation states in openfe: +:class:`.Component`\s, which describe what is physically present in a system; +:class:`.ChemicalSystem`\s, which combine components into a complete end state; +and thermodynamic cycles, which connect end states via alchemical transformations. .. _userguide_components: Components ---------- -A :class:`.ChemicalSystem` is composed of many 'component' objects, which together define overall system. +Components are the composable building blocks that define the chemical +composition of a simulated system. Splitting a system into components serves three purposes: -Examples of components include: +1. Alchemical transformations can be easily understood by comparing the differences in components. +2. Components can be reused to compose different systems. +3. :class:`.Protocol`\s can apply component-specific behaviour, e.g. different force fields per component. -* :class:`.ProteinComponent`: an entire biological assembly, typically the contents of a PDB file. -* :class:`.SmallMoleculeComponent`: typically ligands and cofactors -* :class:`.SolventComponent`: solvent conditions -Splitting the total system into components serves three purposes: +Component types — overview +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. list-table:: + :header-rows: 1 + :widths: 25 30 45 + + * - Component + - Role + - Key notes + * - :class:`.ProteinComponent` + - Biological assembly + - Typically the contents of a PDB file. May include crystallographic waters and ions (defined as HETATM entries), + and disulfide bonds (defined via CONECT records). + * - :class:`.SmallMoleculeComponent` + - Ligands and cofactors + - Can optionally contain atomic partial charges. If present, those will be used in the simulation. + * - :class:`.SolventComponent` + - Abstract solvent definition + - Defines solvent conditions and ion concentration. Does **not** include coordinates or box vectors. Solvent is added by the protocol at runtime. + * - :class:`.SolvatedPDBComponent` + - Explicitly solvated system + - Includes atomic coordinates and box vectors. Solvent is already present, + the protocol does not add any further solvation. + * - :class:`.ProteinMembraneComponent` + - Protein-membrane complex + - Subclass of :class:`.SolvatedPDBComponent`. Includes protein, membrane, solvent, + and box vectors. Replaces :class:`.ProteinComponent` in membrane systems. + +.. _userguide_solvation_models: + +Abstract vs explicit solvation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +These two approaches are **mutually exclusive**: + +* **Abstract solvation** — use a :class:`.SolventComponent`. The protocol adds solvent + during system preparation. +* **Explicit solvation** — use a :class:`.SolvatedPDBComponent` or + :class:`.ProteinMembraneComponent`. Solvent molecule coordinates (waters and ions) are explicitly defined in the inputs. + +Either define the solvent abstractly, or provide a fully solvated system — do not mix +both for the same leg of a transformation. + +.. note:: + + Some protocols, such as :class:`.SepTopProtocol` and :class:`.AbsoluteBindingProtocol`, + use a single :class:`.ChemicalSystem` to represent both the complex and solvent legs. + In this case, a :class:`.ChemicalSystem` may contain both a :class:`.SolventComponent` + and a :class:`.ProteinMembraneComponent`. However, these apply to *different* legs: the + :class:`.SolventComponent` is used only for the solvent leg, and the + :class:`.ProteinMembraneComponent` (which is already explicitly solvated) is used only + for the complex leg. The mutual exclusivity rule still holds per leg. + +Box vectors for explicitly solvated systems +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The components :class:`.SolvatedPDBComponent` and :class:`.ProteinMembraneComponent` +require periodic box vectors. These can be provided in three ways: + +1. **CRYST record in the PDB file** — OpenMM reads box vectors automatically. No additional arguments are needed:: + + membrane_protein = openfe.ProteinMembraneComponent.from_pdb_file('./protein_membrane.pdb') + +2. **Manual specification** — box vectors can be provided explicitly as numpy arrays with OpenFF units in OpenMM format via the ``box_vectors`` argument:: + + import numpy as np + import openff.units as offunit + + box_vectors = np.array([ + [6.9587, 0.0, 0.0], + [0.0, 5.9164, 0.0], + [0.0, 0.0, 9.2692] + ]) * offunit.nanometer + + membrane_protein = openfe.ProteinMembraneComponent.from_pdb_file( + './protein_membrane.pdb', box_vectors=box_vectors + ) + +3. **Inference from atomic coordinates** — box vectors can be estimated from the atomic + positions by passing ``infer_box_vectors=True``:: + + membrane_protein = openfe.ProteinMembraneComponent.from_pdb_file( + './protein_membrane.pdb', infer_box_vectors=True + ) + + .. warning:: + + Inferring box vectors from atomic positions can be inaccurate if the PDB originates + from a previous simulation where atoms may be distributed across periodic images. + + +.. _userguide_chemical_systems: + +ChemicalSystem +-------------- + +A :class:`.ChemicalSystem` is composed of components that together describe a model of the system to be simulated. +simulated system. It represents the **end state** of an alchemical transformation +and is the primary input a :class:`.Protocol` consumes to define a simulation state. + +**What a ChemicalSystem defines** + +* Exact atomic information (including protonation state) of protein, ligands, + cofactors, and any crystallographic waters. +* Atomic positions of all explicitly defined components such as ligands or proteins. +* The abstract or explicit definition of the solvent environment (SolventComponent). + +**What a ChemicalSystem does NOT define**, and are instead handled by the Protocol: + +Any simulation parameters including: +* Forcefield applied to any component, including water model or virtual particles. +* Thermodynamic conditions (e.g. temperature or pressure). +* These are handled by the :class:`.Protocol`. + +.. _userguide_system_composition: + +System composition examples +--------------------------- + +The components that make up each :class:`.ChemicalSystem` depend on the protocol and +the nature of the system. The table below summarises the composition for each combination. + + +.. note:: + + Protocol-specific behaviour: + For :class:`.SepTopProtocol` and :class:`.AbsoluteBindingProtocol`, a single + :class:`.ChemicalSystem` represents both legs of the thermodynamic cycle. The protocol + determines internally what is the complex leg and what is the solvent leg. + This differs from the :class:`.RelativeHybridTopologyProtocol`, where each leg (e.g. complex and solvent) is defined by + separate :class:`.ChemicalSystem`\s. This behaviour is expected to change in future versions. + +.. list-table:: + :header-rows: 1 + :widths: 20 40 40 + + * - System + - :ref:`RBFE ` (:class:`.RelativeHybridTopologyProtocol`) + - :ref:`SepTop ` / :ref:`ABFE ` (:class:`.SepTopProtocol`, :class:`.AbsoluteBindingProtocol`) + * - **Standard protein–ligand** + - | **Complex leg:** + | :class:`.ProteinComponent` + :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + | + | **Solvent leg:** + | :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + - | **Single ChemicalSystem (both legs):** + | :class:`.ProteinComponent` + :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + * - **Membrane system** + - | **Complex leg:** + | :class:`.ProteinMembraneComponent` + :class:`.SmallMoleculeComponent`\s + | *(no* :class:`.SolventComponent` *— already explicitly solvated)* + | + | **Solvent leg:** + | :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + - | **Single ChemicalSystem (both legs):** + | :class:`.ProteinMembraneComponent` + :class:`.SmallMoleculeComponent`\s + :class:`.SolventComponent` + | *(protocol applies* :class:`.SolventComponent` *only in the solvent leg)* -1. alchemical transformations can be easily understood by comparing the differences in components. -2. components can be reused to compose different systems. -3. :class:`.Protocol`\s can have component-specific behavior. E.g. different force fields for each component. Thermodynamic Cycles -------------------- -We can now describe a thermodynamic cycle as a set of :class:`.ChemicalSystem`\s. -The exact end states to construct are detailed in the :ref:`pages for each specific Protocol `. +A thermodynamic cycle can be described as a set of :class:`.ChemicalSystem`\s (nodes) connected by +alchemical transformations (edges). The :class:`.Protocol` defines how the +:class:`.ChemicalSystem`\s map onto the cycle and how they are used in practice. +The same :class:`.ChemicalSystem` can be reused across multiple thermodynamic states +depending on the protocol. For details of which end states to construct, consult the +:ref:`pages for each specific Protocol `. -As an example, we can construct the classic relative binding free energy cycle by defining four components: two ligands, -a protein, and a solvent: +Hybrid topology RBFE example +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As an example, the relative binding free energy cycle requires four +:class:`.ChemicalSystem`\s — one for each node in the cycle: .. figure:: ../protocols/img/rbfe_thermocycle.png :scale: 40% @@ -75,12 +221,27 @@ a protein, and a solvent: ligand_B_complex = openfe.ChemicalSystem(components={'ligand': ligand_B, 'protein': protein, 'solvent': solvent}) # ligand_A + solvent ligand_A_solvent = openfe.ChemicalSystem(components={'ligand': ligand_A, 'solvent': solvent}) - # ligand_A + solvent + # ligand_B + solvent ligand_B_solvent = openfe.ChemicalSystem(components={'ligand': ligand_B, 'solvent': solvent}) +Explicitly solvated variant +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When using a :class:`.SolvatedPDBComponent` or :class:`.ProteinMembraneComponent`, replace :class:`.ProteinComponent` +and :class:`.SolventComponent` for the complex leg. No separate :class:`.SolventComponent` +is required: + +:: + + # explicitly solvated protein-membrane complex (box vectors read from CRYST1 record) + protein_membrane = openfe.ProteinMembraneComponent.from_pdb_file('./protein_membrane.pdb') + + # ligand_A + explicitly solvated protein-membrane — no SolventComponent needed + ligand_A_complex = openfe.ChemicalSystem(components={'ligand': ligand_A, 'protein_membrane': protein_membrane}) + See Also -------- -* To see how to construct a :class:`.ChemicalSystem` \s from your files, see :ref:`the cookbook entry on loading molecules ` -* For details of what thermodynamic cycles to construct, consult the :ref:`pages for each specific Protocol ` +* To see how to construct a :class:`.ChemicalSystem` from your files, see :ref:`the cookbook entry on loading molecules ` +* For details of which thermodynamic cycles to construct, consult the :ref:`pages for each specific Protocol ` diff --git a/docs/guide/setup/defining_protocols.rst b/docs/guide/setup/defining_protocols.rst index e1d4a39b..700f17f1 100644 --- a/docs/guide/setup/defining_protocols.rst +++ b/docs/guide/setup/defining_protocols.rst @@ -10,6 +10,8 @@ there are multiple available ``Protocol``\s to choose from. For example, included in the ``openfe`` package are the following: * :class:`.RelativeHybridTopologyProtocol` + * :class:`.AbsoluteBindingProtocol` + * :class:`.SepTopProtocol` * :class:`.AbsoluteSolvationProtocol` * :class:`.PlainMDProtocol` @@ -47,6 +49,40 @@ For example, to customise the production run length of the RFE Protocol:: protocol = openmm_rfe.RelativeHybridTopologyProtocol(settings) +Adaptive Settings +~~~~~~~~~~~~~~~~~ + +.. warning:: + + The ``_adaptive_settings()`` method is experimental and subject to change. + +In addition to the ``.default_settings()`` method, some protocols +provide an ``_adaptive_settings`` method. This method generates recommended settings +based on properties of the input :class:`.ChemicalSystem`\s and, where required, the :class:`.AtomMapping`. + +For example:: + + from openfe.protocols import openmm_rfe + + settings = openmm_rfe.RelativeHybridTopologyProtocol._adaptive_settings( + stateA=stateA, + stateB=stateB, + mapping=mapping, + ) + + protocol = openmm_rfe.RelativeHybridTopologyProtocol(settings) + +The adaptive settings may modify parameters based on properties of the input systems. +For example (:class:`.RelativeHybridTopologyProtocol`): + +* Transformations involving a change in net charge use a larger number of lambda windows and longer production simulations. +* If both states contain a :class:`.ProteinComponent`, the solvation padding is set to 1 nm. + +Optionally, you can pass a preexisting settings object to the ``_adaptive_settings`` method via the ``initial_settings`` argument. If provided, an adapted copy of these settings will be returned instead +of using the default settings. + +In systems containing membrane-protein complexes (i.e. using a +:class:`.ProteinMembraneComponent`), adaptive settings select a membrane-appropriate barostat, the ``MonteCarloMembraneBarostat``. Creating Transformations from Protocols ----------------------------------------- diff --git a/news/user_guide_membranes.rst b/news/user_guide_membranes.rst new file mode 100644 index 00000000..0234c32e --- /dev/null +++ b/news/user_guide_membranes.rst @@ -0,0 +1,23 @@ +**Added:** + +* + +**Changed:** + +* Updated the chemical systems user guide and the defining protocols user guide to reflect recent protocol updates, including adding membrane support. + +**Deprecated:** + +* + +**Removed:** + +* + +**Fixed:** + +* + +**Security:** + +*