Document the discrepancy between AlphaFold 3 and AlphaFold Server in Known Issues

Big thanks to @stianale for reporting this in https://github.com/google-deepmind/alphafold3/issues/492. PiperOrigin-RevId: 872892863 Change-Id: Ia24bb492daea44534be4ac743fb304bc72fe9741
2026-06-02 11:54:36 +08:00 · 2026-02-20 07:33:38 -08:00
parent ea1667690e
commit 4ca8a65692
1 changed files with 47 additions and 0 deletions
--- a/docs/known_issues.md
+++ b/docs/known_issues.md
@@ -13,3 +13,50 @@ Between commits https://github.com/google-deepmind/alphafold3/commit/f8df1c7 and
 https://github.com/google-deepmind/alphafold3/commit/4e4023c, AlphaFold 3
 handled incorrectly any two-letter atoms (e.g. Cl, Br) in ligands defined using
 SMILES strings.
+
+## MSA discrepancy between AlphaFold 3 and AlphaFold Server
+
+### The root cause of the problem
+
+The released AlphaFold 3 and AlphaFold Server use the same model weights and
+equivalent featurisation and model code. However, the way they run genetic
+search is slightly different. The released AlphaFold 3 searches each database in
+one go, while AlphaFold Server has a sharded version of each database (split
+into multiple smaller FASTA files) and searches all of the shards in parallel.
+The results of these parallel searches are then merged together at the end.
+
+The discrepancy is caused by a different (deeper) MSA on AlphaFold Server in
+some cases. We discovered that the issue is caused by running sharded Jackhmmer
+in AlphaFold Server without the `--domZ` flag (has to be set together with the
+`--Z` flag and set to the same value) which means that effectively the AlphaFold
+Server is running with roughly 100× more permissive `--domE` filter. This means
+more sequences are sometimes included in the MSA.
+
+We are keeping behaviour unchanged in both the released AlphaFold 3 and in the
+AlphaFold Server, however, we are giving users with local installs an option to
+replicate AlphaFold Server behaviour locally. In our large scale tests the
+difference did not matter, it is only very specific inputs that get better
+accuracy with the deeper MSA.
+
+See https://github.com/google-deepmind/alphafold3/issues/492 for an example
+input where a protein-DNA complex gets significantly higher ipTM and pTM with
+AlphaFold Server compared to a local run.
+
+### Replicating AlphaFold Server behaviour locally
+
+If you want to replicate AlphaFold Server behaviour (i.e. better folding
+accuracy in some cases), you can increase the value of the Jackhmmer/Nhmmer
+`--domE` flag by 100× compared to its default value.
+
+Alternatively, you can run the sharded MSA search while not setting the `--domZ`
+value – you would have to modify the code to do it. We added support for
+searching against sharded databases in AlphaFold 3 in
+https://github.com/google-deepmind/alphafold3/commit/805adc3863841d83d631ccd18136ad58ce3ecb34
+and the way to run AlphaFold 3 with sharded databases is documented in
+https://github.com/google-deepmind/alphafold3/blob/main/docs/performance.md#sharded-genetic-databases.
+It can provide 10–30× speedup (potentially even more, depending on hardware) of
+the genetic search.
+
+In general, we recommend experimenting with MSA if you are seeing a prediction
+with low predicted confidence. Typically adding more *relevant* sequences in the
+MSA will increase AlphaFold prediction accuracy and model confidence scores.