Files
alphafold3/docs/known_issues.md
Augustin Zidek 4ca8a65692 Document the discrepancy between AlphaFold 3 and AlphaFold Server in Known Issues
Big thanks to @stianale for reporting this in https://github.com/google-deepmind/alphafold3/issues/492.

PiperOrigin-RevId: 872892863
Change-Id: Ia24bb492daea44534be4ac743fb304bc72fe9741
2026-02-20 07:34:06 -08:00

3.2 KiB
Raw Blame History

Known Issues

Numerical performance for CUDA Capability 7.x GPUs

All CUDA Capability 7.x GPUs (e.g. V100) produce obviously bad output, with lots of clashing residues (the clashes cause a ranking score of -99 or lower), unless the environment variable XLA_FLAGS is set to include --xla_disable_hlo_passes=custom-kernel-fusion-rewriter.

Incorrect handling of two-letter atoms in SMILES ligands

Between commits https://github.com/google-deepmind/alphafold3/commit/f8df1c7 and https://github.com/google-deepmind/alphafold3/commit/4e4023c, AlphaFold 3 handled incorrectly any two-letter atoms (e.g. Cl, Br) in ligands defined using SMILES strings.

MSA discrepancy between AlphaFold 3 and AlphaFold Server

The root cause of the problem

The released AlphaFold 3 and AlphaFold Server use the same model weights and equivalent featurisation and model code. However, the way they run genetic search is slightly different. The released AlphaFold 3 searches each database in one go, while AlphaFold Server has a sharded version of each database (split into multiple smaller FASTA files) and searches all of the shards in parallel. The results of these parallel searches are then merged together at the end.

The discrepancy is caused by a different (deeper) MSA on AlphaFold Server in some cases. We discovered that the issue is caused by running sharded Jackhmmer in AlphaFold Server without the --domZ flag (has to be set together with the --Z flag and set to the same value) which means that effectively the AlphaFold Server is running with roughly 100× more permissive --domE filter. This means more sequences are sometimes included in the MSA.

We are keeping behaviour unchanged in both the released AlphaFold 3 and in the AlphaFold Server, however, we are giving users with local installs an option to replicate AlphaFold Server behaviour locally. In our large scale tests the difference did not matter, it is only very specific inputs that get better accuracy with the deeper MSA.

See https://github.com/google-deepmind/alphafold3/issues/492 for an example input where a protein-DNA complex gets significantly higher ipTM and pTM with AlphaFold Server compared to a local run.

Replicating AlphaFold Server behaviour locally

If you want to replicate AlphaFold Server behaviour (i.e. better folding accuracy in some cases), you can increase the value of the Jackhmmer/Nhmmer --domE flag by 100× compared to its default value.

Alternatively, you can run the sharded MSA search while not setting the --domZ value you would have to modify the code to do it. We added support for searching against sharded databases in AlphaFold 3 in 805adc3863 and the way to run AlphaFold 3 with sharded databases is documented in https://github.com/google-deepmind/alphafold3/blob/main/docs/performance.md#sharded-genetic-databases. It can provide 1030× speedup (potentially even more, depending on hardware) of the genetic search.

In general, we recommend experimenting with MSA if you are seeing a prediction with low predicted confidence. Typically adding more relevant sequences in the MSA will increase AlphaFold prediction accuracy and model confidence scores.