mirror of
https://github.com/google-deepmind/alphafold3.git
synced 2026-06-02 11:54:36 +08:00
Document the discrepancy between AlphaFold 3 and AlphaFold Server in Known Issues
Big thanks to @stianale for reporting this in https://github.com/google-deepmind/alphafold3/issues/492. PiperOrigin-RevId: 872892863 Change-Id: Ia24bb492daea44534be4ac743fb304bc72fe9741
This commit is contained in:
committed by
Copybara-Service
parent
ea1667690e
commit
4ca8a65692
@@ -13,3 +13,50 @@ Between commits https://github.com/google-deepmind/alphafold3/commit/f8df1c7 and
|
||||
https://github.com/google-deepmind/alphafold3/commit/4e4023c, AlphaFold 3
|
||||
handled incorrectly any two-letter atoms (e.g. Cl, Br) in ligands defined using
|
||||
SMILES strings.
|
||||
|
||||
## MSA discrepancy between AlphaFold 3 and AlphaFold Server
|
||||
|
||||
### The root cause of the problem
|
||||
|
||||
The released AlphaFold 3 and AlphaFold Server use the same model weights and
|
||||
equivalent featurisation and model code. However, the way they run genetic
|
||||
search is slightly different. The released AlphaFold 3 searches each database in
|
||||
one go, while AlphaFold Server has a sharded version of each database (split
|
||||
into multiple smaller FASTA files) and searches all of the shards in parallel.
|
||||
The results of these parallel searches are then merged together at the end.
|
||||
|
||||
The discrepancy is caused by a different (deeper) MSA on AlphaFold Server in
|
||||
some cases. We discovered that the issue is caused by running sharded Jackhmmer
|
||||
in AlphaFold Server without the `--domZ` flag (has to be set together with the
|
||||
`--Z` flag and set to the same value) which means that effectively the AlphaFold
|
||||
Server is running with roughly 100× more permissive `--domE` filter. This means
|
||||
more sequences are sometimes included in the MSA.
|
||||
|
||||
We are keeping behaviour unchanged in both the released AlphaFold 3 and in the
|
||||
AlphaFold Server, however, we are giving users with local installs an option to
|
||||
replicate AlphaFold Server behaviour locally. In our large scale tests the
|
||||
difference did not matter, it is only very specific inputs that get better
|
||||
accuracy with the deeper MSA.
|
||||
|
||||
See https://github.com/google-deepmind/alphafold3/issues/492 for an example
|
||||
input where a protein-DNA complex gets significantly higher ipTM and pTM with
|
||||
AlphaFold Server compared to a local run.
|
||||
|
||||
### Replicating AlphaFold Server behaviour locally
|
||||
|
||||
If you want to replicate AlphaFold Server behaviour (i.e. better folding
|
||||
accuracy in some cases), you can increase the value of the Jackhmmer/Nhmmer
|
||||
`--domE` flag by 100× compared to its default value.
|
||||
|
||||
Alternatively, you can run the sharded MSA search while not setting the `--domZ`
|
||||
value – you would have to modify the code to do it. We added support for
|
||||
searching against sharded databases in AlphaFold 3 in
|
||||
https://github.com/google-deepmind/alphafold3/commit/805adc3863841d83d631ccd18136ad58ce3ecb34
|
||||
and the way to run AlphaFold 3 with sharded databases is documented in
|
||||
https://github.com/google-deepmind/alphafold3/blob/main/docs/performance.md#sharded-genetic-databases.
|
||||
It can provide 10–30× speedup (potentially even more, depending on hardware) of
|
||||
the genetic search.
|
||||
|
||||
In general, we recommend experimenting with MSA if you are seeing a prediction
|
||||
with low predicted confidence. Typically adding more *relevant* sequences in the
|
||||
MSA will increase AlphaFold prediction accuracy and model confidence scores.
|
||||
|
||||
Reference in New Issue
Block a user