mirror of
https://github.com/google-deepmind/alphafold3.git
synced 2026-06-02 11:54:36 +08:00
Update advice for different devices
PiperOrigin-RevId: 700993687
This commit is contained in:
@@ -1,9 +1,36 @@
|
||||
# Known Issues
|
||||
|
||||
### Devices other than NVIDIA A100 or H100
|
||||
## Numerical performance for different GPU devices
|
||||
|
||||
There are currently known unresolved numerical issues with using devices other
|
||||
than NVIDIA A100 and H100. For now, accuracy has only been validated for A100
|
||||
and H100 GPU device types. See
|
||||
There are numerical performance issues with some GPU types that are under
|
||||
investigation, see
|
||||
[this Issue](https://github.com/google-deepmind/alphafold3/issues/59) for
|
||||
tracking.
|
||||
|
||||
### Verified devices
|
||||
|
||||
We have run successful large-scale numerical tests for the following devices and
|
||||
maximum number of tokens:
|
||||
|
||||
- H100 80 GB: up to 5,120 tokens.
|
||||
- A100 80 GB: up to 5,120 tokens.
|
||||
- A100 40 GB: up to 4,352 tokens with
|
||||
[unified memory configuration](https://github.com/google-deepmind/alphafold3/blob/main/docs/performance.md#nvidia-a100-40-gb).
|
||||
- P100 16 GB: up to 1,024 tokens.
|
||||
|
||||
Note that the 80 GB devices can run larger targets using unified memory, but
|
||||
outputs have only been verified on particular examples rather than a large-scale
|
||||
test set.
|
||||
|
||||
#### CUDA Capability 7.x GPUs: known issues
|
||||
|
||||
All CUDA Capability 7.x GPUs (e.g. V100) produce obviously bad output, with lots
|
||||
of clashing residues (the clashes cause a ranking score of -99 or lower). With a
|
||||
small fix relating to `bfloat16` conversion to `float32` outputs look normal,
|
||||
but there are numerical performance regressions for some bucket sizes (tested on
|
||||
V100 devices).
|
||||
|
||||
#### CUDA Capability 6.x GPUs: no known issues
|
||||
|
||||
CUDA Capability 6.x GPUs give reasonable output, but large scale numerical
|
||||
testing has only been done for P100.
|
||||
|
||||
@@ -98,14 +98,27 @@ AlphaFold 3 can run on inputs of size up to 4,352 tokens on a single NVIDIA A100
|
||||
While numerically accurate, this configuration will have lower throughput
|
||||
compared to the set up on the NVIDIA A100 (80 GB), due to less available memory.
|
||||
|
||||
#### Devices other than NVIDIA A100 or H100
|
||||
#### NVIDIA P100
|
||||
|
||||
There are currently known unresolved numerical issues with using devices other
|
||||
than NVIDIA A100 and H100. For now, accuracy has only been validated for A100
|
||||
and H100 GPU device types. See
|
||||
AlphaFold 3 can run on inputs of size up to 1,024 tokens on a single NVIDIA P100
|
||||
with no configuration changes needed.
|
||||
|
||||
#### NVIDIA V100
|
||||
|
||||
There are known issues with V100 devices. See
|
||||
[this Issue](https://github.com/google-deepmind/alphafold3/issues/59) for
|
||||
tracking.
|
||||
|
||||
#### Other devices
|
||||
|
||||
There are known issues with CUDA Capability 7.x devices. See
|
||||
[this Issue](https://github.com/google-deepmind/alphafold3/issues/59) for
|
||||
tracking.
|
||||
|
||||
CUDA Capability 6.x and 8.x devices other than those listed explicitly here are
|
||||
believed to work for AlphaFold 3, but large-scale testing has only been
|
||||
performed for the devices mentioned above.
|
||||
|
||||
## Compilation Buckets
|
||||
|
||||
To avoid excessive re-compilation of the model, AlphaFold 3 implements
|
||||
|
||||
Reference in New Issue
Block a user