Update advice for different devices

PiperOrigin-RevId: 700993687
This commit is contained in:
Augustin Zidek
2024-11-28 16:44:11 +00:00
parent 1490230430
commit e56abb7a55
2 changed files with 48 additions and 8 deletions

View File

@@ -1,9 +1,36 @@
# Known Issues
### Devices other than NVIDIA A100 or H100
## Numerical performance for different GPU devices
There are currently known unresolved numerical issues with using devices other
than NVIDIA A100 and H100. For now, accuracy has only been validated for A100
and H100 GPU device types. See
There are numerical performance issues with some GPU types that are under
investigation, see
[this Issue](https://github.com/google-deepmind/alphafold3/issues/59) for
tracking.
### Verified devices
We have run successful large-scale numerical tests for the following devices and
maximum number of tokens:
- H100 80 GB: up to 5,120 tokens.
- A100 80 GB: up to 5,120 tokens.
- A100 40 GB: up to 4,352 tokens with
[unified memory configuration](https://github.com/google-deepmind/alphafold3/blob/main/docs/performance.md#nvidia-a100-40-gb).
- P100 16 GB: up to 1,024 tokens.
Note that the 80 GB devices can run larger targets using unified memory, but
outputs have only been verified on particular examples rather than a large-scale
test set.
#### CUDA Capability 7.x GPUs: known issues
All CUDA Capability 7.x GPUs (e.g. V100) produce obviously bad output, with lots
of clashing residues (the clashes cause a ranking score of -99 or lower). With a
small fix relating to `bfloat16` conversion to `float32` outputs look normal,
but there are numerical performance regressions for some bucket sizes (tested on
V100 devices).
#### CUDA Capability 6.x GPUs: no known issues
CUDA Capability 6.x GPUs give reasonable output, but large scale numerical
testing has only been done for P100.

View File

@@ -98,14 +98,27 @@ AlphaFold 3 can run on inputs of size up to 4,352 tokens on a single NVIDIA A100
While numerically accurate, this configuration will have lower throughput
compared to the set up on the NVIDIA A100 (80 GB), due to less available memory.
#### Devices other than NVIDIA A100 or H100
#### NVIDIA P100
There are currently known unresolved numerical issues with using devices other
than NVIDIA A100 and H100. For now, accuracy has only been validated for A100
and H100 GPU device types. See
AlphaFold 3 can run on inputs of size up to 1,024 tokens on a single NVIDIA P100
with no configuration changes needed.
#### NVIDIA V100
There are known issues with V100 devices. See
[this Issue](https://github.com/google-deepmind/alphafold3/issues/59) for
tracking.
#### Other devices
There are known issues with CUDA Capability 7.x devices. See
[this Issue](https://github.com/google-deepmind/alphafold3/issues/59) for
tracking.
CUDA Capability 6.x and 8.x devices other than those listed explicitly here are
believed to work for AlphaFold 3, but large-scale testing has only been
performed for the devices mentioned above.
## Compilation Buckets
To avoid excessive re-compilation of the model, AlphaFold 3 implements