[Beowulf] Theoretical peak performance of DGX A100

harsh_google lastname harshscience777 at gmail.com
Thu Jun 3 12:07:41 UTC 2021

 I am calculating the theoretical peak (FP64) performance of the Nvidia DGX
A100 system.

Now, A100 datasheet lists FP64 performance to be 9.7 TFLOPS.
Two AMD 7742 CPUs will give 128 cores x 2.25 GHz base clock x 16 FP64 ops /
cycle = 4.6 TFLOPS.
This gives a total of 82.2 TFLOPS per DGX-A100.

Here is my problem. For any system with DGX A100 on top500.org, numbers
just don't add up. For eg: Selene has 560 DGX boxes, but its theoretical
peak is listed as 79.2 PFLOPS, whereas I expect it should be 46 PFLOPS (ie
82.2 TFLOPS x560). The same is true for any other DGX based system listed
on top500. What am I missing here?


Harsh Hemani
