[Beowulf] Best case performance of HPL on EPYC 7742 processor ...

John Hearns hearnsj at gmail.com
Mon Oct 26 04:22:33 PDT 2020


This article might be interesting here:

https://www.dell.com/support/article/en-uk/sln319015/amd-rome-is-it-for-real-architecture-and-initial-hpc-performance?lang=en

And Hello Joshua. Long time no see.

On Sun, 25 Oct 2020 at 23:11, Joshua Mora <joshua_mora at usa.net> wrote:

> Reach out AMD,
> they have specific instructions (including BIOS/OS settings) and even
> binaries
> on how to get the best performance.
> Dont go try and error as is very time consuming.
> BLIS has also multiple parameters as it has nested loops, so you could also
> have to try multiple configurations to get the optimal performance.
> Just reach to them.
>
> Joshua
>
> ------ Original Message ------
> Received: 04:30 PM CDT, 08/14/2020
> From: Richard Walsh <rbwcnslt at gmail.com>
> To: Beowulf List <beowulf at beowulf.org>
> Subject: [Beowulf] Best case performance of HPL on EPYC 7742 processor ...
>
> > All,
> >
> > What have people achieved on this SKU on a single-node using the stock
> > HPL 2.3 source... ??
> >
> > I have seen a variety of performance claims even as high as 90% of its
> > nominal
> > per node peak of 4.608 TFLOPs.  I can now get above 80% of peak, but not
> > higher.
> > I have heard that to get higher values special BIOS settings are
> required,
> > including
> > the turning off SMT which allows the chip to turbo higher.  Remember this
> > is not the
> > 7542 processor with 32 cores per chip and the same bandwidth per socket
> as
> > the
> > 7742 which can turbo to over 100% of nominal peak for HPL.
> >
> > If people have gotten higher single node numbers ... what is your recipe
> > ... ??
> >
> > I am particularly interested in BIOS settings, and maybe surprise
> settings
> > in the HPL.dat file.  Do higher performing runs require using close to
> the
> > maximum memory on the node ... ??  As this is single-node, I would not
> > expect choice of MPI to make a difference
> >
> > To get to 80% with SMT on in the BIOS, I am building with an older Intel
> > compiler and MKL that still recognizes the MKL_DEBUG_CPU_TYPE=5.
> > Running so that the number of MPI ranks run on the node matches the
> > number of CCXs seems ot give the best numbers.
> >
> > Following the tuning instructions from AMD for using BLIS and GCC for
> > the build does not get me there.
> >
> > Thanks,
> >
> > Richard Walsh
> >
>
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20201026/f41b068b/attachment.html>


More information about the Beowulf mailing list