[Beowulf] Varying performance across identical cluster nodes.
pbisbal at pppl.gov
Mon Feb 19 07:51:58 PST 2018
I know this is an old topic. I'm catching up on months' worth of mailing
list mail right now.
On 09/17/2017 09:09 PM, Christopher Samuel wrote:
> On 15/09/17 04:45, Prentice Bisbal wrote:
>> I'm happy to announce that I finally found the cause this problem: numad.
> Very interesting, it sounds like it was migrating processes onto a
> single core over time! Anything diagnostic in its log?
That's exactly what it was doing. No, I did not see any diagnostics in
the log files, but in some of the documentation I read on numad at the
time, it stated that numad is not good to have enabled for large
multi-core jobs that use a lot of memory, like DB servers and HPC jobs.
More information about the Beowulf