[Beowulf] Definition of HPC

Craig Tierney - NOAA Affiliate craig.tierney at noaa.gov
Mon Apr 22 12:36:28 PDT 2013


On Mon, Apr 22, 2013 at 11:40 AM, Mark Hahn <hahn at mcmaster.ca> wrote:

> understood, but how did you decide that was actually a good thing?
>>>
>>>  Mark,
>>
>> Because it stopped the random out of memory conditions that we were
>> having.
>>
>
> aha, so basically "rebooting windows resolves my performance problems" ;)


Not really.  We are saying "we know better than you, flush your buffers".
Maybe in a perfect world we bring some kernel engineers in and make sure
that the OOM killer and other memory subsystem controllers work as we
desire when there is no swap.  That isn't something we have resources to
do.  While we figured this out on our in-house white-box clusters, it is
also needed on the more "supported" SGI ICE system.


>
>  I'm guessing this may have been a much bigger deal on strongly NUMA
>>> machines of a certain era (high-memory ia64 SGI, older kernels).
>>>
>>
> and the situation you're referring to was actually on Altix, right?
> (therefore not necessarily a good idea with current machines and kernels.)
>
>
No this is on two-socket, Intel x86_64 systems.  Standard cluster nodes
running IB, Lustre, and RHEL6 (but we did the same thing in the past on
RHEL5).

Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20130422/ca0779d6/attachment.html>


More information about the Beowulf mailing list