[Beowulf] memory bandwidth scaling

Greg Lindahl lindahl at pbm.com
Tue Oct 6 12:52:52 PDT 2015


On Tue, Oct 06, 2015 at 12:35:30PM -0700, mathog wrote:

> Lately I have been working on a system with >512Gb of RAM and a lot
> of processors.
> This wouldn't be at all a cost effective beowulf node, but it is a
> godsend when the problems being addressed require huge amounts of
> memory and do not partition easily to run on multiple nodes.

Yep, many supercomputer bids include a cluster of identical machines,
and a couple of large-memory nodes for pre- and post-processing.
Might as well stuff as many cpu cores on the large-memory nodes as is
cost effective, especially these days were more sockets == more
memory.

> This machine is also prone to locking up (to the point it doesn't
> answer terminal keystrokes from a remote X11 terminal) when writing
> huge files back to disk.

Watch the number of dirty pages (cat /proc/meminfo | grep Dirty).  At
blekko we had a "polite writer" routine that inserted short sleeps
when Dirty got too high. In theory recent Linux kernels can slow down
the correct process that's generating a lot of dirty pages, but in
reality it screws up enough that it's smart to have big writers use a
polite writer, if you can.

-- greg



More information about the Beowulf mailing list