[Beowulf] single machine with 500 GB of RAM

Wed Jan 9 09:19:27 PST 2013

On 01/09/2013 12:00 PM, Andrew Holway wrote:
> As its a single thread I doubt that faster memory is going to help you much. It's going to suck whatever you do.

I would think faster memory would be the only thing that could be done 
about it, presuming it's single threaded for very good reason (and not 
merely a lack of parallel programming aptitude on the developers part). 
  Some applications really cannot be parallelized, so you just have to 
grit your teeth and buy the fastest freq CPU you can find with the 
fastest memory you can find.  Nothing else can be done.

However, looking at the user manual for this application however, I 
suspect the bulk of the work can be made parallel, in contrast to the 
original post:

 From page 36, Section 2.5:

"2.5 Parallel calculation
The calculation of a property grid can be extremely time consuming task 
that can take for weeks or even months. As nowadays usually more than 
one processor is available in a computer it would be convenient to 
compute the grids in parallel. Although DGrid 4.6 itself is not ready 
yet for automatic parallelization (except for the calculation of overlap 
integrals) the script dgrid para, included in the DGrid package, 
supplies a work around for such parallel runs. The script dgrid_para 
creates control files where job computes a part of the grid (a slice). 
The operating system distributes the jobs among the processors. As the 
final step all slices are merged together into the whole grid. Similar 
procedure can be applied to the search for the critical points."

There's more on this parallel calculation stuff in the rest of Section 
2.5 Jörg, I'd suggest checking it out (or better yet, have your user 
check it out and use it).  This could work nicely for your suggestion 
about it being cheaper to buy multiple machines with smaller memory, but 
definitely check into how memory intensive the merge process at the end 
is.  The merger at the tail might wreak havoc on your setup if you don't 
have proper memory, it isn't optimized, and as a result you're 
constantly swapping.  Merges are a good example of what /should/ work 
REALLY nicely on an SSD -- buy 4 machines for the bulk of the 
computation with just 128 GB RAM each and fast processors, then do the 
merging on just one of them that is also outfitted with a ramdisk'd 0.5 
TB Fusion-IO PCI-E flash device.  If I am not wildly off the mark on the 
basic process flow for this application, such a setup should work well.

Best,

ellis