diskless clients? beowulf-newbie seeks advice

Jared Hodge jared_hodge at iat.utexas.edu
Fri Jun 22 14:42:34 PDT 2001


> I partially agree. Your thoughs on memory management?. Swap over Network
> would decrease greatly the performance, so in my opinion you are limited to
> non-memory-consuming computations

	Wow, someone asking for my thoughts, how gratifying.  Really this may
sound a little trite, but if you're having to intensively swap for a
single application, you're already up the creek without a paddle. 
Really as cheap as memory is, you can load up your nodes with a few Gigs
each and there shouldn't be much of a problem.  In fact, we've got hard
disks on our cluster, but I have swapping disabled because I want the
programs to crash when they try to swap.  I know that sounds strange,
but I would rather our programmers (who are physicists, not computer
scientists) know that something is wrong when they try a new code that
is overly liberal with memory usage, than wonder why the cluster (that I
am responsible for maintaining) is running so slow.  That way, if they
want to get around the total actual memory available, they have to
explicitly write to files and free some memory.  This extra work really
motivates them to use memory efficiently.  I'm willing to turn swapping
on for specific runs if they could justify why they need it (I'm sure we
could work it into the PBS script), but so far the issue hasn't come up
(I think it's kind of cool how I've used their ignorance to their and my
advantage).  Really, swapping is best at keeping desktop PC's running
even when they are overextended.  I don't think swapping is appropriate
(in most circumstances) in high performance computing.

> 
> IMHO Cray's are another beasts. Each micro has access to the main memory via
> a high-speed bus. Thats not my situation (100Mbps) and probably not the
> situation of most beowulfs

	Very true, Cray's are "currently" very different from clusters. 
However as GigE-Myrinet level networks become more common, the
distinction becomes a little more blurry.  100 Mbps is probably not
enough to support any type of network swapping, much less for large
numbers of nodes (although really the latency is more of an issue than
the bandwidth if you use a clever network topology).  Also, I really
think we've started to get even "The Mighty Cray's" attention with our
little Beowulf project, and if their smart they are heading more towards
cluster styled systems also.  Really, I can't even tell the difference
anymore between a true "supercomputer" and a cluster.  They may have the
price and performance increase of using non-COTS hardware, but the goal
is the same - to get parallel jobs done.  
-- 
Jared Hodge
Institute for Advanced Technology
The University of Texas at Austin
3925 W. Braker Lane, Suite 400
Austin, Texas 78759

Phone: 512-232-4460
Fax: 512-471-9096
Email: Jared_Hodge at iat.utexas.edu




More information about the Beowulf mailing list