Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] recommendations for a good ethernet switch for connecting ~300 compute nodes

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at mcmaster.ca
Wed Sep 2 23:18:20 PDT 2009


> That brings me to another important question. Any hints on speccing
> the head-node?

I think you imply a single, central admin/master/head node.  this is 
a very bad idea.  first, it's generally a bad idea to have users on 
a fileserver.  next, it's best to keep cluster-infrastructure
(monitoring, management, pxe, scheduling) on a dedicated admin machine.
for 300 compute nodes, it might be a good idea to provide more than 
one login node (for editing, compilation, etc).

> Especially the kind of storage I put in on the head
> node. I need around 1 Terabyte of storage. In the past I've uses
> RAID5+SAS in the server.

1 TB is, I assume you know, half a disk these days (ie, trivial).
for a 300-node cluster, I'd configure at least 10x and probably 
100x that much.  (my user community is pretty diverse, though,
and with a wide range of IO habits.)

> Mostly for running jobs that access their I/O
> via files stored centrally.

it would be wise to get some sort of estimates of the actual numbers - 
even the total size of all files accessed by a job and its average 
runtime would let you figure an average data rate.

> For muscle I was thinking of a Nehalem E5520 with 16 GB RAM. Should I

I don't think I'd use such a nice machine for any of fileserver, admin or
login nodes.  for admin, it's not needed.  for login it'll be unused a lot of
the time.  for fileservers, you want to sweat the IO system, not the CPU 
or memory.

> boost the RAM up? Or any other comments. It is tricky to spec the
> central node.

spec'ing a single one may be, but a single one is a bad idea...

> Or is it more advisable to go for storage-box external to the server
> for NFS-stores and then figure out a fast way of connecting it to the
> server. Fiber perhaps?

10G (Cu or SiO2, doesn't matter) is the right choice 
for an otherwise-gigabit cluster.



More information about the Beowulf mailing list