[Beowulf] Need recommendation for a new 512 core Linux cluster
landman at scalableinformatics.com
Wed Nov 7 16:22:40 PST 2007
Steven Truong wrote:
> Hi, all. I would like to know for that many cores, what kind of file
> system should we go with? Currently we have a couple of clusters with
> around 100 cores and NFS seems to be ok but not great. We definitely
> need to put in place a parallel file system for this new cluster and I
> do not know which one I should go with? Lustre, GFS, PVFS2 or what
> else? Could you share your experiences regarding this aspect?
What is the nature of your IO? That is, are your jobs dominated by
large sequential reads and writes, or are the nodes effectively
reading/writing when they want (small, random-ish IO). Are your
programs already set for parallel IO (MPI-IO), or is there a single node
that handles most of your IO requests for your jobs
We have looked at GFS recently for some of our storage cluster
offerings, and while inexpensive, it appears to have some bottlenecks
which render it less than ideal for HPC cluster storage. There are some
papers on technologies to improve it:
The general contenders could be Lustre, PVFS2, and a few others. As
Lustre was just acquired by Sun, my concern would be continued Linux
support going forward.
Again, all of this depends upon your read/write patterns, and what
you want to do with it (is this scratch/temp space, or "permanent"
> I also would like to know how many head nodes should I need to manage
> jobs and queues. And what else should I have to worry about?
This depends upon usage patterns. How critical is it that your job
scheduler stay up? How many users will be submitting jobs? Will they
do so interactively, or via web tools? Which scheduler do you plan to
deploy? Which OS?
512 cores would fit nicely in 32 nodes with quad socket quad core.
Floor space shouldn't be an issue. Heat/power could be.
> Thank you very much for sharing any experiences.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf