[Beowulf] 512 nodes Myrinet cluster Challanges

Jaime Perea jaime at iaa.es
Wed May 3 00:21:04 PDT 2006


El Miércoles, 3 de Mayo de 2006 02:54, Robert Latham escribió:
> On Fri, Apr 28, 2006 at 09:27:19AM +0200, Jaime Perea wrote:
> > >From my point of view, the big problem there is the IO, we installed
> >
> > on our small cluster the pvfs2 system, it works well, using the
> > myrinet gm for the passing mechanism, the pvfs2 is only a solution
> > for parallel IO, since mpi can use it. On the other hand it can not
> > be used for the normal user stuff, so you have to take that into
> > account and think carefully on how to install a good poweful nfs
> > server machine which has to be on an alternative standard network. On
> > the "other" architecture IBM's gpfs is really a nice alternative.
>
> Hi
>
> I'm one of the PVFS2 developers, so I'm biased, but good choice of
> parallel file system!
>
> when you say "can not be used for the normal user stuff" what do you
> mean exactly?  It's a pretty common setup on many clusters to have a
> slower home directory file system and a faster scratch file system for
> applications.
>
> Just to clarify, you can use PVFS2 for just about any workload,
> though yes, we are a lot slower for some typical "home directory"
> workloads than the other file systems out there.  PVFS2 performs best
> when coupled with an MPI-IO application, but we have a lot of users
> who just use plain old serial unix read() and write() (or some variant
> of those commands) and have satisfactory experiences.
>
> Hope you are having good success with PVFS2.  If you run into any
> problems or questions, please don't hesitate to ask on
> pvfs2-users at beowulf-underground.org
>
> ==rob

Ooops, sorry, english is not my native language and I can make
mistakes :-) I liked pvfs before and I love pvfs2 now. 

Well, I think the problems are those you are mentioning, first it 
goes a bit slower than let's say nfs or something like gfs over gnbd
(for small clusters)... in any case it is not so slow. The other
is that you need the nodes that are metadata or I/O  servers have
to be up, that means that the probability of file system failure is higher.

The adventages are many, parallel I/O is a plus, not only for mpi programs
but also for the normal tasks, if you  try to convert the format of a lot 
of images you can split the work between nodes, but this is an adventage 
only if your file system can handle that, which is not the case of nfs 
obviously.

In other words, pvfs2 is free, great and useful. it works well  as a 
scratch area and it uses resources that otherwise are not visible
for the user. And for myrinet users it goes over gm which is nice.

That is what I think!

Regards
-- 

           Jaime D. Perea Duarte. <jaime at iaa dot es>
             Linux registered user #10472

           Dep. Astrofisica Extragalactica.
           Instituto de Astrofisica de Andalucia (CSIC)
           Apdo. 3004, 18080 Granada, Spain. 



More information about the Beowulf mailing list