[Beowulf] Re: TOE on Linux
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comTue May 20 12:10:26 PDT 2008
- Previous message: [Beowulf] Re: TOE on Linux
- Next message: [Beowulf] Re: TOE on Linux
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Greg Lindahl wrote: > Joe Landman wrote: > >> Contrary to the detractors of the technologies comments, the >> TOE/RDMA card *did* provide fairly significant performance delta for >> real apps running MPI over gigabit ethernet. > > As a detractor of TOEs, I should point out that one data point does > not prove that it's common that apps get a benefit. True, however it does point out that it is possible to get better performance > I'd be willing to bet that this app was doing extremely large > transfers, and maybe even managed to get more concurrency with the StarCD. Not big transfers, it doesn't move GB to its nodes. > TOE... which could easily be a flaw in the MPI implementation's TCP > driver, a pretty common thing to be wrong. For example, LAM was always Yes this could be possible. > much better than MPICH over TCP, and I wouldn't be surprised if > OpenMPI continues this superiority over MPICH-2. Minor issues with OpenMPI and things like Overflow, but other than that, it does work extremely well. > The most interesting thing, to me, is that the various people selling > TOEs in the HPC arena publish almost no benchmarks. What's the message > rate and N1/2? The only N1/2 I've ever seen published was 100 kbytes. What concerns me less than microbenchmarks are the issues of real application wallclock differences. Frankly we have seen far too many microbenchmarks pushed where real applications are avoided. For this test, on 16 machines, with 2 processors per machine, the StarCD run was about 4x better on the TOE/RDMA Ammasso card than it was over this exact same infrastructure without the TOE/RDMA. Every MPI application we ran showed some similar behavior (Fluent, etc). As Ammasso is out of business, this is sadly nothing we could really use these days. Mark Hahn and others pointed out that the CBA for this may not work well, and I agree. The cost of TOE/RDMA honestly does not look like it provides significant benefits in HPC relative to other technologies. There may be some specific corner cases where it does, but I think the hardware has improved, and baseline SDR IB is quite competitive with TOE that using TOE may not make much sense in many situations. > (Obviously I'm not including Myricom in this bucket: they do publish > microbenchmarks.) > > -- greg -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615
- Previous message: [Beowulf] Re: TOE on Linux
- Next message: [Beowulf] Re: TOE on Linux
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
