[Beowulf] Correct networking solution for 16-core nodes
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comThu Aug 3 19:38:35 PDT 2006
- Previous message: [Beowulf] Correct networking solution for 16-core nodes
- Next message: [Beowulf] Correct networking solution for 16-core nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Gilad Shainer wrote: > There was a nice debate on message rate, how important is this factor > when you > Want to make a decision, what are the real application needs, and if > this is > just a marketing propaganda. For sure, the message rate numbers that are > listed > on Greg web site regarding other interconnects are wrong. > > I would take a look on the new cluster in Tokyo institute of technology. > The > Servers there are "fat nodes" too. Not to try to inject an application centric view into the middle of a theoretical debate ... We have some nice data from November 2005 directly comparing the performance of single thread per node to multi-thread per node on a system with both Myrinet and Infinipath HTX units in there. I don't have the data in front of me, but I can recall some of the features. For LAMMPS running on this cluster we saw a definite degradation which we ascribed to resource contention after doing more tests, when more than 1 thread per node ran. With 2 threads per node, it was a noticable impact on all interconnects. Something in the 5-10% region. At one thread per core (4 cores in 2 sockets), we were seeing significant performance impact from this contention. We ran the same tests on an 8 core system and myrinet and the performance impact was more than "noticable". Ran the 4 thread test on 1,2,4 systems, and the 4 way using the infinipath, the myrinet, ch_p4, and ch_shmem using MPICH 1.2.7. We didn't have time to investigate message size or messaging rate. What we did find was for this code using this input deck, on these systems, running one thread per node gave us the best performance, and one thread per socket was already showing performance degradation. One thread per core on the dual cores was showing significant performance degradation on infinipath and myrinet, but not ch_shmem. ch_p4 was simply a baseline. Someday it would be nice to explore this again with a range of other codes. Joe > > Gilad. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 or +1 866 888 3112 cell : +1 734 612 4615
- Previous message: [Beowulf] Correct networking solution for 16-core nodes
- Next message: [Beowulf] Correct networking solution for 16-core nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
