[Beowulf] MPI performance gain with jumbo frames
laytonjb at charter.net
laytonjb at charter.net
Wed Jun 13 16:30:16 PDT 2007
One of the purposes of interrupt coalescence is to reduce the
load on the CPU by ganging interrupt requests together (sorry
for all of the technical jargon there). In a multi-core situation,
do the interrupts affect all of the cores or just one core?
If the interrupts affect all of the cores, then interrupt
coalescence might be a good thing (even if the latency is much
higher). I think Doug has some benchmarks that show some
strange things when running NPB on multi-core nodes. This
might show us something about what's going on.
I personally like the concept that Level 5 Networks used in conjunction
with their GigE cards - user space drivers. From what I remember the
Level 5 driver runs a TCP stack on each core as well as the driver.
Then when the core needs to communicate it sends the data directly
to the card (I think) bypassing the kernel. The cool thing though is that the
TCP stack is running on the specific core that is communicating. So the
TCP stack makes use of the fact that the core is idle why the data is
being sent so why not use it for TCP processing? You now have the
fastest TCP processor in the box (much faster than the TOE ASICs).
I think that's a really interesting concept.
> So this begs the question, if we are "core rich and packet small"
> do we care about packet size and overhead? In other words if we have
> plenty of cores when do we not care about communication
> overhead. Most GigE drivers have various interrupt coalescence
> strategies and of course Jumbo Frames to lessen the processor
> load, but if we have multi-core do we need to care about this
> as much ... any thoughts?
> > Doug and Jeff have good points (and some good links). On thing to
> > also pay attention to is the CPU utilization during the bandwidth and
> > application testing. We found that on our cluster (various Dells
> > with built in GigE NICs) while we did not see huge differences in
> > effective bandwidth, the CPU overhead was notably less when using
> > Jumbo Frames.
> > Again, YMMV.
> > Good luck,
> > -bill
> > On Jun 11, 2007, at 11:57 AM, Jeffrey B. Layton wrote:
> >> Doug brings up some good points. If you want to try Jumbo
> >> Frames to improve MPI performance you might have to
> >> tweak the TCP buffers as well. There are some links around
> >> the web on this. Sometimes it helps performance, sometimes
> >> it doesn't. Your mileage may vary.
> >> Jeff
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> > !DSPAM:467021b9234289691080364!
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf