[Beowulf] raw ethernet

Keith D. Underwood kdunder at sandia.gov
Sat Jul 24 06:41:00 PDT 2004


> are connected by a point-to-point gigabit link). I'm interested in
> understand why I have a packet loss (interrupt management, full rx ring,
> rx buffer overflow, each of them combined?) Now I'm trying to modulate the

You lose packets because you can.  That sounds sarcastic, but it is the
very unfortunate reality.  I have a bit of  low level GigE experience
and I am pretty sure that is the primary reason.  

For example, say you have a high end GigE switch that guarantees sub-5
microsecond LIFO latency.  Now, assume you have enabled flow control
(yes, flow control was defined in the GigE spec) and that you know the
cards have a decent amount of buffer.  Next, simultaneously send 35
packets to one destination with origins distributed evenly among 7
sources.  Want to know what will happen?  Most of those packets will get
dropped.  Why?  They switch should have enough buffer to handle that. 
Well... as best I could tell, the answer is that ethernet can drop
packets (it is defined as an unreliable protocol) and the switch
recognized a momentary flood that prevented it from meeting latency
guarantees, so it started dropping packets.  Because the protocol is
inherently unreliable, "drop" is a valid design decision at any point
along the path.

					Keith





More information about the Beowulf mailing list