Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] tcp error: Need ideas!

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Nifty Tom Mitchell niftyompi at niftyegg.com
Sat Jan 24 11:03:18 PST 2009


On Sat, Jan 24, 2009 at 09:36:09AM -0600, Gerry Creager wrote:
> 
> Couple of follow-up notes.
>
> MTU=4500:  Had one node fall over with the same overflow errors.
> MTU=3000:  A WRF model is running, but single timesteps are executing  
> 2.5x slower than MTU=1500
>
> I'll go snag the new driver and compile it.  After all: What can it hurt!
>
> Thanks, Guy!
>
> Regards, Gerry
>
> Guy Coates wrote:
>> Hi,
>>
>> We have also seen problems with the bnx2 drivers.
>>
>> I got a more recent set of bnx2 drivers from Broadcom:
>>
......

Has the data been snooped for this data to see if all
is as expected.

If you are seeing a natural MTU running faster than a jumbo MTU
then something is fragmenting or causing fragmentation of the data.  

Should the MTU=4500 causes overflow errors it might be related to fragmentation.
Both the sender and receiver have to keep all the bits on a reliable 
transfer until the data has been acknowledged.   At one time fragmentation
could only be done once to a minimum MTU in the life of a packet.

In addition to snooping packets try "tracepath" to and from all 
the involved boxes to discover what is going on.


-- 
	Regards,
	T o m   M i t c h e l l




More information about the Beowulf mailing list