[Beowulf] tcp error: Need ideas!
Paulo Afonso Lopes
pal at di.fct.unl.pt
Sat Jan 24 10:33:05 PST 2009
I have tried to use Jumbo with the tg3 driver on Scientific Linux 5.0
(both the original 2.6.18-8.1.3.el5 and then 2.6.18-8.1.15.el5)
- netperf shows a sizeable CPU decrease (mainly due to softirq) and high
bandwidths (above 100MB/s)
- A LAM/MPI application hangs (it works fine on 1500 and IB)
- NFS hangs
BTW, there is a string of NFS write problems with these RH-based releases:
2.6.18-8.1.15.el5: writes at 2MB/s (SL 5.0, updated kernel)
2.6.18-92.el5: writes at 2MB/s (CentOS 5.2)
2.6.18-92.1.18.el5: writes at 30MB/s (CentOS 5.2, updated kernel)
What's going on in the "upstream vendor"?!
> Couple of follow-up notes.
> MTU=4500: Had one node fall over with the same overflow errors.
> MTU=3000: A WRF model is running, but single timesteps are executing
> 2.5x slower than MTU=1500
> I'll go snag the new driver and compile it. After all: What can it hurt!
> Thanks, Guy!
> Regards, Gerry
> Guy Coates wrote:
>> We have also seen problems with the bnx2 drivers.
>> I got a more recent set of bnx2 drivers from Broadcom:
>> bnx2-1.8.2b (1Gig E cards)
>> bnx2x-1.46.12 (10 Gig E cards)
>> you can grab a copy from
>> You should be able to build that drivers against kernel >= 2.6.9 so long
>> as you
>> have the appropriate kernel headers.
> Gerry Creager -- gerry.creager at tamu.edu
> Texas Mesonet -- AATLT, Texas A&M University
> Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
> Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
Paulo Afonso Lopes | Tel: +351- 21 294 8536
Departamento de Informática | 294 8300 ext.10763
Faculdade de Ciências e Tecnologia | Fax: +351- 21 294 8541
Universidade Nova de Lisboa | e-mail: pal at di.fct.unl.pt
2829-516 Caparica, PORTUGAL
More information about the Beowulf