Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] tcp error: Need ideas!

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Gerry Creager gerry.creager at tamu.edu
Fri Jan 23 05:49:23 PST 2009


First, thanks to all who've responded.  I've been looking a bit thins 
morning and am trying to grok the results.

Joe Landman wrote:
> Hi Gerry
> 
> Gerry Creager wrote:
>> History/background/description of the cluster
>> * 126 node Dell 1950 cluster with dual-quad core Xeons
>> * HP 5412zl switch for gigabit cluster backplane and 10GBE 
>> interconnect to selected services (file server, etc)
>> * Gigabit interconnect
>> * Hand compiled 2.6.26 kernel
>> * bnx2 module loaded for the Broadcom onboard nics
>> * Switch, compute nodes, head node set to 9000 byte MTU
> 
> We have had *lots* of problems with Broadcom nics and jumbo frames. From 
> 2.6.9 timeframe onwards.

Marvelous.  I'd prefer to not have to back-rev if I can avoid it...

>>
>> We're seeing the following error in WRF compiled with openMPI and the 
>> PGI 7.2 compiler:
>> mca_btl_tcp_frag_send:writev failed with errno=104
>>
>> While all nodes were accessible prior to the run and returned 
>> appropriate "stuff" when queried with, eg., ssh and a command, two 
>> nodes now return something like this:
>> [gerry at brazos SCOOP12km]$ ssh c0522
>> Received disconnect from 192.168.200.154: 2: Bad packet length 808464432.
> 
> Hmmm... sounds like a link tried re-negotiating.  Can you get on via 
> serial/console and

My guess is that the driver wandered across memory boundaries.  This 
stinks of a buffer problem to me.  Typically, after this happens, I 
can't log into the node via any interface, nor on console.  It requites 
an ipmi or physical reboot.

> root at lightning:~# ethtool eth0

-bash-3.2# ethtool eth1
Settings for eth1:
         Supported ports: [ TP ]
         Supported link modes:   10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Full
         Supports auto-negotiation: Yes
         Advertised link modes:  10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Full
         Advertised auto-negotiation: Yes
         Speed: 1000Mb/s
         Duplex: Full
         Port: Twisted Pair
         PHYAD: 1
         Transceiver: internal
         Auto-negotiation: on
         Supports Wake-on: g
         Wake-on: d
         Link detected: yes

> You might want to
> 
>     ethtool eth0 autoneg off
> 
> to force it not to renegotiate its speed.  Also, look at

-bash-3.2# ethtool -A eth1 autoneg off
autoneg unmodified, ignoring
no pause parameters changed, aborting

> root at lightning:~# ethtool -g eth0

-bash-3.2# ethtool -g eth1
Ring parameters for eth1:
Pre-set maximums:
RX:             1020
RX Mini:        0
RX Jumbo:       4080
TX:             255
Current hardware settings:
RX:             255
RX Mini:        0
RX Jumbo:       765
TX:             255

> See if you can do something like
> 
>     ethtool  -G eth0 rx-jumbo 100
> 
> if you have zero jumbo ring rx entries.

Doesn't look like this requires much change.

Also, while I'm in the neighborhood, to respond to Mark's suggestions:

-bash-3.2# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off

Hmmm Might be worth changing tcp segmentation here.

-bash-3.2# ethtool -S eth1
NIC statistics:
      rx_bytes: 43454
      rx_error_bytes: 0
      tx_bytes: 51103
      tx_error_bytes: 0
      rx_ucast_packets: 231
      rx_mcast_packets: 0
      rx_bcast_packets: 329
      tx_ucast_packets: 250
      tx_mcast_packets: 0
      tx_bcast_packets: 4
      tx_mac_errors: 0
      tx_carrier_errors: 0
      rx_crc_errors: 0
      rx_align_errors: 0
      tx_single_collisions: 0
      tx_multi_collisions: 0
      tx_deferred: 0
      tx_excess_collisions: 0
      tx_late_collisions: 0
      tx_total_collisions: 0
      rx_fragments: 0
      rx_jabbers: 0
      rx_undersize_packets: 0
      rx_oversize_packets: 0
      rx_64_byte_packets: 365
      rx_65_to_127_byte_packets: 166
      rx_128_to_255_byte_packets: 20
      rx_256_to_511_byte_packets: 7
      rx_512_to_1023_byte_packets: 1
      rx_1024_to_1522_byte_packets: 1
      rx_1523_to_9022_byte_packets: 0
      tx_64_byte_packets: 42
      tx_65_to_127_byte_packets: 84
      tx_128_to_255_byte_packets: 31
      tx_256_to_511_byte_packets: 97
      tx_512_to_1023_byte_packets: 0
      tx_1024_to_1522_byte_packets: 0
      tx_1523_to_9022_byte_packets: 0
      rx_xon_frames: 0
      rx_xoff_frames: 0
      tx_xon_frames: 0
      tx_xoff_frames: 0
      rx_mac_ctrl_frames: 0
      rx_filtered_packets: 60
      rx_discards: 0
      rx_fw_discards: 0
-bash-3.2# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:1E:C9:AC:27:FB
           inet addr:192.168.200.154  Bcast:192.168.203.255 
Mask:255.255.252.0
           UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
           RX packets:574 errors:0 dropped:0 overruns:0 frame:0
           TX packets:265 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:44422 (43.3 KiB)  TX bytes:54606 (53.3 KiB)
           Interrupt:16 Memory:f4000000-f4012100



>> I'm stumped and looking for causes and solutions.  Yeah, the WRF as 
>> compiled did run before the change to Jumbos.
>>
>> Do I reduce the size of the frames to something smaller, like 8800 
>> bytes? 7500?  1500?
> 
> In the past I had heard that jumbo frames may work on Broadcom NICs 
> around 6000 byte length.  We haven't tried this in a while ... YMMV.
> 
>>
>> I'm not completely out of ideas but stumped.
>>
>> Thanks, gerry
> 
> 

-- 
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843



More information about the Beowulf mailing list