problems with 3com and intel 100MB cards
jim at ks.uiuc.edu
Wed Oct 9 09:13:38 PDT 2002
Intel and 3com cards have good performance because they do all of the
checksums themselves, rather than making the driver use the main CPU.
This means that the data needs to survive its trip across the PCI bus
without errors. If the CPU does the checksum, then (most) PCI bus errors
are caught just as if they were a normal wire error.
My advice is to check your BIOS settings. Set any "performance" options
down to the most conservative, particularly anything related to the PCI
bus. This may fix your problems, or at least reduce their frequency.
On 9 Oct 2002, Marcin Kaczmarski wrote:
> I try to make calculations from materials science using parallel code
> with lam library. I use 6 athlon cluster connected with fast ethernet
> cards. I use 3com3c905c-tx cards with dual channel bonding connection.I
> use redhat linux with 2.4.x kernel and 3c59x driver. Everything works
> quite good but unfortunately I observed that my program (CPMD) simply
> dies without any info in logs (only confirmation from lam that process
> died and that`s all). I tried also intel eepro100 cards and with no
> channel bonding and the results are the same. I consulted this problem
> with some man from Germany who is admin of 60 pc cluster and also uses
> the same program - CPMD. I got the information from him that 3com and
> intel cards are very unreliable , they cannot bear extremely high
> network load. So are these cards only well suited for throwing them away
> to the basket? Do you possibly know how to solve such problem?
> I know about SCI dolphinics cards but they cost $1500 each. I also heard
> from the same man that the only one reliable 100 MB cards were that with
> dec tulip chip. However it seems that they are out of market. Especially
> after intel bought tulip chipset and did sth wrong with that. I`ve heard
> that many clusters run with smc etherpower cards (with tulip chipset)
> however I checked that smc has actually no card with tulip on market.
> I was also thinking about gigabit but first my channel bonding gives me
> almost gigabit performance, second I use motherboards with viakt266a
> chipset and heard that it is very bad chipset and I don`t know if it
> really can handle gigabit card properly, third SCI dolphinics cards have
> very attractive prices while compared with reliable gigabit switches but
> still very high.
> I also heard about linux alpha clusters that fail to operate while
> running with 3com cards.
> Please give some hint how to rebuilt the cluster in order to make it
> usable. Despite using epox cards with viakt266 chipset I have a
> confirmation that the problem is really primarily related to these
> network cards.
> Kind regards
> Marcin Kaczmarski
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf