[Beowulf] RX-polling in sk98lin driver
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mario Storti mstorti at intec.unl.edu.arWed Feb 15 10:08:18 PST 2006
- Previous message: [Beowulf] Nas parallel benchmarks issue
- Next message: [Beowulf] What did I neglect to add? Specing hardware and software and support for a 16 node beowulf
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all, On January we posted a message about very low performance that we detected on a home-made cluster with P4 machines (D915PGN mother-board), 3c2000t NIC cards. http://thread.gmane.org/gmane.comp.clustering.beowulf.general/14343 In brief, we found that for very small size packets the time spent in t sending the messages was too high. Please note that this is not related to latency, which is in the order of 50musec, but for every 30 packets or so, the round-trip time takes 0.1sec or so. This drops the effective bandwith too much, and our applications (mainly a self written Finite Element code for fluids (CFD) http://www.cimec.org.ar/petscfem) run slower than in a Fast Etherne network. The cluster was mounted with WareWulf diskless package on top of Fedora Core 3 (Kernel 2.6.15), which comes with the `sk98lin' driver fot the 3c200t NIC card. The code used MPICH-1.2.6 and PETSC-2.1.6 (http://www.mcs.anl.gov/petsc). At that time we suspected mainly from the TCP layer of the Linux kernel, since the symptoms were very similar to those reported at http://www.icase.edu/coral/LinuxTCP.html After trying a lot of things, we found that upgrading to MPICH2 (version mpich2-1.0.3) almost fixed the problem. But recently we found a similar fault. When solving large linear systems in parallel with PETSc, the code was very slow when using the GMRES method. Again, after trying a lot of things, including tunning the TCP kernel parameters, we found that upgrading the NIC driver to sk98lin 8-23 from the 3COM site (or also 8-30 from www.syskonnect.com) and disbaling the RX-polling option, eliminates the fault. However, when using this kernel we find that the nodes hang randomly, at a frequency of 2 nodes hanged out of 12 per day. We tried several things. * The same kernel with both versions of the driver (8-23 and 8-30) with RX polling 1 is stable (but is slow). * Using a 2.4 kernel instead of the 2.6 doesn't change things. * We tried to disable NFS by building a large VNFS ramdisk with all the files needed but couldn't perform the experiment well, and so we are unable to say if the fault is related to NFS or not. (Note: Nodes are diskless but NFS traffic is reduced by loading most files (almost all except for /usr..) in a ramdisk. ) * Nodes hang even when not under load. They may hang even when they are idle. RX polling seems to be an option present in many drivers and (if I understand well) tries to gather incoming packets in larger ones. The help that we obtain in `$ make menuconfig' about the RX polling option, is the following. > Use Rx polling (NAPI) > CONFIG_SK98LIN_NAPI: > NAPI is a new driver API designed to reduce CPU and interrupt load > when the driver is receiving lots of packets from the card. Any hints are welcome, TIA, Mario -- ------------------------- Mario Alberto Storti [cel. +54-342-156144983] CIMEC (INTEC/CONICET-UNL), Guemes 3450 - 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1015), Tel/Fax: +54-342-4511169 e-mail: mstorti at intec dot unl dot edu dot ar http://www.cimec.org.ar/mstorti -------------------------
- Previous message: [Beowulf] Nas parallel benchmarks issue
- Next message: [Beowulf] What did I neglect to add? Specing hardware and software and support for a 16 node beowulf
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
