newbie: 16-node 500Mbps design
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Josip Loncaric josip at icase.eduMon Aug 28 16:00:31 PDT 2000
- Previous message: newbie: 16-node 500Mbps design
- Next message: USB floppy?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mark Hahn wrote: > > no. Josip's (fine) works is a specific tuning for small-packet performance; > it violates the standards, or at least accepted practice for TCP. Using retransmit timeouts shorter than 200ms may break TCP connections to older BSD hosts. However, this fixed 200ms floor value of the retransmit interval is ridiculous on a closed Beowulf network (200ms at 100Mbit/s represents 2.5 MBytes). BTW, retransmit timeout is adaptively estimated by TCP, so limiting this estimate from below by using the floor of 20ms (which Linux can easily handle on Intels) is appropriate for Beowulf use. > that's fine for tweaking your cluster, but it does NOT show a general problem > with stalls. it's a little unclear to me why he calls these events > "deadlocks", since afaikt, they're simply retransmit timeouts in TCP > terminology, part of TCP's congestion-avoidance heuristics. TCP stalls happen often, but as long as there is a good reason (e.g. congestion) I would not call them 'deadlocks'. However, when both sender and receiver have the capacity to transfer more data, but are forced to wait for a timeout because of a deadlock in TCP logic, then the term is appropriate. One form of this deadlock is described in: "How a large ATM MTU causes deadlocks in TCP data transfers," by Kjersti Moldeklev and Per Gunningberg, IEEE/ACM Trans. on Networking, v3, No. 4, Aug. 1995, pp. 409-422. (see http://www2.comp.polyu.edu.hk/~comp555/INPSII/deadlock.pdf) A common feature of this deadlock and the one my patch addresses is the fact that delayed ACKs could be mistaken for network congestion. My simple fix reduces the probability of deadlocks by using immediate ACKs with (adjustable) probability (we use p=1/8). In seeking a deadlock-free TCP, others have proposed a more elaborate Adaptive Acknowledgment Algorithm: Adam Yeung and Rocky K. C. Chang, "Improving TCP Throughput Performance on High-Speed Networks with a Receiver-Side Adaptive Acknowledgment Algorithm." (see http://www2.comp.polyu.edu.hk/~comp555/INPSII/d2-5.pdf) Sincerely, Josip -- Dr. Josip Loncaric, Senior Staff Scientist mailto:josip at icase.edu ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/ NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
- Previous message: newbie: 16-node 500Mbps design
- Next message: USB floppy?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
