[Beowulf] fast file copying
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Broadley bill at cse.ucdavis.eduThu May 10 14:51:20 PDT 2007
- Previous message: [Beowulf] fast file copying
- Next message: [Beowulf] fast file copying
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Felix Rauch Valenti wrote: > On 04/05/07, Bill Broadley <bill at cse.ucdavis.edu> wrote: >> Geoff Galitz wrote: >> > During an HPC talk some years ago, I recall someone mentioned a tool >> > which can copy large datasets across a cluster using a ring topology. >> > Perhaps someone here knows of this tool? >> >> Not sure about a ring topology, seems kinda silly... > > Why would that be silly? The normal ring based disadvantages, reliability and performance. In the case where your head node and the client nodes have the same speed network, all clients are present, all clients are idle, all clients survive until the end of the transfer you can get great performance. It certainly seems like 90% of so of line speed is possible. Seems like any number of things could make the ring based approach a poor choice, where the worst case of the ring could dramatically slow things down. Things like: * Head node's network connection is 10 times faster * A single node dies during the transfer * A single node joins late * A single node is very busy (I/O, memory constrained, or CPU) A bit-torrent like approach would handle all of the above relatively gracefully. The nettee approach does have the advantage that all disk accesses are sequential. But with a large chunk size of say 64 MB (when transferring a few GB file) seems like seeks wouldn't be a major issue. I've seen 15 MB/sec per client with the default chunk size (fairly small), when I wrote the file to a better disk system I managed 30MB/sec. I've yet to try larger chunk sizes on normal compute node disk systems. I'll do some more testing. > More advantages of the ring topology: It uploads every block on every Sounds like bittorrent. > node exactly once, no prefetching and no seeks are required (if you > replicate a whole partition or a single large file). bittorrent does seek more, but it seems trivial to reduce the seeks so that it's not a performance impact... say 1 per second. > If you are interested in more details about the technology, like > models and performance measurements (somewhat old by now), check out > the second paper in this list: > > http://www.cs.inf.ethz.ch/cops/patagonia/#relmat Interesting paper, I'll try a run with GigE so I can compare fairly.
- Previous message: [Beowulf] fast file copying
- Next message: [Beowulf] fast file copying
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
