Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] copying big files (Henning Fehrmann)

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

David Mathog mathog at caltech.edu
Mon Aug 18 08:38:09 PDT 2008


Henning Fehrmann wrote:

> 
> I spread successfully a 10G file to 50 nodes. The rate was 140Mb/s for
nettee and a bit slower using  dolly.
> I guess it was due to a busy node somewhere in the chain.  
> Increasing the number of clients up to 100 failed in both cases.
> 
> For nettee I got:
> nettee: fatal error writing to child: Connection reset by peer

> 
> I will do more systematic test the next days. 
> David Mathog, are you interested in bug reports?

Yes, please. 

If memory serves you will see that error whenever a child node, or
nettee on that child, crashes.  For instance, if you "kill -9" nettee on
a child the parent should see that.  The command option -colwf will let
the chain continue if this is caused by a full disk or a stdout pipe
failing.  The option -conwf should let the chain continue transfer down
to one above the failed node, and it should tell you which node it was
that failed, so long as -v is used with the appropriate bits.

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech



More information about the Beowulf mailing list