Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] copying data between clusters

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

David Simas dgs at slac.stanford.edu
Fri Mar 5 14:54:48 PST 2010


On Fri, Mar 05, 2010 at 12:32:37PM -0500, Michael Di Domenico wrote:
> As i expect from the smartest sysadmins on the planet, everyone has
> over analyzed the issue... :)
> 
> lets see if i can clarify
> 
> assuming there are two clusters - clusterA and clusterB
> 
> Each cluster is 32nodes and has 50TB of storage attached
> 
> the aggregate network bandwidth between the clusters is 800MB/sec
> 
> the problem is the per-node bandwidth on clusterB is 30MB/sec
> 
> so i use a single node to copy the 20TB of data from clusterB, yes
> it's going to take me 7days to copy everything
> 
> I'd like to paralyze that across multiple nodes to drive the aggregate up
> 
> I was hoping someone would pop up say, hey use this magical piece of
> software. (of which im unable to locate)..

You might be able to use "dar" for this:

	http://dar.linux.free.fr/

Dar will let you slice up your 20 TB of data into even sized pieces
that you can transfer in parallel, than re-construct on the receiving
side.

David S.


> 
> 
> 
> On Fri, Mar 5, 2010 at 11:30 AM, kyron <kyron at neuralbs.com> wrote:
> > On Fri, 05 Mar 2010 11:22:14 -0500, Mike Davis <jmdavis1 at vcu.edu> wrote:
> >> Michael Di Domenico wrote:
> >>> How does one copy large (20TB) amounts of data from one cluster to
> >>> another?
> >>>
> >>> Assuming that each node in the cluster can only do about 30MB/sec
> >>> between clusters and i want to preserve the uid/gid/timestamps, etc
> >>>
> >> If the clusters are co-lo I wouldn't copy I would use shared storage. If
> >
> >> they are not co-located I would use patience.
> >>
> >> Seriously though, for a one time copy, I would consider copying to an
> >> external system and then physically moving that system. To do this and
> >> preserve ownerships you will need to duplicate accounts and groups.
> >
> >
> > ...and we are all assuming non-compressibility; otherwise, use pbzip2 ;)
> >
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list