[Beowulf] copying data between clusters

David Simas dgs at slac.stanford.edu
Fri Mar 5 14:54:48 PST 2010


On Fri, Mar 05, 2010 at 12:32:37PM -0500, Michael Di Domenico wrote:
> As i expect from the smartest sysadmins on the planet, everyone has
> over analyzed the issue... :)
> 
> lets see if i can clarify
> 
> assuming there are two clusters - clusterA and clusterB
> 
> Each cluster is 32nodes and has 50TB of storage attached
> 
> the aggregate network bandwidth between the clusters is 800MB/sec
> 
> the problem is the per-node bandwidth on clusterB is 30MB/sec
> 
> so i use a single node to copy the 20TB of data from clusterB, yes
> it's going to take me 7days to copy everything
> 
> I'd like to paralyze that across multiple nodes to drive the aggregate up
> 
> I was hoping someone would pop up say, hey use this magical piece of
> software. (of which im unable to locate)..

You might be able to use "dar" for this:

	http://dar.linux.free.fr/

Dar will let you slice up your 20 TB of data into even sized pieces
that you can transfer in parallel, than re-construct on the receiving
side.

David S.


> 
> 
> 
> On Fri, Mar 5, 2010 at 11:30 AM, kyron <kyron at neuralbs.com> wrote:
> > On Fri, 05 Mar 2010 11:22:14 -0500, Mike Davis <jmdavis1 at vcu.edu> wrote:
> >> Michael Di Domenico wrote:
> >>> How does one copy large (20TB) amounts of data from one cluster to
> >>> another?
> >>>
> >>> Assuming that each node in the cluster can only do about 30MB/sec
> >>> between clusters and i want to preserve the uid/gid/timestamps, etc
> >>>
> >> If the clusters are co-lo I wouldn't copy I would use shared storage. If
> >
> >> they are not co-located I would use patience.
> >>
> >> Seriously though, for a one time copy, I would consider copying to an
> >> external system and then physically moving that system. To do this and
> >> preserve ownerships you will need to duplicate accounts and groups.
> >
> >
> > ...and we are all assuming non-compressibility; otherwise, use pbzip2 ;)
> >
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list