[Beowulf] Rsync - checksums

Michael Di Domenico mdidomenico4 at gmail.com
Mon Jun 17 09:07:11 PDT 2019


just out of morbid curiosity i popped through the rsync code.  it
doesn't look terribly difficult to wedge in a new algo.  but honestly,
if i was going to go through the trouble i'd write a new tool that
walks the file tree in parallel and logs the checksums to a database.
i've had problems rsync'ing big filesystems in the past, so i try to
avoid it as a DR or poor-man's snapshotting

On Mon, Jun 17, 2019 at 11:30 AM Christopher Samuel <chris at csamuel.org> wrote:
>
> On 6/17/19 6:43 AM, Bill Wichser wrote:
>
> > md5 checksums take a lot of compute time with huge files and even with
> > millions of smaller ones.  The bulk of the time for running rsync is
> > spent in computing the source and destination checksums and we'd like to
> > alleviate that pain of a cryptographic algorithm.
>
> First of all I would note that rsync only uses checksums if you tell it
> to, otherwise it just uses file times and sizes to determine what to
> transfer.
>
> rsync is also single-threaded, so I would take a look at what was
> previously called parsync, but is now parsynfp :-)
>
> http://moo.nac.uci.edu/~hjm/parsync/
>
> There is the caveat there though:
>
> # As a warning, the main use case for parsyncfp is really only
> # very large data transfers thru fairly fast network connections
> # (>1Gb). Below this speed, rsync itself can saturate the
> # connection, so there’s little reason to use parsyncfp and in
> # fact the overhead of testing the existence of and starting more
> # rsyncs tends to worsen its performance on small transfers to
> # slightly less than rsync alone.
>
> Good luck!
> Chris
> --
>    Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


More information about the Beowulf mailing list