Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: dedupe filesystem

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Gerry Creager gerry.creager at tamu.edu
Mon Jun 29 06:57:19 PDT 2009


Dave Love wrote:
> Ashley Pittman <ashley at pittman.co.uk> writes:
> 
>> If you relied on the md5 sum alone there would be collisions and those
>> collisions would result in you losing data.
> 
> The question is whether the probability of collisions is high compared
> with other causes -- presumably hardware, assuming no-one puts figures
> on the software reliability.  As far as I remember, the calculation for
> SHA-1 for Plan 9's Venti¹, which no-one seems to have mentioned, says
> ignore collisions for petabyte filesystems.
> 
> Ob-Beowulf:  You can run Venti on GNU/Linux,² but I don't know how the
> current implementation performs.  Also, GlusterFS has a `data
> de-duplication translator' on its roadmap, which I didn't see mentioned.

Our initial results with a GlusterFS implementation led us back to NFS. 
  Who's got a really successful GlusterFS implementation working?

> --
> 1. http://plan9.bell-labs.com/sys/doc/venti/venti.html
> 2. http://swtch.com/plan9port/
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843



More information about the Beowulf mailing list