[Beowulf] Re: failure trends in a large disk drive population (google fileing system)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
matt jones jamesjamiejones at aol.comSun Feb 18 13:49:47 PST 2007
- Previous message: [Beowulf] Re: failure trends in a large disk drive population
- Next message: [Beowulf] Re: failure trends in a large disk drive population (google fileing system)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
i've read in the past somewhere that the Google File System is capable of having many copies of the data. often having 4 copies on different nodes. and as you say run the query to many of them. if one fails there are still 3, if another there are still 2. i've also read somewhere else that if one fails, it can automatically recreate the image from the remaining ones on a spare node. bringing it back to 4. this approach is rather ott, but it works and works well. i suspect this sort of thing could be done cheaper by just using 3 per copy and hoping that you never lose 2 or more nodes at once. essentially this is a huge distributed files system with integrated RAID software. Chris Samuel wrote: > IIRC they also have figured out a way to be fault tolerant by sending queries out to multiple systems for each part of the DB they are querying, so if one of those fails others will respond anyway. > > Apparently they use more reliable hardware for things like the advertising service -- matt.
- Previous message: [Beowulf] Re: failure trends in a large disk drive population
- Next message: [Beowulf] Re: failure trends in a large disk drive population (google fileing system)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
