[Beowulf] Big storage

Joe Landman landman at scalableinformatics.com
Thu Aug 23 04:56:15 PDT 2007

Greetings Jakob:

Jakob Oestergaard wrote:
> Hi all,
> While not a typical beowulf, I thought the list would have some input for this
> "storage cluster" or whatever we should call it :)

up front disclaimer: we design/build/market/support such things.

> I'm looking at getting some big storage. Of all the parameters, getting as low
> dollars/(month*GB) is by far the most important. The price of acquiring and
> maintaining the storage solution is the number one concern.

Should I presume density, reliability, and performance also factor in
somewhere as 2,3,4 (somehow) on the concern list?

> The setup will probably have a number of "head nodes" which receive a large
> amount of data over standard gigabit from a large amount of remote sources.
> Data is read infrequently from the head nodes by remote systems. The primary
> load on the system will be data writes.

Ok, so you are write dominated.  Could you describe (guesses are fine)
what the writes will look like?  Large sequential data, small random
data (seek, write, close)?

> The head nodes need not see the same unified storage; so I am not required to
> have one big shared filesystem. If beneficial, each of the head nodes could
> have their own local storage.

There are some interesting designs with a variety of systems, including
GFS/Lustre/... on those head nodes, and a big pool of drives behind
them.  These designs will add to the overall cost, and increase complexity.

> The storage pool will start out at around 100TiB and will grow to ~1PiB within
> a year or two (too early to tell). It would be nice to use as few racks as
> possible, and as little power as possible  :)

Ok, so density and power are important.  This is good.  Coupled with the
 low management cost and low acquisition cost, we have about 3/4 of what
we need.  Just need a little more description of the writes.

Also, do you intend to back this up?  How important is resiliency of the
system?  Can you tolerate a failed unit (assume the units have hot
spares, RAID6, etc).  When you look at storage of this size, you have to
start planning for the eventual (and likely) failure of a chassis (or
some number of them), and think about with a RAIN configuration.  Either
that, or invest into massive low level redundancy (which should be scope
limited to the box it is on anyway).

> It *might* be possible to offload older files to tape; does anyone have
> experience with HSM on Linux?  Does it work?  Could it be worthwhile to
> investigate?

Hmmm...  First I would suggest avoiding tape, you should likely be
looking at disk to disk for backup, and use slower nearline mechanisms.

> One setup I was looking at, is simply using SunFire X4500 systems (you can put
> 48 standard 3.5" SATA drives in each 4U system). Assuming I can buy them with
> 1T SATA drives shortly, I could start out with 3 systems (12U) and grow the
> entire setup to 1P with 22 systems in little over two full racks.
> Any better ideas?  Is there a way to get this more dense without paying an arm
> and a leg?  Has anyone tried something like this with HSM?

Yes, but I don't want to turn this into a commercial, so I will be
succinct.  Scalable Informatics (my company) has a similar product,
which does have a good price and price per gigabyte, while providing
excellent performance.  Details (white paper, benchmarks, presentations)
at the http://jackrabbit.scalableinformatics.com web site.

> Thanks all, your input will be greatly appreciated! :)

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615

More information about the Beowulf mailing list