[Beowulf] SSD caching for parallel filesystems

Jonathan Aquilina eagles051387 at gmail.com
Fri Feb 8 08:29:20 PST 2013


Brock the pcie SSD's from ocz the enterprise ones seem to have insane
performance.


On Fri, Feb 8, 2013 at 5:20 PM, Brock Palen <brockp at umich.edu> wrote:

> To add another side note, in the process of interviewing the Gluster team
> for my podcast (www.rce-cast.com) he mentioned writing a plugin, that
> would first write data local to the host, and gluster would then take it to
> the real disk in the background.  There was constraints to doing this.  I
> assume because there was no locking to promise consistency, but for some
> workloads this might be useful, and combine it with local Flash.
>
> That episode should be up Feb 24th.
>
> Brock Palen
> www.umich.edu/~brockp
> CAEN Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
>
> On Feb 8, 2013, at 11:12 AM, Ellis H. Wilson III wrote:
>
> > On 02/06/2013 04:36 PM, Prentice Bisbal wrote:
> >> Beowulfers,
> >>
> >> I've been reading a lot about using SSD devices to act as caches for
> >> traditional spinning disks or filesystems over a network (SAN, iSCSI,
> >> SAS, etc.).  For example, Fusion-io had directCache, which works with
> >> any block-based storage device (local or remote) and Dell is selling
> >> LSI's CacheCade, which will act as a cache for local disks.
> >>
> >> http://www.fusionio.com/data-sheets/directcache/
> >> httP//
> www.dell.com/downloads/global/products/pedge/en/perc-h700-cachecade.pdf
> >>
> >> Are there any products like this that would work with parallel
> >> filesystems, like Lustre or GPFS? Has anyone done any research in this
> >> area? Would this even be worthwhile?
> >
> > Coming late to this discussion, but I'm currently doing research in this
> > area and have a publication in submission about it.  What are you trying
> > to do specifically with it?  NAND flash, with it's particularly nuanced
> > performance behavior, is not right for all applications, but can help if
> > you think through most of your workloads and cleverly architect your
> > system based off of that.
> >
> > For instance, there has been some discussion about PCIe vs SATA -- this
> > is a good conversation, but what's left out is that many manufacturers
> > do not actually use native PCIe "inside" the SSD.  It is piped out from
> > the individual nand packages in something like a SATA format, and then
> > transcoded to PCIe before going out of the device.  This results in
> > latency and bandwidth degredation, and although a bunch of those devices
> > on Newegg and elsewhere under the PCIe category report 2 or 3 or 4 GB/s,
> > it's closer to just 1 or under.  Latency is still better on these than
> > SATA-based ones, but if I just wanted bandwidth I'd buy a few, cheaper,
> > SATA-based ones and strap them together with RAID.
> >
> > On a similar note (discussing the nuances of the setup and the
> > components), if your applications are embarrassingly parallel or your
> > network is really slow (1Gb Ether) they client-side caching is
> > definitely the way to go.  But, if there are ever sync points in the
> > applications or you have a higher throughput, lower latency network
> > available to you, going for a storage-side cache will allow for
> > write-back capabilities as well as higher throughput and lower latency
> > promises than a single client side cache could provide.  Basically, this
> > boils down to do you want M cheaper devices in each node that don't have
> > to go over the network but that are higher latency and lower bandwidth,
> > or do you want an aggregation of N more expensive devices that can give
> > lower latency and much higher bandwidth, but over the network.
> >
> > For an example on how some folks did it with storage-local caches up at
> > LBNL see the following paper:
> >
> > Zheng Zhou et al. An Out-of-core Eigensolver on SSD-equipped Clusters,
> > in Proc. of Cluster’12
> >
> > If anybody has any other papers that look at this of worth, or other
> > projects that look particularly at client-side SSD caching, I'd love to
> > hear about them.  I think in about five years all new compute-nodes will
> > be built with some kind of non-volatile cache or storage -- it's just
> > too good of a solution, particularly with network bandwidth and latency
> > not scaling as fast as NVM properties.
> >
> > Best,
> >
> > ellis
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>



-- 
Jonathan Aquilina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20130208/b33dfb64/attachment.html>


More information about the Beowulf mailing list