<div dir="ltr">Brock the pcie SSD's from ocz the enterprise ones seem to have insane performance.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Feb 8, 2013 at 5:20 PM, Brock Palen <span dir="ltr"><<a href="mailto:brockp@umich.edu" target="_blank">brockp@umich.edu</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">To add another side note, in the process of interviewing the Gluster team for my podcast (<a href="http://www.rce-cast.com" target="_blank">www.rce-cast.com</a>) he mentioned writing a plugin, that would first write data local to the host, and gluster would then take it to the real disk in the background.  There was constraints to doing this.  I assume because there was no locking to promise consistency, but for some workloads this might be useful, and combine it with local Flash.<br>


<br>

That episode should be up Feb 24th.<br>

<span class="HOEnZb"><font color="#888888"><br>

Brock Palen<br>

<a href="http://www.umich.edu/~brockp" target="_blank">www.umich.edu/~brockp</a><br>

CAEN Advanced Computing<br>

<a href="mailto:brockp@umich.edu">brockp@umich.edu</a><br>

<a href="tel:%28734%29936-1985" value="+17349361985">(734)936-1985</a><br>

</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

<br>

On Feb 8, 2013, at 11:12 AM, Ellis H. Wilson III wrote:<br>

<br>

> On 02/06/2013 04:36 PM, Prentice Bisbal wrote:<br>

>> Beowulfers,<br>

>><br>

>> I've been reading a lot about using SSD devices to act as caches for<br>

>> traditional spinning disks or filesystems over a network (SAN, iSCSI,<br>

>> SAS, etc.).  For example, Fusion-io had directCache, which works with<br>

>> any block-based storage device (local or remote) and Dell is selling<br>

>> LSI's CacheCade, which will act as a cache for local disks.<br>

>><br>

>> <a href="http://www.fusionio.com/data-sheets/directcache/" target="_blank">http://www.fusionio.com/data-sheets/directcache/</a><br>

>> httP//<a href="http://www.dell.com/downloads/global/products/pedge/en/perc-h700-cachecade.pdf" target="_blank">www.dell.com/downloads/global/products/pedge/en/perc-h700-cachecade.pdf</a><br>

>><br>

>> Are there any products like this that would work with parallel<br>

>> filesystems, like Lustre or GPFS? Has anyone done any research in this<br>

>> area? Would this even be worthwhile?<br>

><br>

> Coming late to this discussion, but I'm currently doing research in this<br>

> area and have a publication in submission about it.  What are you trying<br>

> to do specifically with it?  NAND flash, with it's particularly nuanced<br>

> performance behavior, is not right for all applications, but can help if<br>

> you think through most of your workloads and cleverly architect your<br>

> system based off of that.<br>

><br>

> For instance, there has been some discussion about PCIe vs SATA -- this<br>

> is a good conversation, but what's left out is that many manufacturers<br>

> do not actually use native PCIe "inside" the SSD.  It is piped out from<br>

> the individual nand packages in something like a SATA format, and then<br>

> transcoded to PCIe before going out of the device.  This results in<br>

> latency and bandwidth degredation, and although a bunch of those devices<br>

> on Newegg and elsewhere under the PCIe category report 2 or 3 or 4 GB/s,<br>

> it's closer to just 1 or under.  Latency is still better on these than<br>

> SATA-based ones, but if I just wanted bandwidth I'd buy a few, cheaper,<br>

> SATA-based ones and strap them together with RAID.<br>

><br>

> On a similar note (discussing the nuances of the setup and the<br>

> components), if your applications are embarrassingly parallel or your<br>

> network is really slow (1Gb Ether) they client-side caching is<br>

> definitely the way to go.  But, if there are ever sync points in the<br>

> applications or you have a higher throughput, lower latency network<br>

> available to you, going for a storage-side cache will allow for<br>

> write-back capabilities as well as higher throughput and lower latency<br>

> promises than a single client side cache could provide.  Basically, this<br>

> boils down to do you want M cheaper devices in each node that don't have<br>

> to go over the network but that are higher latency and lower bandwidth,<br>

> or do you want an aggregation of N more expensive devices that can give<br>

> lower latency and much higher bandwidth, but over the network.<br>

><br>

> For an example on how some folks did it with storage-local caches up at<br>

> LBNL see the following paper:<br>

><br>

> Zheng Zhou et al. An Out-of-core Eigensolver on SSD-equipped Clusters,<br>

> in Proc. of Cluster’12<br>

><br>

> If anybody has any other papers that look at this of worth, or other<br>

> projects that look particularly at client-side SSD caching, I'd love to<br>

> hear about them.  I think in about five years all new compute-nodes will<br>

> be built with some kind of non-volatile cache or storage -- it's just<br>

> too good of a solution, particularly with network bandwidth and latency<br>

> not scaling as fast as NVM properties.<br>

><br>

> Best,<br>

><br>

> ellis<br>

> _______________________________________________<br>

> Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>

> To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

<br>

_______________________________________________<br>

Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>

To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Jonathan Aquilina

</div>