[Beowulf] SSD caching for parallel filesystems

Mon Feb 11 21:28:40 PST 2013

Vincent, seer of seers, prognosticator of prognosticators, Grendel of 
Grendels, answer me this (ironically, this will get us back to the OP's 
question, in a form at least, which would be just swell):

"You are charged with creating the most efficient and cost-effective 
cache layer possible using retail pricing and commodity components.  Let 
us assume that, in order to keep this conversation about bandwidth 
going, this cache is geared to perform well for highly sequential read 
workloads.  Further, since (naturally) not all data is hot at the same 
time, given the imaginary and unspecified size of this parallel file 
system, we know that on any given week there's only about 250GB of data 
that is really, seriously utilized over and over and therefore good for 
caching.  However, this needs to be fed fast -- 3GB/s for instance. 
Ignore the network; this can be one monolithic PFS with local cache. 
Yes, this diverges from the OP's question, but few took that seriously 
anyhow."

Summary: What's the cheapest, fastest, read-bandwidth optimized caching 
medium for a single machine, serving hot data of about 0.25TB at ideally 
3GB/s to keep the machine busy?

Examination (I'm using tomshardware for quick performance numbers and 
pricing figures -- drop the non-retail convo Vince, nobody is buying it):

Rough (for reads) MB/s/$ of HDD (>=250GB): Ranges from 1 to 2.5
Rough (for reads) MB/s/$ of SSD (>=250GB): Ranges from 1 to 2.9

Let's consider real examples at the top of my quick perusal for each 
category.  I'll use for the SSD, the Samsung 840, costing about 178 and 
delivering about 520MB/s, giving it around ~2.9MB/s/$.  For the HDD, 
I'll use the Toshiba DT01ACA100, costing around $73 and delivering about 
185MB/s, giving it 2.5MB/s/$.

So, with my best HDD, I'll need about 16 of them to deliver the 3GB/s 
figure I want, which will cost me in aggregate $1168.  For my best SSD, 
I'll need only about 6 of them, which will cost me $1068.

So this discussion about SSDs being pointless for bandwidth should 
(hopefully) be over.  They can be used for bandwidth acceleration, 
particularly (as the OP mentioned) if used on the compute node when a 
weak network link sits between it and the PFS.  In those cases, there is 
rarely space enough to shove all the HDDs Vincent is espousing in there, 
and therefore SSDs are ideal solutions whether you want a cache for 
latency or bandwidth.

If, on the other hand, we are talking about building general filers with 
huge capacity, minimized cost, and sequential workloads, of course HDDs 
rock.  But we aren't/weren't/never have been talking about that.

Best,

ellis