[Beowulf] SSD caching for parallel filesystems

Vincent Diepeveen diep at xs4all.nl
Sun Feb 10 05:40:09 PST 2013


On Feb 10, 2013, at 2:09 PM, Ellis H. Wilson III wrote:

> On 02/10/13 04:41, Vincent Diepeveen wrote:
>> SSD's are not about bandwidth, they're about latency.
>
> This is a bit aggressive of a vantage point -- let's tone it back:
> "SSD's aren't always the cheapest way to achieve bandwidth, but  
> they are
> critical for latency-sensitive applications that are too large for  
> main
> memory."
>

SSD's are never the cheapest way to achieve bandwidth and never will be.

> In any event, your original statement used to be wholly correct.   
> It has
> changed to a certain degree to "SSDs are about IOPs," which isn't  
> quite
> the same thing.  However, more pointedly, with modern HDDs barely
> approaching 200MB/s and SSD solutions approaching 2-4GB/s, this is an
> increasingly limited viewpoint.  We have to start considering their  
> use
> for bandwidth.

Find me an application that needs big bandwidth and doesn't need  
massive storage.

So any SSD solution that's *not* used for latency sensitive  
workloads, it needs thousands of
dollars worth of SSD's.

In such case plain old harddrive technology that's at buy in price  
right now $35 for a 2 TB disk
(if you buy in a lot, that's the actual buy in price for big shops  
and you nor i get them for that price
of course),  or $17.5 a terabyte, that's unbeatable in performance  
for storage and bandwidth.

We speak about a sustained 200MB/s for dirt cheap RAID harddrives  
here. Put 16 of them in a raid partition and
you can get more than you can deliver over the network from the file  
server and more than your motherboard can effectively
handle a second.

We speak about a buy in price of total peanuts for 16 harddrives  
here, and the same storage in SSD is
worth a total fortune.

So using SSD's is just for latency. Anyone not using them for that i  
would never hire.

>
>> With a raid array of cheapo disks we can also get 3GB/s bandwidth,
>> more than most 2 socket nodes effectively can handle.
>
> 3GB/s divided by 200MB/s gives me something like 15 drives, unless my
> math is wrong, which will be something like $2-$3K, and that's really
> only possible in RAID0, so you're only going to get the capacity of  
> one
> drive.  If all I'm looking for is bandwidth I'd rather spend that  
> 3k on
> an expensive SSD (or RAID a bunch of cheaper SSDs) and get it for far
> less power, wire complexity, space consumption, and risk of failure.
> Moreover, it'll have better latency.  This gap will continue to widen,
> so while we can talk about 15 disks reasonably right now, in a year
> we'll be talking more like 25-30 and then it just becomes absurd.   
> Just
> buy the SSD(s) at that point.
>
>> Only theoretically a higher bandwidth will be possible (benchmarks  
>> huh).
>>
>> However getting 20 bytes from a SSD is in the few dozens of
>> microseconds, versus several milliseconds for the cheapskate disks.
>>
>> That factor of 50-100 difference roughly in latency difference is the
>> reason SSD's exist.
>>
>> Any bandwidth test of a SSD is total nonsense.
>
> (I wish you'd put [In my personal opinion] in front of all of your
> sentences.  It would make them less nails on a chalkboard.)
>
> So what happened to "perfectly parallel"?  Seems to me like a  
> perfectly
> parallel device would be well tuned to deliver good bandwidth.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list