[Beowulf] single machine with 500 GB of RAM

Ellis H. Wilson III ellis at cse.psu.edu
Thu Jan 10 09:08:44 PST 2013


On 01/10/13 00:04, Mark Hahn wrote:
>> procs. Within each process the accesses to their "cube" of data were
>> near to completely random.
>
> "completely random" is a bit like von Neumann's "state of sin" ;)
> if they managed to make actually uniform random accesses, they'd have
> discovered a new PRNG, possibly the most conpute-intensive known!

Alright, alright.  Fair enough, not "completely random" or even near to 
it, but damn close enough for a file system guy since things were 
thrashing like a cat in a bath.

> my guess is that apps that claim to be random/seeky often have pretty
> non-uniform patterns. they obviously have a working set,

Some really do start "psuedo-random" (hopefully this avoids getting me 
in trouble) and converge slowly, building a more defined working set as 
they go.  Genetic algorithms come to mind here, and some types of ANNs, 
both of which I was using in my short couple year stint working with 
computational chemists.  Towards the end things pick up nicely because 
all the data stays in cache effectively (short of mutation situations in 
the GA case).

> let's eyeball a typical memory latency at 50 ns, a mediocre disk at 10 ms,
> but the real news here is that completely mundane SSD latency is 130 us.
> 200,000x slower is why thrashing is painful - 2600x slower than ram is
> not something you can ignore, but it's not crazy.
>
> it's a curious coincidence that a farm of Gb servers could provide
> random 4k blocks at a latency similar to the SSD (say 150 us).
> of course, ScaleMP is, abstractly, based on this idea (over IB.)
>
>> IOPs, but is there anywhere a scaling study on them where individual
>> requests latencies are measured and CDF'd? That would be really
>
> http://markhahn.ca/ssd.png
> like this? those are random 4k reads, uniformly distributed.

No, we've got a couple of those in the paper I shared with Vincent 
yesterday, but this only looks at distribution of latencies for a single 
drive (right?).  I was referring to a scaling study looking at how a 
basic SATA drive connected directly to the mobo is impacted when you go 
from that to a RAID card in between, and then progressively add more 
SSDs to it.  I would expect the latency to drop and then (in the best 
case) stay flat -- it shouldn't drop if the RAID card is decent nor 
should it rise in RAID0 format, since you are still bound by the latency 
of a single SSD.  This is undoubtedly all hinged on the efficacy of the 
RAID card, but my hunch is that most of these cards aren't built well 
for O(us) response times, but instead, for O(ms) response times like the 
HDDs have they have been serving for ages.

If I stumble on a pile of SSDs and a nice RAID card, I'll report back :D.

Best,

ellis



More information about the Beowulf mailing list