[Beowulf] Software RAID?

Greg Lindahl lindahl at pbm.com
Tue Nov 27 22:10:00 PST 2007


On Tue, Nov 27, 2007 at 02:22:03AM -0800, Bill Broadley wrote:

> Hrm, why?  Does context switches not scale with core speed?  Or number
> of cores?  Can't interrupts be spread across CPUs?

No, no, no, and kinda. Caches and main memory access cause problems.

There's a reason why high speed networks eschew interrupts. Haven't
you ever noticed that your InfiniPath cards rarely show any
interrupts? OK, 2 to bring the link up at boot, but MPI doesn't
generate any.

In the disk case, every context switch is an opportunity to dirty the
wrong cache. And accessing data from someone else's cache is slower
than going to main memory, and on NUMA machines like the Opteron, the
wrong main memory is slower than the local main memory.

And round-robining your interrupts, well, that's a recipe for
disaster.

> If you
> need to install a 48 disk server at the top of a 48U rack I am definitely
> busy ;-).

They make lifts for this situation ;-) (plus you have a steady supply
of undergraduates!)

-- greg




More information about the Beowulf mailing list