[Beowulf] Ethernet connected drives
Ellis H. Wilson III
ellis at cse.psu.edu
Thu May 8 10:30:04 PDT 2014
On 05/08/2014 10:29 AM, John Hearns wrote:
> Very interesting idea.
And an old-ish one (circa early 2000's, maybe even late 90's) too. See
research referencing "Active Disk," "Smart Drive," et. al.
These are just "simpler" versions of those, at least on the surface.
And I'm entirely unsurprised Seagate and others are going this way --
with shingled disks your HDD basically starts needing to do a lot of the
stuff SSDs already cope with since rewrites become inappropriately
costly. Since you'll have to push more logic (i.e., more cpu) onto the
HDD to cope with that, you might as well try to squeeze more
vendor-specific gotchas in while you're at it (see the history of
highly-proprietary SSD firmware development/acquisitions).
HGST is just behind Seagate in getting the ball rolling on the
mechanisms to make shingled (or HAMR, whatever) work, so I see this move
as partly a crowd-sourcing approach to getting these mechanisms coded out.
> Forget building compute clusters - soon we will be building Beowulfs
> with disk drives!
Color me dubious. I highly doubt there will be any entire clusters of
just HDDs anytime soon. The cpu/ram you can fit on them will be far
lower than a full machine, even if you consider 16 of them or so.
What these will be good at (and what the Active Disk research espouses)
is applying very simple filters (e.g., simple greps) to avoid pushing
data across buses needlessly. Basically, only the data you want to come
out that you will then compute with will be returned. The compute- and
ram-intensive tasks will still be reserved for the real machine.
Decent paper on one approach to this using SSDs (since they already have
compute and memory on-board to deal with all their firmware complexities).
Department of Computer Science and Engineering
The Pennsylvania State University
More information about the Beowulf