[Beowulf] fast array of wimpy nodes

Fri Apr 17 02:21:20 PDT 2009

http://www.technologyreview.com/printer_friendly_article.aspx?id=22504&channel=computing&section= 

Netbook Chips Create a Low-Power Cloud

A "fast array of wimpy nodes" could replace behemoth server infrastructure.

By Christopher Mims

Using a cluster of the same processors that normally show up in netbooks and
similar mobile devices, researchers have created a powerful server
architecture that draws less power than a lightbulb.

The architecture, dubbed a "fast array of wimpy nodes," or FAWN, offers a way
to decrease by an order of magnitude the amount of power used by the
computational infrastructure of Internet giants like Google, Microsoft,
Amazon, eBay, Facebook, and others. If the predictions of its inventors are
borne out, it could have a significant impact on both the bottom line and the
environmental impact of cloud computing.

Power now accounts for up to 50 percent of the cost of operating data
centers, and in the United States, its cost per kilowatt-hour is increasing.
Even relative newcomers like Facebook use up to $1 million a month in
electricity, and the Environmental Protection Agency (EPA) projects that by
2011, data centers in the United States could use up to 100 billion
kilowatt-hours of electricity, for a total annual cost of $7.4 billion, with
an estimated emissions impact of 59 million metric tons of CO².

FAWN, which is described in an as-yet-unpublished paper by David Andersen and
his team at Carnegie Mellon University, tackles this problem with a
combination of relatively slow processors (the kind used in netbooks and
other mobile devices) and flash memory (the kind that stores data in digital
cameras and USB drives). The somewhat counterintuitive result is an
architecture whose performance per watt of energy is a hundred times better
than that of traditional servers, which use faster (but much more
energy-hungry) processors and disk-based storage.

The exceptional performance of FAWN is limited to certain kinds of
problems--random access of small bits of information--but this kind of
input/output-intensive task is exactly what strains the existing
infrastructure of Web companies like Facebook.

"When you go to Facebook.com, the home page has hundreds of individual data
elements on it, which get translated into hundreds of internal lookups," says
Andersen. Requests for those hundreds of elements, which include friends'
updates, the number of messages in an inbox, and more, are handed off to a
specialized piece of software, called memcached, that stores relevant data in
RAM. Memcached prevents Facebook's disk-based databases from being
overwhelmed by a fire hose of millions of simultaneous requests for small
chunks of information. Amazon, which has more or less the same problem as
Facebook with its shopping cart and custom recommendations, uses a similar
piece of custom-built software, called Dynamo, to perform nearly the same
function.

One way that FAWN replaces software like memcached and Dynamo is by
conquering what computer scientists call the memory wall, which is the huge
disparity between the rate at which disk-based storage can feed data to a CPU
and the rate at which a CPU, which is much faster, can chew through that
data. (Andersen points out that modern CPUs use an enormous number of
transistors trying to guess what data to expect, fetching data in advance or
caching it in memory to make sure that the chip always has a steady supply of
bits to process.)

There are two ways to get around the memory wall: the first is to increase
the performance of a system's memory, and the second is simply to slow down
its CPU. FAWN does both: flash memory has much faster random access than
disk-based storage, and FAWN's slower processors require less power and waste
fewer transistors trying to guess what's coming next.

FAWN is composed of many individual nodes, each with a single 500-megahertz
AMD Geode processor (the same chip used in the first One Laptop Per Child
$100 laptop) with 256 megabytes of RAM and a single four-gigabyte compact
flash card. The largest FAWN cluster built to date, consisting of 21 nodes,
draws a maximum of 85 watts under real-world conditions.

Each FAWN node performs 364 queries per second per watt, which is a hundred
times better than can be accomplished by a traditional disk-based system
working on an input/output-intensive task, such as gathering all the
disparate bits of information required to display a Facebook or FriendFeed
page or a Google search result.

This kind of performance may have applications beyond the data center, says
Steven Swanson, an assistant professor in the department of computer science
and engineering at the University of California, San Diego. Swanson's own
high-performance, flash-memory-based server, called Gordon, which currently
exists only as a simulation, is similar to FAWN in its architecture but was
designed with scientific applications as well as data centers in mind.

Swanson's goal is to exploit the unique qualities of flash memory to handle
problems that are currently impossible to address with anything other than
the most powerful and expensive supercomputers on earth--systems with up to a
petabyte of RAM. "We work with the San Diego Supercomputing Center on large
genomics and bioinformatics patterns," says Swanson. "We want to do queries
very quickly, and if the data graphs won't fit in RAM, they get very slow,
which means you have to give up fidelity in the simulation."

FAWN is "the right direction to push," says Niraj Tolia, a researcher in the
Exascale Computing Lab at HP Labs. "The days are gone when we simply looked
at raw performance as a metric," he adds.

Currently, FAWN is not suitable for CPU-intensive tasks such as processing
video, but Andersen says that future iterations will use the more powerful
Atom processors (which Swanson is also contemplating for his Gordon system).
Having been designed for netbooks, these more powerful processors draw the
same amount of power as the AMD chips--about four watts each. Throw in a
power supply and some networking equipment, and "you could very easily run a
small website on one of these servers, and it would draw 10 watts," says
Andersen--a tenth of what a typical Web server draws.

The next generation of FAWN is something that Andersen hopes the largest
users of data centers will investigate. "I would love it if we could get
Facebook or Google or Microsoft to start building clusters with this," he
says.