[Beowulf] 3.79 TFlops sp, 0.95 TFlops dp, 264 TByte/s, 3 GByte, 198 W @ 500 EUR
prentice at ias.edu
Thu Dec 22 11:49:15 PST 2011
If you or anyone else on this are interested in learning more about the
anton architecture, there a bunch of links here:
There's a couple that give good descriptions of the anton architecture.
I read most of the computer-related ones over the summer. Yes, that's
my idea of light summer reading!
On 12/22/2011 12:33 PM, Lux, Jim (337C) wrote:
> That's an interesting approach of combining ASICs with FPGAs. ASICs will
> blow the doors off anything else in a FLOP/Joule contest or a FLOPS/kg or
> FLOPS/dollar.. For tasks for which the ASIC is designed. FPGAs to handle
> the routing/sequencing/variable parts of the problem and ASICs to do the
> crunching is a great idea. Sort of the same idea as including DSP or
> PowerPC cores on a Xilinx FPGA, at a more macro scale.
> (and of interest in the HPC world, since early 2nd generation Hypercubes
> from Intel used Xilinx FPGAs as their routing fabric)
> The challenge with this kind of hardware design is PWB design. Sure, you
> have 1100+ pins coming out of that FPGA.. Now you have to route them
> somewhere. And do it in a manufacturable board: I've worked recently with
> a board that had 22 layers, and we were at the ragged edge of tolerances
> with the close pitch column grid array parts we had to use.
> I would expect the clever folks at DE Shaw did an integrated design with
> their ASIC.. Make the ASIC pinouts such that they line up with the FPGAs,
> and make the routing problem simpler.
> On 12/22/11 8:53 AM, "Prentice Bisbal" <prentice at ias.edu> wrote:
>> Just for the record - I'm only the messenger. I noticed a
>> not-insignificant number of booths touting FPGAs at SC11 this year, so I
>> reported on it. I also mentioned other forms of accelerators, like GPUs
>> and Intel's MIC architecture.
>> The Anton computer architecture isn't just a FPGA - it also has
>> custom-designed processors (ASICS). The ASICs handle the parts of the
>> molecular dynamics (MD) algorithms that are well-understood, and
>> unlikely to change, and the FPGAs handle the parts of the algorithms
>> that may change or might have room for further optimization.
>> As far as I know, only 8 or 9 Antons have been built. One is at the
>> Pittsburgh Supercomputing Center (PSC), the rest are for internal use at
>> DE Shaw. A single Anton consists of 512 cores, and takes up 6 or 8
>> racks. Despite it's small size, it's orders of magnitude faster at
>> doing MD calculations than even super computers like Jaguar and
>> Roadrunner with hundreds of thousands of processors. So overall, Anton
>> is several orders of magnitudes faster than an general-purpose processor
>> based supercomputer. And sI'm sure it uses a LOT less power. I don't
>> think the Anton's are clustered together, so I'm pretty sure the
>> published performance on MD simulations is for a single Anton with 512
>> Keep in mind that Anton was designed to do only 1 thing: MD, so it
>> probably can't even run LinPack, and if it did, I'm sure it's score
>> would be awful. Also, the designers cut corners where they knew the
>> safely could, like using fixed-precision (or is it fixed-point?) math,
>> so the hardware design is only half the story in this example.
>> On 12/22/2011 11:27 AM, Lux, Jim (337C) wrote:
>>> The problem with FPGAs (and I use a fair number of them) is that you're
>>> never going to get the same picojoules/bit transition kind of power
>>> consumption that you do with a purpose designed processor. The extra
>>> logic needed to get it "reconfigurable", and the physical junction sizes
>>> as well, make it so.
>>> What you will find is that on certain kinds of problems, you can
>>> a more efficient algorithm in FPGA than you can in a conventional
>>> processor or GPU. So, for that class of problem, the FPGA is a winner
>>> (things lending themselves to fixed point systolic array type processes
>>> are a good candidate).
>>> Bear in mind also that while an FPGA may have, say, 10-million gate
>>> equivalent, any given practical design is going to use a small fraction
>>> those gates. Fortunately, most of those unused gates aren't toggling,
>>> they don't consume clock related power, but they do consume leakage
>>> current, so the whole clock rate vs core voltage trade winds up a bit
>>> different for FPGAs.
>>> The biggest problem with FPGAs is that they are difficult to write high
>>> performance software for. With FORTRAN on conventional and vectorized
>>> pipelined processors, we've got 50 years of compiler writing expertise,
>>> and real high performance libraries. And, literally millions of people
>>> who know how to code in FORTRAN or C or something, so if you're looking
>>> for the highest performance coders, even at the 4 sigma level, you've
>>> a fair number to choose from. For numerical computation in FPGAs, not
>>> many. I'd guess that a large fraction of FPGA developers are doing one
>>> two things: 1) digital signal processing, flow through kinds of stuff
>>> (error correcting codes, compression/decompression, crypto; 2) bus
>>> interface and data handling (PCI bus, disk drive controls, etc.).
>>> Interestingly, even with the relative scarcity of FPGA developers versus
>>> conventional CPU software, the average salaries aren't that far apart.
>>> The distribution on "generic coders" is wider (particularly on the low
>>> end.. Barriers to entry are lower for C,Java,whathaveyou code monkeys),
>>> but there are very, very few people making more than, say, 150-200k/yr
>>> doing either. (except in a few anomalous industries, where compensation
>>> is higher than normal in general). (also leaving out "equity
>>> participation" type deals)
>>> On 12/22/11 7:42 AM, "Prentice Bisbal" <prentice at ias.edu> wrote:
>>>> On 12/22/2011 09:57 AM, Eugen Leitl wrote:
>>>>> On Thu, Dec 22, 2011 at 09:43:55AM -0500, Prentice Bisbal wrote:
>>>>>> Or if your German is rusty:
>>>>> Wonder what kind of response will be forthcoming from nVidia,
>>>>> given developments like
>>>>> It does seem that x86 is dead, despite good Bulldozer performance
>>>>> in Interlagos
>>>>> (engage dekrautizer of your choice).
>>>> At SC11, it was clear that everyone was looking for ways around the
>>>> power wall. I saw 5 or 6 different booths touting the use of FPGAs for
>>>> improved performance/efficiency. I don't remember there being a single
>>>> FPGA booth in the past. Whether the accelerator is GPU, FPGA, GRAPE,
>>>> Intem MIC, or something else, I think it's clear that the future of
>>>> architecture is going to change radically in the next couple years,
>>>> unless some major breakthrough occurs for commodity processors.
>>>> I think DE Shaw Research's Anton computer, which uses FPGAs and custom
>>>> processors, is an excellent example of what the future of HPC might
>>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>>> To change your subscription (digest mode or unsubscribe) visit
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf