[Beowulf] Revelations on Roadrunner's Retirement
Joshua mora acosta
joshua_mora at usa.net
Fri Apr 5 09:00:59 PDT 2013
It would be good to know what were the levels of efficiency of the
applications wrt FLOP/s and GB/s and the typical node count for the runs.
Then compare that against the current PF/s systems.
------ Original Message ------
Received: 05:49 PM CEST, 04/05/2013
From: Eugen Leitl <eugen at leitl.org>
To: Beowulf at beowulf.org
Subject: [Beowulf] Revelations on Roadrunner's Retirement
> Revelations on Roadrunner's Retirement
> Nicole Hemsoth
> Earlier this week we reported on the decommissioning of the Roadrunner
> supercomputer at Los Alamos National Laboratory, which was being shuttered
> following a stint of fame as the first system to break the petascale
> back in 2008.
> According to Paul Henning from the computational physics division at Los
> Alamos, Roadrunner’s checkout made big news, but the end of the line for
> super was well-planned, if not right on schedule.
> The system served its purpose chewing a bevy of mostly classified and some
> key civilian code. However, in the end, the combination of a finite
> an extinct chip, the cost of crumpling up code to fit into IBM’s Cell,
> the promise of swifter, more efficient technologies were main factors in
> planned clipped lifecycle of the petaflop pioneer.
> “Rather than think of these machines as physical entities, we think of
> as projects,” he explained. “At the beginning of the Roadrunner
> we laid out a project lifetime for this—and that lifetime considered a
> of things, including the cost of maintenance, power, vendor and licensing
> contracts, and how we would upgrade the system.”
> Henning detailed that the support contract with IBM was up and since they
> don’t even produce the core of the machine’s architecture, the Cell,
> question of even scrounging up some spare parts would have presented a
> tricky issue. The retirement party had been planned years ago anyway, but
> there are some meaty learning opportunities to glean from the scrap metal.
> When any system at the lab is shuttered, the autopsy, which looks at
> everything from the integrity of the memory and OS to the more nuts and
> physical properties, is performed. A key finding of the post-mortem
> around the condition of the boxes after five years of heat, wear and
> tear—it’s here where the materials analysis begins. It’s given the
> materials science team at the center an insider’s view into the real
> on systems after high-yield, high-heat production—and from what we read
> between the lines, these boxes are maxed out.
> Then again, there were never any plans to build the system out to new glory
> ala the Jaguar to Titan transformation. Anyway, even if the hardware
> on its last, weak leg, considering they’d have to retrofit the entire
> since IBM would return a 404 on their build-out needs, it makes sense that
> they’d want to rip…and of course, replace.
> Currently, Los Alamos has sent its applications on a redirect course to the
> smaller, slightly more efficient and roughly performance-equivalent Cielo
> system, which is housed in the same space as the now-defunct Roadrunner.
> Henning said the developer-friendly architecture saves time and money on
> retooling, ostensibly while they try to fit something new into their
> And so here is where things get interesting. Because we can speculate on
> Los Alamos might dream up to fill the 6,000 square foot gap left behind.
> That’s a pretty large spate of empty space for any upstart system to
> into. Titan’s sprawl is right under 5,000 square feet and a lot of flops
> fit in less than that.
> There are a few hints at what might sit on the charred spot Roadrunner once
> occupied post-ripdown. However, it’s worth noting that a quick perusal of
> NNSA’s procurement plans for the next year include something on the order
> a $50 million to (yes) one billion dollar project, which is currently
> accepting proposals. And it’s kind of hard to imagine what else would be
> filed under tech procurements to that monetary tune. If any of you know
> anything about this, that comments section down there looks awfully
> empty….(hint, hint).
> All speculation aside, it looks like we’ll find out soon
> later this year—just what will turn off that vacancy sign at the lab.
> then, the Roadrunner story serves as a reminder about how quickly the tides
> of this type of tech shift and leave superhero machines drifting into
> forgotten waters.
> When national labs and large HPC sites sit down to spill ink on new system
> designs, they’re hedging their bets on what future technologies will look
> like. It’s rare, unless folks are on a TACC/Stampede-like course to go
> ground to super in a tick over a year, to know what innovations on the
> architecture, efficiency or acceleration front will yield big
> price-performance dividends. So at the time that Los Alamos set about
> architecting Roadrunner based on the very unique Cell approach, they were
> placing their bets on the future of that technology.
> Since that development cycle, the rise of GPU acceleration, the
> of the promising Phi, and some efficiency tweaks on the software side have
> rendered some of what made Roadrunner shine seem rather date. It’s now
> possible to get more compute power in a smaller power envelope…and with a
> less in the way of programming hassle, as well, notes Henning. However,
> the NNSA and Los Alamos, whatever the clandestine code was they cooked
> the Cell, it must have been worth the effort on the retooling side.
> Although the story of the Roadrunner being forced into retirement found its
> way into a number of mainstream tech media stories over the course of the
> week, this is a pretty standard order of operations for large HPC centers,
> especially national labs. Henning stressed that the shutdown of the
> once-famous system is not unlike the series of other supers they’ve
> in succession at the center. They build a plan for acquisition, see a
> run its course, learn from it post-mortem and shuttle it off in parts to
> way for something fresh.
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf