[Beowulf] SC13 wrapup, please post your own
stewart at serissa.com
Sat Nov 23 17:32:54 PST 2013
Here are my impressions from the exhibits, in no particular order.
Carbon Nanotube Computer
In the Emerging Technologies area, there was a Stanford graduate student with a MIPs compatible microprocessor made out of carbon nanotube transistors. He said this was feasible because other than the nanotubes, it used standard semiconductor processes.
Thomas Sohmers of REX Computing was in the Emerging Technologies area with a working Adapteva Ephiphany board.
Hybrid Memory Cube
Last year, Micron was talking about their Hybrid Memory Cubes, stacked DRAM dice on top of a controller exporting high speed SERDES links instead of DDRx style interfaces. The things have 120 GB/s bandwidth to a single cube. This year they seem to be real. PicoComputing had a PCI express module with four big Altera FPGAs tied to the four channels of an HMC. They are still pretty low density, 2 or 4 GB per HMC, but that will change. Hynix, Samsung, Xilinx, Altera, and ARM are in the consortium.
These guys are brilliant or crazy. Only time will tell. They have a 256 core chip, with 16 clusters of 16 cores each. They have floating point. There is on-chip interconnect. Each cluster has 2 MB of RAM but all the clusters are distributed memory rather than shared. There are two DDR3 controllers.
This is sort of like a 256 core version of the 36-core Tilera, but without shared memory.
nCore has built a machine. Each node is 4 ARM15 cores and 24 C66x DSP. Up to 144 nodes per rack, connected by serial RapidIO. 3.7 TB RAM. 69 SP TFlops or 18 DP TFlops. Linux, OpenMPI, OpenMP.
DEEP and Extoll
The European Commission is working on a technology demonstrator. The interesting part is the "Booster Node" which has two Xeon Phi "Dense Form Factor" modules connected by Extoll NIC/Switch chips. I had not known about the DFF modules, which Eurotech assembled using water cooling. The other new thing, to me at least, was Extoll. These guys have a combined NIC/6 port switch, and the NIC can pretend to be a PCIe root complex. That means you can send I/O commands from some master node somewhere and boot the Xeon Phi nodes remotely without having a local host node.
I think Extoll is interesting because some of the ex-SiCortex guys tried to start a company to do the same thing in late 2009 but we couldn't get funding. And here it is!
This is another European project, with low-power ARM+GPU nodes based on the Samsung Exynos 5 SOC. It has two ARM A15 and a Mali-T604 GPU. Gigabit Ethernet.
Nallatech is ostensibly an FPGA accelerator company, but they had a strange brick thing on their table. It was about 4 inches square and two inches thick, made of 32 thin slices in two ranks, with water cooling. Each slice contained an Altera Arria FPGA and 8 GB DRAM. All these modules were wired up to several thousand pins coming out the bottom of the module. It was made for a special customer. So if you want processor-in-memory, and you have money, I guess you can get it now.
This isn't new this year, but I hadn't seen one in person. This is the HP 4U (5U?) box with 45ish cartridge slots. New this year are ARM and FPGA cartridges along with the Atoms. Storage cartridges are coming. The HP guy didn't really want to talk to me with my "Quanta" badge :)
There was a lot of water cooling equipment on the floor. I liked the Staubli booth for sheer mechanical pron. They make the drip-free connectors.
OK. I just haven't been paying attention I think. SGI is using IB in their ICEx machines, connected in hypercubes. Hypercubes seemed like a good idea in 1991, with the Thinking Machines, but now?
The problem is that you get log-base-2(nodes) max hopcounts, with half that as the average. Fat Tree gives you 2*log-base-switch-degree(nodes) which probably comes out around the same for 1000 nodes, but with simpler wiring. Routing only works by dimension order routing, which doesn't need the virtual channels that IB doesn't let you use anyway.
I know they've been at the show for several years, but this is the first time I stopped in to talk, because I am working on FPGAs now. Their specialty seems to be tight coupling of PCIe-space FPGAs with application programs by direct memory mapping. This lets you hand off rather small work items to the hardware, without having to deal with the OS kernel. Right now this requires a custom kernel, but they say next summer it will work with standard kernels plus modules.
I like the 16 street Mall free busses. Maybe not all-free downtown like Portland, but nice.
A pollster gave me a $5 Starbucks card for telling her how awful I thought the Intel booth was. Score!
I thought the Swiss and the Saudis had pretty good coffee. (I didn't try the coffee at SI though!)
The lego turing machine was nice.
The Mythbusters exhibit at the Denver Museum of Nature and Science was good. I hope it comes to Boston.
I did not get an LSU scarf, which I really want. Ah well.
More information about the Beowulf