[Beowulf] i7-4770R 128MB L4 cache CPU in compact 0.79 litre box - DIY cluster?
rw at firstpr.com.au
Sun Jan 19 19:08:42 PST 2014
In the Beowulf tradition, here are some ideas about how one might go
about making a cluster with off-the-shelf consumer products, with a few
potential advantages compared to using traditional PC cases and
motherboards: compactness, lower power consumption and ease of running
from a 24 volt lead-acid battery system for un-interruptable and/or
This would probably not be as good for some applications as using a
smaller number of large motherboards with two or four AMD Opteron G34
socket devices with potentially very large amounts of RAM, where each
device has 8, 12 or 16 cores. Where the bottleneck is inter-process
communications, this enables 32 to 64 cores to work in a shared memory
environment or, even if they are not sharing memory, for their processes
to communicate faster than by Ethernet or I guess by Infiniband networks.
Also, the following approach is still arguably amateur in that it
doesn't use ECC memory. I am thinking if someone (not yet me) had a $5k
to $10k and wanted to build their own cluster. The total cost looks
like not much more than $250 per 3.2GHz 22um Haswell core, each
supporting two threads.
Intel's i7-4770R is a BGA (soldered to the PCB) Haswell 22nm CPU with 4
cores, hyper threading (so it can support 8 processes), 256kB L2 cache
per core (as with all recent Intel CPUs, I think) and 6MB of L3 cache.
Most i7 CPUs have 8MB or more L3 cache.
The base clock frequency is 3.2GHz, with turbo mode to 3.9GHz.
The i7-4770R is intended for compact consumer desktop devices and has a
fancy graphics GPU on chip, which I guess is of little interest for HPC.
It is compact (no socket) and relatively inexpensive (I guess, since it
is for cost-sensitive consumer products):
The price quoted is $392. Maximum memory is 32MB, via two channels of
The most unusual thing about it, which I assume is of interest for HPC,
is that in the same package, there is a 128MB L4 cache chip. There are
6 other mobile and desktop CPUs with this "Crystal Well" L5 cache chip:
Of these, the i7-4770R has the highest base clock frequency. The
i5-4570R (2.7GHz base clock frequency) is used in some iMacs.
the 128MB L4 cache functions as a "victim cache" to the CPU's L3 cache:
It is a DRAM cache, rather than SRAM. I haven't researched it any
further, but I guess this extra cache would help with at least some HPC
applications. These may be photos of this or similar dual chip devices:
the 128MB cache is supported by Linux kernel 3.12 (Nov 2013).
Another new feature is on-die voltage regulation (FIVR = Fully
Integrated Voltage Regulation) which reduces motherboard complexity:
The next question is how to get a bunch of PCs with this chip, in as
compact a form as possible. There will no doubt be many sources, but at
present, I found this from Gigabyte - the Brix Pro Ultra Compact PC kit:
GB-BXi7-4770R. "Kit" means it needs RAM, hard drive, operating system,
keyboard, trackball and monitor.
The box is 6.2 x 11.2 x 11.5 cm = 0.79 litres. It has gigabit Ethernet
and two SO-DIMM DDR3L slots with a maximum of 16GB. The memory speed is
quoted as 1333/1600 MHz, but the memory compatibility list indicates
that the 8GB G-Skill F3-1866C10D-16GRSL runs at 1866MHz.
The power supply is an external unit which provides 19 volts at 7.1
amps. This means that it would be straightforward to power a bunch of
these devices via switching buck regulators (there are plenty of
adjustable regulators on eBay), one per Brix, from two 12 volt lead acid
batteries in series. I think this or a conventional UPS is essential
for any HPC work, at least with the frequency of brief power outages
where I live (a north-eastern suburb of Melbourne, Australia).
The power supply and some of the innards are shown in the pcper.com page
mentioned above. There is a fan on the CPU, but I haven't yet seen a
photo of the CPU heatsink. There are some longevity questions about
fans. Their bearings can wear out unless they are ball bearings - which
are generally more expensive and noisier. Heatsinks and the fins of
centrifugal blowers can clog with dust, but presumably a cluster could
be housed somewhere with minimal dust. Here are some photos of the
centrifugal blowers of other Brix models - not the 4770R model, which is
in a taller box and I guess has a larger heatsink and blower:
The last one shows a Delta Electronics BSB05505HP centrifugal blower,
which can be bought elsewhere. I am not sure if it is a ball bearing
blower, but some or many of the Delta Electronics blowers are.
it seems the heatsink of the 4770R model is thicker and that the PC uses
two widely spaced PCBs with the fan in between them, rather than in the
above photo which shows the fan on the underside, I guess, of two
closely-spaced PCBs. So the fan or blower of the 4770R model may be
totally different to those in the shorter models, which have CPUs which
dissipate less power.
The total cost in the USA, in USD$, not counting local taxes and
shipping would be:
~$650 Brix Pro GB-BXi7-4770R
~$260 2 x 8GB RAM @~$130
~$100 120GB SSD drive
All that remains is to add some Ethernet cables and an Ethernet switch
and maybe some fans for the power supply farm. If the machines booted
via the network it may not be necessary to have a hard drive.
Ordinary mass market PC cases, motherboards and i7 CPUs might work out a
little cheaper, but they would not be as compact, would not have the
128MB L4 cache (but would have 8MB L3, rather than this 6MB) and would
probably chew more power.
More information about the Beowulf