[Beowulf] Joe Blaylock's notes on running a MacOS cluster, Nov. 2007

Tue Nov 20 11:19:07 PST 2007

This is not strictly about Beowulfs, but it is probably of interest to
their users.

My friend Joe's team from Indiana University just fielded a MacOS
cluster for the Supercomputing '07 Cluster Challenge.  His experiences
weren't that great; I encouraged him to jot something quick down so that
other people could benefit from his hard-won lessons.

There's more information about the challenge at
http://sc07.supercomp.org/?pg=clusterchlng.html&pp=conference.html.

----- Forwarded message from Kragen Javier Sitaker <kragen at pobox.com> -----

From: Kragen Javier Sitaker <kragen at pobox.com>
To: kragen-fw
Subject: Joe Blaylock's notes on running a MacOS cluster, Nov. 2007

  Disordered thoughts on using MacOS X for HPC.

By Joe Blaylock, 2007-11.

    Recollections:

    * we were the first people to ever try that particular combination:
      Tiger on Xeons with Intel's ICC 10 compiler suite and MKL linear
      algebra libraries. Blazing new territory is never easy.
    * We didn't use XGrid or Apple's cluster management stuff, only
      Server Admin and ARD.
    * Pov-Ray was easy; OpenMPI was easy; using Myrinet over 10Gig
      Ethernet was easy
    * GAMESS was more challenging, but we got it working somewhat. We
      still don't know how to run jobs of type ccsd(t), which require
      System V shared memory.
    * We never got POP to work.
    * Apparently, ICC 10 has some bugs. There were several times when we
      were trying to use it to build, IIRC, GAMESS or POP, and it would
      give illegal instruction errors during compile. Or it would build
      a binary that we would run, and then it would do something
      horrible (like hang the machine (probably a bug interaction
      between icc and MacOSX).
    * OpenDirectory doesn't seem ready for prime time. It's pretty easy
      to set up, but it's unreliable and mysterious. In MacOS X, there
      seems to be a fundamental disconnect between things in the CLI
      world and things in the GUI world. Setting something up in one
      place won't necessarily be reflected in the other place. I'm sure
      that this is all trivial, if you're a serious Darwin user. But
      none of us were. So for example, you set up your NFS exports in
      the Server Admin tool, rather than by editing /etc/exports. The
      Admin tool won't put anything into /etc/exports. So if you're on
      the command line, how do you check what you're exporting? With the
      complexity of LDAP, this becomes a real problem. You set up
      accounts on your head node, and say to export that information.
      But perhaps you create an account, but can't log into it on a
      node. If you're ssh'd in from the outside, where do you check to
      see (from the command-line) what the authentication system is
      doing? Our local Mac guru couldn't tell us. And then you'd create
      another account, and the first one would start working again. WTF?
    * This may be the most frustrating thing about working with OS X
      Server. The CLI is the redheaded stepchild, and lots of HPC is
      mucking about on the command-line. You can use VNC to connect to
      ARD (but only if a user is logged in on the desktop and running
      ARD!), but it's slow, and only provides desktop control, not
      cluster management. ARD can then be run on the desktop, to provide
      desktop control of the nodes in the cluster, and some cluster
      management: run unix command everywhere, shut nodes down, etc.
      There were a handful of tasks which seemed important, but which I
      couldn't figure out how to do on the command-line at all. The most
      heinous of these is adding and removing users to/from LDAP.
    * Most of the time, I found it more convenient to use a 'for' loop
      that would ssh to nodes to run some command for me.
    * MacOS X lacks a way to do cpu frequency scaling. This killed us in
      the competition. We couldn't scale cores to save on our power
      budget, we could only leave them idle.
    * Being a Linux dude, I found having to have license keys for my
      operating systems, and (separately) my administration and
      management tools, to be odious in the extreme. Having to
      separately license ICC and IFORT and MKL just added frustration
      and annoyance.
    * We didn't make detailed performance comparisons between stuff
      built with the intel suite and things built with, e.g., the GNU
      suite and GotoBLAS. We were too busy just trying to get everything
      to work. I'm sure that Intel produces better code under normal
      circumstances, but we had lots of cases where version 10 couldn't
      even produce viable binaries. So, make of that what you will.

    What I would recommend (if you were going to use MacOS X):

    * Learn Darwin, in detail. Figure out the CLI way to do everything,
      and do it. In fact, forget Mac OS X; just use Darwin. Learn the
      system's error codes, figure out how to manipulate fat binaries
      (and how to strip them to make skinny ones), be able to manipulate
      users, debug the executing binaries, etc. Consider looking into
      the Apple disk imaging widget so you can boot the nodes diskless.

    What I would do differently (whether I stick with MacOS X or not):

    * diskless clients
    * Flash drive for head node
    * no GPUs
    * Get Serial Console set up and available, even if you don't use it
      routinely
    * CPU Frequency Scaling!!
    * many more, smaller cores. we had 36 at 3GHz. this was crazy. We
      were way power hungry.
    * Go to Intel 45nm dies.

----- End forwarded message -----