Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Ashley Pittman ashley at pittman.co.uk
Thu Dec 3 01:20:58 PST 2009


On Wed, 2009-12-02 at 14:58 -0500, Joe Landman wrote:
> David Mathog wrote:
> >> What's got me and the IT guys stumped is that while the compute nodes
> > boot via PXE from the head node without trouble on the NetGear, they
> > barf with the SMC.  To be specific, after the initial boot with a
> > minimal Linux kernel, there is a "fatal error" with "timeout waiting for
> > getfile" when the compute node attempts to download the provisioning
> > image from head.  However, when they were running Rocks before I
> > arrived, the cluster worked fine with the SMC switch.
> 
> Wondering aloud whether or not the ethernet driver has been correctly 
> included in the kernel/initrd for the PXE booted image.  I've 
> seen/experienced this before, PXE works fine, the kernel boots, and is 
> missing the ethernet driver.

Or the new distro you are trying enumerates the ethernet devices
differently and it's trying to load the getfile from a different
unconnected ethernet port.  That's fairly common as well.  It could even
be worse that than in that the enumeration could be non-deterministic to
really confuse you.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk




More information about the Beowulf mailing list