[Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Keller Greg at keller.netThu Dec 3 12:17:56 PST 2009
- Previous message: [Beowulf] Cluster Users in Clusters Linux and Windows
- Next message: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>>>> What's got me and the IT guys stumped is that while the compute >>>> nodes >>> boot via PXE from the head node without trouble on the NetGear, they >>> barf with the SMC. To be specific, after the initial boot with a >>> minimal Linux kernel, there is a "fatal error" with "timeout >>> waiting for >>> getfile" when the compute node attempts to download the provisioning >>> image from head. However, when they were running Rocks before I >>> arrived, the cluster worked fine with the SMC switch. This is very common with Spanning tree enabled. Essentially, once the port has a physical link light it may take a while before spanning tree allows traffic to actually flow through the port. Longer than a typical timeout. When loading/reloading the driver there seems to be an instantaneous drop of the link that forces a new delay cycle. With the Dell PowerConnect (SMC Rebrand??) series you have to "enable" port fast or "disable" spanning tree to avoid this delay before traffic passes. I generally do both. The Web based GUI is sufficiently bad enough to make this more difficult than it needs to be, but you can globally disable spanning tree through it. I use the command line, connect to interface range all, and then configure my ports as: ! enable config interface range ethernet all spanning-tree disable spanning-tree portfast mtu 9216 exit ! Hope this helps! Cheers! Greg Technical Principal R Systems NA, inc.
- Previous message: [Beowulf] Cluster Users in Clusters Linux and Windows
- Next message: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
