Thu Jun 12 22:07:40 PDT 2014
with certainty what is causing the failures:
- is it a LAM bug?
- is it a 3c59x driver bug?
- is it a 2.4 kernel bug?
Besides this problem I have encountered by now several RedHat 7.1 machines
on campus (UP or SMP) that had network problems which could be solved by
including the "noapic" option in lilo.conf. Are there chances that the
APIC problems in the 2.4 kernels are resolved soon (there seem to be changes
to the APIC code in 2.4.10, but I still have problems)? Is there a performance
hit related to the "noapic" option?
Anyway, with the release of mpich-1.2.2 this problem isn't as pressing
anymore as it was a few weeks ago. The performance of MPI jobs under
mpich-1.2.2 is much improved, particularly for smaller message sizes. Big
thankyou to the mpich developpers!
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the Beowulf