[Beowulf] Strange hardware? problems
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
David Mathog mathog at caltech.eduTue May 1 12:27:46 PDT 2007
- Previous message: Fwd: Re: [Beowulf] Why is communication so expensive for very small messages?
- Next message: [Beowulf] Strange hardware? problems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Robert G. Brown" <rgb at phy.duke.edu> wrote > I've been coding, one way or another, for coming up on 35 years or > thereabouts, starting with paper tape, going through cards (lots of > cards), and up the evolutionary ladder. In all of that time, I've > encountered one -- count it, one -- time that a consistent error in code > I was running was due to a real failure in the hardware I was running on > and not a bug in my own code. RGB has an extra 5 years on me, but my experience has been similar: only very, very, very rarely is a program fault the result of a true hardware issue. (This excludes anything that runs from one box to another over a cable or fiber, where hardware issues are more common.) We once tracked a bug in an FFT subroutine running on an array processor to faulty memory, and right down to a memory pattern suggesting two address pins were shorted together. On opening the beast up, sure enough, the short was right where it had to be, and it was repaired with a scalpel. This was around 1982. Anyway, one caveat. With the proliferation of x86 variants I now on occasion hit a binary which has been compiled for some other processor variant that blows up when it tries to use an instruction which is not supported on the processor it is actually running on. As I mentioned previously, valgrind can catch these for you. Or recompile using switches you know are supported on the target processor. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech
- Previous message: Fwd: Re: [Beowulf] Why is communication so expensive for very small messages?
- Next message: [Beowulf] Strange hardware? problems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
