MPICH, malloc, and my impending assault of one (1) beowulf cluste r
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
mundy erik erik.mundy at HAMPTONU.EDUWed Jul 18 12:34:53 PDT 2001
- Previous message: building scyld from source. (alpha)
- Next message: 1GB Kingston memory = $20 (fwd)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello, my name is Erik, and I am an MPICH abuser. I am running a simple one master, two slave Beowulf test cluster, RHL 6.1, kernel 2.4.4, MPICH 1.2.1, NFS mount from master to slave on old PII 400's. MPICH is giving me some serious headaches - every MPI program I execute with a malloc in it crashes with the good old "p4 error: interrupt SIGSEGV: 11" message. I have been experimenting with the test programs that come with MPICH for simplicity; for example, 'cpi' runs well on all three computers. It calculates pi, and I rejoice. Mpptest also works without a problem between any two of the three computers. But when I try to mpirun "sendrecv" or "overtake" from examples/test/pt2pt (both of which use a malloc), MPICH gives it the good old college try and then throws me the errors. Normally I would just try to do as much as humanly possible to ignore this problem, but the code that this beowulf was designed for works when I execute it on one computer, and crashes rather spectacularly with the segmentation violation error when I try to mpirun it, even on just one computer, leading me to think that there is some sort of conflict between MPICH and malloc. Granted, these computers aren't exactly state-of-the-art - each has only 128M ram with ~400M swap. But that should be more than enough to execute those simple examples. Has anyone had trouble with the Linux version of malloc in the past in a situation like this? If you shudder when you hear the words "malloc" and "MPICH" used in the same sentence, please email me back. This might be a bit difficult to track down, and I'm really not the best man for the job, all I did was build a beowulf :). I've only been on this list for the last two months but it's taught me that if anyone can help its probably you guys. I am EXTREMELY appreciative of any assistance you can offer. Thanks, Erik erik.mundy at hamptonu.edu PS - also, I should mention that yes, the code I am trying to run WAS designed for use with MPI, and yes, I did patch MPICH with the bug fixes from the Argonne page. Sorry to take the obvious 'he's so dumb!' solutions away... I'm hoping there's one more that maybe I'm just missing :)
- Previous message: building scyld from source. (alpha)
- Next message: 1GB Kingston memory = $20 (fwd)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
