[Beowulf] Seg Fault with pvm_upkstr() and Linux.

Robert G. Brown rgb at phy.duke.edu
Wed Mar 16 06:30:41 PST 2005


On Tue, 15 Mar 2005, Josh Zamor wrote:

> I have just started in on programming using PVM and have run into an 
> odd problem. I have written a C (c99) program that calculates the 
> factorial of a number by calculating parts of the range. I'm using the 
> GMP for dealing with large numbers (I have done this program 
> successfully before using numerous methods including pthreads). The 
> basic way it works for the cluster is that the program starts on a 
> machine, determines the subrange to be calculated for each task and 
> then waits for each process to come back with the answer for it's 
> subrange. The main process then finds the total result of the factorial 
> from multiplying the subrange results back together... Pretty 
> standard...
> 
> The problem is that after a subrange is calculated by a task the result 
> is put into a character array (null terminated, created from GMP's 
> mpz_getstr()), it is packaged using pvm_pkstr() and sent to the parent. 
> The parent can receive this using pvm_recv(), but as soon as it tries 
> to store this into a character array in the program using pvm_upkstr(), 
> or explore it with pvm_bufinfo(), it segfaults.
> 
> I've also tried this in a couple of different ways, passing ints, it 
> works, but passing strings (either large or small) results in a 
> segfault.

This SOUNDS like programming error -- using a pointer as an int or vice
versa.

I'd do two things -- look at the actual result produced by GMP on the
client side in some detail -- dumping it bytewise a character at a time
isn't that dumb an idea.  GMP introduces all sorts of new types, and
I'll BET that these types are structs, not the actual data.  So is the
result a normal pointer-addressable string or a struct?  Maybe what you
are returning is a container for a pointer to anonymous memory on the
client, not the contents of that memory... (Note that I've never used
GMP so don't know, but you definitely need to check to make sure that
what you are returning is an actual complete data object and not a
container e.g. a struct or linked list).

I assume that you've experimented and have no difficulty returning and
unpacking ordinary ints, strings, or raw data blocks with PVM.  If so
you probably aren't making a pointer error on the master server side,
although it never hurts to check.

If you want other eyes on your actual code (might be useful if it is
indeed programmer error) please post.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu





More information about the Beowulf mailing list