[Beowulf] Stroustrup regarding multicore
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduTue Aug 26 15:48:04 PDT 2008
- Previous message: [Beowulf] Stroustrup regarding multicore
- Next message: [Beowulf] Stroustrup regarding multicore
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 26 Aug 2008, Perry E. Metzger wrote: > > "Robert G. Brown" <rgb at phy.duke.edu> writes: >> I think that matrix allocation per se is contiguous because of how and >> where it occurs at the compiler level. Immediate matrices allocated >> like a[10][10] are just translated into displacements relative to the >> stack pointer, are they not? > > It depends. You're not guaranteed per se that they'll be allocated on > the stack, though in practice they are. Unless they are really large, I imagine. I'm not sure what the stack limits are (I don't code to ever stress them) but I'm sure there are some. >> Global/external matrices at the a[10][10] >> are prespecified in the data segment of the program, > > Depending on the particular OS. Certainly that's the usual model. What OS's do it some other way? I'm just curious. Is this a kernel feature or a compiler feature? >> The only place where it might not be true is when one dynamically >> allocates memory for a **matrix. There, if you loop through malloc >> calls, there is no guarantee that they return contiguous blocks > > An vector of pointers to vectors is not a matrix. :) > > It acts a bit like one, but that's guaranteed NOT to be contiguous > allocation because malloc will at the very least add the (invisible to > the programmer) malloc headers etc. No, a vector of pointers to addresses in a vector. To make a 3x3 matrix **m indexed from -1 to 1 that has actual data in vector *v: a) v = (whatever *) malloc(9*sizeof(whatever); b) m = (whatever **) malloc(3*sizeof(*whatever)) + 1; c) for(i=0;i<3;i++) m[i-1] = &v[i*3 + 1]; (or thereabouts). m[-1][-1] now dereferences to v[0]. m[0][-1] points to v[3]. m[1][-1] points to v[6]. m[-1][0] derefs to v[1]. Etc. If I did the seat of the pants displacement correctly. The point is that a) v is a contiguous block of memory. If one puts both v and m and index limits in a struct, one can make an entire "matrix" object a single contiguous block of memory that actually knows its own limits and permits limits checking with semi-portable routines. One can now call e.g. an ODE solver that requires a vector argument v but write the derivs routine using m and completely natural addressing. And if one wishes, obviously one can do more complicated things to compute the displacements of the addresses packed into m to do e.g. triangular arrays. Is this a pain to set up? One is basically doing by hand what the compiler does for you for rectilinear, 0-indexed arrays, so yes. However, it is often WORTH the pain to save space, to be able to produce usable code, to be able to create an array in C that can be passed to a fortran array, although I think macros are likely more useful for that. In fact, macros are another way to do some of the same things, but not identical things. Oh, and free(v), free(m) is all it takes to free the space. In contrast to a more usual NR malloc loop, which has to loop frees of each vector block (with, as you note, its header:-) and then free the ** pointer. I use/have used both, but generally prefer the vector repacking approach as it is more obviously efficient. And yeah, there's little point to this if all you're doing is allocating a "boring" rectilinear array that starts at 0, or even 1. I still often do it for 1 because I can't stand creating shadow im1 = i-1 variables in loops from 1 to N that contain lots of complicated algebraic stuff. As I said, angular momentum table displacements can get pretty ugly, especially when one gets to 3 pairs of indices. rgb [My largest piece of fortran, lifetime, was writing a program to check a physicist's algebraic reduction of a multichannel nuclear scattering cross-section from its "bare" form -- summing over umpty different channel angular momenta -- into a "cooked" form that used all the algebraic reductions of angular momentum tensors into more compact irreducible forms, which was a sum over a much smaller set of input and output channel spins of a product of a few 9j symbols, a CG coefficient, and some phases. Box-full-of-cards, so to speak. The "hard way" sums were over many, many clebsch-gordan (3j) and 6j coefficients, all of which I had to evaluate with my own code, this being pre-library days at least if you were poor. The later stuff with Gaunt numbers was from structure factors in crystal band structure computations, where Gaunt numbers (triple integrals of Ylm's) are themselves products/sums of 3j symbols, so I reused some of the code, and eventually ported it -- twice -- into C. Actually, it was a pretty big fortran program too once upon a time -- lots of bessel functions and integration and coupled ODE solution.] > > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
- Previous message: [Beowulf] Stroustrup regarding multicore
- Next message: [Beowulf] Stroustrup regarding multicore
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
