[Beowulf] Stroustrup regarding multicore
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Eric Thibodeau kyron at neuralbs.comTue Aug 26 11:30:55 PDT 2008
- Previous message: [Beowulf] Stroustrup regarding multicore
- Next message: [Beowulf] Stroustrup regarding multicore
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Perry E. Metzger wrote: > "Robert G. Brown" <rgb at phy.duke.edu> writes: > >> On Tue, 26 Aug 2008, Michael H. Frese wrote: >> >>> C is not much better. I once worked a young computational >>> programmer for almost a week to get him to prove to himself that a C >>> source program couldn't walk through a 2-d array the hard way as >>> fast as a Fortran source program unless the stepping was coded by >>> hand. He didn't believe that a 2-d array in C is syntactically a 1-d >>> array of pointers to 1-d arrays, and the row pointers must be >>> fetched from memory! And separate compilation of functions >>> > > As I said already, he's wrong.... > > >> Perhaps, but don't most C programmers allocate such an array as a single >> vector and then repack the indices? >> > > I've never seen anyone allocate "as a single vector and repack the > indices", though I'm sure that a counterexample exists in someone's > code out there somewhere. In any case, one has no need to do such a > thing. > > (This is not to say that when one calls malloc, if you're calling > malloc to allocate an array, that you don't pass it a single size_t > indicating what you're looking for, but that's a different issue.) > > I am not sure if this is what you mean but, anyone that has been programming in C long enough (hrm...to use malloc at least once ;) ) _should_ know that malloc reserves X bytes of memory and doesn't care nor needs to know what the memory is used for. As for the contiguous nature of the assignment, doing otherwise would be horrendously inefficient given that most processors take for granted this memory mapping to optimize cache usage and pre-fetches (noting that actual memory allocation is done by the OS as pages). I am currently unable to dig out the reference but there are very few processors (none that are Beowulf COTS material iirc) that implement any sort of semantics to detect the actual fetching pattern (ie: understand that the data is being fetched by strides of x bytes). Also I am actually doing some studies on comparing the use of data structures (or not) in some simple C code to see if the use of structures has a significant impact on the cache's fetching capabilities....efforts which might become useless (stay readable to the human and let the compiler do it's work) since (from GCC-4.3.1's manpage): -fipa-struct-reorg Perform structure reorganization optimization, that change C-like structures layout in order to better utilize spatial locality. This transformation is affective for programs containing arrays of structures. Available in two compilation modes: profile-based (enabled with -fprofile-generate) or static (which uses built-in heuristics). Require -fipa-type-escape to provide the safety of this transformation. It works only in whole program mode, so it requires -fwhole-program and -combine to be enabled. Structures considered cold by this transformation are not affected (see --param struct-reorg-cold-struct-ratio=value). > Perry > Eric Thibodeau -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080826/80078ad1/attachment.html
- Previous message: [Beowulf] Stroustrup regarding multicore
- Next message: [Beowulf] Stroustrup regarding multicore
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
