[Beowulf] Cell
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comWed Apr 27 15:46:54 PDT 2005
- Previous message: [Beowulf] Cell
- Next message: [Beowulf] Cell
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Vincent Diepeveen wrote: > A raid5 array of 2 terabyte costs like $2000-$3000 and it can deliver > 400-600MB/s i/o hands down when attached to a single machine. So if you > make the 1 tflop processor, there is no need to worry! I need to find out where you are getting your raids... [...] > Anything that has to do with huge calculations is in the first place cpu > power limited. Not anything else. There is a statement I like to make when I see comments like this. "Gross generalizations tend to be incorrect". If you think about it long enough, you can see the recursive humor. There are many different factors that will affect the overall performance of a machine on a particular code/data set. To illustrate this, I often suggested the following gedankenexperiment. Imagine you have a CPU that is infinitely fast, coupled to resources that are not infinitely fast. This means that while operations take exactly 0 time on the CPU, we haven't done a thing to make the memory or IO faster. In this gedankenexperiment, how much of a speedup do you get from an infinitely fast CPU? Memory moves still take time. Data loading and storing still takes time. Data motion is quickly becoming one of the (if not the) most critical aspect of performance for a fair number of calculations. So unless all parts are infinitely fast, you still have to pay for the data motion time, the IO time, the memory-> memory time, the memory->CPU time (and CPU to memory time). In short, an infinitely fast CPU would reduce the execution time of (possibly significantly) a class of applications that are only CPU bound (say operating out of internal cache only). It will do very little for a code which is IO or memory bandwidth or latency bound. > Big RAM is nice to have for most clever algorithms, but it is second most > important. CPU power is most important. If there is some bottleneck that > limits the RAM we have, do not worry! > > We will find a solution! > > The real bottleneck is in the end the number of instructions a cpu can > process a second. Not really. The bottleneck in performance is how full you can keep the multiple pipelines of the processor. Branch statements tend to force pipeline flushes. You can "handle" this with speculative execution. Real memory accesses can bottleneck the memory subsystem, so real processors allow specific mixtures of instructions in flight at once to reduce resource contention. If you overflow any of the fixed CPU resources, you can stall a pipeline while waiting for the contention to be eliminated, or you can stall the entire CPU while flushing TLB and other shared resources. Basically you have multiple simultaneous zero sum games (fixed number of operations per unit time, specific mixtures of operations that maximize the performance of instructions in flight). Compilers are, as I indicated before, not particularly smart in most cases, and they generate code locally that might not make sense globally. Moreover, how instructions are ordered and presented to the CPU will fundamentally impact the overall performance. Code optimizers are, in a large sense, an attempt to better fit the emitted instructions to the processor architecture, by rewriting loops, mathematical constructs, and related. Optimizers are not perfect. Some architectures are pretty much impossible to write optimal code for (turns out to be NP-hard), and you have to accept a set of compromises at some point to avoid having your compilation take 24 hours (my MD codes used to take about 24 hours to build on a Trace Multiflow, VLIW architecture). The overall point of this is a) writing good code is hard b) writing fast code is harder c) CPUs don't automagically make things faster, compilers are implicated in this mess d) some optimizers are better left off :( -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615
- Previous message: [Beowulf] Cell
- Next message: [Beowulf] Cell
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
