[Beowulf] fast interconnects
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jim Lux James.P.Lux at jpl.nasa.govMon Dec 5 07:33:53 PST 2005
- Previous message: [Beowulf] fast interconnects
- Next message: [Beowulf] Multirail Clusters: need comments
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
At 06:47 AM 12/5/2005, you wrote: > > There's all kinds of useful things one can do if you have a multi Gbit/sec > >well, the bar is at more like 1.5 GB/s right now. quadrics trails a bit >because it's not full-duplex in its current pcix incarnation, but everything >10GE or IB-based is well above. myri 2G is pretty much a legacy product, >as is gigabit (only being nearly free keeps it from being obsolete...) > > > interconnect. Various signal processing (Synthetic Aperture Radar > > processing, Hyperspectral imaging compression, signal analysis) spring > >I guess you mean that practical applications of this stuff would need to >work on data much larger than a single node's memory (which is at least 2-16GB >in the current market). I have users who swear by FFTW 2.x's MPI code, >and claim that it works well even on gigabit-based systems. Indeed.. think in terms of a real time stream of data, as opposed to a batch processing. multiple data streams coming in from a sensor at 100Mbps or more, processed data coming out at a few Mbit/sec. A trivial example (and clearly a bad use for a cluster, since there is custom silicon available) is compressing digital video from the raw CCIR601 style samples (4:2:2, 270 Mbps) into a compressed stream at 19 Mbps. Most of the compression is small 8x8 transforms. Another example is where you have a wideband signal coming in, and you are implementing some sort of digital analysis/receiver. Imagine digitizing the entire Low VHF communications band from 30 to 88 MHz (about 120 Msamples/second) feeding it out to a raft of processors and having each processor find, extract and process one signal. Or, more practically, you'd have a series of band receivers, each of which grabs, say, 10 MHz wide and feeds it to the cluster. You want to track a frequency hopping radio, so, when a signal disappears, you need to look for a new signal with similar modulation characteristics popping up somewhere else at about the right time. There ARE other ways to solve these problems, even with general purpose hardware, particularly if you can "batch" the data. However batching the data increases the latency, and there are applications where low latency is required, and you don't want to wait until you've got 10 million samples to work on. > > hard work, the problem has been getting the data in and out (a 1 GFLOP > > processor doesn't buy you much if the data rate in and out is 60+ > > Megatransfer/second (=133 MHz:2 (1 in, 1 out))) > >really? what do you think the flops/byte ratio is for this domain? Depending on the algorithms, it could be quite low.. a few multiplies and adds per sample. In systolic arrays (which rely on extremely parallelized algorithms) it might be only one op/sample. I'd never claim that a cluster of commodity processors is a *good* way to implement a systolic array, but it's an example where you have data flowing "through" a processor at a high rate, but don't need much processing on each node. I guess.. any of these exceedingly fine grained processes puts a heavy burden on the interconnect, and faster is always better, especially if it allows you to trade commodity processors and commodity C programmers for custom ASICs and chip designers at a million bucks a spin. > > The interconnect speed is what drives folks to incredibly expensive > ASIC or > > FPGA solutions. > >hmm. Yep. But, just as folks like using clusters built of commodity computers in preference to specialized supercomputers in the more traditional HPC world, the same is true in signal processing. All the good stuff about Beowulfs in general: - cheap hardware to get started - scalability so you can start small (cheap) - easy access to tools (e.g. gcc, all manner of libraries, matlab/octave) - low learning curve to get started applies to the signal processing world as well. James Lux, P.E. Spacecraft Radio Frequency Subsystems Group Flight Communications Systems Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875
- Previous message: [Beowulf] fast interconnects
- Next message: [Beowulf] Multirail Clusters: need comments
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
