[Beowulf] coprocessor to do "physics calculations"
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Douglas OFlaherty douglas at shore.netMon May 15 21:35:37 PDT 2006
- Previous message: [Beowulf] OT: informatics software for linux clusters
- Next message: [Beowulf] Cluster OpenMP
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
(apologies for going a bit further off topic) Aegia is chasing the broad market, so the pricing is $275 for the Aegia card.[1] It certain spreads the spectrum of price/performance for tuned code. It doesn't do much for time to solution. The programming model is clearly the single biggest obstacle to success. The GPU market benefited from standardized APIs for video display like DirectX & OpenGL. Aegia has their own games engines so they explicitly control the programming model.[2] There are rumors of a MSFT DirectPhysics API to bring together the underlying APIs for game physics to be supported on GPUs, PPUs & CPUs.[3] A great idea, but daunting to execute. GPUs, Aegia, Cell & GRAPE have a luxury, developing the hardware & software for a specific application area sufficient to amortize development costs. Systems mixing FPGAs and CPUs are broadly used in high-end imaging solutions. These look on paper to be a programming nightmare. Perhaps, the obstacle is not programmability, but defining the right application.[4] In the age of (relatively) inexpensive FPGA gates and ASICs why aren't communities of users seeking hardware & software partners to speed up the critical loops? With increased bandwidth, lower latencies and standard interfaces available on commodity platforms, you would only pay a premium for for the part of the system that delivers the performance. Wasn't this some of the same thinking that moved codes from SMPs to beowulf clusters in the first place? A modestly difficult programming task to take advantage of emerging hardware performance.[5] However, I recognize the uphill battle.[6] Math & solver libraries can deliver improved performance on a hardware platform with minimal changes to code. Despite the ease & possible performance gains I only know a handful of commercial codes that make use of vendor supplied libraries. It is expensive to qualify multiple environments. Once you give the end-users some latitude on libraries, who knows what else they may develop. They may plug in an APU that improves performance 20x, lowers licensing revenue and introduces a different rounding scheme than the original binary - accurate, yet different - and call you for support. [1] Based upon Alienware's online configurations [2] They run in lower resolution on CPUs and high realism when the Aegia processor is present. [3] http://digg.com/software/Microsoft_making_their_own_physics_SDK_API [4] I nominate weather codes [5] If I can divide my code into intelligent work units, then I don't need to run them in a single shared memory machine. If I can divide my code into intelligent work units, then I don't need to run them on the same type of processor. [6] And let's not start on the complexity of managing a cluster of truly heterogeneous nodes... I'm sure Don Becker is already working on this one ;) Date: Sun, 14 May 2006 14:37:22 -0400 (EDT) From: Mark Hahn <hahn at physics.mcmaster.ca> Subject: Re: [Beowulf] coprocessor to do "physics calculations" To: beowulf at beowulf.org Message-ID: <Pine.LNX.4.44.0605141251570.7486-100000 at coffee.psychology.mcmaster.ca> Content-Type: TEXT/PLAIN; charset=US-ASCII > > Didn't see anyone post this link regarding Aegia Physix processor. It is the most comprehensive write up I have seen. > > > > http://www.blachford.info/computer/articles/PhysX1.html > yes, and even so it's not very helpful. "fabric connecting compute and memory elements" pretty well covers it! the block diagram they give could almost apply directly to Cell, for instance. fundamentally, about these cell/aegia/gpu/fpga approaches, you have to ask: - how cheap will it be in final, off-the-shelf systems? GPUs are most attractive this way, since absurd gaming cards have become a check-off even on corporate PCs (and thus high volume.) it's unclear to me whether Cell will go into any million-unit products other than dedicated game consoles. - does it run efficiently-enough? most sci/eng I see is pretty firmly based on 64b FP, often with large data. but afaikt, Cell (eg) doesn't do well on anything but in-cache 32b FP. GPUs have tantalizingly high local-mem bandwidth, but also don't really do anything higher than 32b. - how much time will it take to adapt to the peculiar programming model necessary for the device? during the time spent on that, what will happen to the general-pupose CPU market? I think price, performance and time-to-market are all stacked against this approach, at least for academic/research HPC. it would be different if the general-purpose CPU market stood still, or if there were no way to scale up existing clusters... ------------------------------
- Previous message: [Beowulf] OT: informatics software for linux clusters
- Next message: [Beowulf] Cluster OpenMP
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
