[Beowulf] How Can Microsoft's HPC Server Succeed?

John Vert jvert at windows.microsoft.com
Fri Apr 18 16:22:13 PDT 2008


Microsoft's MPI stack has never used DAPL.

V1 ("Windows Compute Cluster Server 2003") uses WinsockDirect. High-speed interconnects like Infiniband  plug into this stack through the existing WinsockDirect provider interface.

V2 ("Windows HPC Server 2008", coming soon) introduces a new provider interface called NetworkDirect which maps much better to the hardware. So far we are seeing excellent performance and the 2 microsecond latency quoted earlier is one example.

Our V2 job scheduler also has a lot of performance improvements. If you care about how long the scheduler takes to submit, allocate, reserve, and activate a 1,000+ CPU job, I think you'll like that. This is really nothing to do with "Linux clusters" as it's largely a job scheduler issue and most job schedulers support multiple platforms.

John Vert
Development Manager
High Performance Computing


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Greg Lindahl
Sent: Friday, April 18, 2008 3:13 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] How Can Microsoft's HPC Server Succeed?

On Fri, Apr 18, 2008 at 11:59:19PM +0200, Bogdan Costescu wrote:

> The reasons
> for using Windows were more or less the same that have been mentioned
> in this thread, so I won't repeat them. To note is that they weren't
> using Windows exclusively, but only on a part of the cluster, the rest
> running Linux.

Well, in that case the part of the argument that goes "You don't have
any Linux admins in your organization" doesn't apply. But I could see
the seductive line "You have a lot of faculty who are exclusively
Windows, how will they use your Linux cluster?" used.

BTW, isn't it still the case the Microsoft is exclusively using DAPL
for their MPI?  So yes, they do reduce performance on all high speed
interconnects.

> And then
> things turned really strange after a statement saying that in Linux it
> takes several minutes to start a parallel job while in Windows only
> about 10 seconds.

I posted a link to a blog posting from Microsoft about that. It's
pretty terrible marketing for anyone who knows someone with a Linux
cluster, because few Linux clusters are misconfigured that badly.

-- greg



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list