[Beowulf] SGI to offer Windows on clusters

Robert G. Brown rgb at phy.duke.edu
Fri Apr 13 04:43:31 PDT 2007


On Thu, 12 Apr 2007, laytonjb at charter.net wrote:

> I really think the web interface is the way to go. This way you can submit jobs from
> any machine that has a browser (Linux, Windows, Mac, etc.).

Isn't that what gridware basically does already?  Doesn't SGE provide a
web interface to resources (a cluster) it controls?  Isn't that a
significant part of the point of the Globus project?  The ATLAS grid
(IIRC) uses a grid interface, more or less, to provide just this sort of
layer isolation between cluster/grid resource and the user.

There are problems with this, of course.  It is wonderful if the
grid/cluster already has a canned package installed, so that what the
user "submits" is a parametric dataset that tells the cluster how and
what to run with that package.  BLAST etc work fine this way and there
exist cluster/grids architected just this way for this purpose, from
what I can tell via google.  If you want to run YOUR source code on the
cluster from a different OS, however, well, that's a problem isn't it?

I think that there have been efforts to resolve even this -- ways of
submitting your source (or a functional binary) with build (or run)
instructions in a "package" -- I vaguely remember that ATLAS attempts to
implement a little horror called "pacman" for this purpose.  I leave to
your imaginations the awesome mess of dealing with library requirements,
build incompatibilities, mistaken assumptions, and worse across
architectures especially ones likely for people who write in MS C++ (or
a C downshift thereof) and expect it the source to "just run" when
recompiled on a linux box.

Practically speaking, for source code based applications if the user has
a linux box (or even a canned vmware linux development environment they
can run as a windows appliance -- and there are many of them prebuilt
and available for free so this is no longer that crazy a solution on a
moderately powerful windows workstation -- and sufficient linux
expertise to work through builds thereupon, they can develop binaries or
build packages that they can submit to a cluster via a web interface
that hides all cluster detail.  If not, then not.

Joe of course is building specific purpose clusters for many of his
clients and hence can successfully implement either canned software
solutions OR can manage the porting, building, preinstallation of the
client's software so that they can use it via a web-appliance interface.
Basically they purchase his expertise to do the code migration -- which
is again fine if the source is mature and unlikely to need a lot of
real-time tweaking and if they mostly want an appliance with which to
process a very large data space or parametric space a little at a time
(so "jobs" are parametric descriptions used to start up a task).

There are various other details associated with gridware and cluster
usage of this sort that make the idea "good" or "bad" per application.
If the application is bottlenecked by data access -- it processes huge
files, basically -- one can spend a lot of time loading data onto the
cluster via e.g. the web interface compared to a little time running the
application on the data, something that can perhaps be done more
smoothly and faster with a native shared disk implementation instead of
double hits on native disk on both ends plus a (probably slow) network
transfer.  Accessing other resources -- GUI access to the program being
run, for example -- similarly depends strongly on having the right hooks
on both ends.

    rgb

>
> Jeff
>
>> Here is a proactive suggestion for keeping open source
>> ahead of Microsoft CCS:
>> 1. I think CCS will appeal to small shops with no prior cluster
>>     and no admin capability beyond a part time windows person.
>> 2. such customers are the volume seats for a range of desktop
>>     CAD/CAE tools.
>> 3. Such ISVs will see potential of license growth, and will
>>     likely choose to tie-in to the Microsoft message of ease-of-use.
>>     A big feature here, in my view, is the one-button-job-launch.
>>
>> This means, for Linux to have a position as the backend
>> compute cluster, we must have this one button job launch
>> capability.  A Windows library must be available to
>> the ISV, to provide a job submission API  to the batch
>> scheduler.  With such a feature, the ISVs can be
>> persued to incoporate.
>>
>> Ideally the job submission API is a kind of standard, so
>> the ISV does not see duplicate work versus the batch scheduler
>> used.
>>
>> So,
>> a) we need a job submission API, and
>> b) we need the Windows library added to Linux batch schedulers.
>>     (I'm not saying the scheduler runs on Windows, we just need
>>     the submission/retrieve portion).
>>
>> Does such exist already?
>>
>> Thanks, Rich
>> Rich Altmaier, SGI
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu





More information about the Beowulf mailing list