[Beowulf] SGI to offer Windows on clusters

Joe Landman landman at scalableinformatics.com
Thu Apr 12 18:47:08 PDT 2007


Hi Jeff, Rich, and Beo-world:

laytonjb at charter.net wrote:
> Rich,
> 
> I absolutely agree with you! I've been thinking about this for a long time but
> I don't have any answers for you :)  I think there are a number of people
> thinking about this same problem. My best suggestion is to use the web to
> launch and monitor jobs. Everyone know how to use a browser, so just going
> to a URL, logging in, and filling out a simple form to run a job would be great.

The internals also need to use http and related transports.

> I know that Platform (LSF), Altair (PBS), and Cluster Resources (MOAB) have
> solutions that do this. I started to look at them at my previous job. Not too
> bad in general. I also know that Joe Landman at Scalable Informatics also
> has a web interface (SWICE?) that he offers his customers.

Egad... I didn't realize that my naming was that bad.  It is (currently)
called SICE (Scalable Informatics Computing Environment).  It will
change in very short order.  It is in use by a number of customers.

Next version (in development now), which will sport a name change, and
lots of nice new bits will go a long ways to solving pretty much all of
the issues Rich raised, and then some.  Will work with any scheduler
(need a minimal API to be function), and should work on any OS (Linux,
and yes, that means you Microsoft) as a client and a "server".  Clients
can be CLI, Web, yadda yadda yadda.

Not re-inventing wheels here.   Avoiding building stuff we don't have
to.  Focusing on building the stuff we have to.

One version or another has been in the field on live production machines
for the better part of 4 years.

> I also recently got an email about some kind of interface for CFD codes that
> allows you to easily submit jobs to various schedulers and watch the progress
> of the run (you could even do a simple plot if you wanted). I can't remember the
> name of it and it's on my home box right now.
> 
> I think an API, whatever it looked like, that could submit jobs to various schedulers,
> was simple to use, simple to add other applications would work well. 
> 
> I really think the web interface is the way to go. This way you can submit jobs from
> any machine that has a browser (Linux, Windows, Mac, etc.). 

For one group, I demoed submitting jobs from my palm pilot.  While that
was neat, the folks who watched me control the JackRabbit appliance from
the Palm were really geeked.

> Jeff
> 
>> Here is a proactive suggestion for keeping open source
>> ahead of Microsoft CCS:
>> 1. I think CCS will appeal to small shops with no prior cluster
>>     and no admin capability beyond a part time windows person.

I think the concept of a "sealed appliance" is a good one.  At first I
had asked if a cluster is a toaster in jest last year, but frankly, the
small ones are.  If they are "sealed appliances" you don't care what OS
runs on it.  Think of your home firewall/router/gateways.  You manage
them from a web browser.  Do you know or care what OS is underneath?
For the most part no.  You just want to know it works, and works well.
This is an area Linux could play extremely well in.

>> 2. such customers are the volume seats for a range of desktop
>>     CAD/CAE tools.

We are seeing a number (not insignificant) migrating over to Linux, and
running windows in a window or on a laptop next to them.  We have
several customers like this now, and more are moving in this direction.
 This does not mean everyone will.  I would personally prefer better
interoperability between windows desktops and linux systems
(desktops/clusters).  Would make *everyones* life easier.  The linux
units bend over backwards to provide this, and we work fairly hard at
making the cluster look like a web page and a big disk (e.g. an
appliance) to the end user.  Not perfect, but it is improving over time.

>> 3. Such ISVs will see potential of license growth, and will
>>     likely choose to tie-in to the Microsoft message of ease-of-use.
>>     A big feature here, in my view, is the one-button-job-launch.

Hmmm... I think the potential for growth closely tracks the "personal"
supercomputer.  It remains to be seen if this will take off.

>>
>> This means, for Linux to have a position as the backend
>> compute cluster, we must have this one button job launch
>> capability.  A Windows library must be available to

First off:  Linux is the defacto standard back end compute cluster.  A
smattering of others exist, but Linux is fairly dominant in this area.
That said, complacency is not a long term survival strategy.

Second:  "one button job launch" exists in some products today.  No, not
the Platform tools, or the Moab, or the other tools.  Those focus upon
the queuing, and that is frankly one thing that customers don't really
give a rip about.  Way way back in 1999/2000 (when I was at SGI,
building the SGI GenomeCluster bits) we demonstrated the concept of
hiding the scheduler from the user.  The user didn't know squat about
the scheduler or job submission bits at all.  They didn't have to.   The
system handled all of that for them.   Again, this was operational (and
I remember demonstrating it on the Linuxworld show floor at the SGI
booth with BLAST running across some 80 client machines) in 2000 on Linux.

>> the ISV, to provide a job submission API  to the batch
>> scheduler.  With such a feature, the ISVs can be
>> persued to incoporate.

It is much deeper than that.  There are much better (saner, smarter,
less heavyweight) ideas around now.  DRMAA is the lower level of this.
Its what you build atop this that matters.  When Microsoft came out with
their own API (which as far as I remember, did not talk DRMAA), I was
quite critical of it.  Still am (unless DRMAA support has been added).
With DRMAA, you can talk to any scheduler which talks DRMAA.  Neat.

>>
>> Ideally the job submission API is a kind of standard, so
>> the ISV does not see duplicate work versus the batch scheduler
>> used.

See above.

>>
>> So,
>> a) we need a job submission API, and

Low level exists in DRMAA.  Higher level logic also exists in some apps
(SICE et al) which does some pretty neat things.

>> b) we need the Windows library added to Linux batch schedulers.
>>     (I'm not saying the scheduler runs on Windows, we just need
>>     the submission/retrieve portion).

Again, most users won't (and shouldn't) care where the scheduler runs as
long as it all works together.  Interoperability is critical.

>>
>> Does such exist already?

Yes.  See above.  Bug folks offline if you want more info.

Joe

>>
>> Thanks, Rich
>> Rich Altmaier, SGI
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list