[Beowulf] HPC on the cloud

Rayson Ho raysonlogin at gmail.com
Fri Dec 2 13:01:57 PST 2011


On Tue, Oct 4, 2011 at 3:29 PM, Chris Dagdigian <dag at sonsorol.org> wrote:
> Here is a cliche example: Amazon S3
>
> Before the S3 object storage service will even *acknowledge* a
> successful PUT request, your file is already at rest in at least three
> amazon facilities.
>
> So to "really" compare S3 against what you can do locally you at least
> have to factor in the cost of your organization being able to provide 3x
> multi-facility replication for whatever object store you choose to deploy...

Agreed. Users who need less reliable storage can use Reduced
Redundancy Storage (RRS) instead. RRS only creates 2 copies instead of
3, and the price is only 2/3 the price of S3:

http://aws.amazon.com/s3/#pricing


And Amazon recently introduced the "Heavy Utilization Reserved
Instances" and "Light Utilization Reserved Instances", which bring the
cost down quite a bit as well:

http://aws.typepad.com/aws/2011/12/reserved-instance-options-for-amazon-ec2.html


With VFIO, the latency difference between 10Gb Ethernet and Infiniband
should be narrowing quite a bit as well:

http://blogs.cisco.com/performance/open-mpi-over-linux-vfio/


Finally, Amazon Cloud Supercomputer ranks #42 on the most recent TOP500 list:

http://i.top500.org/system/177457


I still think that a lot of companies will keep on buying their own
servers for compute farms & HPC clusters. But for those who don't want
to own their servers, or want to have a cluster quickly (less than 30
mins to build a basic HPC cluster[1] - of course StarCluster or
CycleCloud can do most of the heavy lifting faster), or don't have the
expertise, then remote HPC clusters (whether it be Amazon EC2 Cluster
Compute Instances or Gridcore/Gompute[2]) are getting very attractive.

[1]: http://www.youtube.com/watch?v=5zBxl6HUFA4
[2]: https://www.gompute.com/web/guest/how-it-works

Rayson

=================================
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/





> I don't want to be seen as a shill so I'll stop with that example. The
> results really are surprising once you start down the "true cost of IT
> services..." road.
>
>
> As for industry trends with HPC and IaaS ...
>
> I can assure you that in the super practical & cynical world of biotech
> and pharma there is already an HPC migration to IaaS platforms that is
> years old already. It's a lot easier to see where and how your money is
> being spent inside a biotech startup or pharma and that is (and has)
> shunted a decent amount of spending towards cloud platforms.
>
> The easy stuff is moving to IaaS platforms. The hard stuff, the custom
> stuff, the tightly bound stuff and the data/IO-bound stuff is staying
> local of course - but that still means lots of stuff is moving externally.
>
> The article that prompted this thread is a great example of this. The
> client company had a boatload of one-off molecular dynamics simulations
> to run. So much, in fact, that the problem was computationally
> infeasable to even consider doing inhouse.
>
> So they did it on AWS.
>
> 30,000 CPU cores. For ~$9,000 dollars.
>
> Amazing.
>
> It's a fun time to be in HPC actually. And getting my head around "IaaS"
> platforms turned me onto things (like opscode chef) that we are now
> bringing inhouse and integrating into our legacy clusters and grids.
>
>
> Sorry for rambling but I think there are 2 main drivers behind what I
> see moving HPC users and applications into IaaS cloud platforms ...
>
>
> (1) The economies of scale are real. IaaS providers can run better,
> bigger and cheaper than we can and they can still make a profit. This is
> real, not hype or sales BS. (as long as you are honest about your actual
> costs...)
>
>
> (2) The benefits of "scriptable everything" or "everything has an API".
> I'm so freaking sick of companies installing VMWare and excreting a
> press release calling themselves a "cloud provider". Virtual servers and
> virtual block storage on demand are boring, basic and pedestrian. That
> was clever in 2004. I need far more "glue" to build useful stuff in a
> virtual world and IaaS platforms deliver more products/services and
> "glue" options than anyone else out there. The "scriptable everything"
> nature of IaaS is enabling a lot of cool system and workflow building,
> much of which would be hard or almost impossible to do in-house with
> local resources.
>
>
>
> My $.02
>
> -Chris
>
> (corporate hat: chris at bioteam.net)
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



-- 
Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/



More information about the Beowulf mailing list