[Beowulf] HPC in the cloud question (Hutcheson, Mike)

Greg Keller Greg at Keller.net
Fri May 8 09:33:56 PDT 2015


FWIW - SR-IOV on Mellanox is good and turning to great this year so near
bare metal performance in a vm is becoming possible with the flexibiliy of
migrating VMs over IB.  We don't use it yet in production but expect to by
SC15.

[Shameless_Promo]
R-HPC (www.r-hpc.com) sells bare metal "HPC as a Service" in 2 modes today:
1) Utility Shared Queues with IB and Lustre on Scientific Linux (pay per
job)
2) Dedicated Clusters with flexible OS/FS and superuser if you need it.
(Pay per days/months dedicated)

Our access model is ssh or ssh over VPN, so there is infinite flexibility
in mode 2.  We can help make sure the user experience is minimally impacted
either way.  We have HPC Admins who can deep dive in to support compiling
codes or other challenges, so we are an extension of your admin/support
team any time we are invited to help.  We are open source and happy to help
you recreate anything we do on your site.  Our only "vendor lock-in
strategy" is you will love our support.

We use Dell as a sales channel too (Dell HPC CLoud Services) which has
simplified purchasing for some academic institutions too.  We are Internet2
connected and working on Net+ Service provider status... so you may already
be conntected at 10Gb speeds!
[/Shameless_Promo]

Hope This Helps...
Cheers!
Greg W. Keller





> Date: Thu, 7 May 2015 22:28:11 +0000
> From: "Hutcheson, Mike" <Mike_Hutcheson at baylor.edu>
> To: "beowulf at beowulf.org" <beowulf at beowulf.org>
> Subject: [Beowulf] HPC in the cloud question
> Message-ID: <D1714A97.56B37%Mike_Hutcheson at baylor.edu>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi.  We are working on refreshing the centralized HPC cluster resources
> that our university researchers use.  I have been asked by our
> administration to look into HPC in the cloud offerings as a possibility to
> purchasing or running a cluster on-site.
>
> We currently run a 173-node, CentOS-based cluster with ~120TB (soon to
> increase to 300+TB) in our datacenter.  It?s a standard cluster
> configuration:  IB network, distributed file system (BeeGFS.  I really
> like it), Torque/Maui batch.  Our users run a varied workload, from
> fine-grained, MPI-based parallel aps scaling to 100s of cores to
> coarse-grained, high-throughput jobs (We?re a CMS Tier-3 site) with high
> I/O requirements.
>
> Whatever we transition to, whether it be a new in-house cluster or
> something ?out there?, I want to minimize the amount of change or learning
> curve our users would have to experience.  They should be able to focus on
> their research and not have to spend a lot of their time learning a new
> system or trying to spin one up each time they have a job to run.
>
> If you have worked with HPC in the cloud, either as an admin and/or
> someone who has used cloud resources for research computing purposes, I
> would appreciate learning your experience.
>
> Even if you haven?t used the cloud for HPC computing, please feel free to
> share your thoughts or concerns on the matter.
>
> Sort of along those same lines, what are your thoughts about leasing a
> cluster and running it on-site?
>
> Thanks for your time,
>
> Mike Hutcheson
> Assistant Director of Academic and Research Computing Services
> Baylor University
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20150508/f2f841f9/attachment.html>


More information about the Beowulf mailing list