[Beowulf] Cluster Novice. I want to know more about user space
richard at hinditron.com
Thu Jan 20 03:50:48 PST 2005
Thank you for the response and I appreciate your promptness. But Isn't it
funny . The guys who sell solutions to the customer are actually
non-technical guys. In my case, the marketing people who sold the cluster of
nodes did not suggest an external RAID box for the Cluster. They were going
ga ga .... over the amount of harddisk real estate each node will have. They
just say that it is enough . In my case it will be a 96 node cluster, with
each node having a mirrored 72Gig drive. As per them it comes upto 6TB of
space for the user. Do you think it is possible. If yes, how? As per me,
Like I said, I am new to cluster computing and they are old players in this
field. So, maybe they are right!!.
BTW.. I will also be using SuSE. So would you mind if I ask you for help
about SuSE in the future?.
----- Original Message -----
From: "Mark Westwood" <mark.westwood at ohmsurveys.com>
To: "Richard Chang" <richard at hinditron.com>
Cc: "Beowulf Mail List" <beowulf at beowulf.org>
Sent: Thursday, January 20, 2005 1:56 PM
Subject: Re: [Beowulf] Cluster Novice. I want to know more about user space
> I manage a small cluster here, so my answers are based on experience of
> one Beowulf. The cluster runs on SuSE Linux, which is probably irrelevant
> to any of the answers. We use it for running Fortran codes crunching a
> lot of numbers in parallel - and intended use does have some influence on
> the configuration of a cluster.
> Richard Chang wrote:
>> Hi all,
>> Here is my situation.
>> I am really new to clusters and I am assingned the task to learn about
>> it. As maintaining one such cluster will become my bread and butter.
>> Let me start. I want to know how is a cluster viewed from the Desktop of
>> a User. I have to maintain a LINUX Cluster and Is it same as the user
>> logs in to a Standalone Linux Box. Will he see all the nodes as a whole
>> or can he see all the nodes individually.
> Regular users log on to the head node which is just another Linux box. We
> use grid engine for job scheduling, so users submit their jobs to grid
> engine, which takes care of placing them onto the compute nodes. Regular
> users never log on directly to the compute nodes - though I guess we could
> construct an artificial (for us at least) scenario where this would be
>> Is he going to see only the master node? which perhaps is the only Node
>> connected to the site network and the rest of the Nodes are connected to
>> the master node, thru some internal network not accessable to the site
> Yes, the regular user only 'sees' the head node. The cluster has a
> private network not shared with the office network. I guess we could
> configure it differently and effectively put all the compute nodes on the
> office network, but that's kind of moving towards Low Performance
> Computing and we put a lot of effort into extracting High Performance from
> the cluster.
>> When we login to the Cluster, are we connected to the whole setup or
>> just the master node?.
> Best think of it as just logging into the cluster. But like the
> marketeers say, 'the network is the computer' (or is that 'the computer is
> the network' ?) so the user gets all the cluster services while running an
> interactive session on the head node.
>> What happens to the abundant Hard disk space we have in all the other
>> nodes, Can the user use it?. If yes how, coz he is logging into the
>> master node only and how can he access the other nodes. If the hard disk
>> space is only used for scratch, then why do we need a 72Gig Hard drive
>> for that matter?.
> The cheap and cheerful IDE disks in the compute nodes store O/S etc.
> Grendel forbid that they would ever be used for swap space during a
> computation but it does happen sometimes. All the useful, fast SCSI disks
> are in a RAID array attached to the head node. But this arrangement is
> quite use-specific, albeit very common. Our big computations do not do a
> lot of input / output once the initial data has been read from disk and
> distributed to the compute node's memory. Our cluster is configured for
> high performance parallel computing, I suppose a cluster which is built
> for a web-server farm would have a requirement for much faster i/o on all
> compute nodes. That's getting outside my area of expertise so I will go
> no further.
>> These are some of the issues annoying me. Pls forgive me if I am a
>> little boring and I will be glad if someone can really guide me.
> There's a lot of information about all this out there. I like the book
> 'Beowulf Cluster Computing with Linux' as a good survey of many / most
> aspects of cluster computing but there are plenty of others available.
> Then there's google ...
> Hope some of this is useful
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
> Mark Westwood
> Parallel Programmer
> OHM Ltd
> The Technology Centre
> Offshore Technology Park
> Claymore Drive
> AB23 8GD
> United Kingdom
> +44 (0)870 429 6586
More information about the Beowulf