[Beowulf] PostgreSQL

Michael Will mwill at penguincomputing.com
Thu Aug 19 10:15:56 PDT 2004


Postgres and other databases inherently are designed according to the SMP-model 
and not message passing. This means that even though they might be 
heavily multithreaded they still need shared memory to coordinate things efficiently.


There are several approaches to running it on a beowulf style distributed cluster though:

1. run database on one node, and apps on other nodes
I successfully installed tomcat+jsp+postgresql to serve webapplications on a 
Scyld Beowulf cluster, but that just means that some nodes run the applicationserver 
container (tomcat) and one node runs the database that the webapps can connect to. 

This works especially well if you have a node with Quad-Opterons (SMP) designated as the
database server.


2. split databases between independend postgresql nodes
Some people slit the data that they process, maybe by hashing on the primary key,
and run separate postgres databases on their nodes, managing their own set of data.

I know somebody that has 30 nodes running his real-time data analysis with postgresql
databases, splitting up the data into 30 subsets by the primary key.


3. implement shared memory on a cluster (yuck) and pretend to be SMP
This requires a high speed low latency interconnect in order to run performant,
and definitely is no longer pure commodity hardware.

4. there exist solutions of some database middleware that distributes the data
out to several nodes. That still only gives you single node updates but allows for
multi-node reads and failover scenarios.

5. Mysql.com has an alternative which is an in-RAM clustered database that was
designed from grounds up with the message passing model in mind. It should be
the most efficient way to do this.

Michael Will

On Thursday 19 August 2004 08:46 am, Roberto Melo Cavalcante wrote:
> Hi everybody!
> 
> I'm newbie in cluster related topics. I really tried to search on your 
> archives, but they are too many to search then all. My question is not 
> quite exactly about cluster, but about the availabillity of PostgreSQL 
> on it. Sorry for that. I did not ask this question on a PostgreSQL 
> mailling list but so far I've searched for many days on its site without 
> a clear confirmation it doesn't. So please, be patient.
> 
> I'd like to know if does anyone have sucessfully installed PostgreSQL on 
> a Beowulf cluster?
> If so, does PostgreSQL is really taking advantage of the cluster? I mean 
> does it see the entire cluster as one machine?
> 
> Thanks.
> 
> Roberto Melo Cavalcante
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com




More information about the Beowulf mailing list