[Beowulf] High Performance for Large Database

Laurence Liew laurence at scalablesystems.com
Tue Nov 9 17:08:45 PST 2004


Hi

Thanks for the information - very interesting.

I believe PVFS works better with larger IOs due to the overheads of PVFS...

You may wish to try GFS (open sourced by Red Hat after buying 
Sistina)... it may give better performance.

Laurence

Felix Rauch Valenti wrote:
> On Wed, 27 Oct 2004 09:29:58 +0800, Laurence Liew
> <laurenceliew at yahoo.com.sg> wrote:
> [...]
> 
>>3. Try running Postgresql on a cluster filesystem like PVFS - it is not
>>gauranteed as it probably fails the ACID test for a SQL compliant
>>database. The basic idea is that if we cannot parallelise the database -
>>we make the underlying IO parallel and hence boost the IO performance of
>>the system.. and any applications that run on them.. and this includes
>>Postgresql.
> 
> 
> I tried this as part of my dissertation (I'm not a database person though).
> 
> We basically compared the performance of thee different
> configurations: A single-node Oracle, Oracle on top of PVFS, and
> Oracle on top of a distributed-devices system.
> 
> More specifically, we tried:
> - Oracle running on a single node with a single SCSI disk.
> - Oracle running on a single node, accessing its data files on a PVFS
> with 6 servers interconnected by Gigabit Ethernet.
> - Oracle running on a single node, accessing its data files on a
> RAID0, who's 3 constituting partitions were accessed by a special
> protocol (similar in its idea to network block devices) over Gigabit
> Ethernet.
> 
> We ran the experiments (TPC-D benchmarks) a few years ago. The results
> were in a nutshell: The performance of the above PVFS configuration
> was very low, most likely because the database's 4-KByte reads were to
> small. While the configuration with distributed devices was much
> better, it was not significantly faster then the single-node
> configuration.
> 
> To compare, we also tried the TP-Lite query-distribution middleware
> (which distributes the queries to 3 Oracle servers over Gigabit
> Ethernet), and the performance was best for most cases.
> 
> If you are interested in more details (please forgive me the
> advertisement), you might want to have a look at chapter 8 of my
> thesis [1] or an upcoming paper titled "OS Support for a Commodity
> Database on PC Clusters -- Distributed Devices vs. Distributed File
> Systems" to be published at the 16th Australasian Database Conference
> (the final version is unfortunately not yet ready).
> 
> - Felix
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
==========================================
Visit us at Supercomputing2004. Booth #400
==========================================

Laurence Liew, CTO		Email: laurence at scalablesystems.com
Scalable Systems Pte Ltd	Web  : http://www.scalablesystems.com
(Reg. No: 200310328D)
7 Bedok South Road		Tel  : 65 6827 3953
Singapore 469272		Fax  : 65 6827 3922




More information about the Beowulf mailing list