[Beowulf] High Performance for Large Database
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Felix Rauch Valenti felix.rauch.valenti at gmail.comMon Nov 8 20:41:52 PST 2004
- Previous message: [Beowulf] Re: torus versus (fat) tree topologies
- Next message: [Beowulf] High Performance for Large Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 27 Oct 2004 09:29:58 +0800, Laurence Liew <laurenceliew at yahoo.com.sg> wrote: [...] > 3. Try running Postgresql on a cluster filesystem like PVFS - it is not > gauranteed as it probably fails the ACID test for a SQL compliant > database. The basic idea is that if we cannot parallelise the database - > we make the underlying IO parallel and hence boost the IO performance of > the system.. and any applications that run on them.. and this includes > Postgresql. I tried this as part of my dissertation (I'm not a database person though). We basically compared the performance of thee different configurations: A single-node Oracle, Oracle on top of PVFS, and Oracle on top of a distributed-devices system. More specifically, we tried: - Oracle running on a single node with a single SCSI disk. - Oracle running on a single node, accessing its data files on a PVFS with 6 servers interconnected by Gigabit Ethernet. - Oracle running on a single node, accessing its data files on a RAID0, who's 3 constituting partitions were accessed by a special protocol (similar in its idea to network block devices) over Gigabit Ethernet. We ran the experiments (TPC-D benchmarks) a few years ago. The results were in a nutshell: The performance of the above PVFS configuration was very low, most likely because the database's 4-KByte reads were to small. While the configuration with distributed devices was much better, it was not significantly faster then the single-node configuration. To compare, we also tried the TP-Lite query-distribution middleware (which distributes the queries to 3 Oracle servers over Gigabit Ethernet), and the performance was best for most cases. If you are interested in more details (please forgive me the advertisement), you might want to have a look at chapter 8 of my thesis [1] or an upcoming paper titled "OS Support for a Commodity Database on PC Clusters -- Distributed Devices vs. Distributed File Systems" to be published at the 16th Australasian Database Conference (the final version is unfortunately not yet ready). - Felix
- Previous message: [Beowulf] Re: torus versus (fat) tree topologies
- Next message: [Beowulf] High Performance for Large Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
