[Beowulf] Lustre failover
cap at nsc.liu.se
Wed Sep 10 09:09:32 PDT 2008
On Wednesday 10 September 2008, Mark Hahn wrote:
> >> active/active seems strange to me - it implies that the bottleneck
> >> is the OSS (OST server), rather than the disk itself. and a/a means
> >> each OSS has to do more locking for the shared disk, which would seem
> >> to make the problem worse...
> > No, you can do active/active with several systems
> > Raid1
> > / \
> > OSS1 OSS2
> > \ /
> > Raid2
> > (Raid1 and Raid2 are hardware raid systems).
> > Now OSS1 will primarily serve Raid1 and OSS2 will primarily serve Raid2.
> > So
> yes, I know - that's how HP SFS is set up. the OP was talking
> active-active, though, meaning that IO at any instant can go to either OSS
> and still make it onto a particular raid. otherwise it's active/passive,
> what SFS does.
I have a real hard time understanding how lustre could manage an active/active
OST. This based on the fact that an OST is essentially a ldiskfs(ext4)
filesystem on a device and this setup does not work in a situation where more
than one entity modifies the data.
I think that what the lustre manual is refering to is a setup with two OSTs on
a pair of servers. In this config one server would be active for one OST and
passive for the other (and vice versa).
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 189 bytes
Desc: This is a digitally signed message part.
More information about the Beowulf