[Beowulf] Lustre failover

Bernd Schubert bs at q-leap.de
Wed Sep 10 07:22:13 PDT 2008


On Wednesday 10 September 2008 15:02:17 Mark Hahn wrote:
> > With OST servers it is possible to have a load-balanced active/active
> > configuration.
> > Each node is the primary node for a group of OSTs, and the failover
> > node for other
>
> ...
>
> > Anyone done this on a production system?
>
> we have a number of HP's Lustre (SFS) clusters, which use
> dual-homed disk arrays, but in active/passive configuration.
> it works reasonably well.
>
> > Experiances? Comments?
>
> active/active seems strange to me - it implies that the bottleneck
> is the OSS (OST server), rather than the disk itself.  and a/a means
> each OSS has to do more locking for the shared disk, which would seem
> to make the problem worse...


No, you can do active/active with several systems

      Raid1
     /     \
OSS1        OSS2
    \      /
     Raid2


(Raid1 and Raid2 are hardware raid systems).

Now OSS1 will primarily serve Raid1 and OSS2 will primarily serve Raid2. So 
you have an active active situation. We usually do this with even more 
hardware raid systems, mirrored as software raid1 for optimal high 
availibility.


Cheers,
Bernd

-- 
Bernd Schubert
Q-Leap Networks GmbH



More information about the Beowulf mailing list