[Beowulf] Lustre Upgrades

Jeff Johnson jeff.johnson at aeoncomputing.com
Mon Jul 23 10:58:20 PDT 2018


Paul,

How big are your ldiskfs volumes? What type of underlying hardware are
they? Running e2fsck (ldiskfs aware) is wise and can be done in parallel.
It could be within a couple of days, the time all depends on the size and
underlying hardware.

Going from 2.5.34 to 2.10.4 is a significant jump. I would be sure there
isn't a step upgrade advised. I know there has been step upgrades in the
past, not sure about going to/from these two versions.

--Jeff

On Mon, Jul 23, 2018 at 10:34 AM, Paul Edmon <pedmon at cfa.harvard.edu> wrote:

> Yeah we've found out firsthand that its problematic as we have been seeing
> issues :).  Hence the urge to upgrade.
>
> We've begun exploring this but we wanted to reach out to other people who
> may have gone through the same thing to get their thoughts.  We also need
> to figure out how significant an outage this will be.  As if it takes a day
> or two of full outage to do the upgrade that is more acceptable than a
> week.  We also wanted to know if people had experienced data
> loss/corruption in the process and any other kinks.
>
> We were planning on playing around on VM's to test the upgrade path before
> committing to upgrading our larger systems.  One of the questions we had
> though was if we needed to run e2fsck before/after the upgrade as that
> could add significant time to the outage for that to complete.
>
> -Paul Edmon-
>
> On 07/23/2018 01:18 PM, Jeff Johnson wrote:
>
> You're running 2.10.4 clients against 2.5.34 servers? I believe there are
> notable lnet attrs that don't exist in 2.5.34. Maybe a Whamcloud wiz might
> chime in but I think that version mismatch might be problematic.
>
> You can do a testbed upgrade to test taking a ldiskfs volume from 2.5.34
> to 2.10.4, just to be conservative.
>
> --Jeff
>
>
> On Mon, Jul 23, 2018 at 10:05 AM, Paul Edmon <pedmon at cfa.harvard.edu>
> wrote:
>
>> My apologies I meant 2.5.34 not 2.6.34.  We'd like to get up to 2.10.4
>> which is what our clients are running.  Recently we upgraded our cluster to
>> CentOS7 which necessitated the client upgrade.  Our storage servers though
>> stayed behind on 2.5.34.
>>
>> -Paul Edmon-
>>
>> On 07/23/2018 01:00 PM, Jeff Johnson wrote:
>>
>> Paul,
>>
>> 2.6.34 is a kernel version. What version of Lustre are you at now? Some
>> updates are easier than others.
>>
>> --Jeff
>>
>> On Mon, Jul 23, 2018 at 8:59 AM, Paul Edmon <pedmon at cfa.harvard.edu>
>> wrote:
>>
>>> We have some old large scale Lustre installs that are running 2.6.34 and
>>> we want to get these up to the latest version of Lustre.  I was curious if
>>> people in this group have any experience with doing this and if they could
>>> share them.  How do you handle upgrades like this?  How much time does it
>>> take?  What are the pitfalls?  How do you manage it with minimal customer
>>> interruption? Should we just write off upgrading and stand up new servers
>>> that are on the correct version (in which case we need to transfer the
>>> several PB's worth of data over to the new system)?
>>>
>>> Thanks for your wisdom.
>>>
>>> -Paul Edmon-
>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>>
>> --
>> ------------------------------
>> Jeff Johnson
>> Co-Founder
>> Aeon Computing
>>
>> jeff.johnson at aeoncomputing.com
>> www.aeoncomputing.com
>> t: 858-412-3810 x1001   f: 858-412-3845
>> m: 619-204-9061
>>
>> 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>>
>> High-Performance Computing / Lustre Filesystems / Scale-out Storage
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>
>
>
> --
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
>
> jeff.johnson at aeoncomputing.com
> www.aeoncomputing.com
> t: 858-412-3810 x1001   f: 858-412-3845
> m: 619-204-9061
>
> 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>
> High-Performance Computing / Lustre Filesystems / Scale-out Storage
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>


-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite C - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20180723/1d824514/attachment-0001.html>


More information about the Beowulf mailing list