<div dir="auto">This seems very relevant<div dir="auto"><br></div><div dir="auto"><a href="https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html?m=1">https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html?m=1</a><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 4 Jan 2018 11:49 pm, "Jörg Saßmannshausen" <<a href="mailto:sassy-work@sassy.formativ.net">sassy-work@sassy.formativ.net</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>

<br>

that was the question I was pondering about all day today and I tried to read<br>

and digest any information I could get.<br>

<br>

In the end, I contacted my friend at CERT and proposed the following:<br>

- upgrade the heanode/login node (name it how you like) as that one is exposed<br>

to the outside world via ssh<br>

- do not upgrade the compute nodes for now until we got more information about<br>

the impact of the patch(es).<br>

<br>

It would not be the first time a patch is opening up another can of worms. What<br>

I am hoping for is finding a middle way between security and performance. IF<br>

the patch(es) are save to apply, I still can roll them out to the compute<br>

nodes without loosing too much uptime. IF there is a problem regarding<br>

performance it only affects the headnode which I can ignore on that cluster.<br>

<br>

As always, your mileage will vary, specially as different clusters have<br>

different purposes.<br>

<br>

What I would like to know is: how about compensation? For me that is the same<br>

as the VW scandal last year. We, the users, have been deceived. Specially if<br>

the 30% performance loss which have been mooted are not special corner cases<br>

but are seen often in HPC. Some of the chemistry code I am supporting relies<br>

on disc I/O, others on InfiniBand and again other is running entirely in<br>

memory.<br>

<br>

These are my 2 cents. If somebody has a better idea, please let me know.<br>

<br>

All the best from a rainy and windy London<br>

<br>

Jörg<br>

<br>

<br>

Am Mittwoch, 3. Januar 2018, 13:56:50 GMT schrieb Remy Dernat:<br>

> Hi,<br>

> I renamed that thread because IMHO there is a another issue related to that<br>

> threat. Should we upgrade our system and lost a significant amount of<br>

> XFlops... ? What should be consider :   - the risk  - your user population<br>

> (size / type / average "knowledge" of hacking techs...)  - the isolation<br>

> level from the outside (internet)<br>

><br>

> So here is me question : if this is not confidential, what will you do ?<br>

> I would not patch our little local cluster, contrary to all of our other<br>

> servers. Indeed, there is another "little" risk. If our strategy is to<br>

> always upgrade/patch, in this particular case you can loose many users that<br>

> will complain about perfs... So another question : what is your global<br>

> strategy about upgrades on your clusters ? Do you upgrade it as often as<br>

> you can ? One upgrade every X months (due to the downtime issue) ... ?<br>

><br>

> Thanks,<br>

> Best regardsRémy.<br>

><br>

> -------- Message d'origine --------De : John Hearns via Beowulf<br>

> <<a href="mailto:beowulf@beowulf.org">beowulf@beowulf.org</a>> Date : 03/01/2018  09:48  (GMT+01:00) À : Beowulf<br>

> Mailing List <<a href="mailto:beowulf@beowulf.org">beowulf@beowulf.org</a>> Objet : Re: [Beowulf] Intel CPU design<br>

> bug & security flaw - kernel fix imposes performance penalty Thanks Chris.<br>

> In the past there have been Intel CPU 'bugs' trumpeted, but generally these<br>

> are fixed with a microcode update. This looks different, as it is a<br>

> fundamental part of the chips architecture.However the Register article<br>

> says: "It allows normal user programs – to discern to some extent the<br>

> layout or contents of protected kernel memory areas" I guess the phrase "to<br>

> some extent" is the vital one here. Are there any security exploits which<br>

> use this information? I guess it is inevitable that one will be engineered<br>

> now that this is known about. The question I am really asking is should we<br>

> worry about this for real world systems. And I guess tha answer is that if<br>

> the kernel developers are worried enough then yes we should be too.<br>

> Comments please.<br>

><br>

><br>

><br>

> On 3 January 2018 at 06:56, Greg Lindahl <<a href="mailto:lindahl@pbm.com">lindahl@pbm.com</a>> wrote:<br>

><br>

> On Wed, Jan 03, 2018 at 02:46:07PM +1100, Christopher Samuel wrote:<br>

> > There appears to be no microcode fix possible and the kernel fix will<br>

> ><br>

> > incur a significant performance penalty, people are talking about in the<br>

> ><br>

> > range of 5%-30% depending on the generation of the CPU. :-(<br>

><br>

> The performance hit (at least for the current patches) is related to<br>

><br>

> system calls, which HPC programs using networking gear like OmniPath<br>

><br>

> or Infiniband don't do much of.<br>

><br>

><br>

><br>

> -- greg<br>

><br>

><br>

><br>

><br>

><br>

> ______________________________<wbr>_________________<br>

><br>

> Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>

><br>

> To change your subscription (digest mode or unsubscribe) visit<br>

> <a href="http://www.beowulf.org/mailman/listinfo/beowulf" rel="noreferrer" target="_blank">http://www.beowulf.org/<wbr>mailman/listinfo/beowulf</a><br>

<br>

______________________________<wbr>_________________<br>

Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>

To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" rel="noreferrer" target="_blank">http://www.beowulf.org/<wbr>mailman/listinfo/beowulf</a><br>

</blockquote></div></div>