[Beowulf] Lustre on google cloud

Chris Dagdigian dag at sonsorol.org
Thu Aug 1 04:11:24 PDT 2019


Getting off topic here but the Capital One data breach was not the 
result of a cloud provider failure or cloud provider security hole. The 
only audits that would have identified the problem would have been 
client-side audits.

It was a failure of the "shared security model" where AWS is very clear 
about the boundaries of what they are supposed to do vs what the 
client/user is responsible for and in this situation the failure was 
clearly on the Capital One side although there is wiggle room to perhaps 
blame an as-yet-unamed WAF vendor (see below).

The short summary is :

  - A commercial WAF (web application firewall) appliance  (and not the 
AWS managed WAF service) was either misconfigured or actually had a 
security vulnerability
  - Capital One was using this WAF appliance to protect an internet 
facing service
  - The attacker was able to either gain a shell on the WAF or access a 
vulnerability that let her access the EC2 instance metadata on the WAF 
appliance
  - Inside EC2 instance metadata were the constantly rotating IAM EC2 
instance-role credentials that the WAF itself used to talk to AWS APIs

So the attacker was able to steal the rotating credential set used by 
the WAF out of instance metadata and use those transient API keys to go 
hunting in S3 buckets for private data

This is where Capital One will be getting some serious side eye ..

1) It is legit that a WAF may need S3 service access if (for instance) 
it dumps logs there but why did it's permission set include read access 
to buckets hosting sensitive data? In a least-priv model the permissions 
given to the WAF appliance should have been "you can only access the log 
bucket and nothing else"

2) S3 buckets hosting sensitive data probably should have had source-IP 
access rules applied to them. The attacker apparently did not steal the 
data via the WAF -- she used the WAF to pull the API keys out and then 
accessed S3 via other methods using the ripped keys. This meant that she 
was accessing from IPs that pretty clearly were not internal or 
CapitalOne managed.

2) This is in the hindsight-20/20 category but if they were running 
sophisticated security tooling they presumably should have been able to 
detect that an API Keypair was suddenly acting "out of character" and 
accessing things that it had never touched in the past.  This is the "we 
have the capability to alert on anomalous behavior" feature that a lot 
of people want or think they can buy ready to go off the shelf.  I sort 
of give them a pass on this because right now this part of the security 
industry is full of bullshit vendors selling "AI" solutions that will 
magically solve all your problems.  This is still a very hard problem in 
2019 -- finding the anomalous needle in a haystack without crushing your 
analysts under a wave of false positives


Chris



> Jonathan Aquilina <mailto:jaquilina at eagleeyet.net>
> August 1, 2019 at 1:05 AM
> Hi Gerald,
>
> I think the question is how do these cloud providers let such 
> misconfigurations get through to production systems. Arent audits 
> carried out to ensure that this doesn’t happen?
>
> Regards,
> Jonathan
>
> -----Original Message-----
> From: Beowulf <beowulf-bounces at beowulf.org> On Behalf Of Gerald Henriksen
> Sent: Thursday, 1 August 2019 02:46
> To: Beowulf at beowulf.org
> Subject: Re: [Beowulf] Lustre on google cloud
>
>
> Not sure what the Capital One data breach has to do with the cloud, it 
> was (yet again?) misconfigured software that allowed the theft.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin 
> Computing To change your subscription (digest mode or unsubscribe) 
> visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> Gerald Henriksen <mailto:ghenriks at gmail.com>
> July 31, 2019 at 8:45 PM
>
> Not sure what the Capital One data breach has to do with the cloud, it
> was (yet again?) misconfigured software that allowed the theft.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> Jonathan Aquilina <mailto:jaquilina at eagleeyet.net>
> July 31, 2019 at 12:10 AM
>
> Hi Jon,
>
> They now have Lustre through FSx or what ever AWS have called it. I am 
> not sure you guys have heard about the capital one data breach but at 
> times im still rather weary of the cloud.
>
> Regards,
>
> Jonathan
>
> *From:*Jonathan Engwall <engwalljonathanthereal at gmail.com>
> *Sent:* Wednesday, 31 July 2019 01:03
> *To:* Douglas Eadline <deadline at eadline.org>
> *Cc:* Jonathan Aquilina <jaquilina at eagleeyet.net>; Beowulf Mailing 
> List <Beowulf at beowulf.org>; Chris Samuel <chris at csamuel.org>
> *Subject:* Re: [Beowulf] Lustre on google cloud
>
> AWS has a host of free tier sercives you should blend together. 
> Elastic Beanstalk and Lambda (AWS proprietary lambda) can move lots of 
> data below a cost level.
>
> Your volume will automatically cause billing obviously. I have a 
> friend at AWS. Maybe something new is going on, I can check up with him.
>
> On Mon, Jul 29, 2019, 11:24 AM Douglas Eadline <deadline at eadline.org 
> <mailto:deadline at eadline.org>> wrote:
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> Jonathan Engwall <mailto:engwalljonathanthereal at gmail.com>
> July 30, 2019 at 7:03 PM
> AWS has a host of free tier sercives you should blend together. 
> Elastic Beanstalk and Lambda (AWS proprietary lambda) can move lots of 
> data below a cost level.
> Your volume will automatically cause billing obviously. I have a 
> friend at AWS. Maybe something new is going on, I can check up with him.
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> Douglas Eadline <mailto:deadline at eadline.org>
> July 29, 2019 at 2:04 PM
>> What would be the reason for getting such large data sets back on premise?
>> Why not leave them in the cloud for example in an S3 bucket on amazon or
>> google data store.
> I think this touches on the ownership issue I have seen some
> people mention (I think Addison Snell or i360). That is, you own
> the data but not the infrastructure.
>
> To use the "data lake" analogy, you start
> out creating a swimming pool in the cloud. You own
> the water, but it is in someone else's pool. Manageable.
> At some point your little pool becomes a big lake. Moving the lake,
> for any number of reasons, become a really big issue and possibly
> unmanageable.
>
> "For any number of reasons" can be cost, performance, access,
> etc. and the issues you never imagined (a black swan as it were)
>
> Just like everything else, it all depends ... (and how risk adverse
> you are).
>
> --
> Doug
>
>
>
>> Regards,
>> Jonathan
>>
>> -----Original Message-----
>> From: Beowulf <beowulf-bounces at beowulf.org> On Behalf Of Chris Samuel
>> Sent: Sunday, 28 July 2019 03:36
>> To: beowulf at beowulf.org
>> Subject: Re: [Beowulf] Lustre on google cloud
>>
>> On Friday, 26 July 2019 4:46:56 AM PDT John Hearns via Beowulf wrote:
>>
>>> Terabyte scale data movement into or out of the cloud is not scary in
>>> 2019.
>>> You can move data into and out of the cloud at basically the line rate
>>> of your internet connection as long as you take a little care in
>>> selecting and tuning your firewalls and inline security devices.
>>> Pushing  1TB/day etc.
>>> into the cloud these days is no big deal and that level of volume is
>>> now normal for a ton of different markets and industries.
>> Whilst this is true as Chris points out this does not mean that there
>> won't be data transport costs imposed by the cloud provider (usually for
>> egress).
>>
>> All the best,
>> Chris
>> --
>>    Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190801/45041771/attachment-0001.html>


More information about the Beowulf mailing list