[Beowulf] Data Destruction

Jörg Saßmannshausen sassy-work at sassy.formativ.net
Wed Sep 29 15:41:08 UTC 2021


Dear all,

interesting discussion and very timely for me as well as we are currently 
setting up a new HPC facility, using OpenStack throughout so we can build a 
Data Safe Haven with it as well.
The question about data security came up too in various conversations, both 
internal and with industrial partners. 
Here I actually asked one collaboration partner what they understand about 
"data at rest":
- the drive has been turned off
- the data is not being accessed

For the former, that is easy we simply encrypt all drives, one way or another. 
However, that means when the drive is on, the data is not encrypted. 

For the latter that is a bit more complicated as you need to decrypt the 
files/folder when you want to access them. This, however, in addition to the 
drive encryption itself, should give you potentially the maximum security. 
When you want to destroy that data, deleting the encrypted container *and* the 
access key, i.e. the piece you need to decrypt it, like a Yubikey, should in 
my humble opinion being enough for most data. 
If you need more, shred the drive and don't use fancy stuff like RAID or PFS. 

If you still need more, don't store the data at all but print it out on paper 
and destroy it by means of incineration. :D

How about that?

All the best from a sunny London

Jörg

Am Mittwoch, 29. September 2021, 15:57:17 BST schrieb Skylar Thompson:
> In this case, we've successfully pushed back with the granting agency (US
> NIH, generally, for us) that it's just not feasible to guarantee that the
> data are truly gone on a production parallel filesystem. The data are
> encrypted at rest (including offsite backups), which has been sufficient
> for our purposes. We'll then just use something like GNU shred(1) to do a
> best-effort secure delete.
> 
> In addition to RAID, other confounding factors to be aware of are snapshots
> and cached data.
> 
> On Wed, Sep 29, 2021 at 10:52:33AM -0400, Paul Edmon via Beowulf wrote:
> > I guess the question is for a parallel filesystem how do you make sure you
> > have 0'd out the file with out borking the whole filesystem since you are
> > spread over a RAID set and could be spread over multiple hosts.
> > 
> > -Paul Edmon-
> > 
> > On 9/29/2021 10:32 AM, Scott Atchley wrote:
> > > For our users that have sensitive data, we keep it encrypted at rest and
> > > in movement.
> > > 
> > > For HDD-based systems, you can perform a secure erase per NIST
> > > standards. For SSD-based systems, the extra writes from the secure erase
> > > will contribute to the wear on the drives and possibly their eventually
> > > wearing out. Most SSDs provide an option to mark blocks as zero without
> > > having to write the zeroes. I do not think that it is exposed up to the
> > > PFS layer (Lustre, GPFS, Ceph, NFS) and is only available at the ext4 or
> > > XFS layer.
> > > 
> > > On Wed, Sep 29, 2021 at 10:15 AM Paul Edmon <pedmon at cfa.harvard.edu
> > > 
> > > <mailto:pedmon at cfa.harvard.edu>> wrote:
> > >     The former.  We are curious how to selectively delete data from a
> > >     parallel filesystem.  For example we commonly use Lustre, ceph,
> > >     and Isilon in our environment.  That said if other types allow for
> > >     easier destruction of selective data we would be interested in
> > >     hearing about it.
> > >     
> > >     -Paul Edmon-
> > >     
> > >     On 9/29/2021 10:06 AM, Scott Atchley wrote:
> > > >     Are you asking about selectively deleting data from a parallel
> > > >     file system (PFS) or destroying drives after removal from the
> > > >     system either due to failure or system decommissioning?
> > > >     
> > > >     For the latter, DOE does not allow us to send any non-volatile
> > > >     media offsite once it has had user data on it. When we are done
> > > >     with drives, we have a very big shredder.
> > > >     
> > > >     On Wed, Sep 29, 2021 at 9:59 AM Paul Edmon via Beowulf
> > > >     
> > > >     <beowulf at beowulf.org <mailto:beowulf at beowulf.org>> wrote:
> > > >         Occassionally we get DUA (Data Use Agreement) requests for
> > > >         sensitive
> > > >         data that require data destruction (e.g. NIST 800-88). We've
> > > >         been
> > > >         struggling with how to handle this in an era of distributed
> > > >         filesystems
> > > >         and disks.  We were curious how other people handle requests
> > > >         like this?
> > > >         What types of filesystems to people generally use for this
> > > >         and how do
> > > >         people ensure destruction?  Do these types of DUA's preclude
> > > >         certain
> > > >         storage technologies from consideration or are there creative
> > > >         ways to
> > > >         comply using more common scalable filesystems?
> > > >         
> > > >         Thanks in advance for the info.
> > > >         
> > > >         -Paul Edmon-
> > > >         
> > > >         _______________________________________________
> > > >         Beowulf mailing list, Beowulf at beowulf.org
> > > >         <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> > > >         To change your subscription (digest mode or unsubscribe)
> > > >         visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> > > >         <https://beowulf.org/cgi-bin/mailman/listinfo/beowulf>
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf





More information about the Beowulf mailing list