[Beowulf] SSDs for HPC?

Prentice Bisbal prentice.bisbal at rutgers.edu
Mon Apr 7 11:53:42 PDT 2014


Beowulfers,

What are the current opinions on SSDs the local nodes of HPC clusters? 
In this case, I'm including Hadoop in HPC, since I think I need to 
provide that capability. How I will provide that is a whole 'nother 
discussion, so let's not discuss that here. While looking into this, 
this is what I've come across from my own research and discussions with 
others.

1. Using SSDs on local nodes for HDFS isn't really useful because SSDs 
are still too small. Also, since 'big data' is large squential reads, 
the performance of SSDs over spinning disks isn't as significant as if 
there were a lot of random I/O. In other words, the large size of a 
spinning disk is more valuable than the speed of SSDs in this use case.

2. SSDs used as local scratch disk can significantly speed up 
applications that write to intermediate files, and remove the burden on 
the parallel filesystem at the same time, but not many users will take 
advantage of this.

3. The best place for SSDs is on the parallel filesystem where it can be 
used as burst-buffer.

4. SSDs wearing out. Is that still a concern, or are lifespans getting 
better?I think Jim Lux once did calculations on that list to show that 
with wear-leveling and everything else, even if you wrote to an SSD 
constantly, it would still outlive the average lifespan of a cluster.

Okay. Have at it!

-- 
Prentice




More information about the Beowulf mailing list