[Beowulf] since we are talking about file systems ...

Robert G. Brown rgb at phy.duke.edu
Sun Jan 22 10:23:32 PST 2006


On Sun, 22 Jan 2006, PS wrote:

> Indexing is the key;  observe how Google accesses millions of files in split 
> seconds; this could easily be achieved in a PC file system.

I think that you mean the right thing, but you're saying it in a very
confusing way.

1) Google doesn't access millions of files in a split second, it AFAIK
accesses relatively few files that are hashes (on its "index server")
that lead to URLs in a split second WITHOUT actually traversing millions
of alternatives (as you say, indexing is the key:-).  File access
latency on a physical disk makes the former all but impossible without
highly specialized kernel hacks/hooks, ramdisks, caches, disk arrays,
and so on.  Even bandwidth would be a limitation if one assumes block
I/O with a minimum block size of 4K -- 4K x 1M -> 4 Gigabytes/second
(note BYTES, not bits) exceeds the bandwidth of pretty much any physical
medium except maybe memory.

2) It cannot "easily" be achieved in a PC file system, if by that you
mean building an actual filesystem (at the kernel level) that supports
this sort of access.  There is a lot more to a scalable, robust,
journalizeable filesystem than directory lookup capabilities.  A lot of
Google's speed comes from being able to use substantial parallelism on a
distributed server environment with lots of data replication and
redundancy, a thing that is impossible for a PC filesystem with a number
of latency and bandwidth bottlenecks at different points in the dataflow
pathways towards what is typically a single physical disk on a single
e.g.  PCI-whatever channel.

I think that what you mean (correctly) is that this is something that
"most" user/programmers would be better off trying to do in userspace on
top of any general purpose, known reliable/robust/efficient PC
filesystem, using hashes customized to the application.  When I first
read your reply, though, I read it very differently as saying that it
would be easy to build a linux filesystem that actually permits millions
of files per second to be accessed and that this is what Google does
operationally.

    rgb

>
>
> Paul
>
>
> Joe Landman wrote:
>
>> Methinks I lost lots of folks with my points ...
>> 
>> Major thesis is that on well designed hardware/software/filesystems, 50000 
>> files is not a problem for accesses (though from a management point of view 
>> it is a nightmare).  For poorly designed/implemented file systems it is a 
>> nightmare.
>> 
>> Way back when in the glory days of SGI, I seem to remember xfs being tested 
>> with millions of files per directory (though don't hold me to that old 
>> memory).  Call this hearsay at this moment.
>> 
>> A well designed and implemented file system shouldn't bog you down as you 
>> scale out in size, even if you shouldn't.  Its sort of like your car.  If 
>> you go beyond 70 MPH somewhere in the US that supports such speeds, your 
>> transmission shouldn't just drop out because you hit 71 MPH.
>> 
>> Graceful degradation is a good thing.
>> 
>> Joe
>> 
>> 
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu





More information about the Beowulf mailing list