Locality and caching in parallel/distributed file systems
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joseph Landman landman at scalableinformatics.comTue Dec 3 03:46:29 PST 2002
- Previous message: Locality and caching in parallel/distributed file systems
- Next message: Locality and caching in parallel/distributed file systems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 2002-12-03 at 19:57, Andrew Fant wrote: > Morning all, > Lately, I have been thinking a lot about parallel filesystems in > the most un-rigourous way possible. Knowing that PVFS simply stripes the > data across the participating filesystems, I was wondering if anyone had > tried to apply caching technology and file migration capacities to a > parallel/distributed filesytem in a manner analagous to SGI's ccNuma > memory architecture. That is, distributing files in the FS to various > nodes, keeping track of where the accesses are coming from, and moving > the file to another node if that is where some suitable percentage of the cough cough <avaki> cough cough... Distributed parallel file systems require distributed data and local speed access to make any sense. I am sure others may disagree, but any file system that you need to shuttle metadata about will generally not scale well (unless you have a NUMAlink like speed/latency, which pushes the scaling wall way out, but it is still there). Cluster file systems have been the rage in the past as one of the next great things. I guess I advocate waiting and seeing for this, as I have not yet seen a scalable distributed file system (and if someone knows of one, which is not too painful, please let me know). My definition of a scalable distributed file system is, BTW, one that connects to every compute node, and gives local I/O speed to simultaneous reads and writes (to the same/different files) across the single namespace. This def may not be in line with others, but it is what I use to understand the issues. The idea in building any scalable resource (net, computing, disk, etc) is to avoid single points of information flow. Maintaining metadata for file systems represents exactly that. You get hot-spot formation, and start having to do interesting gymnastics to overcome it (if it is at all possible to overcome). Data motion is rapidly becoming one of the hardest issues to deal with. Good thread start there Andy! -- Joseph Landman, Ph.D. Scalable Informatics LLC email: landman at scalableinformatics.com web: http://scalableinformatics.com voice: +1 734 612 4615 fax: +1 734 398 5774
- Previous message: Locality and caching in parallel/distributed file systems
- Next message: Locality and caching in parallel/distributed file systems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
