Unexplained I/O errors
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Steven Timm timm at fnal.govTue Jul 17 08:19:01 PDT 2001
- Previous message: BLAST and other parallel sequence comparison programs
- Next message: Unexplained I/O errors
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi everyone, We are currently burning in a new cluster and seeing the following problem: We see a number of files, usually contiguous in the same directory, that ls will list as being there, but ls -l will show Input/output error. An fsck of the system gets rid of the I/O errors but also gets rid of the file. There is no error message on the console, nor in /var/log/messages, to indicate any disk controller problems. The problem appears to get worse over time, over a period of a few days the majority of our 136 machines exhibit these errors. Our configuration: Supermicro 370DLE motherboard, 2x1000MHz pentium III, 512 MB ram, Seagate system disk (30 GB) and CDROM on IDE primary, 2x40GB IBM drives on IDE secondary. hda: ST330620A, ATA DISK drive hdb: CD-ROM 48X/AKH, ATAPI CDROM drive hdc: IC35L040AVER07-0, ATA DISK drive hdd: IC35L040AVER07-0, ATA DISK drive I/O errors happen only on the system disk. We swapped out a large number of IDE cables for the system disk, replacing them with a better grade, with no luck. We have downgraded a few machines to the 2.2.16 kernel, and this appears to be OK, but it is a bit early to tell. We have also pulled the CD roms off of a few machines and this also appears to be stable but we need more data yet. Any idea what could be causing all of this? Steve ------------------------------------------------------------------ Steven C. Timm (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division/Operating Systems Support Scientific Computing Support Group--Computing Farms Operations
- Previous message: BLAST and other parallel sequence comparison programs
- Next message: Unexplained I/O errors
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
