Disk reliability (Was: Node cloning)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Josip Loncaric josip at icase.eduWed Apr 11 20:40:04 PDT 2001
- Previous message: Disk reliability (Was: Node cloning)
- Next message: Disk reliability (Was: Node cloning)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Donald Becker wrote: > > On Wed, 11 Apr 2001, Robert G. Brown wrote: > > > I would assume the "erase" option is really a name for a new low level > > reformat that fixes the latter kind of error and MIGHT even help with > > When they say "heal", they actually mean "remap to substitute disk > blocks reserved for this purpose". They must have thought that the > concept of remapping disk blocks was too confusing. I've found a few web pages which may be of interest. The low level format on modern drives is created at the factory, possibly using very precise disk drive servo track writing machines. This process cannot be duplicated by any utility commands to the hard drive. However, each sector ID contains a flag indicating whether it is defective or not. What IBM's Drive Fitness Test and similar tools do is not low level formating but defect detection, remapping of defective sectors and zero-fill of the data areas. In typical hard drive, embedded servo bursts (written at the factory) are used to guide the disk heads (if those servo signals are erased, the drive needs to be replaced). They are followed by a gap, then sector ID, sync pattern, data area, ECC field and another gap. In mid-1990s, IBM developed the No-ID sector format which uses the disk space more efficiently (by up to 30%). The embedded servo bursts are still used to provide the servo signals, but the ID fields are stored in solid state memory rather than taking space from each sector. Also, improved servo tracking algorithms have reduced the problems caused by increased vibration at 7200rpm. http://www.storage.ibm.com/oem/tech/noid.htm http://www.storage.ibm.com/hardsoft/diskdrdl/technolo/truetrack.htm http://www.pcguide.com/ref/hdd/geom/tracksSector-c.html http://www.pcguide.com/ref/hdd/geom/formatDefect-c.html http://www.pcguide.com/ref/hdd/geom/formatUtilities-c.html Bad sectors on a disk are, well, bad. You do not want them, and if you can get a good disk instead, doing so is a good idea. However, there are also reasonably good software solutions, which primarily apply when (1) replacing 25% of the slightly troubled disks in your cluster is a pain and (2) the data on these disks is replicated 64 times with at least 75% of the copies being good. A regular application of 'e2fsck -c ...' and suitable 'rsync -ac ...' commands can keep such a cluster operating with reasonable confidence. Finally, keep in mind that if 25-35% of brand new IDE new disks can develop bad blocks, a similar percentage of the replacements could also develop bad blocks. A zero tolerance policy will mean at least an hour of system administrator's time per incident to replace the disk, reload the software and do the paperwork to have it replaced. This would be repeated every time another unit develops a bad block. The software alternative (e2fsck -c ...) can be automated and can keep the entire system operational until a group of seriously defective drives can be replaced together. While most of the IDE drives can work flawlessly, it bears noting that cheap IDE drives are designed for lighter duty than expensive SCSI models. The IDE drives are typically designed for 11 hours/day operation, but Beowulf clusters operate them 24 hours/day, 365 days a year, for years on end. [Moreover, some IDE drive servos are designed to calibrate certain parameters at powerup, so if the drive is never powered down, its mechanical parameters might drift away from the calibration.] For a list of SCSI/IDE differences, see http://www.storage.ibm.com/techsup/hddtech/ide_raid.htm Sincerely, Josip -- Dr. Josip Loncaric, Research Fellow mailto:josip at icase.edu ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/ NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
- Previous message: Disk reliability (Was: Node cloning)
- Next message: Disk reliability (Was: Node cloning)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
