Hard, optical and tape drives are electromechanical devices, and therefore hybrids of electronics and mechanical components. They are hampered by the inertia of their mechanical components, which effectively limits the speed of media access and retrieval of data. The mechanical portions of drives are subject to wear and tear and are also the critical limiting factor for performance. Even thought they can be physically improved up to a certain point these mechanical performance improvements usually come at an increased cost of the mechanical components.
Failure prediction has long been the Holy Grail of systems administrators. The downtime costs of unanticipated outages are of great significance to the bottom line as well as to overall performance. To better understand how failure prediction can be performed, we present nex some techniques used to manage defects on drives.
Read Error Recovery
Magnetic tape, hard disc, optical and magneto-optical drives implement a vendor specific read error management strategy. However the general approach has similarities. Thus, most drives run a sequence of recovery techniques, from the simplest to the more sophisticated.
Read/Write Reallocation
The SCSI standard defines the Automatic Write and Read Reallocation (ARRE & WRRE). The automatic write reallocation was implemented long ago on WORM drives, using the Write-Verify SCSI command. Once the data is written, a Verify is performed generally with head servos (focus, tracking, phase) slightly degraded in order to detect marginal sectors/blocks based on ECC errors. Then those sectors/blocks are reallocated to a pool of alternate sectors and a reallocation table is updated to reflect the change. Following the reallocation, the disk-write task continues until it is complete.
Built-in Diagnostic Tests
Most Magnetic tape, hard disk, optical and magneto-optical drives as well as libraries implement a set of tests built into the firmware. They can be invoked via a front panel or host-based applications. Drive Self-Test, DST, is an HDD industry standard that was adopted by all major PC OEMs. Some vendors developed enhanced built-in tests.