TechnologyThe emergence of ATA drives as a serious alternative to enterprise storage holds the promise of significantly reducing storage acquisition costs. This is further amplified by the advent of Serial ATA, which brings features like hot-plug capability, higher bandwidth and thin cabling. However, RAID technologies currently used with SCSI and Fibre Channel storage implementations are ill-suited for use in the lower-cost ATA arena. The pervasive use of disk write-back caching and the high cost of NVRAM-based solutions negatively impacts the reliability of price advantages of the ATA platform. Write-back Caches ATA drives are slower than their SCSI / Fibre Channel counterparts from a seek-time and RPM basis. To compensate for decreased performance, ATA drive manufacturers have implemented Write-Back Caches on the drives to enhance write performance. With Write-Back Caching turned on by default, the ATA drive can signal the completion of writes more quickly than if had to wait until the data was completely transferred to the disk media. However, even as write back caching provides a performance boost, data corruption may happen in the event of a power failure if data has not been properly flushed out to the disk media. We have conducted some experiments to demonstrate the effect of leaving write-back caches. See our report on write back cache experiments for the details. RAID5 Atomic Parity Updates One of the challenges in implementing a reliable RAID5 solution is guaranteeing atomic updates of data and parity, especially during a power failure scenario. High end hardware RAID5 controllers solve this problem using battery backed non-volatile RAM (NVRAM) to ensure parity consistency. Non-NVRAM solutions involve logging the needed information to other regions of the disk before the actual updates; these are typically done at the expense of performance. See an introduction to RAID5 on more discussion on RAID5 atomic parity update.
SR5 Software RAID SR5 is a patented RAID5 technology based on a log-structured methodology. It guarantees atomic updates of all RAID5 parity stripes without using expensive NVRAM or using disk-based logging techniques that will impact performances. In addition, it operates with disk write back cache turned on, but always ensures ordered writes to the RAID5 disk array. With the Auto Write Cache technology, SR5 provides the following guarantee: suppose an application writes blocks in the following order Bi, Bi+1, Bi+2, ... to the block device exposed by SR5; after a recovery from power failure, if SR5 declares Bj to be last valid block written out to the disk media, then SR5 guarantees that all previous blocks Bj-1, Bj-2, ... are already on the disk media. The block device that SR5 exposes has this property regardless of what applications run on top of it, whether they are file systems or other applications that interact directly with the block device. SR5 adopts a flexible data structure that maps between physical and logical addresses, i.e., the mapping between block M in RAID device block and the combination of which physical disk and block K within that disk. This flexible data structure allows the parity stripe to span variable number of disks. On an array of N disks, SR5 does not always use parity stripe that span N disks; sometimes when there is insufficient requests to span N disks, it will use a stripe fewer than N disks, while continuing to protect against one disk failure. When a disk fails in a N disk array and it has not been replaced, SR5 will try to take advantage of the space in the remaining active disks to rebuild a new N-1 disk array; after the rebuild process, the array can again tolerate 1 disk failure. Other uses of the flexible data structure are in supporting deleted or added disks. Another unique feature of SR5 is tolerance of transient failure. For example, if a disk gets disconnected (e.g., someone pulls a wrong disk) and gets reconnected some time later, SR5 can recognize and continue to use relevant information in that disk. This feature is particularly useful when disks are connected over a network, which are typically subjected to transient errors (network disappears and reappears). We will later enhance SR5 to support disks connected over a network. In essence, SR5 offers a flexible approach to data protection using a software-only technique. SR5 also leverages the Intel's MMX™ Technology for fast XOR computations. In our tests, we have demonstrated that SR5 on a Intel Pentium 4 system outperforms hardware RAID5 solutions that use special XOR engines for parity computations. SR5 only requires standard drive controllers to connect the hard disks, thus there is no need to spend the money on expensive RAID hardware. SR5 is currently available on Linux. We are now releasing Linux version of SR5 software for end users. We have also conducted experiments to compare SR5 performance to several existing Linux RAID5 solutions. Please follow the links below for more information:
|
|
Last update: October 27, 2003. Copyright © 2003 Boon Storage Technologies, Inc. All Rights Reserved. |