| RAID
(redundant array of independent
disks; originally redundant array
of inexpensive disks) is
a way of storing the same data
in different places (thus, redundantly)
on multiple hard disks. By placing
data on multiple disks, I/O operations
can overlap in a balanced way,
improving performance. Since multiple
disks increases the mean time
between failure (MTBF), storing
data redundantly also increases
fault-tolerance.
A RAID appears to the operating
system to be a single logical
hard disk. RAID employs the technique
of striping, which involves partitioning
each drive's storage space into
units ranging from a sector (512
bytes) up to several megabytes.
The stripes of all the disks are
interleaved and addressed in order.
In a single-user system where
large records, such as medical
or other scientific images, are
stored, the stripes are typically
set up to be small (perhaps 512
bytes) so that a single record
spans all disks and can be accessed
quickly by reading all disks at
the same time.
In a multi-user system, better
performance requires establishing
a stripe wide enough to hold the
typical or maximum size record.
This allows overlapped disk I/O
across drives.
Redundant Arrays of Independent
(or Inexpensive) Disks, or RAID,
is an evolving technology that
offers significant advantages
in storage capacity, performance,
and reliability to firms that
have requirements for more information
than can be readily stored and
accessed on a single personal
computer. A RAID system comprises
two main components: an array
of four or more disks and a RAID
controller. The RAID controller
is an electronic device that provides
the interface between the host
computer and the array of disks.
It makes the array of disks look
like one very large, very fast,
very reliable disk to the host
computer. From the viewpoint of
the host computer, this large
virtual disk operates seamlessly
and transparently just like any
other disk; it does not require
changes to the computer’s
operating system or application
software.
RAID systems provide large amounts
of storage by making the data
available on several disks readily
available to the host computer.
RAID systems may contain as many
as 75 disks similar to disks used
in personal computers. With current
technology each disk may provide
10 gigabytes (billion bytes) or
more of information. Thus, RAID
systems may contain several hundred
gigabytes for computer databases,
computer networks, video production
and editing, prepress, medical
imaging and other applications.
Owing to their electronic design,
the performance of electronic
devices such as CPUs and networks
has continued to grow at a rapid
pace. Unfortunately, the electro-mechanical
design of computer disks has limited
their performance growth. Indeed,
microprocessor performance has
been doubling about every two
years, while disk performance
has taken 10 years to double.
The electronic controllers in
RAID systems overcome this limitation
by striping data across the array
of disks and by using parallel
data paths. Striping data simply
means that when the host computer
sends information to the RAID
system, the controller writes
a portion of that information
(a stripe) on each of several
disks. Thus, the data is distributed
across the disks rather than being
written only on one disk. The
RAID controller also uses parallel
data paths, so that it can perform
the operations of reading and
writing information to several
disks simultaneously. With these
capabilities, a RAID system can
write information to the disk
array or read information from
the disk array at speeds as high
as 35 megabytes (million bytes)
per second. In contrast, with
a single disk the transfer rate
is only about 10 megabytes per
second.
RAID systems also provide high
reliability and data availability
through a technique called parity
checking. In this scheme, when
the RAID controller writes information
on the disks, it also writes redundant
information called parity bits.
These parity bits can be computed
in parallel to other operations,
so that RAID systems suffer no
performance penalty when computing
parity. This parity information
has the fascinating property that
the RAID controller can re-compute
the information that was on a
disk should the disk or its connections
fail. Advanced RAID systems will
reconstruct the data from a failed
disk onto a spare disk, so that
the RAID system continues to operate
at high performance without loss
of data even if one of the component
disks fails or is removed from
the system!
TOP
With increasing demands for mass
storage capacity, performance,
and reliability in their computer
systems, many firms are adopting
RAID technology to complement
their computer systems. RAID systems
keep large transaction data bases
online, they provide real-time
video and information for broadcast,
and they provide rapid access
to large electronic files. With
the advantages RAID systems offer,
they are becoming used in an increasing
number of business and scientific
applications.
There are at least nine types
of RAID plus a non-redundant array
(RAID-0): |
- RAID-0. This technique has striping but no redundancy of data. It offers the best performance but no fault-tolerance.
- RAID-1. This type is also known as disk mirroring and consists of at least two drives that duplicate the storage of data. There is no striping. Read performance is improved since either disk can be read at the same time. Write performance is the same as for single disk storage. RAID-1 provides the best performance and the best fault-tolerance in a multi-user system.
- RAID-2. This type uses striping across disks with some disks storing error checking and correcting (ECC) information. It has no advantage over RAID-3.
- RAID-3. This type uses striping and dedicates one drive to storing parity information. The embedded error checking (ECC) information is used to detect errors. Data recovery is accomplished by calculating the exclusive OR (XOR) of the information recorded on the other drives. Since an I/O operation addresses all drives at the same time, RAID-3 cannot overlap I/O. For this reason, RAID-3 is best for single-user systems with long record applications.
- RAID-4. This type uses large stripes, which means you can read records from any single drive. This allows you to take advantage of overlapped I/O for read operations. Since all write operations have to update the parity drive, no I/O overlapping is possible. RAID-4 offers no advantage over RAID-5.
- RAID-5. This type includes a rotating parity array, thus addressing the write limitation in RAID-4. Thus, all read and write operations can be overlapped. RAID-5 stores parity information but not redundant data (but parity information can be used to reconstruct data). RAID-5 requires at least three and usually five disks for the array. It's best for multi-user systems in which performance is not critical or which do few write operations.
- RAID-6. This type is similar to RAID-5 but includes a second parity scheme that is distributed across different drives and thus offers extremely high fault- and drive-failure tolerance. There are few or no commercial examples currently.
- RAID-7. This type includes a real-time embedded operating system as a controller, caching via a high-speed bus, and other characteristics of a stand-alone computer. One vendor offers this system.
- RAID-10. This type offers an array of stripes in which each stripe is a RAID-1 array of drives. This offers higher performance than RAID-1 but at much higher cost.
- RAID-53. This type offers an array of stripes in which each stripe is a RAID-3 array of disks. This offers higher performance than RAID-3 but at much higher cost.
Computer training gets more extensive every day. If you need help with A+ Certification software, let us help you! We offer comprehensive computer training for your business and career needs. If you are interested in A+ Certification, don't delay any longer! |