| RAID
(redundant array of independent
disks; originally redundant array
of inexpensive disks) is
a way of storing the same data
in different places (thus, redundantly)
on multiple hard disks. By placing
data on multiple disks, I/O operations
can overlap in a balanced way,
improving performance. Since multiple
disks increases the mean time
between failure (MTBF), storing
data redundantly also increases
fault-tolerance.
A RAID appears to the operating
system to be a single logical
hard disk. RAID employs the technique
of striping, which involves partitioning
each drive's storage space into
units ranging from a sector (512
bytes) up to several megabytes.
The stripes of all the disks are
interleaved and addressed in order.
In a single-user system where
large records, such as medical
or other scientific images, are
stored, the stripes are typically
set up to be small (perhaps 512
bytes) so that a single record
spans all disks and can be accessed
quickly by reading all disks at
the same time.
In a multi-user system, better
performance requires establishing
a stripe wide enough to hold the
typical or maximum size record.
This allows overlapped disk I/O
across drives.
Redundant Arrays of Independent
(or Inexpensive) Disks, or RAID,
is an evolving technology that
offers significant advantages
in storage capacity, performance,
and reliability to firms that
have requirements for more information
than can be readily stored and
accessed on a single personal
computer. A RAID system comprises
two main components: an array
of four or more disks and a RAID
controller. The RAID controller
is an electronic device that provides
the interface between the host
computer and the array of disks.
It makes the array of disks look
like one very large, very fast,
very reliable disk to the host
computer. From the viewpoint of
the host computer, this large
virtual disk operates seamlessly
and transparently just like any
other disk; it does not require
changes to the computer’s
operating system or application
software.
RAID systems provide large amounts
of storage by making the data
available on several disks readily
available to the host computer.
RAID systems may contain as many
as 75 disks similar to disks used
in personal computers. With current
technology each disk may provide
10 gigabytes (billion bytes) or
more of information. Thus, RAID
systems may contain several hundred
gigabytes for computer databases,
computer networks, video production
and editing, prepress, medical
imaging and other applications.
Owing to their electronic design,
the performance of electronic
devices such as CPUs and networks
has continued to grow at a rapid
pace. Unfortunately, the electro-mechanical
design of computer disks has limited
their performance growth. Indeed,
microprocessor performance has
been doubling about every two
years, while disk performance
has taken 10 years to double.
The electronic controllers in
RAID systems overcome this limitation
by striping data across the array
of disks and by using parallel
data paths. Striping data simply
means that when the host computer
sends information to the RAID
system, the controller writes
a portion of that information
(a stripe) on each of several
disks. Thus, the data is distributed
across the disks rather than being
written only on one disk. The
RAID controller also uses parallel
data paths, so that it can perform
the operations of reading and
writing information to several
disks simultaneously. With these
capabilities, a RAID system can
write information to the disk
array or read information from
the disk array at speeds as high
as 35 megabytes (million bytes)
per second. In contrast, with
a single disk the transfer rate
is only about 10 megabytes per
second.
RAID systems also provide high
reliability and data availability
through a technique called parity
checking. In this scheme, when
the RAID controller writes information
on the disks, it also writes redundant
information called parity bits.
These parity bits can be computed
in parallel to other operations,
so that RAID systems suffer no
performance penalty when computing
parity. This parity information
has the fascinating property that
the RAID controller can re-compute
the information that was on a
disk should the disk or its connections
fail. Advanced RAID systems will
reconstruct the data from a failed
disk onto a spare disk, so that
the RAID system continues to operate
at high performance without loss
of data even if one of the component
disks fails or is removed from
the system!
With increasing demands for mass
storage capacity, performance,
and reliability in their computer
systems, many firms are adopting
RAID technology to complement
their computer systems. RAID systems
keep large transaction data bases
online, they provide real-time
video and information for broadcast,
and they provide rapid access
to large electronic files. With
the advantages RAID systems offer,
they are becoming used in an increasing
number of business and scientific
applications.
There are at least nine types
of RAID plus a non-redundant array
(RAID-0):
RAID-0.
This technique has striping
but no redundancy of data.
It offers the best performance
but no fault-tolerance. |
RAID-1. This type is also
known as disk mirroring and
consists of at least two drives
that duplicate the storage
of data. There is no striping.
Read performance is improved
since either disk can be read
at the same time. Write performance
is the same as for single
disk storage. RAID-1 provides
the best performance and the
best fault-tolerance in a
multi-user system. |
RAID-2. This type uses striping
across disks with some disks
storing error checking and
correcting (ECC) information.
It has no advantage over RAID-3.
|
RAID-3. This type uses striping
and dedicates one drive to
storing parity information.
The embedded error checking
(ECC) information is used
to detect errors. Data recovery
is accomplished by calculating
the exclusive OR (XOR) of
the information recorded on
the other drives. Since an
I/O operation addresses all
drives at the same time, RAID-3
cannot overlap I/O. For this
reason, RAID-3 is best for
single-user systems with long
record applications. |
RAID-4.
This type uses large stripes,
which means you can read records
from any single drive. This
allows you to take advantage
of overlapped I/O for read
operations. Since all write
operations have to update
the parity drive, no I/O overlapping
is possible. RAID-4 offers
no advantage over RAID-5.
|
RAID-5.
This type includes a rotating
parity array, thus addressing
the write limitation in RAID-4.
Thus, all read and write operations
can be overlapped. RAID-5
stores parity information
but not redundant data (but
parity information can be
used to reconstruct data).
RAID-5 requires at least three
and usually five disks for
the array. It's best for multi-user
systems in which performance
is not critical or which do
few write operations. |
RAID-6.
This type is similar to RAID-5
but includes a second parity
scheme that is distributed
across different drives and
thus offers extremely high
fault- and drive-failure tolerance.
There are few or no commercial
examples currently. |
RAID-7. This type includes
a real-time embedded operating
system as a controller, caching
via a high-speed bus, and
other characteristics of a
stand-alone computer. One
vendor offers this system.
|
RAID-10. This type offers
an array of stripes in which
each stripe is a RAID-1 array
of drives. This offers higher
performance than RAID-1 but
at much higher cost. |
RAID-53. This type offers
an array of stripes in which
each stripe is a RAID-3 array
of disks. This offers higher
performance than RAID-3 but
at much higher cost. |
Computer
training gets more extensive every day. If you need help with A+
Certification software, let us help you! We offer comprehensive computer
training for your business and career needs. If you are
interested in A+
Certification, don't delay any longer!
|
Another RAID Tutorial  |