RAID stands for ‘Redundant Arrays of Independent Disks’ or ‘Redundant Arrays of Inexpensive Disks’. This technology has explosively grown in the recent years. This post will discuss in detail about what is RAID (Redundant Arrays of Independent Disks), it’s types, implementation, architecture and applications.
What is RAID
Data protection and Data Consistency is achieved using this Technology. RAID is an acronym for Redundant Arrays of Independent/ Inexpensive Disks. This technology combines multiple, small, inexpensive disk drives into an Array of disk drives which is treated as a single storage unit (drive) by the computer.
Fig. 1 – Introduction to Redundant Arrays of Independent Disks
This Technology plays a vital role in storing large amounts of data while preserving Data Integrity. It helps in real-time data recovery when a hard drive fails. In other words, this technology either divides or duplicates the task of one hard disk between multiple disks.
This is done to create data redundancy in case of a drive failure. RAID Mode also called Raid level is set for different application requirements. E.g. when mode “RAID 0” is set, then the system splits the data evenly between two or more disks.
Fig. 2 – Image of Redundant Arrays of Inexpensive Disks Configuration
RAID Configuration Levels
Different levels represent specific configuration of Disk Arrays. Only few configurations are practical for most of the processing systems. Hence RAID – 0 ,1,3,5 and 6 are discussed below.
- RAID – 0 (Non-Redundant Configuration)
- RAID – 1 (Mirrored Configuration)
- RAID – 3 (Bit-Interleaved Parity)
- RAID – 5 (Block-Interleaved Distributed-Parity)
- RAID – 6 (P+Q Redundancy)
RAID – 0 (Non-Redundant Configuration)
This is the fastest RAID mode which stripes the data onto each disk evenly. Data Striping refers to the distribution of data over multiple disks to make them appear as a single, large disk. This configuration offers best ‘Write’ performance and it does not employ redundancy at all. ‘Read’ performance is low in this configuration.
If one physical disk in the array fails, the data is lost. This type of configuration is preferred in super-computing environments where performance and capacity, rather than reliability are the primary concerns. Fig. 3 shows Non-Redundant Configuration where the Data is distributed evenly and striped across three disks.
Fig. 3 – Non-Redundant Configuration
RAID – 1 (Mirrored Configuration)
This type of configuration is implemented by using twice as many disks as Non-Redundant Disk Array. It is also called as Mirrored configuration in which write data is mirrored on two separate disk systems. This ensures that that there are always two copies of information.
If a disk fails, the other copy is used to implement the service requests. It is widely used in Database applications. This is a secure mode as there is no Data loss. Fig. 4 shows data is mirrored on disk 1 and 2. If Disk 1 fails, the data can be retrieved from disk 2.
Fig. 4 – Mirrored Configuration
RAID – 3 (Bit-Interleaved Parity)
This type of configuration is conceptually interleaved bit-wise over the data disks to increase the speed of access to blocks on disk storage. The architecture has two or more data disks along with an Error Correcting Code (ECC) disk which contains Exclusive-OR of the data from other disks.
Each read and write request accesses all data disks and the parity disk and hence only one request can be serviced in this configuration. The parity disk has no data and only parity. The data can be recovered from a failed disk drive by reconstructing Exclusive-OR of the remaining drives and ECC drive. This type of architecture is found in applications that require high bandwidth but not high I/0 rates.
Fig. 5 – RAID – 3 Configuration
RAID – 5 (Block-Interleaved Distributed Parity)
Disk Arrays with this type of configuration provides small read, large read, and large write performance of any redundant disk array. To improve the write performance of Mirrored Configuration and Bit-Interleaved Parity system, this architecture was introduced in which Read and Write requests are performed in parallel.
While servicing large requests in Block Interleaved Parity Distribution, disk conflicts are less because, when the striping units are traversed sequentially, each disk is accessed once.
Fig. 6 – Block-Interleaved Distributed Parity Configuration
RAID – 6 (P+Q Redundancy)
This is also called P + Q Redundancy configuration in which the disk arrays are similar to Block-Interleaved Distributed Parity disk arrays. It can perform small write operations using a read-modify-write procedure. It requires more disk space as P and Q information has to be updated. It is not yet accepted as a standard RAID Configuration.
Fig. 7 – RAID 6 Configuration
It can be implemented in two ways:
- Hardware RAID System
- Software RAID System
Hardware RAID System
The processing of Disk Arrays is offloaded to a dedicated processor on the hardware referred as RAID Controller to manage the configuration. It can be implemented on any Operating System. Most commercial applications adopts this approach.
Software RAID System
This approach is implemented using normal disk controllers available on the Motherboard and the Operating System. Software driver is loaded which helps the system to communicate with the Disk drives and is less expensive.
Fig. 8 shows typical Architecture of Hardware implementation. It consists of different components like:
- SCSI Card/ SATA Drives
- RAID Controller
Fig. 8 – Redundant Arrays of Inexpensive Disks Architecture
Redundant Arrays of Inexpensive Disks operations are carried out from the CPU (Central Processing Unit) through the dedicated SCSI/ SATA Card/ Drives.
SCSI Card/SATA Drives
SCSI is an acronym for Small Computer System Interface. It is used to control a Redundant Array of Independent Disks. It helps in coordination of the devices on the SCSI bus and the computer.
It directs data in and out of storage devices. The controller is designed to support drive formats such as SATA and SCSI. It can also be built into a server’s motherboard.
Raid configuration decides how the data is distributed on the disks and the number of disks to be used. It stores data which is accessed by the system through the controller.
Redundant Arrays of Inexpensive Disk controller has a set of instructions stored in Cache for each drive in the Array.
How does RAID Work
According to SCSI specification, the communication on the bus is asynchronous, and nearly 256 SCSI commands can be relayed to any given device at once. When a ‘write’ or ‘read’ request is issued, it is passed to the RAID Controller as a set of IDE or SCSI commands. Based on the configuration used, the controller translates these commands to their physical storage and transmits to the disk array.
Fig. 9 – Sequence of Events for Read Request
The system considers the array as a single disk. The Controller maintains cache which gets filled as the requests gets translated and schedules the execution for the requests issued. The request gets serviced and the controller sends back the acknowledgement signal along the connection interface. Block Acknowledgement (BA) is used instead of sending Ack to every single frame. BA contains bitmap size of 64*16 bits. The issue of disk failure is tackled using FEC (Forward Error Correction) and BEC ( Backward Error Recovery) techniques.
Applications of RAID
The applications include:
- It is widely used in Data Warehousing.
- It is used in Video Streaming applications.
- It is extensively used for small block applications such as Web Servers and transaction-oriented Databases.
- It is also implemented for high-end servers.
- It is used in Gaming systems.
Advantages of RAID
The advantages are:
- Transfer of large sequential files and graphic images is easier.
- Hardware based implementation is more robust.
- Software based implementation is cost-effective.
- Highest performance and Data protection can be achieved.
- Fault tolerance capacity is high.
- They require less power.
- Controller logic is built-in which helps in error detection and correction functions.
Disadvantages of RAID
The disadvantages include:
- In spite of using this technology, backup software is a must.
- Mapping Logic blocks onto physical locations is complex.
- Data chunk size affects the performance of disk array.
Also Read: Barcode Number System - Types, Structure, How it works, Application, Advantage & Disadvantage What is Li-Fi Technology - How it Works, Applications & Advantages SCADA System - Components, Hardware & Software Architecture, Types OSI Model - Characteristics of Seven Layers, Why to Use & Limitations