System Resilience, High Availability, QoS, and Fault Tolerance
In this part of this tutorial, we’ll take a look at System Resilience, High Availability, QoS, and Fault Tolerance to help you understand the different aspects of these concepts. As a CISSP candidate, it is important to be able to differentiate between these concepts:
- System Resilience: It is the ability to recover quickly. That is, if Site 1 goes down, Site 2 immediately comes operational. Or if a disk drive fails, another spare disk drive quickly is added to the storage pool. System Resilience includes eliminating single points of failure in system designs into critical systems.
- Quality of Service (QoS): It is a technology that enables specified services to receive a higher quality of service than other specified services. Therefore, service providers need to determine which service has the highest priority among the services they provide to their customers. For example, Voice over Internet Protocol (VoIP) systems typically are prioritized to ensure sufficient network bandwidth is always available to avoid any traffic delay or degradation of voice quality. Other services (such as web browsing) will be prioritized at a lower level. Why? Because they are not sensitive to delays. The new net neutrality law gives ISPs a right to provide a higher quality of services to a specified set of customers or for a specified service on the internet.
- High Availability: It is about having multiple redundant systems that enable zero downtime or degradation for a single failure. High availability can usually be implemented in cluster systems, and it has two modes: 1- Active-active mode: both systems are running and quickly available. 2- Active-passive mode: One system is active, while the other is in standby but can become active, usually within a matter of seconds.
- Fault Tolerance: It is the ability of a system to suffer a fault but continue to operate. How can the system have this capability? Via adding redundant components such as additional disks within a redundant array of inexpensive disks (RAID) array, multiple power supplies, NIC (multiple network interfaces), or additional servers within a failover clustered configuration.