← Concepts
Security ArchitectureSY0-701 · Task 3.4

High availability — SY0-701

Learn what high availability (HA) means in security architecture, how it differs from fault tolerance and resilience, and how it tests on SY0-701.

WHAT IT IS

High availability (HA) is "a failover feature to ensure availability during device or component interruptions." (NIST SP 800-113, via NIST Glossary)

Availability itself is "ensuring timely and reliable access to and use of information." (FIPS 200, derived from 44 U.S.C. § 3542, via NIST Glossary)

HA achieves that goal specifically through failover — the capability to switch over automatically (typically without human intervention or warning) to a redundant or standby system upon the failure or abnormal termination of the previously active system. (CNSSI 4009-2015 / NIST SP 800-53 Rev. 5, via NIST Glossary)

Mental model

Think of HA as a relay race handoff that the runners execute automatically the moment the lead runner trips. The baton (the service) keeps moving without the crowd (the users) noticing a drop. The relay team exists precisely so no single runner can stop the race.

The key properties that make the relay work:

  • Redundancy — a standby runner is always staged and ready.
  • Automatic failover — the handoff happens without a coach calling a timeout.
  • Continued service — the race does not pause; authorized users retain access.

When to use it

The exam frequently places HA next to two concepts that sound similar but operate at a different scope or timing. Use this table to keep them distinct.

ConceptNIST-grounded definition (summary)Primary goalWhen service resumes
High AvailabilityFailover feature to ensure availability during device or component interruptions (NIST SP 800-113)Prevent perceivable service interruptionAutomatically, during the interruption
Fault ToleranceA property of a system that allows proper operation even if components fail (NISTIR 8202)Continue correct operation through component failureContinuously — no transition needed
ResilienceAbility to operate under adverse conditions and recover to an effective operational posture in a time frame consistent with mission needs (NIST SP 800-39)Maintain essential capability and recoverAfter degradation — may involve partial recovery

The practical distinction: fault tolerance means the system never stops working even as parts fail; high availability means the system switches to a standby so quickly that service is effectively uninterrupted; resilience is the broader property of surviving and recovering from adversity, which may include a period of degraded operation.

COMMON MISCONCEPTION

Candidates often treat high availability and fault tolerance as synonyms because both involve redundancy. They are not the same.

Fault tolerance is grounded in the property of a system allowing proper operation even if components fail — the system keeps running through failure, internally. High availability is grounded in the failover mechanism — a separate standby takes over upon failure or abnormal termination. A fault-tolerant system does not require a switchover; a high-availability system does. Confusing the two can lead a candidate to select the wrong architectural control when a question asks which mechanism specifically uses a standby and an automatic switchover.

How it shows up on the exam

The cognitive target is application: given a described scenario, identify which mechanism — HA, fault tolerance, or resilience — is the appropriate control or the one already in use.

Signal phrases to watch for:

  • "automatically switches to a standby" → points to HA (failover, NIST SP 800-113 / CNSSI 4009-2015)
  • "continues to operate even as components fail" → points to fault tolerance (NISTIR 8202)
  • "recover to an effective operational posture" or "operate under adverse conditions" → points to resilience (NIST SP 800-39)

Candidates sometimes answer from intuition ("redundancy = high availability") without reading whether the scenario describes an automatic switchover or continuous operation through failure. Slowing down to identify which grounded property the scenario describes — failover, continued operation, or recovery — is the reliable path through these items.

Related concepts

  • Recovery Sites — the physical or cloud locations a failover switches to
  • Geographic Dispersion — distributing components across locations to reduce single-point-of-failure risk
  • Backups — data copies that support recovery but are distinct from the automatic-switchover mechanism of HA

Sources

Every claim on this page traces to the public exam blueprint and official documentation:

CutScore is an independent study tool and is not affiliated with, authorized by, endorsed by, or sponsored by Amazon Web Services. “AWS” and “AWS Certified AI Practitioner” are trademarks of Amazon.com, Inc. or its affiliates. All content is independently authored from the public exam blueprint and official documentation — no real exam content is used.

The exam-readiness instrument. Know if you’re ready before you book.

Company
Contact