HA heartbeat
The HA heartbeat allows cluster units to communicate with each other. The heartbeat consists of hello packets that are sent at regular intervals by the heartbeat interface of all cluster units. The hello packets describe the state of the cluster unit (including communication sessions) and are used by other cluster units to keep the cluster synchronized. While the cluster is operating, the HA heartbeat confirms that all cluster units are functioning normally.
Key aspects of the HA Heartbeat include:
-
Synchronization: The heartbeat link transports session information, configuration changes, kernel routing tables, and IPsec keys to ensure all units are aligned.
-
Communication Protocols: Standard heartbeats utilize specific Layer 2 Ethernet frames (EtherTypes 0x8890, 0x8891, and 0x8893) to identify different types of synchronization traffic.
-
Addressing: The cluster automatically creates virtual interfaces in a hidden VDOM (vsys_ha) and assigns them link-local IPv4 addresses (169.254.0.x) to facilitate communication.
-
Failover Detection: The system monitors the heartbeat interval and lost threshold; if a peer stops sending packets for a specified duration, it is declared dead, triggering a failover.
The following sections provide detailed configuration guidelines and behavioral insights for HA heartbeat:
|
Topic |
Summary |
|---|---|
|
The standard configuration for physical appliances where Layer 2 Ethernet frames are used on dedicated interfaces to broadcast heartbeat and synchronization data. |
|
|
An alternative configuration for cloud and virtual environments that do not support Layer 2 broadcasts, instead utilizing Layer 3 IP packets to maintain the cluster connection. Once Unicast communication is established, the focus shifts to synchronization; because cluster members often reside in different physical locations or availability zones, they require unique settings that must be excluded from synchronization. |
|
|
Tailors cluster behavior to specific network environments by adjusting interval thresholds for precise failover control, modifying startup wait times for stability in high-latency sites, and applying encryption to prevent injection attacks |
|
|
The throughput capacity needed for the heartbeat link, which is primarily driven by session synchronization traffic and can surge significantly during failover events or when a new unit joins the cluster. |
|
|
The use of specific, non-standard Layer 2 protocol identifiers (such as 0x8890 through 0x8893) to distinguish between different HA functions like hello packets, session sync, and configuration management. |
|
|
The automatic assignment of virtual Link-Local IPv4 addresses (169.254.0.x) to hidden internal interfaces, allowing cluster members to route management traffic and logs between units despite using Layer 2 transport. |
|
|
The diagnostic process of verifying cluster communication, often involving packet sniffing for specific EtherTypes to ensure that intermediate switches are not blocking the Layer 2 frames required for HA. |