Replacing a failed cluster unit
This procedure describes how to safely remove a failed unit from a FortiGate HA cluster and integrate a replacement (or a repaired) unit. These steps ensure the cluster remains stable and the new unit does not inadvertently take over as the primary unit before it is ready. This process does not interrupt cluster operations unless network cabling changes are required to accommodate the new hardware.
Prerequisites for the replacement unit
The replacement or repaired unit must meet these strict criteria to successfully join the cluster:
-
Hardware consistency: You must maintain identical hardware configurations across all cluster members, including the same model, generation, RAM, and number of hard disks/SSDs.
-
Firmware build: The replacement must run the exact same firmware build as the active cluster.
-
Licensing: All units must have the same level of licensing.
-
Operating mode: The unit must be in the same mode (NAT or Transparent) as the cluster.
-
VDOM mode: The unit must be in single VDOM mode initially.
-
Certificates: You may install third-party certificates on the primary unit before forming the cluster; these will synchronize to the replacement unit once it joins.
Step 1: Prepare for removal (failover check)
Before physically disconnecting the failed unit, you must confirm the status of the cluster:
-
Check cluster health: Ensure the remaining unit is healthy and has taken over all traffic.
-
Manual failover: If the "failed" unit is still partially responsive or acting as the primary unit, manually trigger a failover (by adjusting HA priority or disconnecting its heartbeat interface) to ensure the healthy unit is the primary before you remove the hardware. This prevents traffic drops during the physical swap.
-
Physical disconnection: Once the healthy unit is confirmed as primary, disconnect the failed unit from both the cluster heartbeat and the network.
Step 2: RMA transfer and registration
If you are replacing a unit through RMA, you must transfer the registration and services so the new unit can access the correct licenses.
-
Log in to the FortiCloud Support portal.
-
Go to Products > Product List and select the Serial Number (or HA vSN) of the failed unit.
-
In the Registration widget, click RMA Transfer.
-
Enter the serial number of the new replacement unit and complete the wizard.
-
Register and apply any specific license keys to the replacement unit to match the cluster's licensing level.
Step 3: Configure the replacement unit (ensure secondary status)
To prevent the new unit from accidentally becoming the primary unit and over-writing your production config with a blank one, follow this safety-first configuration:
-
Isolate the unit: Perform the initial configuration using the Console or Management port while the unit is not connected to the network.
-
Set low priority: Set the HA priority of the replacement unit to a value significantly lower than the existing primary. (For example, set the primary to 200 and the replacement to 100.)
-
Match HA Settings: Configure the HA group ID, group name, and password to match the existing cluster.
-
Override disabled: Ensure
set override disableis configured in the HA settings. This ensures the unit with the longest uptime (the existing primary) remains the leader. -
Mode sync: If the cluster is in transparent mode, you must switch the replacement unit to transparent mode using the CLI/GUI before connecting it to the cluster to avoid broadcast loops or traffic interruptions.
Step 4: Connect the replacement
-
Cabling: Connect the heartbeat (HA) interfaces first.
-
Power on: Turn on the replacement unit.
-
Monitor synchronization: Use the command
get system ha statuson the primary to see the new unit join. -
Data cables: Once the units are in sync (the Configuration Status shows as in-sync), connect the remaining data/network cables. The cluster will automatically synchronize the configuration from the primary to the replacement unit.