HA primary unit selection criteria
In a FGCP HA setup, cluster members must negotiate to determine who will become the primary unit upon connecting to the HA cluster. Once a primary unit is identified, all other members become subordinate (or secondary) members.
When does primary unit selection occur?
Primary unit selection occurs whenever a new unit joins the HA cluster or the primary unit leaves the HA cluster. It also occurs whenever a monitored interface status changes.
This can occur when:
-
Two or more units initially form a new HA cluster.
-
A new unit joins an existing HA cluster.
-
A device failover takes place, where the primary unit fails due to a device failure.
-
A link failover takes place, where a monitored interface on any unit either fails or is restored.
Relevant configurations
Configurations that can impact the HA primary unit section are listed below:
Priority |
A value between 0-255 assigned to this unit. A higher number indicates higher priority. By default, priority is 128. Priority value does not get synchronized to other HA members. |
Monitor |
Interface(s) to check for a physical link failure. |
Override |
Enable to prioritize priority value over uptime in HA primary unit selection. Disable to prioritize uptime over priority value. This setting is disabled by default. |
From CLI:
config system ha set priority <integer> set monitor <interface list> set override {enable | disable} end
From GUI:
On the System > HA page:
Primary unit selection criteria
If the HA override setting is disabled on all cluster members, the primary unit will be selected based on the following order:
If the HA override setting is enabled on all cluster members, the primary unit will be selected based on the following order:
For each criteria, if the value is the same, then it is considered a tie, and the next criteria is evaluated.
For the HA uptime criteria:
-
If the difference between HA uptime is more than five (5) minutes (300 seconds), the cluster unit that is operating longer becomes the primary unit.
-
If the difference between HA uptime is less than five (5) minutes (300 seconds), then the criteria is considered a tie.
-
If a monitored interface fails on a HA unit, its HA uptime is reset to zero (0).
-
If a cluster member restarts, the HA uptime is reset to zero (0).
In some documents, the terms MUPS and MPUS, which are based on the first letters of each criteria, are used to describe the order in which the criteria are considered during the HA primary unit selection process. |
Viewing the role of the unit
After HA primary unit selection has completed, you can view the HA role of each unit in various ways.
-
In the GUI, go to System > HA to view the members in the cluster and the role for each member.
-
From the CLI, run
get system ha status
. The role of each unit is displayed:# get system ha status … Primary: FG101FTK19xxxxx7, HA operating index = 0 Secondary: FG101FTK19xxxxx8, HA operating index = 1
-
Similarly, from the CLI, run
diagnose sys ha status
. The role of each unit is displayed.
Viewing how the primary unit was selected
You can use the get system ha status
command to see how the primary unit was selected. The output of this command contains a section called Primary selected using
that shows a history of how the primary unit was selected.
# get system ha status HA Health Status: WARNING: FG101FTK19xxxxx7 has hbdev down; WARNING: FG101FTK19xxxxx8 has hbdev down; Model: FortiGate-101F Mode: HA A-A Group Name: FGT_HA Group ID: 0 Debug: 0 Cluster Uptime: 5 days 8h:30m:57s Cluster state change time: 2024-04-12 02:25:05 Primary selected using: <2024/04/12 02:25:05> vcluster-1: FG101FTK19xxxxx7 is selected as the primary because its override priority is larger than peer member FG101FTK19xxxxx8. <2024/04/12 02:25:04> vcluster-1: FG101FTK19xxxxx7 is selected as the primary because it's the only member in the cluster. <2024/04/12 02:13:34> vcluster-1: FG101FTK19xxxxx7 is selected as the primary because its override priority is larger than peer member FG101FTK19xxxxx8. <2024/04/12 02:09:28> vcluster-1: FG101FTK19xxxxx7 is selected as the primary because it's the only member in the cluster.
Comparing the HA uptime between cluster members
You can use the CLI command diagnose sys ha dump-by group
to display the age difference of the units in a cluster. This command also displays information about a number of HA related parameters for each cluster unit.
For example, consider a cluster of two FortiGate units. Entering the diagnose sys ha dump-by group
command from the primary unit CLI displays information similar to the following:
# diagnose sys ha dump-by group ... vcluster_nr=1 vcluster-1: start_time=1712913904(2024-04-12 02:25:04), state/o/chg_time=2(work)/2(work)/1712913904(2024-04-12 02:25:04) pingsvr_flip_timeout/expire=3600s/0s 'FG101FTK19xxxxx8': ha_prio/o=1/1, link_failure=0, pingsvr_failure=0, flag=0x00000000, mem_failover=0, uptime/reset_cnt=0/2 'FG101FTK19xxxxx7': ha_prio/o=0/0, link_failure=0, pingsvr_failure=0, flag=0x00000001, mem_failover=0, uptime/reset_cnt=189/2
The last two lines of the output display status information about each cluster unit including the uptime
. The uptime
is the age difference in seconds between the two units in the cluster.
In the example, the age of the subordinate unit is 189 seconds more than the age of the primary unit. The age difference is less than five (5) minutes (less than 300 seconds), so age has no effect on primary unit selection.
Changing the cluster age difference margin
You can change the cluster age difference margin using the following command:
config system ha set ha-uptime-diff-margin <margin> end
Where the <margin>
can be from 1 to 65535 seconds (default = 300).
Resetting the uptime of a unit
For debugging purpose, you may want to reset the HA member’s uptime without restarting the unit or changing the status of a monitored interface.
To manually change the uptime:
# diagnose sys ha reset-uptime
The command resets the HA age internally and does not affect the up time displayed for cluster units using the diagnose sys ha dump-by all-vcluster
or diagnose sys ha dump-by all-vcluster
command. It also does not affect the time displayed on the Dashboard or cluster members list.