Fortinet black logo

Handbook

Primary unit selection with override enabled

6.0.0
Copy Link
Copy Doc ID 4afb0436-a998-11e9-81a4-00505692583a:123439
Download PDF

Primary unit selection with override enabled

The HA override CLI command is disabled by default. When override is disabled, a cluster will still renegotiate when an event occurs that affects primary unit selection, such as changes in device priority or a disconnected monitored interface.

note icon For a virtual cluster configuration, override is enabled by default for both virtual clusters when you enable virtual cluster 2. For more information, see Virtual clustering.

In most cases you should keep override disabled to reduce how often the cluster negotiates. Frequent negotiations may cause frequent traffic interruptions.

However, if you want to make sure that the same cluster unit always operates as the primary unit and if you are less concerned about frequent cluster negotiation you can set its device priority higher than other cluster units and enable override.

To enable override, connect to each cluster unit CLI (using the execute ha manage command) and use the config system ha CLI command to enable override.

For override to be effective, you must also set the device priority highest on the cluster unit that you want to always be the primary unit. To increase the device priority, from the CLI use the config system ha command and increase the value of the priority keyword to a number higher than the default priority of 128.

You can also increase the device priority from the GUI by going to System > HA. To increase the device priority of the primary unit select edit for the primary or subordinate unit and set the Device Priority to a number higher than 128.

note icon The override setting and device priority value are not synchronized to all cluster units. You must enable override and adjust device priority manually and separately for each cluster unit.

With override enabled, the primary unit with the highest device priority will always become the primary unit. Whenever an event occurs that may affect primary unit selection, the cluster negotiates. For example, when override is enabled a cluster renegotiates when you change the device priority of any cluster unit or when you add a new unit to a cluster.

Override and primary unit selection

Enabling override changes the order of primary unit selection. As shown below, if override is enabled, primary unit selection considers device priority before age and serial number. This means that if you set the device priority higher on one cluster unit, with override enabled this cluster unit becomes the primary unit even if its age and serial number are lower than other cluster units.

Similar to when override is disabled, when override is enabled primary unit selection checks for connected monitored interfaces first. So if interface monitoring is enabled, the cluster unit with the most disconnected monitored interfaces cannot become the primary unit, even of the unit has the highest device priority.

If all monitored interfaces are connected (or interface monitoring is not enabled) and the device priority of all cluster units is the same then age and serial number affect primary unit selection.

Controlling primary unit selection using device priority and override

To configure one cluster unit to always become the primary unit you should set its device priority to be higher than the device priorities of the other cluster units and you should enable override on all cluster units.

Using this configuration, when the cluster is operating normally the primary unit is always the unit with the highest device priority. If the primary unit fails the cluster renegotiates to select another cluster unit to be the primary unit. If the failed primary unit recovers, starts up again and rejoins the cluster, because override is enabled, the cluster renegotiates. Because the restarted primary unit has the highest device priority it once again becomes the primary unit.

In the same situation with override disabled, because the age of the failed primary unit is lower than the age of the other cluster units, when the failed primary unit rejoins the cluster it does not become the primary unit. Instead, even though the failed primary unit may have the highest device priority it becomes a subordinate unit because its age is lower than the age of all the other cluster units.

Points to remember about primary unit selection when override is enabled

Some points to remember about primary unit selection when override is enabled:

  • The FGCP compares primary unit selection criteria in the following order: Failed Monitored Interfaces > Device Priority > Age > Serial number. The selection process stops at the first criteria that selects one cluster unit.
  • Negotiation and primary unit selection is triggered whenever an event occurs which may affect primary unit selection. For example negotiation occurs, when you change the device priority, when you add a new unit to a cluster, if a cluster unit fails, or if a monitored interface fails.
  • Device priority is considered before age. Otherwise age is handled the same when override is enabled.

Configuration changes can be lost if override is enabled

In some cases, when override is enabled and you make configuration changes to an HA cluster these changes can be lost. For example, consider the following sequence:

  1. A cluster of two FortiGates is operating with override enabled.
    • FGT-A: Primary unit with device priority 200 and with override enabled
    • FGT-B: Subordinate unit with device priority 100 and with override disabled
    • If both units are operating, FGT-A always becomes the primary unit because FGT‑A has the highest device priority.
  2. FGT-A fails and FGT-B becomes the new primary unit.
  3. The administrator makes configuration changes to the cluster.
    The configuration changes are made to FGT-B because FGT-B is operating as the primary unit. These configuration changes are not synchronized to FGT-A because FGT-A is not operating.
  4. FGT-A is restored and starts up again.
  5. The cluster renegotiates and FGT-A becomes the new primary unit.
  6. The cluster recognizes that the configurations of FGT-A and FGT-B are not the same.
  7. The configuration of FGT-A is synchronized to FGT-B.
    The configuration is always synchronized from the primary unit to the subordinate units.
  8. The cluster is now operating with the same configuration as FGT-A. The configuration changes made to FGT-B have been lost.

The solution

When override is enabled, you can prevent configuration changes from being lost by doing the following:

  • Verify that all cluster units are operating before making configuration changes (from the GUI go to System > HA to view the cluster members list or from the FortiOS CLI enter get system ha status).
  • Make sure the device priority of the primary unit is set higher than the device priorities of all other cluster units before making configuration changes.
  • Disable override either permanently or until all configuration changes have been made and synchronized to all cluster units.

Override and disconnecting a unit from a cluster

A similar scenario to that described above may occur when override is enabled and you use the Disconnect from Cluster option from the GUI or the execute ha disconnect command from the CLI to disconnect a cluster unit from a cluster.

Configuration changes made to the cluster can be lost when you reconnect the disconnected unit to the cluster. You should make sure that the device priority of the disconnected unit is lower than the device priority of the current primary unit. Otherwise, when the disconnected unit joins the cluster, if override is enabled, the cluster renegotiates and the disconnected unit may become the primary unit. If this happens, the configuration of the disconnected unit is synchronized to all other cluster units and any configuration changes made between when the unit was disconnected and reconnected are lost.

Delaying how quickly the primary unit rejoins the cluster when override is enabled

In some cases, when override is enabled and the unit designated to be the primary unit rejoins the cluster after it has rebooted, it will become the primary unit too soon and cause traffic disruption. This can happen, for example, if one of the FortiGate interfaces gets its address using PPPoE. If the backup unit is operating as the primary unit and processing traffic, when the primary unit comes up it may need a short time to get a new IP address from the PPPoE server. If the primary unit takes over the cluster before it has an IP address, traffic will be disrupted until the primary unit gets its address.

You can resolve this problem by using the following command to add a wait time. In this example the wait time is 10 seconds. The wait time range is 0 to 3600 seconds and the default wait time is 0 seconds.

config system ha

set override-wait-time 10

end

With this wait time configured, after the primary unit is up and running it has 10 seconds to synchronize sessions, get IP addresses from PPPoE and DHCP servers and so on. After 10 seconds the primary unit sends gratuitous arp packets and all traffic to the cluster is sent to the new primary unit. You can adjust the wait time according to the conditions on your network.

The override wait time can only be configured when HA override is enabled, and it is only activated after a unit boots up. For example, it is not activated after a failover triggered by the monitor interface, or when HA is changed from standalone mode to A-P or A-A mode.

During the override wait time, after the FortiGate has booted up, its HA priority is effectively zero. After the wait time expires, the FortiGate resumes its configured HA priority.

Primary unit selection with override enabled

The HA override CLI command is disabled by default. When override is disabled, a cluster will still renegotiate when an event occurs that affects primary unit selection, such as changes in device priority or a disconnected monitored interface.

note icon For a virtual cluster configuration, override is enabled by default for both virtual clusters when you enable virtual cluster 2. For more information, see Virtual clustering.

In most cases you should keep override disabled to reduce how often the cluster negotiates. Frequent negotiations may cause frequent traffic interruptions.

However, if you want to make sure that the same cluster unit always operates as the primary unit and if you are less concerned about frequent cluster negotiation you can set its device priority higher than other cluster units and enable override.

To enable override, connect to each cluster unit CLI (using the execute ha manage command) and use the config system ha CLI command to enable override.

For override to be effective, you must also set the device priority highest on the cluster unit that you want to always be the primary unit. To increase the device priority, from the CLI use the config system ha command and increase the value of the priority keyword to a number higher than the default priority of 128.

You can also increase the device priority from the GUI by going to System > HA. To increase the device priority of the primary unit select edit for the primary or subordinate unit and set the Device Priority to a number higher than 128.

note icon The override setting and device priority value are not synchronized to all cluster units. You must enable override and adjust device priority manually and separately for each cluster unit.

With override enabled, the primary unit with the highest device priority will always become the primary unit. Whenever an event occurs that may affect primary unit selection, the cluster negotiates. For example, when override is enabled a cluster renegotiates when you change the device priority of any cluster unit or when you add a new unit to a cluster.

Override and primary unit selection

Enabling override changes the order of primary unit selection. As shown below, if override is enabled, primary unit selection considers device priority before age and serial number. This means that if you set the device priority higher on one cluster unit, with override enabled this cluster unit becomes the primary unit even if its age and serial number are lower than other cluster units.

Similar to when override is disabled, when override is enabled primary unit selection checks for connected monitored interfaces first. So if interface monitoring is enabled, the cluster unit with the most disconnected monitored interfaces cannot become the primary unit, even of the unit has the highest device priority.

If all monitored interfaces are connected (or interface monitoring is not enabled) and the device priority of all cluster units is the same then age and serial number affect primary unit selection.

Controlling primary unit selection using device priority and override

To configure one cluster unit to always become the primary unit you should set its device priority to be higher than the device priorities of the other cluster units and you should enable override on all cluster units.

Using this configuration, when the cluster is operating normally the primary unit is always the unit with the highest device priority. If the primary unit fails the cluster renegotiates to select another cluster unit to be the primary unit. If the failed primary unit recovers, starts up again and rejoins the cluster, because override is enabled, the cluster renegotiates. Because the restarted primary unit has the highest device priority it once again becomes the primary unit.

In the same situation with override disabled, because the age of the failed primary unit is lower than the age of the other cluster units, when the failed primary unit rejoins the cluster it does not become the primary unit. Instead, even though the failed primary unit may have the highest device priority it becomes a subordinate unit because its age is lower than the age of all the other cluster units.

Points to remember about primary unit selection when override is enabled

Some points to remember about primary unit selection when override is enabled:

  • The FGCP compares primary unit selection criteria in the following order: Failed Monitored Interfaces > Device Priority > Age > Serial number. The selection process stops at the first criteria that selects one cluster unit.
  • Negotiation and primary unit selection is triggered whenever an event occurs which may affect primary unit selection. For example negotiation occurs, when you change the device priority, when you add a new unit to a cluster, if a cluster unit fails, or if a monitored interface fails.
  • Device priority is considered before age. Otherwise age is handled the same when override is enabled.

Configuration changes can be lost if override is enabled

In some cases, when override is enabled and you make configuration changes to an HA cluster these changes can be lost. For example, consider the following sequence:

  1. A cluster of two FortiGates is operating with override enabled.
    • FGT-A: Primary unit with device priority 200 and with override enabled
    • FGT-B: Subordinate unit with device priority 100 and with override disabled
    • If both units are operating, FGT-A always becomes the primary unit because FGT‑A has the highest device priority.
  2. FGT-A fails and FGT-B becomes the new primary unit.
  3. The administrator makes configuration changes to the cluster.
    The configuration changes are made to FGT-B because FGT-B is operating as the primary unit. These configuration changes are not synchronized to FGT-A because FGT-A is not operating.
  4. FGT-A is restored and starts up again.
  5. The cluster renegotiates and FGT-A becomes the new primary unit.
  6. The cluster recognizes that the configurations of FGT-A and FGT-B are not the same.
  7. The configuration of FGT-A is synchronized to FGT-B.
    The configuration is always synchronized from the primary unit to the subordinate units.
  8. The cluster is now operating with the same configuration as FGT-A. The configuration changes made to FGT-B have been lost.

The solution

When override is enabled, you can prevent configuration changes from being lost by doing the following:

  • Verify that all cluster units are operating before making configuration changes (from the GUI go to System > HA to view the cluster members list or from the FortiOS CLI enter get system ha status).
  • Make sure the device priority of the primary unit is set higher than the device priorities of all other cluster units before making configuration changes.
  • Disable override either permanently or until all configuration changes have been made and synchronized to all cluster units.

Override and disconnecting a unit from a cluster

A similar scenario to that described above may occur when override is enabled and you use the Disconnect from Cluster option from the GUI or the execute ha disconnect command from the CLI to disconnect a cluster unit from a cluster.

Configuration changes made to the cluster can be lost when you reconnect the disconnected unit to the cluster. You should make sure that the device priority of the disconnected unit is lower than the device priority of the current primary unit. Otherwise, when the disconnected unit joins the cluster, if override is enabled, the cluster renegotiates and the disconnected unit may become the primary unit. If this happens, the configuration of the disconnected unit is synchronized to all other cluster units and any configuration changes made between when the unit was disconnected and reconnected are lost.

Delaying how quickly the primary unit rejoins the cluster when override is enabled

In some cases, when override is enabled and the unit designated to be the primary unit rejoins the cluster after it has rebooted, it will become the primary unit too soon and cause traffic disruption. This can happen, for example, if one of the FortiGate interfaces gets its address using PPPoE. If the backup unit is operating as the primary unit and processing traffic, when the primary unit comes up it may need a short time to get a new IP address from the PPPoE server. If the primary unit takes over the cluster before it has an IP address, traffic will be disrupted until the primary unit gets its address.

You can resolve this problem by using the following command to add a wait time. In this example the wait time is 10 seconds. The wait time range is 0 to 3600 seconds and the default wait time is 0 seconds.

config system ha

set override-wait-time 10

end

With this wait time configured, after the primary unit is up and running it has 10 seconds to synchronize sessions, get IP addresses from PPPoE and DHCP servers and so on. After 10 seconds the primary unit sends gratuitous arp packets and all traffic to the cluster is sent to the new primary unit. You can adjust the wait time according to the conditions on your network.

The override wait time can only be configured when HA override is enabled, and it is only activated after a unit boots up. For example, it is not activated after a failover triggered by the monitor interface, or when HA is changed from standalone mode to A-P or A-A mode.

During the override wait time, after the FortiGate has booted up, its HA priority is effectively zero. After the wait time expires, the FortiGate resumes its configured HA priority.