Fortinet black logo

Handbook

Synchronizing the configuration

6.0.0
Copy Link
Copy Doc ID 4afb0436-a998-11e9-81a4-00505692583a:641401
Download PDF

Synchronizing the configuration

The FGCP uses a combination of incremental and periodic synchronization to make sure that the configuration of all cluster units is synchronized to that of the primary unit.

The following settings are not synchronized between cluster units:

  • The FortiGate host name
  • GUI Dashboard widgets
  • HA override
  • HA device priority
  • The virtual cluster priority
  • The HA priority setting for a ping server (or dead gateway detection) configuration
  • The system interface settings of the HA reserved management interface
  • The HA default route for the reserved management interface, set using the ha-mgmt-interface-gateway option of the config system ha command

Most subscriptions and licenses are not synchronized, as each FortiGate must be licensed individually. FortiToken Mobile is an exception; they are registered to the primary unit and synchronized to the slaves.

The primary unit synchronizes all other configuration settings, including the other HA configuration settings.

All synchronization activity takes place over the HA heartbeat link using TCP/703 and UDP/703 packets.

Disabling automatic configuration synchronization

In some cases you may want to use the following command to disable automatic synchronization of the primary unit configuration to all cluster units.

config system ha

set sync-config disable

end

When this option is disabled the cluster no longer synchronizes configuration changes. If a device failure occurs, the new primary unit may not have the same configuration as the failed primary unit. As a result, the new primary unit may process sessions differently or may not function on the network in the same way.

In most cases you should not disable automatic configuration synchronization. However, if you have disabled this feature you can use the execute ha synchronize command to manually synchronize a subordinate unit’s configuration to that of the primary unit.

You must enter execute ha synchronize commands from the subordinate unit that you want to synchronize with the primary unit. Use the execute ha manage command to access a subordinate unit CLI.

For example, to access the first subordinate unit and force a synchronization at any time, even if automatic synchronization is disabled enter:

execute ha manage 0

execute ha synchronize start

You can use the following command to stop a synchronization that is in progress.

execute ha synchronize stop

Incremental synchronization

When you log into the cluster GUI or CLI to make configuration changes, you are actually logging into the primary unit. All of your configuration changes are first made to the primary unit. Incremental synchronization then immediately synchronizes these changes to all of the subordinate units.

When you log into a subordinate unit CLI (for example using execute ha manage) all of the configuration changes that you make to the subordinate unit are also immediately synchronized to all cluster units, including the primary unit, using the same process.

Incremental synchronization also synchronizes other dynamic configuration information such as the DHCP server address lease database, routing table updates, IPsec SAs, MAC address tables, and so on. See DHCP and PPPoE compatability for more information about DHCP server address lease synchronization and Synchronizing kernel routing tables for information about routing table updates.

Whenever a change is made to a cluster unit configuration, incremental synchronization sends the same configuration change to all other cluster units over the HA heartbeat link. An HA synchronization process running on the each cluster unit receives the configuration change and applies it to the cluster unit. The HA synchronization process makes the configuration change by entering a CLI command that appears to be entered by the administrator who made the configuration change in the first place.

Synchronization takes place silently, and no log messages are recorded about the synchronization activity. However, log messages can be recorded by the cluster units when the synchronization process enters CLI commands. You can see these log messages on the subordinate units if you enable event logging and set the minimum severity level to Information and then check the event log messages written by the cluster units when you make a configuration change.

You can also see these log messages on the primary unit if you make configuration changes from a subordinate unit.

Periodic synchronization

Incremental synchronization makes sure that as an administrator makes configuration changes, the configurations of all cluster units remain the same. However, a number of factors could cause one or more cluster units to go out of sync with the primary unit. For example, if you add a new unit to a functioning cluster, the configuration of this new unit will not match the configuration of the other cluster units. Its not practical to use incremental synchronization to change the configuration of the new unit.

Periodic synchronization is a mechanism that looks for synchronization problems and fixes them. Every minute the cluster compares the configuration file checksum of the primary unit with the configuration file checksums of each of the subordinate units. If all subordinate unit checksums are the same as the primary unit checksum, all cluster units are considered synchronized.

If one or more of the subordinate unit checksums is not the same as the primary unit checksum, the subordinate unit configuration is considered out of sync with the primary unit. The checksum of the out of sync subordinate unit is checked again every 15 seconds. This re-checking occurs in case the configurations are out of sync because an incremental configuration sequence has not completed. If the checksums do not match after 5 checks the subordinate unit that is out of sync retrieves the configuration from the primary unit. The subordinate unit then reloads its configuration and resumes operating as a subordinate unit with the same configuration as the primary unit.

The configuration of the subordinate unit is reset in this way because when a subordinate unit configuration gets out of sync with the primary unit configuration there is no efficient way to determine what the configuration differences are and to correct them. Resetting the subordinate unit configuration becomes the most efficient way to resynchronize the subordinate unit.

Synchronization requires that all cluster units run the same FortiOS firmware build. If some cluster units are running different firmware builds, then unstable cluster operation may occur and the cluster units may not be able to synchronize correctly.

note icon Re-installing the firmware build running on the primary unit forces the primary unit to upgrade all cluster units to the same firmware build.

Console messages when configuration synchronization succeeds

When a cluster first forms, or when a new unit is added to a cluster as a subordinate unit, the following messages appear on the CLI console to indicate that the unit joined the cluster and had its configuring synchronized with the primary unit.

slave's configuration is not in sync with master's, sequence:0

slave's configuration is not in sync with master's, sequence:1

slave's configuration is not in sync with master's, sequence:2

slave's configuration is not in sync with master's, sequence:3

slave's configuration is not in sync with master's, sequence:4

slave starts to sync with master

logout all admin users

slave succeeded to sync with master

Console messages when configuration synchronization fails

If you connect to the console of a subordinate unit that is out of synchronization with the primary unit, messages similar to the following are displayed.

slave is not in sync with master, sequence:0. (type 0x3)

slave is not in sync with master, sequence:1. (type 0x3)

slave is not in sync with master, sequence:2. (type 0x3)

slave is not in sync with master, sequence:3. (type 0x3)

slave is not in sync with master, sequence:4. (type 0x3)

global compared not matched

If synchronization problems occur the console message sequence may be repeated over and over again. The messages all include a type value (in the example type 0x3). The type value can help Fortinet Support diagnose the synchronization problem.

HA out of sync object messages and the configuration objects that they reference
Out of Sync Message Configuration Object
HA_SYNC_SETTING_CONFIGURATION = 0x03 /data/config
HA_SYNC_SETTING_AV = 0x10
HA_SYNC_SETTING_VIR_DB = 0x11 /etc/vir
HA_SYNC_SETTING_SHARED_LIB = 0x12 /data/lib/libav.so
HA_SYNC_SETTING_SCAN_UNIT = 0x13 /bin/scanunitd
HA_SYNC_SETTING_IMAP_PRXY = 0x14 /bin/imapd
HA_SYNC_SETTING_SMTP_PRXY = 0x15 /bin/smtp
HA_SYNC_SETTING_POP3_PRXY = 0x16 /bin/pop3
HA_SYNC_SETTING_HTTP_PRXY = 0x17 /bin/thttp
HA_SYNC_SETTING_FTP_PRXY = 0x18 /bin/ftpd
HA_SYNC_SETTING_FCNI = 0x19 /etc/fcni.dat
HA_SYNC_SETTING_FDNI = 0x1a /etc/fdnservers.dat
HA_SYNC_SETTING_FSCI = 0x1b /etc/sci.dat
HA_SYNC_SETTING_FSAE = 0x1c /etc/fsae_adgrp.cache
HA_SYNC_SETTING_IDS = 0x20 /etc/ids.rules
HA_SYNC_SETTING_IDSUSER_RULES = 0x21 /etc/idsuser.rules
HA_SYNC_SETTING_IDSCUSTOM = 0x22
HA_SYNC_SETTING_IDS_MONITOR = 0x23 /bin/ipsmonitor
HA_SYNC_SETTING_IDS_SENSOR = 0x24 /bin/ipsengine
HA_SYNC_SETTING_NIDS_LIB = 0x25 /data/lib/libips.so
HA_SYNC_SETTING_WEBLISTS = 0x30
HA_SYNC_SETTING_CONTENTFILTER = 0x31 /data/cmdb/webfilter.bword
HA_SYNC_SETTING_URLFILTER = 0x32 /data/cmdb/webfilter.urlfilter
HA_SYNC_SETTING_FTGD_OVRD = 0x33 /data/cmdb/webfilter.fgtd-ovrd
HA_SYNC_SETTING_FTGD_LRATING = 0x34 /data/cmdb/webfilter.fgtd-ovrd
HA_SYNC_SETTING_EMAILLISTS = 0x40
HA_SYNC_SETTING_EMAILCONTENT = 0x41 /data/cmdb/spamfilter.bword
HA_SYNC_SETTING_EMAILBWLIST = 0x42 /data/cmdb/spamfilter.emailbwl
HA_SYNC_SETTING_IPBWL = 0x43 /data/cmdb/spamfilter.ipbwl
HA_SYNC_SETTING_MHEADER = 0x44 /data/cmdb/spamfilter.mheader
HA_SYNC_SETTING_RBL = 0x45 /data/cmdb/spamfilter.rbl
HA_SYNC_SETTING_CERT_CONF = 0x50 /etc/cert/cert.conf
HA_SYNC_SETTING_CERT_CA = 0x51 /etc/cert/ca
HA_SYNC_SETTING_CERT_LOCAL = 0x52 /etc/cert/local
HA_SYNC_SETTING_CERT_CRL = 0x53 /etc/cert/crl
HA_SYNC_SETTING_DB_VER = 0x55
HA_GET_DETAIL_CSUM = 0x71
HA_SYNC_CC_SIG = 0x75 /etc/cc_sig.dat
HA_SYNC_CC_OP = 0x76 /etc/cc_op
HA_SYNC_CC_MAIN = 0x77 /etc/cc_main
HA_SYNC_FTGD_CAT_LIST = 0x7a /migadmin/webfilter/ublock/ftgd/ data/

Comparing checksums of cluster units

You can use the diagnose sys ha checksum show command to compare the configuration checksums of all cluster units. The output of this command shows checksums labeled global and all as well as checksums for each of the VDOMs including the root VDOM. The get system ha-nonsync-csum command can be used to display similar information; however, this command is intended to be used by FortiManager.

The primary unit and subordinate unit checksums should be the same. If they are not you can use the execute ha synchronize start command to force a synchronization.

The following command output is for the primary unit of a cluster that does not have multiple VDOMs enabled:

diagnose sys ha checksum show

is_manage_master()=1, is_root_master()=1

debugzone

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

checksum

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

The following command output is for a subordinate unit of the same cluster:

diagnose sys ha checksum show

is_manage_master()=0, is_root_master()=0

debugzone

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

checksum

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

The following example shows using this command for the primary unit of a cluster with multiple VDOMs. Two VDOMs have been added named test and Eng_vdm.

From the primary unit:

config global

diagnose sys ha checksum show

is_manage_master()=1, is_root_master()=1

debugzone

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

checksum

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

From the subordinate unit:

config global

diagnose sys ha checksum show

is_manage_master()=0, is_root_master()=0

debugzone

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

checksum

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

How to diagnose HA out of sync messages

This section describes how to use the diagnose sys ha checksum show and diagnose debug commands to diagnose the cause of HA out of sync messages.

If HA synchronization is not successful, use the following procedures on each cluster unit to find the cause.

To determine why HA synchronization does not occur
  1. Connect to each cluster unit CLI by connected to the console port.
  2. Enter the following commands to enable debugging and display HA out of sync messages.

    diagnose debug enable

    diagnose debug console timestamp enable

    diagnose debug application hatalk -1

    diagnose debug application hasync -1

    Collect the console output and compare the out of sync messages with the information in the table HA out of sync object messages and the configuration objects that they reference.

  3. Enter the following commands to turn off debugging.

    diagnose debug disable

    diagnose debug reset

To determine what part of the configuration is causing the problem

If the previous procedure displays messages that include sync object 0x30 (for example, HA_SYNC_SETTING_CONFIGURATION = 0x03) there is a synchronization problem with the configuration. Use the following steps to determine the part of the configuration that is causing the problem.

If your cluster consists of two cluster units, use this procedure to capture the configuration checksums for each unit. If your cluster consists of more that two cluster units, repeat this procedure for all cluster units that returned messages that include 0x30 sync object messages.

  1. Connect to each cluster unit CLI by connected to the console port.
  2. Enter the following command to turn on terminal capture

    diagnose debug enable

  3. Enter the following command to stop HA synchronization.

    execute ha sync stop

  4. Enter the following command to display configuration checksums.

    diagnose sys ha checksum show global

  5. Copy the output to a text file.
  6. Repeat for all affected units.
  7. Compare the text file from the primary unit with the text file from each cluster unit to find the checksums that do not match.

    You can use a diff function to compare text files.

  8. Repeat for the root VDOM:

    diagnose sys ha checksum show root

  9. Repeat for all VDOMS (if multiple VDOM configuration is enabled):

    diagnose sys ha checksum show <vdom-name>

  10. You can also use the grep option to just display checksums for parts of the configuration.

    For example to display system related configuration checksums in the root VDOM or log-related checksums in the global configuration:

    diagnose sys ha checksum root | grep system

    diagnose sys ha chechsum global | grep log

    Generally it is the first non-matching checksum that is the cause of the synchronization problem.

  11. Attempt to remove/change the part of the configuration that is causing the problem. You can do this by making configuration changes from the primary unit or subordinate unit CLI.
  12. Enter the following commands to start HA configuration and stop debugging:

    execute ha sync start

    diagnose debug disable

    diagnose debug reset

Recalculating the checksums to resolve out of sync messages

Sometimes an error can occur when checksums are being calculated by the cluster. As a result of this calculation error the CLI console could display out of sync error messages even though the cluster is otherwise operating normally. You can also sometimes see checksum calculation errors in diagnose sys ha checksum command output when the checksums listed in the debugzone output don’t match the checksums in the checksum part of the output.

One solution to this problem could be to re-calculate the checksums. The re-calculated checksums should match and the out of sync error messages should stop appearing.

You can use the following command to re-calculate HA checksums:

diagnose sys ha checksum recalculate [<vdom-name> | global]

Just entering the command without options recalculates all checksums. You can specify a VDOM name to just recalculate the checksums for that VDOM. You can also enter global to recalculate the global checksum.

Determining what is causing a configuration synchronization problem

There are twenty-five FortiOS modules that have their configurations synchronized. It can be difficult to find the cause of a synchronization problem with so much data to analyze. You can use the following diagnose commands to more easily find modules that may be causing synchronization problems.

diagnose sys ha hasync-stats {all | most-recent [<seconds>] | by object [<number>]}

all displays the synchronization activity for all modules that happened since the hasync process started running (usually this would be since the cluster started-up).

most-recent [<seconds>] displays the most recently occurring synchronization events. You can include a time in seconds to display recent events that occurred during the time interval. If you don't include the number of seconds, the command displays the most recent events in the last 5 seconds. This option can be used to determine the module or modules that are currently synchronizing or attempting to synchronize. If no modules are currently synchronizing, the command just displays the most recent synchronization events.

by-object [<number>] displays the synchronization activity of a specific module, where <number> is the module number in the range 1 to 25. To display a list of all 25 modules and their numbers enter:

diagnose sys ha hasync-stats by-object ?

To display the most recent activity, enter:

diagnose sys ha hasync-stats most-recent 10
current-time/jiffies=2018-03-28 13:01:42/1148242:

hasync-obj=2(arp):
    epoll_handler=1(ev_arp_handler): start=1522256500.354400(2018-03-28 13:01:40), end=1522256500.354406(2018-03-28 13:01:40), total=0.000006/1699
hasync-obj=5(config):
    timer=0(check_sync_status), add=1141764(2018-03-28 13:01:26), expire=1142764(2018-03-28 13:01:36), end=1142764(25018-03-28 13:01:36), del=0(), total_call=1143
hasync-obj=8(time):
    obj_handler=0(packet): start=1522256497.851550(2018-03-28 13:01:37), end=1522256497.851570(2018-03-28 13:01:37), total=0.000020/381
    timer=0(sync_timer), add=1140106(2018-03-28 13:01:10), expire=1143106(2018-03-28 13:01:40), end=1143106(2018-03-28 13:01:40), del=0(), total_call=381
hasync-obj=21(hastats):
    obj_handler=0(packet): start=1522256499.760934(2018-03-28 13:01:39), end=1522256499.760936(2018-03-28 13:01:39), total=0.000002/2285
    timer=0(hastats_timer), add=1142556(2018-03-28 13:01:34), expire=1143056(2018-03-28 13:01:39), end=1143056(2018-03-28 13:01:39), del=0(), total_call=2286

The last few lines of this output shows activity with the hastats module, which is module 21. You can use the following command to see more information about synchronization activity with this module:

diagnose sys ha hasync-stats by-object 21

Synchronizing the configuration

The FGCP uses a combination of incremental and periodic synchronization to make sure that the configuration of all cluster units is synchronized to that of the primary unit.

The following settings are not synchronized between cluster units:

  • The FortiGate host name
  • GUI Dashboard widgets
  • HA override
  • HA device priority
  • The virtual cluster priority
  • The HA priority setting for a ping server (or dead gateway detection) configuration
  • The system interface settings of the HA reserved management interface
  • The HA default route for the reserved management interface, set using the ha-mgmt-interface-gateway option of the config system ha command

Most subscriptions and licenses are not synchronized, as each FortiGate must be licensed individually. FortiToken Mobile is an exception; they are registered to the primary unit and synchronized to the slaves.

The primary unit synchronizes all other configuration settings, including the other HA configuration settings.

All synchronization activity takes place over the HA heartbeat link using TCP/703 and UDP/703 packets.

Disabling automatic configuration synchronization

In some cases you may want to use the following command to disable automatic synchronization of the primary unit configuration to all cluster units.

config system ha

set sync-config disable

end

When this option is disabled the cluster no longer synchronizes configuration changes. If a device failure occurs, the new primary unit may not have the same configuration as the failed primary unit. As a result, the new primary unit may process sessions differently or may not function on the network in the same way.

In most cases you should not disable automatic configuration synchronization. However, if you have disabled this feature you can use the execute ha synchronize command to manually synchronize a subordinate unit’s configuration to that of the primary unit.

You must enter execute ha synchronize commands from the subordinate unit that you want to synchronize with the primary unit. Use the execute ha manage command to access a subordinate unit CLI.

For example, to access the first subordinate unit and force a synchronization at any time, even if automatic synchronization is disabled enter:

execute ha manage 0

execute ha synchronize start

You can use the following command to stop a synchronization that is in progress.

execute ha synchronize stop

Incremental synchronization

When you log into the cluster GUI or CLI to make configuration changes, you are actually logging into the primary unit. All of your configuration changes are first made to the primary unit. Incremental synchronization then immediately synchronizes these changes to all of the subordinate units.

When you log into a subordinate unit CLI (for example using execute ha manage) all of the configuration changes that you make to the subordinate unit are also immediately synchronized to all cluster units, including the primary unit, using the same process.

Incremental synchronization also synchronizes other dynamic configuration information such as the DHCP server address lease database, routing table updates, IPsec SAs, MAC address tables, and so on. See DHCP and PPPoE compatability for more information about DHCP server address lease synchronization and Synchronizing kernel routing tables for information about routing table updates.

Whenever a change is made to a cluster unit configuration, incremental synchronization sends the same configuration change to all other cluster units over the HA heartbeat link. An HA synchronization process running on the each cluster unit receives the configuration change and applies it to the cluster unit. The HA synchronization process makes the configuration change by entering a CLI command that appears to be entered by the administrator who made the configuration change in the first place.

Synchronization takes place silently, and no log messages are recorded about the synchronization activity. However, log messages can be recorded by the cluster units when the synchronization process enters CLI commands. You can see these log messages on the subordinate units if you enable event logging and set the minimum severity level to Information and then check the event log messages written by the cluster units when you make a configuration change.

You can also see these log messages on the primary unit if you make configuration changes from a subordinate unit.

Periodic synchronization

Incremental synchronization makes sure that as an administrator makes configuration changes, the configurations of all cluster units remain the same. However, a number of factors could cause one or more cluster units to go out of sync with the primary unit. For example, if you add a new unit to a functioning cluster, the configuration of this new unit will not match the configuration of the other cluster units. Its not practical to use incremental synchronization to change the configuration of the new unit.

Periodic synchronization is a mechanism that looks for synchronization problems and fixes them. Every minute the cluster compares the configuration file checksum of the primary unit with the configuration file checksums of each of the subordinate units. If all subordinate unit checksums are the same as the primary unit checksum, all cluster units are considered synchronized.

If one or more of the subordinate unit checksums is not the same as the primary unit checksum, the subordinate unit configuration is considered out of sync with the primary unit. The checksum of the out of sync subordinate unit is checked again every 15 seconds. This re-checking occurs in case the configurations are out of sync because an incremental configuration sequence has not completed. If the checksums do not match after 5 checks the subordinate unit that is out of sync retrieves the configuration from the primary unit. The subordinate unit then reloads its configuration and resumes operating as a subordinate unit with the same configuration as the primary unit.

The configuration of the subordinate unit is reset in this way because when a subordinate unit configuration gets out of sync with the primary unit configuration there is no efficient way to determine what the configuration differences are and to correct them. Resetting the subordinate unit configuration becomes the most efficient way to resynchronize the subordinate unit.

Synchronization requires that all cluster units run the same FortiOS firmware build. If some cluster units are running different firmware builds, then unstable cluster operation may occur and the cluster units may not be able to synchronize correctly.

note icon Re-installing the firmware build running on the primary unit forces the primary unit to upgrade all cluster units to the same firmware build.

Console messages when configuration synchronization succeeds

When a cluster first forms, or when a new unit is added to a cluster as a subordinate unit, the following messages appear on the CLI console to indicate that the unit joined the cluster and had its configuring synchronized with the primary unit.

slave's configuration is not in sync with master's, sequence:0

slave's configuration is not in sync with master's, sequence:1

slave's configuration is not in sync with master's, sequence:2

slave's configuration is not in sync with master's, sequence:3

slave's configuration is not in sync with master's, sequence:4

slave starts to sync with master

logout all admin users

slave succeeded to sync with master

Console messages when configuration synchronization fails

If you connect to the console of a subordinate unit that is out of synchronization with the primary unit, messages similar to the following are displayed.

slave is not in sync with master, sequence:0. (type 0x3)

slave is not in sync with master, sequence:1. (type 0x3)

slave is not in sync with master, sequence:2. (type 0x3)

slave is not in sync with master, sequence:3. (type 0x3)

slave is not in sync with master, sequence:4. (type 0x3)

global compared not matched

If synchronization problems occur the console message sequence may be repeated over and over again. The messages all include a type value (in the example type 0x3). The type value can help Fortinet Support diagnose the synchronization problem.

HA out of sync object messages and the configuration objects that they reference
Out of Sync Message Configuration Object
HA_SYNC_SETTING_CONFIGURATION = 0x03 /data/config
HA_SYNC_SETTING_AV = 0x10
HA_SYNC_SETTING_VIR_DB = 0x11 /etc/vir
HA_SYNC_SETTING_SHARED_LIB = 0x12 /data/lib/libav.so
HA_SYNC_SETTING_SCAN_UNIT = 0x13 /bin/scanunitd
HA_SYNC_SETTING_IMAP_PRXY = 0x14 /bin/imapd
HA_SYNC_SETTING_SMTP_PRXY = 0x15 /bin/smtp
HA_SYNC_SETTING_POP3_PRXY = 0x16 /bin/pop3
HA_SYNC_SETTING_HTTP_PRXY = 0x17 /bin/thttp
HA_SYNC_SETTING_FTP_PRXY = 0x18 /bin/ftpd
HA_SYNC_SETTING_FCNI = 0x19 /etc/fcni.dat
HA_SYNC_SETTING_FDNI = 0x1a /etc/fdnservers.dat
HA_SYNC_SETTING_FSCI = 0x1b /etc/sci.dat
HA_SYNC_SETTING_FSAE = 0x1c /etc/fsae_adgrp.cache
HA_SYNC_SETTING_IDS = 0x20 /etc/ids.rules
HA_SYNC_SETTING_IDSUSER_RULES = 0x21 /etc/idsuser.rules
HA_SYNC_SETTING_IDSCUSTOM = 0x22
HA_SYNC_SETTING_IDS_MONITOR = 0x23 /bin/ipsmonitor
HA_SYNC_SETTING_IDS_SENSOR = 0x24 /bin/ipsengine
HA_SYNC_SETTING_NIDS_LIB = 0x25 /data/lib/libips.so
HA_SYNC_SETTING_WEBLISTS = 0x30
HA_SYNC_SETTING_CONTENTFILTER = 0x31 /data/cmdb/webfilter.bword
HA_SYNC_SETTING_URLFILTER = 0x32 /data/cmdb/webfilter.urlfilter
HA_SYNC_SETTING_FTGD_OVRD = 0x33 /data/cmdb/webfilter.fgtd-ovrd
HA_SYNC_SETTING_FTGD_LRATING = 0x34 /data/cmdb/webfilter.fgtd-ovrd
HA_SYNC_SETTING_EMAILLISTS = 0x40
HA_SYNC_SETTING_EMAILCONTENT = 0x41 /data/cmdb/spamfilter.bword
HA_SYNC_SETTING_EMAILBWLIST = 0x42 /data/cmdb/spamfilter.emailbwl
HA_SYNC_SETTING_IPBWL = 0x43 /data/cmdb/spamfilter.ipbwl
HA_SYNC_SETTING_MHEADER = 0x44 /data/cmdb/spamfilter.mheader
HA_SYNC_SETTING_RBL = 0x45 /data/cmdb/spamfilter.rbl
HA_SYNC_SETTING_CERT_CONF = 0x50 /etc/cert/cert.conf
HA_SYNC_SETTING_CERT_CA = 0x51 /etc/cert/ca
HA_SYNC_SETTING_CERT_LOCAL = 0x52 /etc/cert/local
HA_SYNC_SETTING_CERT_CRL = 0x53 /etc/cert/crl
HA_SYNC_SETTING_DB_VER = 0x55
HA_GET_DETAIL_CSUM = 0x71
HA_SYNC_CC_SIG = 0x75 /etc/cc_sig.dat
HA_SYNC_CC_OP = 0x76 /etc/cc_op
HA_SYNC_CC_MAIN = 0x77 /etc/cc_main
HA_SYNC_FTGD_CAT_LIST = 0x7a /migadmin/webfilter/ublock/ftgd/ data/

Comparing checksums of cluster units

You can use the diagnose sys ha checksum show command to compare the configuration checksums of all cluster units. The output of this command shows checksums labeled global and all as well as checksums for each of the VDOMs including the root VDOM. The get system ha-nonsync-csum command can be used to display similar information; however, this command is intended to be used by FortiManager.

The primary unit and subordinate unit checksums should be the same. If they are not you can use the execute ha synchronize start command to force a synchronization.

The following command output is for the primary unit of a cluster that does not have multiple VDOMs enabled:

diagnose sys ha checksum show

is_manage_master()=1, is_root_master()=1

debugzone

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

checksum

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

The following command output is for a subordinate unit of the same cluster:

diagnose sys ha checksum show

is_manage_master()=0, is_root_master()=0

debugzone

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

checksum

global: a0 7f a7 ff ac 00 d5 b6 82 37 cc 13 3e 0b 9b 77

root: 43 72 47 68 7b da 81 17 c8 f5 10 dd fd 6b e9 57

all: c5 90 ed 22 24 3e 96 06 44 35 b6 63 7c 84 88 d5

The following example shows using this command for the primary unit of a cluster with multiple VDOMs. Two VDOMs have been added named test and Eng_vdm.

From the primary unit:

config global

diagnose sys ha checksum show

is_manage_master()=1, is_root_master()=1

debugzone

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

checksum

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

From the subordinate unit:

config global

diagnose sys ha checksum show

is_manage_master()=0, is_root_master()=0

debugzone

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

checksum

global: 65 75 88 97 2d 58 1b bf 38 d3 3d 52 5b 0e 30 a9

test: a5 16 34 8c 7a 46 d6 a4 1e 1f c8 64 ec 1b 53 fe

root: 3c 12 45 98 69 f2 d8 08 24 cf 02 ea 71 57 a7 01

Eng_vdm: 64 51 7c 58 97 79 b1 b3 b3 ed 5c ec cd 07 74 09

all: 30 68 77 82 a1 5d 13 99 d1 42 a3 2f 9f b9 15 53

How to diagnose HA out of sync messages

This section describes how to use the diagnose sys ha checksum show and diagnose debug commands to diagnose the cause of HA out of sync messages.

If HA synchronization is not successful, use the following procedures on each cluster unit to find the cause.

To determine why HA synchronization does not occur
  1. Connect to each cluster unit CLI by connected to the console port.
  2. Enter the following commands to enable debugging and display HA out of sync messages.

    diagnose debug enable

    diagnose debug console timestamp enable

    diagnose debug application hatalk -1

    diagnose debug application hasync -1

    Collect the console output and compare the out of sync messages with the information in the table HA out of sync object messages and the configuration objects that they reference.

  3. Enter the following commands to turn off debugging.

    diagnose debug disable

    diagnose debug reset

To determine what part of the configuration is causing the problem

If the previous procedure displays messages that include sync object 0x30 (for example, HA_SYNC_SETTING_CONFIGURATION = 0x03) there is a synchronization problem with the configuration. Use the following steps to determine the part of the configuration that is causing the problem.

If your cluster consists of two cluster units, use this procedure to capture the configuration checksums for each unit. If your cluster consists of more that two cluster units, repeat this procedure for all cluster units that returned messages that include 0x30 sync object messages.

  1. Connect to each cluster unit CLI by connected to the console port.
  2. Enter the following command to turn on terminal capture

    diagnose debug enable

  3. Enter the following command to stop HA synchronization.

    execute ha sync stop

  4. Enter the following command to display configuration checksums.

    diagnose sys ha checksum show global

  5. Copy the output to a text file.
  6. Repeat for all affected units.
  7. Compare the text file from the primary unit with the text file from each cluster unit to find the checksums that do not match.

    You can use a diff function to compare text files.

  8. Repeat for the root VDOM:

    diagnose sys ha checksum show root

  9. Repeat for all VDOMS (if multiple VDOM configuration is enabled):

    diagnose sys ha checksum show <vdom-name>

  10. You can also use the grep option to just display checksums for parts of the configuration.

    For example to display system related configuration checksums in the root VDOM or log-related checksums in the global configuration:

    diagnose sys ha checksum root | grep system

    diagnose sys ha chechsum global | grep log

    Generally it is the first non-matching checksum that is the cause of the synchronization problem.

  11. Attempt to remove/change the part of the configuration that is causing the problem. You can do this by making configuration changes from the primary unit or subordinate unit CLI.
  12. Enter the following commands to start HA configuration and stop debugging:

    execute ha sync start

    diagnose debug disable

    diagnose debug reset

Recalculating the checksums to resolve out of sync messages

Sometimes an error can occur when checksums are being calculated by the cluster. As a result of this calculation error the CLI console could display out of sync error messages even though the cluster is otherwise operating normally. You can also sometimes see checksum calculation errors in diagnose sys ha checksum command output when the checksums listed in the debugzone output don’t match the checksums in the checksum part of the output.

One solution to this problem could be to re-calculate the checksums. The re-calculated checksums should match and the out of sync error messages should stop appearing.

You can use the following command to re-calculate HA checksums:

diagnose sys ha checksum recalculate [<vdom-name> | global]

Just entering the command without options recalculates all checksums. You can specify a VDOM name to just recalculate the checksums for that VDOM. You can also enter global to recalculate the global checksum.

Determining what is causing a configuration synchronization problem

There are twenty-five FortiOS modules that have their configurations synchronized. It can be difficult to find the cause of a synchronization problem with so much data to analyze. You can use the following diagnose commands to more easily find modules that may be causing synchronization problems.

diagnose sys ha hasync-stats {all | most-recent [<seconds>] | by object [<number>]}

all displays the synchronization activity for all modules that happened since the hasync process started running (usually this would be since the cluster started-up).

most-recent [<seconds>] displays the most recently occurring synchronization events. You can include a time in seconds to display recent events that occurred during the time interval. If you don't include the number of seconds, the command displays the most recent events in the last 5 seconds. This option can be used to determine the module or modules that are currently synchronizing or attempting to synchronize. If no modules are currently synchronizing, the command just displays the most recent synchronization events.

by-object [<number>] displays the synchronization activity of a specific module, where <number> is the module number in the range 1 to 25. To display a list of all 25 modules and their numbers enter:

diagnose sys ha hasync-stats by-object ?

To display the most recent activity, enter:

diagnose sys ha hasync-stats most-recent 10
current-time/jiffies=2018-03-28 13:01:42/1148242:

hasync-obj=2(arp):
    epoll_handler=1(ev_arp_handler): start=1522256500.354400(2018-03-28 13:01:40), end=1522256500.354406(2018-03-28 13:01:40), total=0.000006/1699
hasync-obj=5(config):
    timer=0(check_sync_status), add=1141764(2018-03-28 13:01:26), expire=1142764(2018-03-28 13:01:36), end=1142764(25018-03-28 13:01:36), del=0(), total_call=1143
hasync-obj=8(time):
    obj_handler=0(packet): start=1522256497.851550(2018-03-28 13:01:37), end=1522256497.851570(2018-03-28 13:01:37), total=0.000020/381
    timer=0(sync_timer), add=1140106(2018-03-28 13:01:10), expire=1143106(2018-03-28 13:01:40), end=1143106(2018-03-28 13:01:40), del=0(), total_call=381
hasync-obj=21(hastats):
    obj_handler=0(packet): start=1522256499.760934(2018-03-28 13:01:39), end=1522256499.760936(2018-03-28 13:01:39), total=0.000002/2285
    timer=0(hastats_timer), add=1142556(2018-03-28 13:01:34), expire=1143056(2018-03-28 13:01:39), end=1143056(2018-03-28 13:01:39), del=0(), total_call=2286

The last few lines of this output shows activity with the hastats module, which is module 21. You can use the following command to see more information about synchronization activity with this module:

diagnose sys ha hasync-stats by-object 21