Using high availability (HA)

Example: Failover scenarios

Configuration settings that are not synchronized

All configuration settings on the primary unit are synchronized to the secondary unit, except the following:

HA settings not synchronized

Operation mode

You must set the operation mode (gateway, transparent, or server) of each HA group member before configuring HA.

Host name

The host name distinguishes members of the cluster. For details, see Host name.

Static route

Static routes are not synchronized because the HA units may be in different networks (see Configuring static routes ).

Interface configuration

(gateway and server mode only)

Each FortiMail unit in the HA group must be configured with different network interface settings for connectivity purposes. For details, see Configuring the network interfaces.

Exceptions include some active-passive HA settings which affect the interface configuration for failover purposes. These settings are synchronized. For details, see Virtual IP Address.

Management IP address

(transparent mode only)

Each FortiMail unit in the HA group should be configured with different management IP addresses for connectivity purposes. For details, see About the management IP.

SNMP system information

Each FortiMail unit in the HA group will have its own SNMP system information, including the Description, Location, and Contact. For details, see Configuring the network interfaces.

RAID configuration

RAID settings are hardware-dependent and determined at boot time by looking at the drives (for software RAID) or the controller (hardware RAID), and are not stored in the system configuration. Therefore, they are not synchronized.

Main HA configuration

The main HA configuration, which includes the HA mode of operation (such as primary or secondary), is not synchronized because this configuration must be different on the primary and secondary units. For details, see Configuring the HA mode and group.

HA Daemon configuration

The following HA daemon settings are not synchronized:

Shared password
Backup mail data directories
Backup MTA queue directories

You must add the shared HA password to each unit in the HA group. All units in the HA group must use the same shared password to identify the group.

Since the mail data and MTA queue backup settings are not synchronized, to use this feature, you must enable it on both the primary and secondary units.

Synchronized HA daemon options that are active-passive HA settings affect how often the secondary unit tests the primary unit and how the secondary unit synchronizes configuration and mail data. Because HA daemon settings on the secondary unit control how the HA daemon operates, in a functioning HA group you would change the HA daemon configuration on the secondary unit to change how the HA daemon operates. The HA daemon settings on the primary unit do not affect the operation of the HA daemon.

HA service monitoring configuration

In active-passive HA, the HA service monitoring configuration is not synchronized. The remote service monitoring configuration on the secondary unit controls how the secondary unit checks the operation of the primary unit. The local services configuration on the primary unit controls how the primary unit tests the operation of the primary unit. For details, see Configuring service-based failover.

Note: You might want to have a different service monitoring configuration on the primary and secondary units. For example, after a failover you may not want service monitoring to operate until you have fixed the problems that caused the failover and have restarted normal operation of the HA group.

Product name and icon

The product names and icons under System > Customization > Appearance are not synchronized. All other appearance settings are synchronized.

Config-only HA

In config-only HA, the following settings are not synchronized:

the local domain name
default certificate
iSCSI initiator name
iSCSI ID for remote storage
SNMP settings
IP pools (see Configuring IP pools)
the quarantine report host name (see Web release host name/IP)
IBE settings of base URL, Help content URL, and About content URL
Centralized quarantine client IP address
Centralized IBE client IP address
Starting from 5.4.0 release, all system, domain, and user level block/safe lists are synchronized. Before 5.4.0 release, user-level block/safe lists are not synchronized. But system and domain-level block/safe lists are synchronized. Before v5.0.2 release, domain-level block/safe lists are not automatically synchronized either.

Note that user data is synchronized at predefined time intervals, not in real time.

See also

Synchronization of MTA queue directories after a failover

During normal operation, email messages are in one of three states:

being received or sent by the primary unit
waiting to be delivered in the mail queue
stored on the primary unit’s mail data directories (email quarantines, email archives, and email inboxes of server mode)

When normal operation of an active-passive HA group is interrupted and a failover occurs, sending and receiving is interrupted. The delivery attempt fails, and the sender usually retries to send the email message. However, stored messages remain in the primary unit’s mail data directories.

You usually should configure HA to synchronize the stored mail data to prevent loss of email messages, but you usually will not want to regularly synchronize the mail queue. This is because, to prevent loss of email messages in the failed primary unit, FortiMail units in active-passive HA use the following failover mechanism:

If the failed primary unit effective HA operating mode is failed, a sequence similar to the following occurs automatically when the problem that caused the failure is corrected.

The secondary unit detects the failure of the primary unit, and becomes the new primary unit.
The former primary unit restarts, detects the new primary unit, and becomes a secondary unit.

You may have to manually restart the failed primary unit.

The former primary unit pushes its mail queue to the new primary unit.

This synchronization occurs through the heartbeat link between the primary and secondary units, and prevents duplicate email messages from forming in the primary unit’s mail queue.

The new primary unit delivers email in its mail queues, including email messages synchronized from the new secondary unit.

As a result, as long as the failed primary unit can restart, no email is lost from the mail queue.

Even if you choose to synchronize the mail queue, because its contents change very rapidly and synchronization is periodic, there is a chance that some email in these directories will not be synchronized at the exact moment a failover occurs.

See also

Getting HA information using SNMP

About logging, alert email and SNMP in HA

To configure logging and alert email, configure the primary unit and enable HA events. When the configuration changes are synchronized to the secondary units, all FortiMail units in the HA group record their own separate log messages and send separate alert email messages. Log data is not synchronized. For details on configuring logging and viewing log messages, see Logs, reports and alerts.

To distinguish alert email from each member of the HA cluster, configure a different host name for each member. For details, see Host name.

To use SNMP, configure each cluster member separately and enable HA events for the community. If you enable SNMP for all units, they can all send SNMP traps. Additionally, you can use an SNMP server to monitor the primary and secondary units for HA settings, such as the HA configured and effective mode of operation. For details on SNMP, see Configuring the network interfaces.

To aid in quick discovery and diagnosis of network problems, consider configuring SNMP, Syslog, and/or alert email to monitor the HA cluster for failover messages.

See also

Getting HA information using SNMP

You can use an SNMP manager to get information about how FortiMail HA is operating. The FortiMail MIB (fortimail.mib) and the FortiMail trap MIB (fortimail.trap.mib) include the HA fields listed below.

FortiMail MIB fields

MIB Field	Description
	fortimail.mib
fmlHAEventId	Provides the ID of the most recent HA event.
fmlHAUnitIp	Provides the IP address of the port1 interface of the FortiMail unit on which an HA event occurred.
fmlHAEventReason	Provides the description of the reason for the HA event.
fmlHAMode	Provides the HA configured mode of operation that you configured the FortiMail unit to operate in (either as primary or secondary).
fmlHAEffectiveMode	Provides the effective HA mode of operation (applies to active-passive HA only), either as the primary unit or as the secondary unit. The effective HA mode of operation matches the configured mode of operation unless a failure has occurred.
	fortimail.trap.mib
fmlTrapHAEvent	Provides the FortiMail HA trap that is sent when an HA event occurs. This trap includes the contents of the `fmlSysSerial`, `fmlHAEventId`, `fmlHAUnitIp`, and `fmlHAEventReason` MIB fields.

How to use HA

In general, to enable and configure HA, you should perform the following:

If the HA cluster will use FortiGuard Antivirus and/or FortiGuard Antispam services, license all FortiMail units in the HA group for the FortiGuard Antispam and FortiGuard Antivirus services, and register them with the Fortinet Technical Support web site, https://support.fortinet.com/.
Physically connect the FortiMail units that will be members of the HA cluster.

You must connect at least one of their network interfaces for heartbeat and synchronization traffic between members of the cluster. For reliability reasons, Fortinet recommends that you connect both a primary and a secondary heartbeat interface, and that they be connected directly or through a dedicated switch that is not connected to your overall network.

For config-only clusters, configure each member of the cluster to store mail data on a NAS server that supports NFS connections (active-passive groups may also use a NAS server, but do not require it). For details, see Selecting the mail data storage location.
On each member of the cluster:

Enable the HA mode that you want to use (either active-passive or config-only) and select whether the individual member will act as a primary unit or secondary unit within the cluster. For information about the differences between the HA modes, see About high availability.
Configure the local IP addresses of the primary and secondary heartbeat and synchronization network interfaces.
For active-passive clusters, configure the behavior on failover, and how the network interfaces should be configured for whichever FortiMail unit is currently acting as the primary unit. Additionally, if the FortiMail units store mail data on a NAS, disable mail data synchronization between members.
For config-only clusters, if the FortiMail unit is a primary unit, configure the IP addresses of its secondary units; if the FortiMail unit is a secondary unit, configure the IP address of its primary unit.

For details, see Configuring the HA mode and group.

If the HA cluster is active-passive and you want to trigger failover when hardware or a service fails, even if the heartbeat connection is still functioning, configure service monitoring. For details, see Configuring service-based failover.

Monitor the status of each cluster member. For details, see Monitoring the HA status. To monitor HA events through log messages and/or alert email, you must first enable logging of HA activity events. For details, see Logs, reports and alerts.

Monitoring the HA status

The Status tab in the High Availability submenu shows the configured HA mode of operation of a FortiMail unit in an HA group. You can also manually initiate synchronization and reset the HA mode of operation. A reset may be required if a FortiMail unit’s effective HA mode of operation differs from its configured HA mode of operation, such as after a failover when a configured primary unit is currently acting as a secondary unit.

For FortiMail units operating as secondary units, the Status tab also lets you view the status and schedule of the HA synchronization daemon.

Appearance of the Status tab varies by:

whether the HA group is active-passive or config-only
whether the FortiMail unit is configured as a primary unit or secondary unit
whether a failover has occurred (active-passive only)

If HA is disabled, this tab displays:

HA mode is currently disabled

Before you can use the Status tab, you must first enable and configure HA. For details, see Configuring the HA mode and group.

To view the HA mode of operation status, go System > High Availability > Status.

Viewing HA status

GUI item	Description
Configured Operating Mode	Displays the HA operating mode that you configured, either: Primary: Configured to be the primary unit of an active-passive group. Secondary: Configured to be the secondary unit of an active-passive group. Config-primary: Configured to be the primary unit of a config-only group. Config-secondary: Configured to be a secondary unit of a config-only group. For information on configuring the HA operating mode, see HA mode. After a failure, the FortiMail unit may not be acting in its configured HA operating mode. For details, see Using high availability (HA).
Effective Operating Mode	Displays the mode that the unit is currently operating in, either: Primary: Acting as primary unit. Secondary: Acting as secondary unit. Off: For primary units, this indicates that service/interface monitoring has detected a failure and has taken the primary unit offline, triggering failover. For secondary units, this indicates that synchronization has failed once; a subsequent failure will trigger failover. For details, see On failure and Using high availability (HA). Failed: Service/network interface monitoring has detected a failure and the diagnostic connection is currently determining whether the problem has been corrected or failover is required. For details, see On failure. The configured HA operating mode matches the effective operating mode unless a failure has occurred. For example, after a failover, a FortiMail unit configured to operate as a secondary unit could be acting as a primary unit. For explanations of combinations of configured and effective HA modes of operation, see Monitoring the HA status.For information on restoring the FortiMail unit to an effective HA operating mode that matches the configured operating mode, see Using high availability (HA). This option appears only if the FortiMail unit is a member of an active-passive HA group.
Detail Status	This table is viewable, when HA is configured, by all HA units (primary/secondary, and config-primary/config-secondary): IP: IP address of HA cluster members. SN: Serial number of HA cluster member. Secondary: Displays the configuration synchronization status of the secondary/config-secondary unit. Primary: Displays the configuration synchronization status of the primary/config-primary unit. Status: Displays whether or not the HA cluster is synchronized. Last Seen: Displays the last time the primary unit’s HA daemon checked to make sure that the secondary unit is operating correctly. Monitoring occurs through the heartbeat link between the primary and secondary units. For details, see HA base port.
Action	Displays the actions you can take, depending on the context: Start configuration sync: Click to manually initiate synchronization of the configurations. For information on items that are not synchronized, see Configuration settings that are not synchronized. Switch to secondary/primary mode: Option depends on HA unit's role; click to manually switch the effective HA operating mode of the primary unit so that it becomes a secondary unit, or vice versa. Restart the HA system: Click to restart HA processes after they have been halted due to detection of a failure by service monitoring. For details, see On failure, Configuring service-based failover, and Restarting the HA processes on a stopped primary unit. This option appears only if the FortiMail unit is configured to operate as the primary unit, but its effective HA operating mode is off.

Combinations of configured and effective HA modes of operation

Configured operating mode	Effective operating mode	Description
Primary	Primary	Normal for the primary unit of an active-passive HA group.
Secondary	Secondary	Normal for the secondary unit of an active-passive HA group.
Primary	Off	The primary unit has experienced a failure, or the FortiMail unit is in the process of switching to operating in HA mode. HA processes and email processing are stopped.
Secondary	Off	The secondary unit has detected a failure, or the FortiMail unit is in the process of switching to operating in HA mode. After the secondary unit starts up and connects with the primary unit to form an HA group, the first configuration synchronization may fail in special circumstances. To prevent both the secondary and primary units from simultaneously acting as primary units, the effective HA mode of operation becomes off. If subsequent synchronization fails, the secondary unit’s effective HA mode of operation becomes primary.
Primary	Failed	The remote service monitoring or local network interface monitoring on the primary unit has detected a failure, and will attempt to connect to the other FortiMail unit. If the problem that caused the failure has been corrected, the effective HA mode of operation switches from failed to secondary, or to match the configured HA mode of operation, depending on the On failure setting. Additionally, f the HA group is operating in transparent mode, and if the effective HA mode of operation changes to failed, the network interface IP/netmask on the secondary unit displays bridging (waiting for recovery). For details, see Configuring the network interfaces.
Primary	Secondary	The primary unit has experienced a failure but then returned to operation. When the failure occurred, the unit configured to be the secondary unit became the primary unit. When the unit configured to be the primary unit restarted, it detected the new primary unit and so switched to operating as the secondary unit.
Secondary	Primary	The secondary unit has detected that the FortiMail unit configured to be the primary unit failed. When the failure occurred, the unit configured to be the secondary unit became the primary unit.
Config primary	N/A	Normal for the primary unit of a config-only HA group.
Config secondary	N/A	Normal for the secondary unit of a config-only HA group.

About logging, alert email and SNMP in HA

Storing mail data on a NAS server

Configuring the HA mode and group

Example: Failover scenarios

Restarting the HA processes on a stopped primary unit

If you configured service monitoring on an active-passive HA group (see Configuring service-based failover) and either the primary unit or the secondary unit detects a service failure on the primary unit, the primary unit changes its effective HA mode of operation to off, stops processing email, and halts all of its HA processes.

After resolving the problem that caused the failure, you can use the following steps to restart the HA processes on the primary unit.

In this example, resolving this problem could be as simple as reconnecting the cable to the port2 network interface. Once the problem is resolved, use the following steps to restart the stopped primary unit.

To restart a stopped primary unit

Log in to the web-based manager of the primary unit.
Go to System > High Availability > Status.
Under Action, click Restart the HA system.

The primary unit restarts and rejoins the HA group.

If a failover has occurred due to processes being stopped on the primary unit, and the secondary unit is currently acting as the primary unit, you can restore the primary and secondary units to acting in their configured roles. For details, see Using high availability (HA).

See also

Monitoring the HA status

Configuring the HA mode and group

The Configuration tab in the System > High Availability submenu lets you configure the high availability (HA) options, including:

enabling HA
selecting whether the HA group is active-passive or config-only in style
whether this individual FortiMail unit will act as a primary unit or a secondary unit in the cluster
network interfaces that will be used for heartbeat and synchronization
service monitor

For config-only HA, if the FortiMail unit is operating in server mode, you must store mail data externally, on a NAS server. Failure to store mail data externally could result in mailboxes and other data scattered over multiple FortiMail units. For details on configuring NAS, see Storing mail data on a NAS server and Selecting the mail data storage location.

For an explanation of active-passive and config-only, see About high availability.

HA settings, with the exception of Virtual IP Address settings, are not synchronized and must be configured separately on each primary and secondary unit.

You must maintain the physical link between the heartbeat and synchronization network interfaces. These connections enable cluster members to detect the responsiveness of other members, and to synchronize data. If they are interrupted, normal operation will be interrupted and, for active-passive HA groups, a failover will occur. For more information on heartbeat and synchronization, see About the heartbeat and synchronization.

For an active-passive HA group, or a config-only HA group consisting of only two FortiMail units, directly connect the heartbeat network interfaces using a crossover Ethernet cable. For a config-only HA group consisting of more than two FortiMail units, connect the heartbeat network interfaces through a switch, and do not connect this switch to your overall network.

To configure HA options

Go to System > High Availability > Configuration.

The appearance of sections and the options in them options vary greatly with your choice in the Mode of operation drop-down-list.

Configure the following sections, as applicable:

Configuring the primary HA options
Configuring the primary configuration IP
Configuring the advanced options
Configuring the secondary system options
Storing mail data on a NAS server
Configuring interface monitoring
Configuring service-based failover

Click Apply.

Configuring the primary HA options

Go to System > High Availability > Configuration and click the arrow to expand the HA configuration section, if needed. The options presented vary greatly depending on your choice in the Mode of operation drop-down-list.

HA main options

GUI item	Description
HA mode	Enables or disables HA, selects active-passive or config-only HA, and selects the initial configured role this FortiMail unit in the HA group. Off: The FortiMail unit is not operating in HA mode. Primary: The FortiMail unit is the primary unit in an active-passive HA group. Secondary: The FortiMail unit is the secondary unit in an active-passive HA group. Config-primary: The FortiMail unit is the primary unit in a config-only HA group. Config-secondary: The FortiMail unit is a secondary unit in a config-only HA group.
On failure	Select one of the following behaviors of the primary unit when it detects a failure, such as on a power failure or from service/interface monitoring. switch off: Do not process email or join the HA group until you manually select the effective operating mode (see Using high availability (HA) and Using high availability (HA)). wait for recovery then restore original role: On recovery, the failed primary unit‘s effective HA mode of operation resumes its configured primary role. This also means that the secondary unit needs to give back the primary role to the primary unit. This behavior may be useful if the cause of failure is temporary and rare, but may cause problems if the cause of failure is permanent or persistent. wait for recovery then restore secondary role: On recovery, the failed primary unit’s effective HA mode of operation becomes secondary, and the secondary unit continue to assume the primary role. The primary unit then synchronizes the content of its MTA queue directories with the current primary unit. The new primary unit can then deliver email that existed in the former primary unit’s MTA queue at the time of the failover. For information on manually restoring the FortiMail unit to acting in its configured HA mode of operation, see Using high availability (HA). In most cases, you should select the wait for recovery then restore secondary role option. This option appears only if HA mode is primary.
Shared password	Enter an HA password for the HA group. You must configure the same Shared password value on both the primary and secondary units.
Enable centralized monitor	Enable or disable the central statistics service. Once enabled, administrators on the primary HA unit can monitor the state and activity of each HA cluster member, including CPU, memory, and disk usage, email throughput, and other statistic summaries. This feature can also be enabled in the CLI by enabling `central-statistics` under `config system ha`. For more information, see the FortiMail CLI Reference. For more information, see Centrally monitoring the HA cluster.

Configuring the primary configuration IP

If you are configuring the unit as the secondary unit in a config-only group, go to System > High Availability > Configuration to configure the primary IP address.

In the Primary IP address field, enter the IP of the primary heartbeat network interface of the primary unit. The secondary unit synchronizes only with this primary unit’s IP address.

Configuring the advanced options

Go to System > High Availability > Configuration to configure the advanced options.

For config-only groups, just the HA base port option appears.

HA advanced options

GUI item	Description
Synchronize mail data directory	Synchronize system quarantine, email archives, email users’ mailboxes (server mode only), preferences, and per-recipient quarantines. Unless the HA cluster stores its mail data on a NAS server, you should configure the HA cluster to synchronize mail directories. If mail data changes frequently, you can manually initiate a data synchronization when significant changes are complete. For details, see Using high availability (HA).
Synchronize MTA queue directory	Synchronize the mail queue of the FortiMail unit. For more information on the mail queue, see Managing the mail queue. Caution: If the primary unit experiences a hardware failure and you cannot restart it, and if this option is disabled, MTA queue directory data could be lost. Note: Enabling this option can affect the FortiMail unit’s performance, because periodic synchronization of the mail queue can be processor and bandwidth-intensive. Additionally, because the content of the MTA queue directories is very dynamic, periodically synchronizing MTA queue directories between FortiMail units may not guarantee against loss of all email in those directories. Even if MTA queue directory synchronization is disabled, after a failover, a separate synchronization mechanism may successfully prevent loss of MTA queue data. For details, see Synchronization of MTA queue directories after a failover.
HA base port	Enter the first of four TCP port numbers that will be used for: the heartbeat signal synchronization control data synchronization configuration synchronization Note: For active-passive groups, in addition or alternatively to configuring the heartbeat, you can configure service monitoring. For details, see Configuring service-based failover. Note: In addition to automatic immediate and periodic configuration synchronization, you can also manually initiate synchronization. For details, see Using high availability (HA).
Heartbeat lost threshold	Enter the total span of time, in seconds, for which the primary unit can be unresponsive before it triggers a failover and the secondary unit assumes the role of the primary unit. The heartbeat will continue to check for availability once per second. To prevent premature failover when the primary unit is simply experiencing very heavy load, configure a total threshold of three (3) seconds or more to allow the secondary unit enough time to confirm unresponsiveness by sending additional heartbeat signals. Note: If the failure detection time is too short, the secondary unit may falsely detect a failure when during periods of high load. Caution: If the failure detection time is too long the primary unit could fail and a delay in detecting the failure could mean that email is delayed or lost. Decrease the failure detection time if email is delayed or lost because of an HA failover.
Remote services as heartbeat	Enable to use remote service monitoring as a secondary HA heartbeat. If enabled and both the primary and secondary heartbeat links fail or become disconnected, if remote service monitoring still detects that the primary unit is available, a failover will not occur. Note: The remote service check is only applicable for temporary heartbeat link fails. If the HA process restarts due to system reboot or HA daemon reboot, physical heartbeat connections will be checked first. If the physical connections are not found, the remote service monitoring does not take effect anymore. Note: Using remote services as heartbeat provides HA heartbeat only, not synchronization. To avoid synchronization problems, you should not use remote service monitoring as a heartbeat for extended periods. This feature is intended only as a temporary heartbeat solution that operates until you reestablish a normal primary or secondary heartbeat link.

Configuring the secondary system options

This section appears only when the mode of operation is set to config-primary under System > High Availability > Configuration.

HA peer options

GUI item	Description
IP address	Double-click in order to modify, then enter the IP address of the primary network interface on that secondary unit.
Create	Click to add a secondary unit to the list of Peer systems, then double-click its IP address. The primary unit synchronizes only with secondary units in the list of Peer systems.
Delete	Click the row corresponding to a peer IP address, then click this button to remove that secondary unit from the HA group.

See also

About logging, alert email and SNMP in HA

Storing mail data on a NAS server

Example: Failover scenarios

Storing mail data on a NAS server

For FortiMail units operating in server mode as a config-only HA group, you must store mail data on a NAS server instead of locally. If mail data is stored locally, email users’ messages and other mail data could be scattered across multiple FortiMail units.

Even if your FortiMail units are not operating in server mode with config-only HA, however, storing mail data on a NAS server may have a number of benefits for your organization. For example, backing up your NAS server regularly can help prevent loss of mail data. Also, if your FortiMail unit experiences a temporary failure, you can still access the mail data on the NAS server. When the FortiMail unit restarts, it can usually continue to access and use the mail data stored on the NAS server.

For config-only HA groups using a network attached storage (NAS) server, only the primary unit sends quarantine reports to email users. The primary unit also acts as a proxy between email users and the NAS server when email users use FortiMail webmail to access quarantined email and to configure their own Bayesian filters.

For a active-passive HA groups, the primary unit reads and writes all mail data to and from the NAS server in the same way as a standalone unit. If a failover occurs, the new primary unit uses the same NAS server for mail data. The new primary unit can access all mail data that the original primary unit stored on the NAS server. So if you are using a NAS server to store mail data, after a failover, the new primary unit continues operating with no loss of mail data.

If the FortiMail unit is a member of an active-passive HA group, and the HA group stores mail data on a remote NAS server, disable mail data synchronization to prevent duplicate mail data traffic.

For instructions on storing mail data on a NAS server, see Selecting the mail data storage location.

See also