Using high availability (HA)
FortiMail units can be configured to act in a high availability (HA) cluster or group of clusters to increase processing capacity and/or availability, so that your overall deployment uptime is preserved even if some individual hardware or software fails. Deployments may require changes to the network topology or DNS records to achieve this, depending on the HA mode.
This section contains the following topics:
- About HA types
- About HA modes
- About HA heartbeat and synchronization
- Configuring HA
- Monitoring HA status
About HA types
Supported FortiMail HA deployment types are:
- Member HA: Multiple FortiMail units work together in one HA pair or cluster.
- Group HA: Multiple HA clusters work together in a group.
For example, if you have one data center to protect, you only need one cluster. However if you have multiple data centers for redundancy or capacity, then you can join the clusters together to form an HA group.
Each cluster in an HA group has its own HA mode. At the HA group level, there is also an HA mode that defines throughput or failover amongst the clusters. Depending on your throughput or failover requirements, you can mix the HA modes at the member HA and group HA level.
See also
About HA modes
FortiMail HA clusters and groups of clusters can operate with HA mode as either:
This determines network topology, fault tolerance, and total throughput.
|
Active-passive HA |
Active-active HA |
|---|---|
|
2 FortiMail units or clusters |
2-24 FortiMail units or clusters |
|
Deployed behind a switch |
Deployed either behind a load balancer or with multiple DNS |
|
Both configuration* and data synchronized^ |
Only configuration* synchronized |
|
Only primary unit or HA group processes email |
All units process email |
|
No data loss^ when hardware fails |
Data loss when hardware fails |
|
No increased processing capacity |
Increased processing capacity |
* For exceptions, see Settings that are not synchronized by HA.
^ For exceptions, see Synchronization of MTA queue directories after a failover.
Active-passive member HA operating in gateway mode
Active-active member HA operating in gateway mode
Group HA with a mix of active-active and active-passive clusters
When a FortiMail unit or cluster fails, its email traffic is interrupted. SMTP clients usually handle this gracefully, and restart a new connection. Traffic is redirected away from the point of failure by different methods that vary by HA mode:
-
Active-passive: A secondary becomes primary and starts using the using the Virtual IP address (or Virtual IPv6 address) and/or Virtual hostname. Then it uses ARP to notify the nearby OSI Layer 2 switch or router about the link change, and they automatically redirect traffic.
-
Active-active: The load balancer health check detects a failure. It stops distributing traffic to failed FortiMail units or clusters. Only live FortiMail units continue to receive traffic.
Traffic for other IP addresses, such as administrative connections on the management network interfaces, may continue to reach the failed unit or cluster, depending on your network topology. This can be used to troubleshoot the cause of failure, or to reconfigure HA.
See also
About HA heartbeat and synchronization
About logging, alert email, and SNMP for HA
Storing mail data from HA clusters on a NAS server
Mixing HA modes
Mix the cluster's HA mode and the group's HA mode, if needed for your use case. For example, to increase service uptime and reduce data loss risks, you could join active-passive HA clusters together into an active-passive group. This reduces risk of data loss a little more than a standalone active-passive cluster. However it also reduces throughput to a fraction of what is possible, because only the primary unit of the primary cluster (1/4 of your total FortiMail units) will be actively processing email. Instead you may prefer to mix HA modes for a balance of availability and performance: active-passive clusters, in an active-active group.
Synchronization varies by HA mode. Therefore a mix of modes can change which settings are synchronized.
About HA heartbeat and synchronization
-
monitor for failure of the primary unit in the HA cluster or group of clusters (a health check)
-
synchronize configuration changes from the primary unit to secondaries (and in groups of clusters, from the primary cluster to the secondaries)
For exceptions, see Settings that are not synchronized by HA.
-
(active-passive only, and only if enabled) synchronize the mail queue, FortiMail system mail directory, and user home directories
For exceptions, see Storing mail data from HA clusters on a NAS server.
|
|
Synchronization intervals vary.
If configuration synchronization did not occur when expected, or if you have inadvertently de-synchronized the secondary unit’s configuration (for example, if a cable was accidentally disconnected), then you can manually initiate synchronization on either the primary unit or the secondary unit. |
Periodically, secondary units verify configuration synchronization. If it's not in sync, then secondary units pull changes from the primary, and reload the configuration. In active-active HA, block list and safe list changes are also pushed from the secondary to the primary unit, and then synchronized to all other secondary units.
|
|
Due to the introduction of primary backup in active-active HA in FortiMail 7.4.0, communication between the secondary units is also required. In config-only HA before the 7.4.0 release, it was not required. |
Heartbeats from the primary to secondaries must not be interrupted. Exceptions include when:
- the primary reboots, or you enter the
execute reloadcommand in the CLI. If the primary unit reboots or reloads its configuration, then it signals to the secondary unit to wait for additional time. - Remote services as heartbeat is enabled.
|
|
Remote service monitoring does not provide synchronization, and therefore is not a complete, long-term replacement for the heartbeat. Heartbeat links that are disrupted should be fixed as soon as possible. |
If the heartbeat signal is lost, or if the primary detects failure via another method (hard drive monitoring, interface monitoring, or remote service monitoring), then behavior varies by HA mode:
-
Active-passive: A secondary unit or cluster becomes the new primary, and starts processing email ("failover").
-
Active-active: If Primary backup has been selected, then your preferred secondary unit or cluster will take over the role of the primary (Effective becomes Primary) for the purpose of configuration synchronization. All units continue to process email, except the failed unit.
If a Primary backup is not selected, or if the new primary also fails, then each secondary continues as a secondary. However, with no designated primary unit, changes to the configuration are not synchronized anymore. Repair the primary or reconfigure the HA cluster to form a heartbeat with a new primary.
Some failure causes are temporary. Depending on Action on failure, even if the primary can recover, it might not automatically return to the HA cluster or group of clusters and reclaim its role. You can manually trigger this with Restore.
See also
About HA port numbers and protocols
About logging, alert email, and SNMP for HA
Settings that are not synchronized by HA
Storing mail data from HA clusters on a NAS server
Synchronization of MTA queue directories after a failover
About HA port numbers and protocols
The default protocol and port numbers for HA heartbeat, synchronization, and service monitoring communications are configurable. See HA base port, the control packet setting in the FortiMail CLI Reference, and Appendix C: Port Numbers.
|
|
If a firewall is between the primary and secondary FortiMail unit, then verify that the firewall policy allows HA port numbers. Blocked HA ports can cause incorrect failover and synchronization failure. |
Settings that are not synchronized by HA
All settings on the primary unit are synchronized to the secondary unit, except the following:
|
Settings |
Explanation |
||
|---|---|---|---|
|
Licenses |
FortiGuard Antivirus, FortiGuard Antispam, and other service subscription and feature licenses are specific to each FortiMail unit, and are not synchronized, regardless of HA mode. |
||
|
Operation mode |
You must set the operation mode (gateway, transparent, or server) of each FortiMail unit before they join HA. Many settings vary by operation mode, and therefore configurations cannot be synchronized if the operation mode is different. |
||
|
Host name |
Different host names are used to distinguish members of the HA cluster when connecting to the GUI and to indicate which unit failed. For details, see Host name. |
||
|
Static route |
Static routes are not synchronized because some or all in the network interfaces on each FortiMail unit in the HA cluster may be connected to different subnets. See also Configuring static routes . |
||
|
Interface configuration (gateway and server mode only) |
Administrator connections to the GUI/CLI, alert email, and many other features require that you configure at least one network interface with an IP address. For details, see Configuring the network interfaces. Exceptions include virtual IP addresses on active-passive HA. Virtual IP addresses are synchronized because, upon failover, the secondary unit must starts to use them. This mechanism allows traffic to receive connections instead of the failed primary unit. See Virtual IP address (or Virtual IPv6 address). |
||
|
Management IP address (transparent mode only) |
Each FortiMail unit in the HA cluster should be configured with different management IP addresses for GUI and CLI connectivity purposes. For details, see About the management IP. |
||
|
SNMP system information |
Each FortiMail unit in the HA cluster will have its own SNMP system information, including the Description, Location, and Contact. For details, see Configuring SNMP queries and traps. |
||
|
RAID configuration |
RAID settings are hardware-dependent and determined at boot time by looking at the drives (for software RAID) or the controller (hardware RAID), and are not stored in the system configuration. Therefore, they are not synchronized. |
||
|
Some HA settings |
|||
|
Product name and icon |
The product name and icon under System > Customization > Appearance are not synchronized. All other appearance settings are synchronized. |
||
|
Miscellaneous settings |
In active-active HA, the following settings are not synchronized:
All system, domain, and user level block/safe lists are synchronized.
|
See also
About HA heartbeat and synchronization
Synchronization of MTA queue directories after a failover
During normal operation in active-passive HA, email is either:
- being received or sent by the primary FortiMail unit or cluster
- waiting to be delivered in the mail queue
- stored in the primary’s mail data directories (quarantines, email archives, and, for server mode, email inboxes)
When a failure occurs, sending and receiving is interrupted. The delivery attempt fails. Usually, the sender retries. However, stored email remains in the mail data directories.
To prevent data loss when a primary fails, you usually should enable Synchronize mail data directory (unless NAS storage is used), but do not need to enable Synchronize MTA queue directory. This is because of an automatic recovery mechanism in FortiMail HA failover.
-
The secondary unit detects that the primary unit has failed, and becomes the new primary.
-
If the failed unit can reboot, it detects the new primary unit.
-
The former primary unit pushes its mail queue to the new primary unit.
This synchronization occurs through the heartbeat link between the primary and secondary units, and prevents duplicate email messages from forming in the primary unit’s mail queue.
-
The new primary unit delivers email in its mail queues, including email messages synchronized from the new secondary unit.
As a result, if the failed primary unit can restart, no email is lost from the mail queue.
See also
About HA heartbeat and synchronization
Storing mail data from HA clusters on a NAS server
Storing mail data from HA clusters on a NAS server
In active-active HA, if FortiMail units are operating in server mode, you must store mail data centrally on a network attached storage (NAS) server — not on each FortiMail unit. Otherwise users’ email and other data could be scattered across multiple FortiMail units, and it won't be available when they connect to another.
For other HA and operating modes, it also may be better to store mail data on a NAS server.
For example, regular NAS server backups help to prevent mail data loss, even if a FortiMail unit has hardware failure. Also, during a temporary failure of a FortiMail unit, you can still access the mail data on the NAS server. When the FortiMail unit restarts, it can usually continue to access and use the mail data stored on the NAS server.
For active-active HA with a NAS server, only the primary unit sends quarantine reports to email users. The primary unit also acts as a proxy between email users and the NAS server when email users use FortiMail webmail to access quarantined email.
For active-passive HA, the primary unit stores all mail data on the NAS server in the same way as a standalone unit. If a failover occurs, the new primary unit also uses the same NAS server, and continues operating with no loss of mail data.
|
|
If FortiMail units are in active-passive HA, and store mail data on a remote NAS server, disable Synchronize mail data directory to reduce redundant network traffic and save bandwidth. |
For instructions on storing mail data on a NAS server, see Selecting the mail data storage location.
See also
About HA heartbeat and synchronization
Synchronization of MTA queue directories after a failover
About logging, alert email, and SNMP for HA
For faster discovery and diagnosis of network problems that have caused an HA failover, you can configure SNMP, Syslog, and/or alert email to monitor FortiMail HA.
To configure logging and alert email, configure the primary unit and enable HA events. When the configuration changes are synchronized to secondary units, all FortiMail units in the HA cluster or group of clusters record their own separate log messages and send separate alert email messages. Log data is not synchronized.
|
|
To distinguish alert email from each FortiMail unit in HA, configure a different host name for each. For details, see Host name. |
To use SNMP to monitor HA failover, configure each cluster member to enable HA events for the SNMP community, such as:
See also
Configuring SNMP queries and traps
About HA heartbeat and synchronization
Configuring HA
Depending on your HA deployment scenario, use the following procedures to deploy either the member or group type of HA.
After you configure HA, usually administrators connect only to the primary unit. Changes made to the primary unit are synchronized to the secondary units. See About HA heartbeat and synchronization.
Exceptions include:
- viewing log messages recorded about a secondary unit on its own hard disk
- configuring settings that are not synchronized (see Settings that are not synchronized by HA)
Deploying member HA
The following procedures describe how to set up a FortiMail pair or cluster in the member HA type.
-
Register all FortiMail units in the HA cluster with the Fortinet Technical Support web site:
If you use licensed features such as centralized HA monitoring, FortiGuard Antivirus, and/or FortiGuard Antispam, you must purchase and register licenses for each unit.
You can mix different models in FortiMail HA. However:
- All FortiMail units must have the same firmware version and operation mode.
- Capacity and FortiMail maximum configuration values are limited by the least powerful model.
-
Design a network topology that avoids a single point of failure.
For example, if there is only one router or firewall or ISP link, and it fails, then service downtime will occur even if the FortiMail HA cluster and email servers are still operating normally. To avoid this risk, if possible, all devices and links should be redundant. (Connect each FortiMail unit to two gateway routers, etc. You may need more network cables and devices to achieve this.)
-
Connect the network interfaces that will be used for HA heartbeat and synchronization. At least one heartbeat link is required.
For example, you could use a network cable to directly connect FortiMail A's port2 to FortiMail B's port2.
To minimize failovers and sync disruptions, create two heartbeat links. Either:
- Directly link each pair of heartbeat ports with an Ethernet crossover cable.
- Connect each pair through an isolated, dedicated local switch.
This guarantees bandwidth and lower latency for the synchronization and heartbeat, even if one cable is accidentally disconnected. For better reliability, also enable Remote services as heartbeat.
Don't use DHCP IP addresses for heartbeat links. DHCP can be the default or common for VM instances in cloud deployments, but DHCP can disrupt the HA heartbeat link when an IP address has not been assigned yet by the DHCP server, such as:
-
during firmware upgrades
-
if DHCP clients have an IP address conflict
-
if DHCP reservations fail
Use static IP addresses instead.
Don't disconnect heartbeat links once HA is enabled. If the heartbeat is interrupted, then the secondary will assume that the primary has failed, and become the new primary. If no failure has actually occurred, however, both FortiMail units will be operating as primary units at the same time (a "split brain"). This disrupts synchronization and could cause scattered data. In active-passive HA, it also can cause an IP address conflicts. To correct the role on a unit that should be secondary, click Restore.
-
If you will use active-passive HA with gateway or server operation mode, add a Virtual IP address (or Virtual IPv6 address) and Virtual hostname to the network interface on the primary unit that receives email connections.
On internal DNS servers, update records to use this virtual IP address, not the physical IP address.
On public DNS servers, records should still use the public IP address. If your router or firewall applies NAT, this IP address may be on their WAN or gateway interface, not the virtual IP address on FortiMail.
Wait for the DNS records to propagate to non-authoritative DNS servers before you enable HA. This prevents service disruptions.
Topology with virtual IP address for active-passive HA
-
If you will use active-active HA, configure storage of mail data on a NAS server. See Storing mail data from HA clusters on a NAS server.(Active-passive HA can also benefit from a NAS server, but does not require it.)
-
If you will use remote service monitoring (SMTP etc.), then enable those services on the heartbeat network interfaces. See Mail access.
-
On the FortiMail unit that will be the primary in the HA cluster, go to System > High Availability > Configuration and:
-
Configure the following:
GUI item
Description
Enable or disable HA.
Select Member. See About HA types.
Select either Active-Active or Active-Passive. See About HA modes.
Select what the primary unit will do after it fails (if it can recover), either:
- Switch off immediately — Do not automatically rejoin the HA cluster. To manually rejoin it to the cluster with its configured Member role, click Restore.
- Wait for recovery — Automatically rejoin the cluster, but the Effective becomes Secondary. To manually restore the FortiMail unit to acting in its configured Member role, click Restore.
- Wait for recovery and switch to configured role — Automatically rejoin the cluster, but the Effective becomes Primary again.The secondary unit that was temporarily acting as primary also automatically becomes Secondary again. This option may be useful if the cause of failure is temporary and rare, but may cause problems if the cause of failure is recurring, resulting in many extra role changes.
Tip: In most cases, you should select Wait for recovery.
Enter a password for this HA cluster.
Before FortiMail units in the HA cluster synchronize with each other, they verify that they have the same password. This prevents them from accidentally synchronizing with the wrong cluster. Therefore you must enter the same HA password on all of them.
-
Expand the Member section. For each FortiMail unit in the HA cluster, click New and configure the following:
GUI item
Description
Enter the name of this unit in the HA cluster.
Select the role of the FortiMail unit in the HA cluster, either Primary or Secondary.
Each FortiMail unit's role in the HA cluster is not synchronized because this distinguishes the primary and secondary units.
Effects of the role vary by HA mode. See About HA modes.
Use current device
Click to automatically fill out the following fields with the current device information.
Enter the IP address of the network interface that will listen for the heartbeat and synchronization.
Alternatively, to define a heartbeat interface, instead use Host name.
If you want more heartbeat interfaces, click + and then add those IP addresses.
Alternatively, if you are currently configuring the device that you are adding to the table, click Use Current Device.
Note: You must also bring up and then enable Heartbeat status on the interface. If it is disabled, but the IP address is configured here, then HA will detect that the heartbeat link has failed.
Enter the hostname of the network interface that will listen for the heartbeat and synchronization.
Alternatively, to define a heartbeat interface, instead use IPv4 address (or IPv6 address).
Note: You must also bring up and then enable Heartbeat status on the interface. If it is disabled, but the hostname is configured here, then HA will detect that the heartbeat link has failed.
Tip: Use a hostname to define the heartbeat interface (not an IP address) in environments where IP addresses change often, such as with VMs and containers.
Heartbeat hostnames might not be the same as the SMTP relay/proxy hostname (Host name in mail settings) and virtual hostname for active-passive HA (Virtual hostname). If it is, however, then you can click Use Current Device to automatically paste the MTA hostname into this field.
If HA mode is Active-Active, then there can be many secondary units. Enable this setting if Member role is Secondary, and you want to select this member to become the new primary when a failure is detected.
Note: Usually you should have a primary backup. Otherwise configuration synchronization will be interrupted upon failure. See About HA heartbeat and synchronization.
Comment
Optional. Enter a descriptive comment.
- If the HA mode is active-passive, configure the Virtual IP address (or Virtual IPv6 address) that will transfer upon failover.
-
If the HA cluster stores mail data on NAS, disable Synchronize mail data directory.
-
Optionally, configure:
- Click Apply.
-
-
Repeat the previous steps for secondary units.
Except for Shared password, Member role, and the IP address or hostname of the primary that the secondary is connecting to, skip most settings.
-
If the HA mode is active-active, configure the load balancer with either remote service monitoring or interface monitoring to detect failed FortiMail units, and to redirect and balance connections among available FortiMail units.
-
Monitor the status of each cluster member. For details, see Monitoring HA status, Logs, reports, and alerts, and Centrally monitoring the HA cluster.
See also
About HA heartbeat and synchronization
Settings that are not synchronized by HA
Deploying group HA
The following procedures describe how to set up group HA with multiple FortiMail unit clusters.
-
Register all of the FortiMail units as described in Deploying member HA.
-
Connect the network interfaces as described in Deploying member HA.
-
On the primary cluster's primary unit:
-
Go to System > High Availability > Configuration.
-
Configure the HA settings as described in Deploying member HA.
-
From Type, select Group.
The Group section becomes available.
HA mode and Action on failure now apply across the group of clusters (not to this primary unit's individual cluster). If required, reconfigure those settings.
-
Expand the Group section.
-
Click New. Configure the following settings, and then click OK and Apply.
Repeat this step for each primary and secondary cluster in the group of clusters.
GUI item
Description
Enter the name of the HA cluster.
Select the cluster's role, either Off, Primary, or Secondary.
Select the group of clusters' mode, ether Active-Active (A-A) or Active-Passive (A-P).
Comment
Optional. Enter a description or comment.
-
-
On other units:
-
Go to System > High Availability > Configuration.
- If you recently changed HA settings on the primary cluster's primary unit, then click the Refresh icons next to the Group and Member sections to get the current entries.
-
Click Join an existing HA cluster.
-
Configure the following settings:
GUI item
Description
Enter the IP address of the primary cluster's primary unit.
Shared password
Enter the Shared password that was configured on the primary unit.
Enter the unit's name in the cluster.
Enable this option.
Select which cluster to join.
-
Click Confirm and Join.
This option is only available if HA is not already configured on the unit. If HA has been configured before, cancel and go to the next step instead.
-
-
If member HA is already configured on the unit, and you want it to join group HA:
-
Go to System > High Availability > Configuration.
-
From Type, select Group.
-
Expand the Member section.
-
Double-click to edit the member.
-
Configure the following:
GUI item
Description
Group name
Select which HA group to join.
This setting is available only if Type is Group.
-
Click OK, and then click Apply.
The primary unit in the primary HA cluster will collect and populate the HA information on other primary units in the secondary HA clusters, which will then propagate the information to their secondary units.
-
See also
Advanced Option section
-
Go to System > High Availability > Configuration.
-
Expand the Advanced Option section.
-
Configure the following and then click Apply:
GUI item
Description
Enable if the HA cluster does not store its mail data on a NAS server, and you need to use HA communications to synchronize its system quarantine, per-recipient quarantines, email archives, email users’ preferences, and (server mode only) mailboxes.
This setting applies only if HA mode is Active-Passive.
You can manually initiate a data synchronization whenever significant changes occur. See Start configuration sync.
Enable if you want to synchronize the mail queue with FortiMail units in the HA cluster.
This setting applies only if HA mode is Active-Passive.
If the primary unit experiences a hard drive failure and you cannot restart it, and if this option is disabled, MTA queue directory data could be lost.
If you enable this option, it can reduce performance, and is not guaranteed to prevent data loss. Mail queue directories are very dynamic. Many email could be added to the queue between each sync.
If you disable this option, data loss might not occur, either. After a failover, when the unit rejoins the cluster, a separate synchronization mechanism occurs. This often restores the mail queue. For details, see Synchronization of MTA queue directories after a failover and Managing the mail queues.
Enter the first of multiple port numbers (see Appendix C: Port Numbers) that will be used for:
- heartbeat signals
- synchronization control
- data synchronization
- configuration synchronization
In addition to a lost heartbeat, other unresponsive network services and hardware failure can also be used to trigger failover. For details, see Service Monitor section and About HA heartbeat and synchronization.
In addition to automatic immediate and periodic configuration synchronization, you can also manually initiate synchronization. For details, see Start configuration sync.
Enter the amount of time, in seconds, that a primary unit can be unresponsive until HA detects a failure and performs the action in Action on failure.
To determine the best heartbeat threshold, monitor your FortiMail unit's performance. Examine how long each high system resource usage lasts. Configure a threshold that is longer than most peak usage. This gives the secondary unit enough time to accurately confirm unresponsiveness, and avoid unnecessary failovers. (Heartbeat responses may be slow during peak load.) See also Using the dashboard, Centrally monitoring the HA cluster, and Troubleshoot resource issues.
If you have service level agreements (SLA), then you may be required to keep this time short. If the failure detection time is too long, email delivery could be delayed or fail until HA detects the failure. This reduces service uptime.
Enable to avoid the Action on failure action if the heartbeat links (see Interface section) temporarily fail, but service monitoring such as for SMTP (see Service Monitor section) detects that the primary unit is still available.
The Action on failure action can still occur if the HA process restarts due to system reboot or HA daemon restart. Then it examines the physical heartbeat links first. If they are not found, then failure is detected.
This setting provides an extra HA heartbeat only, not synchronization. To avoid synchronization problems, do not use remote service monitoring as a heartbeat for a long time. This feature is intended only as a temporary heartbeat until you reestablish a normal primary or secondary heartbeat link.
Interface section
This section configures the HA behavior of network interfaces on this FortiMail unit, especially whether they have a:
- heartbeat (see About HA heartbeat and synchronization)
- virtual IP address or virtual hostname
- interface monitor
In a basic HA deployment, the heartbeat interface provides a basic signal to other HA group members about the health of the primary FortiMail unit. However, you can use an additional signals. Interface monitoring periodically tests the local network interfaces on the primary unit . If a malfunctioning interface is detected, HA performs the action configured in Action on failure. This can include reconfiguring network interfaces to move virtual IP addresses onto the new primary unit.Interface monitoring periodically tests the local network interfaces on the primary unit . If a malfunctioning interface is detected, HA performs the action configured in Action on failure. This can include reconfiguring network interfaces to move Virtual IP address (or Virtual IPv6 address) and Virtual hostname onto the new primary unit.
- Configure the interface monitoring interval and failure detection threshold. See Service Monitor section.
-
Go to System > High Availability > Configuration.
-
Expand the Interface section.
-
Select a row for a network interface in the table, and then click Edit.
-
Configure the following settings:
GUI item
Description
Enable if this interface will listen for HA heartbeat and synchronization communications.
You must enable this option on at least one of the network interfaces that you defined for the unit in IPv4 address (or IPv6 address). Otherwise HA will detect a failure.
Displays the name of the network interface that you are configuring.
Optionally, you can click the name to view or configure its settings. See also Configuring the network interfaces.
Enter a virtual IP address and netmask that the primary unit will have on this network interface. Upon failure detection, the secondary will become the new primary and start to use the virtual IP address.
For gateway mode and server mode, DNS records should be configured to point to the virtual IP address, not the physical IP addresses.See also About HA modes, Configuring the network interfaces, and About IPv6 Support.
This setting is available only if HA mode is Active-Passive.
Enter a virtual hostname.
Similar to behavior with the virtual IP address, the virtual hostname belongs to the current primary unit. Upon failover, the secondary unit becomes the new primary unit, and so it starts to use the virtual hostname instead.
This setting is available only if HA mode is Active-Passive.
Enable to monitor the network interface for failure. Connection interval and retries occur according to the interface monitoring settings in Service Monitor section.
Service Monitor section
Failed FortiMail units, in the simplest HA deployments, are detected by an interrupted heartbeat. However HA can also detect failure of hardware and network services. Heartbeats detect the general responsiveness of a primary unit, but do not test each daemon (for example, POP3 or webmail service), hard drive, and physical network ports used by non-heartbeat traffic. Therefore you can add hardware and service monitoring to be more specific. Alternatively, if the heartbeat link is briefly disconnected, services monitoring can prevent an unnecessary failover by temporarily acting as a secondary heartbeat.
With service monitoring, the secondary unit connects to the SMTP, POP3, and/or web service (HTTP) on the primary unit to detect failure. For server mode, IMAP service can also be monitored.
With local network interface monitoring and hard drive monitoring, the primary unit monitors its own network interfaces and hard drives.Hard drive monitoring tests that the local hard drive is still accessible, and disk space exists for mail data. If the hard disk is not responsive, or if the mail data disk is 95% full, then a failure is detected.
Network interface monitoring tests all network interfaces where:
- Status is enabled (the network interface is up)
- Enable port monitor is enabled
Alert email, log messages, and SNMP traps (if configured) indicate the specific cause.
To configure hardware and service monitoring
-
Go to System > High Availability > Configuration.
-
Expand the Service Monitor section.
-
Select a row in the table and click Edit.
For Remote SMTP, Remote IMAP, Remote POP, and Remote HTTP services, configure the following and click OK:
GUI item
Description
Enable
Enable or disable monitoring for the service.
Name
Displays the service name.
Enter the listening port number of the service on the primary unit and (active-active HA only) secondary. See also Appendix C: Port Numbers and Mail access.
Enter the amount of time in seconds to wait for a response when service monitoring tries to connect.
Enter the amount of time in seconds between each try.
Enter the number of consecutive unsuccessful tries that indicates a failure.
For interface monitoring, configure the following and click OK (to specify which ports are monitored, see Interface section):
GUI item
Description
Enter the amount of time in seconds between each try.
Enter the number of consecutive unsuccessful tries that indicates a failure.
For local hard drive monitoring, configure the following and click OK:
GUI item
Description
Enable
Enable or disable monitoring of the local hard drive.
Enter the amount of time in seconds between each try.
Enter the number of consecutive unsuccessful tries that indicates a failure.
See also
About HA heartbeat and synchronization
About logging, alert email, and SNMP for HA
Monitoring HA status
After you configure HA (see Configuring HA), to view the current roles and synchronization status of the HA group, go System > High Availability > Status. You can also manually initiate some HA actions, such as Sync and Failover.
Most information is automatically populated after the primary unit connects to this unit, and that unit joins the HA cluster. Then HA statuses such as Status are kept up-to-date via the heartbeat.
|
GUI item |
Description |
|---|---|
|
Type |
Displays the configured Type. |
|
Mode |
Displays the configured HA mode. |
|
Refresh (button) |
Click to get the newest data and display it on System > High Availability > Status. If the display does not refresh, you may need to click Clear Cache first. |
|
(button) |
Select a FortiMail unit or cluster, and then click this button to manually trigger a failover. |
|
(button) |
Select a FortiMail unit or cluster, and then click to manually restart HA and reset Effective to match the unit's initially configured Member role. Caution: When a failed unit reboots, don't click Restore until it finishes synchronizing its mail queue and other data with the current primary. If this recovery mechanism is interrupted, data could be lost. For details, see Status and Synchronize mail data directory. |
|
(button) |
Select a FortiMail unit or cluster, and then click to manually initiate configuration synchronization with other FortiMail units in the HA cluster or group of clusters. See also Settings that are not synchronized by HA. |
|
(button) |
Click to reload the heartbeat daemon and its status data to show current information. |
|
Name |
Name of the unit and, if there are multiple HA clusters, the Group name. |
|
SN |
Serial number. |
|
IP |
|
|
Version |
Firmware version. A FortiMail unit must run the same firmware version in order to join the HA group, so that the configuration can be synchronized. Exceptions are during updates. See Upgrading firmware on HA units. |
|
See Combinations of configured and effective HA role. In active-active HA, the secondary unit that is the Primary backup (if configured) will display Secondary, like other secondary units. |
|
| Effective |
See Combinations of configured and effective HA role. After a failure has been detected, this status may not match the initially configured Member role. To return to that role, click Restore. |
| Status |
Displays the status of HA cluster joining, heartbeat, and synchronization. See also Combinations of configured and effective HA role.
|
|
Amount of time that the HA cluster member has been operational. |
|
|
When this FortiMail unit’s HA daemon last communicated with the others in the HA group to make sure that they are available. See also Heartbeat lost threshold and HA base port. |
See also
Centrally monitoring the HA cluster
About HA heartbeat and synchronization
About logging, alert email, and SNMP for HA
Combinations of configured and effective HA role
To adapt when it detects a heartbeat or synchronization failure, a FortiMail HA unit may no longer be operating in its initially configured Member role.
Combinations of the Configured and Effective columns on System > High Availability > Status indicate if the unit joined the HA cluster and it is operating normally or not. The Status column may indicate troubleshooting information.
|
Result |
||
|
Primary |
Primary |
Normal for the primary. |
|
Secondary |
Secondary |
Normal for the secondary. In active-active HA, however, this can also happen if the primary has failed. (Most of the secondaries continue to show Secondary. Only the unit where you enabled Primary backup has Effective showing Primary.) |
|
Primary |
Discovering |
Initial HA configuration is complete. The primary is now trying to connect with other HA units to form a heartbeat link. |
|
Primary |
Registering |
Heartbeat connection succeeded and the unit is joining the cluster. |
|
Primary |
Unknown |
Initial HA configuration was not able to complete. Therefore the unit could not try to join an HA cluster or group. For example, if the primary is defined, but not the other units, then HA cannot form a heartbeat link yet. This situation should correct itself once all units are configured. |
|
Primary |
Off |
Either the:
and the heartbeat and configuration synchronization are currently stopped. After the secondary joins an HA cluster or group, some causes such as network interruptions could cause the first configuration synchronization to fail. To prevent both the secondary and primary from simultaneously acting as primary ("split brain"), Effective temporarily becomes Off. If the next synchronization fails again, then the secondary's Effective becomes Primary. To restart HA processes and return the unit to the originally configured role, click Restore. |
|
Primary |
Hold Off |
The primary is rebooting or upgrading firmware. It asked to wait longer than the usual Heartbeat lost threshold so that the reboot can complete. If the primary does not return, then the secondary performs the action in Action on failure or Primary backup. |
|
Primary |
Failed |
Remote service monitoring, or local hard drive, or network interface monitoring has detected a failure. If operating in transparent mode, then on System > Network > Interface, the network interface IP/Netmask on the secondary displays Bridging (waiting for recovery). When you correct the failure, Effective changes to either Secondary or Primary, depending on Action on failure. |
|
Primary |
Secondary |
The primary failed. A secondary automatically became the new primary. When the failed unit restarted, it detected that there was already a primary in the HA cluster or group, and so now the failed unit is the new secondary. If you want the failed unit to return to acting as the primary, click Restore. |
|
Secondary |
Primary |
The secondary detected that the primary failed, and then the secondary became the new primary. If you want it to return to acting as the secondary,click Restore. |
|
Secondary |
Secondary (No Primary) |
The secondary detected that the primary failed, but it was not configured as Primary backup. Therefore configuration synchronization cannot occur until you either repair the primary, or manually configure a secondary to become the new primary. This occurs only if HA mode is Active-active. |
See also