Performance SLA overview
Performance SLAs consist of three parts:
Health checks
A health check is defined by a probe mode, protocol, and server. These three options specify what resource is being evaluated and how the evaluation is done. Each health check should be configured specifically for that resource, so the probe mode, protocol and server should be tailored for the particular service. For example, the health check for a VoIP service will differ than one for a database replication service.
Performance SLA participants are the interfaces that will be evaluated for a given health check. They must be SD-WAN member interfaces, but do not have to belong to the same zone. When selecting participants, only select participants that you expect the service communications to use. For example, a health check for a corporate resource might only use the overlay to access the service. Therefore, you would only add the VPN interfaces as participants.
There are six predefined performance SLA profiles for newly created VDOMs or factory reset FortiGate devices: AWS, DNS, FortiGuard, Gmail, Google Search, and Office 365. These performance SLA profiles provide Fortinet recommended settings for common services. To complete the performance SLA configuration, add the participants for the service. You can adjust the default settings to suit your needs.
Probe mode
The probe mode can be set to active, passive, or prefer passive.
In active mode, the FortiGate sends a packet of the type specified by the protocol setting towards the defined server. This allows you to evaluate the path to the destination server using the protocol that matches the service provided by the server. Active probing does add some overhead in the form of health check probes (and additional configurations to define the probe type and server), but it has the benefit of constantly measuring the performance of the path to the server. This can be beneficial when reviewing historical data.
In passive mode, session information captured by firewall policies is used to determine latency, jitter, and packet loss. This has the added benefit of not generating additional traffic, and does not require the performance SLA to define a specific server for measurement. Instead, the SD-WAN rule must define the traffic to evaluate, and the firewall policy permitting the traffic must have a setting enabled. See Passive WAN health measurement and Passive health-check measurement by internet service and application for more information.
Prefer passive mode is a combination of active and passive modes. Health is measured using traffic when there is traffic, and using probes when there is no traffic. A protocol and server must be configured.
Protocol
Health checks support a variety of protocols and protocol specific options. The most commonly used protocols (ping, HTTP, and DNS) can be configured in the GUI when creating a new performance SLA on the Network > SD-WAN > Performance SLAs page. The following protocols and options can be configured in the CLI using the set protocol <option>
parameter:
ping |
Use PING to test the link with the server. |
tcp-echo |
Use TCP echo to test the link with the server. |
udp-echo |
Use UDP echo to test the link with the server. |
http |
Use HTTP-GET to test the link with the server. |
twamp |
Use TWAMP to test the link with the server. |
dns |
Use DNS query to test the link with the server. The FortiGate sends a DNS query for an A Record and the response matches the expected IP address. |
tcp-connect |
Use a full TCP connection to test the link with the server. The method to measure the quality of the TCP connection can be:
|
ftp |
Use FTP to test the link with the server. The FTP mode can be:
|
SD-WAN health checks can generate traffic that becomes quite high as deployments grow. Take this into consideration when setting DoS policy thresholds. For details on setting DoS policy thresholds, refer to DoS protection. |
To use UDP-echo and TCP-echo as health checks:
config system sdwan set status enable config health-check edit "h4_udp1" set protocol udp-echo set port 7 set server <server> next edit "h4_tcp1" set protocol tcp-echo set port 7 set server <server> next edit "h6_udp1" set addr-mode ipv6 set server "2032::12" set protocol udp-echo set port 7 next end end
To use DNS as a health check, and define the IP address that the response must match:
config system sdwan set status enable config health-check edit "h4_dns1" set protocol dns set dns-request-domain "ip41.forti2.com" set dns-match-ip 1.1.1.1 next edit "h6_dns1" set addr-mode ipv6 set server "2000::15.1.1.4" set protocol dns set port 53 set dns-request-domain "ip61.xxx.com" next end end
To use TCP Open (SYN/SYN-ACK) and TCP Close (FIN/FIN-ACK) to verify connections:
config system sdwan set status enable config health-check edit "h4_tcpconnect1" set protocol tcp-connect set port 443 set quality-measured-method {half-open | half-close} set server <server> next edit "h6_tcpconnect1" set addr-mode ipv6 set server "2032::13" set protocol tcp-connect set port 444 set quality-measured-method {half-open | half-close} next end end
To use active or passive mode FTP to verify connections:
config system sdwan set status enable config health-check edit "h4_ftp1" set protocol ftp set port 21 set user "root" set password *********** set ftp-mode {passive | port} set ftp-file "1.txt" set server <server> next edit "h6_ftp1" set addr-mode ipv6 set server "2032::11" set protocol ftp set port 21 set user "root" set password *********** set ftp-mode {passive | port} set ftp-file "2.txt" next end end
Health check probe packets support DSCP markers for accurate link performance evaluation for high priority applications. This allows the probe packet to match the real traffic it is providing measurements for, including how that traffic is shaped by upstream devices based on the DSCP markers.
To mark health check packets with DSCP:
config system sdwan config health-check edit <name> set diffservcode <6-bits_binary, 000000-111111> set protocol <option> next end end
Server
An IP address or FQDN can be defined as the server that the probe packets will be sent to. Up to two servers can be defined this way. When two servers are provided, both must fail in order for the health check to fail. This is to avoid a scenario where one remote server is down and causes a false positive that the link is down. The FortiGate can still use the interface associated with this health check to reach the remaining healthy server.
The purpose of the server is not simply to measure the health of the link, but rather the health of the path to a resource. It is highly recommended to use an IP address or FQDN that reflects the resource so the traffic path is considered.
A server can only be used in one performance SLA at any given time. |
SLA targets
SLA targets are a set of constraints that are used in SD-WAN rules to control the paths that traffic takes. The constraints are:
- Latency threshold: latency for SLA to make a decision, in milliseconds (0 - 10000000, default = 5).
- Jitter threshold: jitter for SLA to make a decision, in milliseconds (0 - 10000000, default = 5).
- Packet loss threshold: packet loss for SLA to make a decision, in percentage (0 - 100, default = 0).
These settings should be specific to the service whose performance is being considered. You should attempt to configure the constraints to be just under the maximum values for the application or service to function well. For example, if your application requires less than 100 ms latency, then you should configure the SLA target to be 90 ms. Misconfiguring these settings will cause the performance SLA to lose value. If the values are too tight, then you may have traffic flipping between links before necessary. If the values are too loose, then performance may be impacted and the FortiGate will do nothing about it.
In the GUI, one SLA target can be configured, but additional targets can be configured in the CLI. Once a second target is configured in the CLI, additional targets can be configured from the GUI. Multiple SLA targets can be configured where a server provides multiple services that have different values for acceptable performance. For example, Google provides a DNS service and entertainment services (YouTube), so it is necessary to configure multiple SLA targets in this case since you can only configure a server in one performance SLA.
Link status
The Link Status section of the performance SLA configuration consists of three settings that determine the frequency that the link is evaluated, and the requirements to be considered valid or invalid:
- Check interval: the interval in which the FortiGate checks the interface, in milliseconds (500 - 3600000, default = 500).
- Failures before inactive: the number of failed status checks before the interface shows as inactive (1 - 3600, default =5). This setting helps prevent flapping, where the system continuously transfers traffic back and forth between links.
- Restore link after: the number of successful status checks before the interface shows as active (1 - 3600, default = 5). This setting also helps prevent flapping.
When a participant becomes inactive, the performance SLA causes the FortiGate to withdraw all static routes associated with that interface. If there are multiple static routes using the same interface, they will all be withdrawn when the link monitor is failing.