Embedded SD-WAN SLA status in ICMP probes
In SD-WAN hub-and-spoke topologies, each spoke can embed in ICMP probes sent to the hub the SLA status decided by the spoke. The SLA status is either in SLA or out of SLA. The hub uses the received SLA status from each spoke to manage route priority for hub-to-spoke traffic.
SLA priority can also be embedded in ICMP probes. See Embedded SD-WAN SLA priorities in ICMP probes for more information.
Spoke-initiated speed tests can also use the embedded information. When a spoke continually embeds an out-of SLA status into ICMP probes—regardless of the SLA calculation—the hub can use the received out-of SLA status information to manage route priorities and detour hub-to-spoke user traffic to other tunnels.
For configuration on a hub, the config system sdwan
command includes these options:
config system sdwan config health-check edit <entry> set sla-id-redistribute <ID> config sla edit <entry> set link-cost-factor remote {latency | jitter | packet-loss | mos | remote} next end next end end
set sla-id-redistribute <ID> |
Select the ID from the SLA subtable. The selected SLA's priority value will be distributed into the routing table (0 - 32, default = 0). |
set link-cost-factor {latency | jitter | packet-loss | mos | remote} |
Criteria on which to base link selection.
|
The hub can also work in a hybrid mode. If set sla-id-redistribute
is not configured on the spoke, the hub can use its own SLA settings to determine the route priority.
Example
The SD-WAN topology with ADVPN and BGP neighbor on loopback is used for the following two examples:
The examples use the following SD-WAN settings on the spoke:
SD-WAN settings on the spoke relevant to the examples:
config system sdwan set speedtest-bypass-routing enable config health-check edit <name> set embed-measured-health enable set sla-id-redistribute <id> config sla edit 1 <desired SLA thresholds> next end next end next end
SD-WAN settings on the hub relevant to the examples:
config system sdwan config health-check edit <name> set detect-mode remote set sla-id-redistribute <num> config sla edit <num> set link-cost-factor remote set priority-in-sla <value> set priority-out-sla <value> next end next end next end
Path selection based on embedded SLA status
In this example, spokes are configured to embed the SLA status in ICMP probes, and the hub is configured to use the information. With this configuration:
-
The spoke's SLA information (packet-loss, latency, jitter, and mos) and SLA status (in SLA or out of SLA) on each overlay are embedded into ICMP probes and transported to the hub;
-
The hub sets priorities on IKE routes over different overlays based on the received SLA status on overlays
-
On the hub, BGP routes, which are used to guide hub-to-spoke traffic, rely on IKE routes to be resolved to tunnels and inherit the priorities from the IKE routes
-
The hub can steer traffic to spokes by resolved BGP routes with different inherited priorities
To embed SD-WAN SLA status and priorities in ICMP probes:
-
Configure SD-WAN and BGP on Spoke-1 (Branch1_A_FGT):
-
Configure SD-WAN:
config system sdwan set status enable set speedtest-bypass-routing enable config zone edit "overlay" next end config members edit 4 set interface "H1_T11" set zone "overlay" set source 172.31.0.65 set priority 10 next edit 5 set interface "H1_T22" set zone "overlay" set source 172.31.0.65 set priority 10 next end config health-check edit "HUB" set server "172.31.100.100" set embed-measured-health enable set sla-id-redistribute 1 set members 4 5 config sla edit 1 set link-cost-factor latency set latency-threshold 50 next end next end end
-
Configure BGP:
config router bgp set as 65001 set router-id 172.31.0.65 ...... config neighbor edit "172.31.0.1" ...... set remote-as 65001 set update-source "Loopback0" next end config network edit 1 set prefix 10.0.3.0 255.255.255.0 next end ...... end
-
View the health-check settings:
All overlays on Spoke-1 are in SLA.
# diagnose sys sdwan health-check Health Check(HUB): Seq(4 H1_T11): state(alive), packet-loss(0.000%) latency(0.252), jitter(0.025), mos(4.404), bandwidth-up(999998), bandwidth-dw(999998), bandwidth-bi(1999996) sla_map=0x1 Seq(5 H1_T22): state(alive), packet-loss(0.000%) latency(0.199), jitter(0.008), mos(4.404), bandwidth-up(999998), bandwidth-dw(999997), bandwidth-bi(1999995) sla_map=0x1
-
-
Configure SD-WAN and BGP on Spoke-2 (Branch2_FGT):
-
Configure SD-WAN:
config system sdwan set status enable set speedtest-bypass-routing enable config zone edit "overlay" next end config members edit 4 set interface "H1_T11" set zone "overlay" set source 172.31.0.66 set priority 10 next edit 5 set interface "H1_T22" set zone "overlay" set source 172.31.0.66 set priority 10 next end config health-check edit "HUB" set server "172.31.100.100" set embed-measured-health enable set sla-id-redistribute 1 set members 4 5 config sla edit 1 set link-cost-factor latency set latency-threshold 70 next end next end end
-
Configure BGP:
config router bgp set as 65001 set router-id 172.31.0.66 ...... config neighbor edit "172.31.0.1" ...... set remote-as 65001 set update-source "Loopback0" next end config network edit 1 set prefix 10.0.4.0 255.255.255.0 next end ...... end
-
View the health-check settings:
All overlays on Spoke-2 are in SLA.
# diagnose sys sdwan health-check Health Check(HUB): Seq(4 H1_T11): state(alive), packet-loss(0.000%) latency(0.245), jitter(0.034), mos(4.404), bandwidth-up(999998), bandwidth-dw(999996), bandwidth-bi(1999994) sla_map=0x1 Seq(5 H1_T22): state(alive), packet-loss(0.000%) latency(0.191), jitter(0.007), mos(4.404), bandwidth-up(999998), bandwidth-dw(999997), bandwidth-bi(1999995) sla_map=0x1
-
-
Configure SD-WAN and BGP on the Hub (DC1_A_FGT):
-
Configure SD-WAN:
config system sdwan set status enable config zone edit "overlay" next end config members edit 1 set interface "EDGE_T1" set zone "overlay" next edit 2 set interface "EDGE_T2" set zone "overlay" next end config health-check edit "Remote_HC" set detect-mode remote set sla-id-redistribute 1 set members 1 2 config sla edit 1 set link-cost-factor remote set priority-in-sla 100 set priority-out-sla 200 next end next end end
-
Configure BGP:
config router bgp set as 65001 set router-id 172.31.0.1 set recursive-inherit-priority enable ...... config neighbor-group edit "EDGE" set remote-as 65001 set update-source "Loopback0" set route-reflector-client enable next end config neighbor-range edit 1 set prefix 172.31.0.64 255.255.255.192 set neighbor-group "EDGE" next end ...... end
-
View the health-check settings:
The following example shows:
-
rmt_ver=0
indicates that only SLA information has been received. -
rmt_ver=1
indicates that SLA information and SLA status have been received. -
rmt_sla=out
indicates that received SLA status is out of SLA. -
rmt_sla=in
indicates that received SLA status is in SLA. -
EDGE_T1_0
is toH1_T11
on Spoke-1, andEDGE_T1_1
is toH1_T11
on Spoke-2. -
EDGE_T2_1
is toH1_T22
on Spoke-1, andEDGE_T1_0
is toH1_T22
on Spoke-2.
# diagnose sys sdwan health-check remote Remote Health Check: Remote_HC(1) Passive remote statistics of EDGE_T1(46): EDGE_T1_0(10.0.0.20): timestamp=06-11 14:50:33, src=172.31.0.65, latency=0.261, jitter=0.043, pktloss=0.000%, mos=4.404, SLA id=1(pass), rmt_ver=1, rmt_sla=in, rmt_prio=0 EDGE_T1_1(172.31.0.66): timestamp=06-11 14:50:33, src=172.31.0.66, latency=0.285, jitter=0.038, pktloss=0.000%, mos=4.404, SLA id=1(pass), rmt_ver=1, rmt_sla=in, rmt_prio=0 Remote Health Check: Remote_HC(2) Passive remote statistics of EDGE_T2(47): EDGE_T2_0(10.0.0.15): timestamp=06-11 14:50:33, src=172.31.0.66, latency=0.195, jitter=0.008, pktloss=0.000%, mos=4.404, SLA id=1(pass), rmt_ver=1, rmt_sla=in, rmt_prio=0 EDGE_T2_1(172.31.0.65): timestamp=06-11 14:50:33, src=172.31.0.65, latency=0.202, jitter=0.009, pktloss=0.000%, mos=4.404, SLA id=1(pass), rmt_ver=1, rmt_sla=in, rmt_prio=0
-
-
-
After the spokes' SLA status on overlays are embedded in ICMP probes and transported to the hub, view the routing tables on the hub.
Based on the received in-SLA status, the hub sets a predefined priority of 100 on IKE routes over EDGE_T1 and EDGE_T2. Meanwhile, recursively resolved BGP routes inherit the priorities from those IKE routes.
-
On the hub, get the static routing table:
# get router info routing-table static Routing table for VRF=0 S 172.31.0.65/32 [15/0] via EDGE_T2 tunnel 172.31.0.65, [100/0] [15/0] via EDGE_T1 tunnel 10.0.0.20, [100/0] S 172.31.0.66/32 [15/0] via EDGE_T1 tunnel 172.31.0.66, [100/0] [15/0] via EDGE_T2 tunnel 10.0.0.15, [100/0]
-
On the hub, get the BGP routing table:
# get router info routing-table bgp Routing table for VRF=0 B 10.0.3.0/24 [200/0] via 172.31.0.65 (recursive via EDGE_T2 tunnel 172.31.0.65 [100]), 22:15:50 (recursive via EDGE_T1 tunnel 10.0.0.20 [100]), 22:15:50, [1/0] B 10.0.4.0/24 [200/0] via 172.31.0.66 (recursive via EDGE_T1 tunnel 172.31.0.66 [100]), 00:01:50 (recursive via EDGE_T2 tunnel 10.0.0.15 [100]), 00:01:50, [1/0]
-
-
Change the latency on Spoke-1 and Spoke-2, and view the results.
-
On Spoke-1, increase H1_T11's latency to 60 to make it out of SLA.
-
On Spoke-2, increase H1_T11's latency to 80 to make it out of SLA.
-
On Spoke-1, run a health check:
Branch1_A_FGT (root) (Interim)# diagnose sys sdwan health-check Health Check(HUB): Seq(4 H1_T11): state(alive), packet-loss(0.000%) latency(60.229), jitter(0.021), mos(4.373), bandwidth-up(999999), bandwidth-dw(999998), bandwidth-bi(1999997) sla_map=0x0 Seq(5 H1_T22): state(alive), packet-loss(0.000%) latency(0.210), jitter(0.012), mos(4.404), bandwidth-up(999998), bandwidth-dw(999997), bandwidth-bi(1999995) sla_map=0x1
-
On Spoke-2, run a health check:
Branch2_FGT (root) (Interim)# diagnose sys sdwan health-check Health Check(HUB): Seq(4 H1_T11): state(alive), packet-loss(0.000%) latency(80.227), jitter(0.024), mos(4.361), bandwidth-up(999998), bandwidth-dw(999998), bandwidth-bi(1999996) sla_map=0x0 Seq(5 H1_T22): state(alive), packet-loss(0.000%) latency(0.202), jitter(0.012), mos(4.404), bandwidth-up(999998), bandwidth-dw(999997), bandwidth-bi(1999995) sla_map=0x1
-
-
After the hub receives the updated SLA status, run a health check on the hub and view the routing tables.
The hub has updated the route priorities based on predefined priority settings (
set priority-in-sla 100
and set priority-out-sla 200
).-
Run a health check:
# diagnose sys sdwan health-check remote Remote Health Check: Remote_HC(1) Passive remote statistics of EDGE_T1(46): EDGE_T1_0(10.0.0.20): timestamp=06-11 15:35:29, src=172.31.0.65, latency=60.244, jitter=0.017, pktloss=0.000%, mos=4.373, SLA id=1(remote), rmt_ver=1, rmt_sla=out, rmt_prio=0 EDGE_T1_1(172.31.0.66): timestamp=06-11 15:35:29, src=172.31.0.66, latency=80.234, jitter=0.036, pktloss=0.000%, mos=4.361, SLA id=1(remote), rmt_ver=1, rmt_sla=out, rmt_prio=0 Remote Health Check: Remote_HC(2) Passive remote statistics of EDGE_T2(47): EDGE_T2_0(10.0.0.15): timestamp=06-11 15:35:29, src=172.31.0.66, latency=0.201, jitter=0.008, pktloss=0.000%, mos=4.404, SLA id=1(pass), rmt_ver=1, rmt_sla=in, rmt_prio=0 EDGE_T2_1(172.31.0.65): timestamp=06-11 15:35:29, src=172.31.0.65, latency=0.215, jitter=0.010, pktloss=0.000%, mos=4.404, SLA id=1(pass), rmt_ver=1, rmt_sla=in, rmt_prio=0
-
View the static routing table:
For
EDGE_T1
, the priority changed from100
to200
because it is out of SLA.DC1_A_FGT (root) (Interim)# get router info routing-table static Routing table for VRF=0 S 172.31.0.65/32 [15/0] via EDGE_T2 tunnel 172.31.0.65, [100/0] [15/0] via EDGE_T1 tunnel 10.0.0.20, [200/0] S 172.31.0.66/32 [15/0] via EDGE_T2 tunnel 10.0.0.15, [100/0] [15/0] via EDGE_T1 tunnel 172.31.0.66, [200/0]
-
View the BGP routing table:
-
For 10.0.3.0/24, EDGE_T2 is preferred. Priority for EDGE_T1 changed from 100 to 200.
-
For 10.0.4.0/24, EDGE_T2 is preferred. Priority for EDGE_T1 changed from 30 to 40.
DC1_A_FGT (root) (Interim)# get router info routing-table bgp Routing table for VRF=0 B 10.0.3.0/24 [200/0] via 172.31.0.65 (recursive via EDGE_T2 tunnel 172.31.0.65 [100]), 22:31:22 (recursive via EDGE_T1 tunnel 10.0.0.20 [200]), 22:31:22, [1/0] B 10.0.4.0/24 [200/0] via 172.31.0.66 (recursive via EDGE_T2 tunnel 10.0.0.15 [100]), 00:08:22 (recursive via EDGE_T1 tunnel 172.31.0.66 [200]), 00:08:22, [1/0]
-
-
Speed test rerouting based on embedded SLA status
When spoke-initiated speed tests are enabled for this configuration, the out-of SLA status is used by the hub to choose other routes during the speed test.
To configure speed tests:
-
On the hub, enable speed tests and allow them on the underlays and overlays.
-
Enable speed tests:
config system global set speedtest-server enable set speedtestd-ctrl-port 6000 set speedtestd-server-port 7000 end
-
Allow speed tests on the underlay:
config system interface edit "port1" ... set allowaccess ping speed-test ... next end
-
Allow speed tests on the underlay:
config system interface edit "port2" ... set allowaccess ping speed-test ... next end
-
Allow speed tests on the overlay, and specify a shaping profile:
config system interface edit "EDGE_T1" ... set allowaccess ping speed-test set type tunnel set egress-shaping-profile "profile_1" ... set interface "port1" next end
-
Allow speed tests on the overlay, and specify a shaping profile:
config system interface edit "EDGE_T2" ... set allowaccess ping speed-test set type tunnel set egress-shaping-profile "profile_1" ... set interface "port2" next end
-
View the shaping profile:
config firewall shaping-profile edit "profile_1" set default-class-id 2 config shaping-entries edit 1 set class-id 2 set priority low set guaranteed-bandwidth-percentage 10 set maximum-bandwidth-percentage 10 next edit 2 set class-id 3 set priority medium set guaranteed-bandwidth-percentage 30 set maximum-bandwidth-percentage 40 next edit 3 set class-id 4 set guaranteed-bandwidth-percentage 20 set maximum-bandwidth-percentage 50 next end next end
-
-
On Spoke-1, schedule speed tests:
config system speed-test-schedule edit "H1_T11" set mode TCP set schedules "speed-test" set dynamic-server enable set ctrl-port 6000 set server-port 7000 set update-shaper remote next edit "H1_T22" set mode UDP set schedules "speed-test" set dynamic-server enable set ctrl-port 6000 set server-port 7000 set update-shaper remote next end
Before starting the speed test on Spoke-1, route priorities are based on the received in-SLA status on both H1_T11 and H1_T22
DC1_A_FGT (root) (Interim)# get router info routing-table bgp Routing table for VRF=0 B 10.0.3.0/24 [200/0] via 172.31.0.65 (recursive via EDGE_T2 tunnel 172.31.0.65 [100]), 00:24:14 (recursive via EDGE_T1 tunnel 10.0.0.20 [100]), 00:24:14, [1/0]
While the speed test is running on H1_T11 of Spoke-1, Spoke-1 will constantly embed NOK
status into probes on H1_T11 and send them to the hub. Then the Hub updates route priorities accordingly and detours hub-to-spoke traffic to H1_T22 to avoid the impact on speed test of H1_T11. EDGE_T2 is preferred, and the EDGE_T1 priority changed from 100 to 200.
DC1_A_FGT (root) (Interim)# get router info routing-table bgp Routing table for VRF=0 B 10.0.3.0/24 [200/0] via 172.31.0.65 (recursive via EDGE_T2 tunnel 172.31.0.65 [100]), 00:04:42 (recursive via EDGE_T1 tunnel 10.0.0.20 [200]), 00:04:42, [1/0]
During the speed test on H1_T22 of Spoke-1, Spoke-1 constantly embeds NOK
status (out of SLA status) into probes on H1_T22 and sends them to the hub. Then the Hub updates route priorities accordingly and detours hub-to-spoke traffic to H1_T11 to avoid the impact on speed test of H1_T22. EDGE_T1 is preferred, and the EDGE_T2 priority changed from 100 to 200.
DC1_A_FGT (root) (Interim)# get router info routing-table bgp Routing table for VRF=0 B 10.0.3.0/24 [200/0] via 172.31.0.65 (recursive via EDGE_T1 tunnel 10.0.0.20 [100]), 00:00:05 (recursive via EDGE_T2 tunnel 172.31.0.65 [200]), 00:00:05, [1/0]
Once speed test completes, the results are applied on child tunnels as egress-shaping-profile on the hub.
DC1_A_FGT (root) (Interim)# diagnose vpn tunnel list list all ipsec tunnel in vd 0 ------------------------------------------------------ name=EDGE_T1_0 ver=2 serial=22 172.31.1.1:0->172.31.3.1:0 nexthop=172.31.1.2 tun_id=10.0.0.20 tun_id6=::10.0.0.31 status=up dst_mtu=1500 weight=1 ....... dec: spi=9932cb0d esp=aes-gcm key=36 466db2b7ef257f0cf4b32ce79ef009485cbaececf0bad273b27c6a0a03d736557dfa15db ah=null key=0 enc: spi=dea67332 esp=aes-gcm key=36 4f10635db3f4b52f156d98291e0b9af21529a12233cb77c8f94d5a58027d26efcfe7d1ac ah=null key=0 dec:pkts/bytes=0/0, enc:pkts/bytes=1762/235516 npu_flag=03 npu_rgwy=172.31.3.1 npu_lgwy=172.31.1.1 npu_selid=26 dec_npuid=1 enc_npuid=1 npu_isaidx=626 npu_osaidx=39 egress traffic control: bandwidth=673982(kbps) lock_hit=0 default_class=2 n_active_class=3 class-id=2 allocated-bandwidth=67398(kbps) guaranteed-bandwidth=67398(kbps) max-bandwidth=67398(kbps) current-bandwidth=1(kbps) priority=low forwarded_bytes=173K dropped_packets=0 dropped_bytes=0 class-id=3 allocated-bandwidth=269592(kbps) guaranteed-bandwidth=202194(kbps) max-bandwidth=269592(kbps) current-bandwidth=0(kbps) priority=medium forwarded_bytes=0 dropped_packets=0 dropped_bytes=0 class-id=4 allocated-bandwidth=336990(kbps) guaranteed-bandwidth=134796(kbps) max-bandwidth=336990(kbps) current-bandwidth=0(kbps) priority=high forwarded_bytes=0 dropped_packets=0 dropped_bytes=0 ------------------------------------------------------ name=EDGE_T2_1 ver=2 serial=1c 172.31.1.5:0->172.31.3.5:0 nexthop=172.31.1.6 tun_id=172.31.0.65 tun_id6=::10.0.0.25 status=up dst_mtu=1500 weight=1 ....... dec: spi=9932cb0e esp=aes-gcm key=36 551bdf438cb62ba0bf2df131d5e7da4697dfbec5d4f1d60e876f049ef9ec29bae2deb3b1 ah=null key=0 enc: spi=dea67333 esp=aes-gcm key=36 1d94ae5e40f32d48f03555d1d76008b72f3662e6619c4fc16fc730b9bae8c7546b595acc ah=null key=0 dec:pkts/bytes=0/0, enc:pkts/bytes=1560/215488 npu_flag=03 npu_rgwy=172.31.3.5 npu_lgwy=172.31.1.5 npu_selid=27 dec_npuid=1 enc_npuid=1 npu_isaidx=627 npu_osaidx=40 egress traffic control: bandwidth=389154(kbps) lock_hit=0 default_class=2 n_active_class=3 class-id=2 allocated-bandwidth=38915(kbps) guaranteed-bandwidth=38915(kbps) max-bandwidth=38915(kbps) current-bandwidth=1(kbps) priority=low forwarded_bytes=166K dropped_packets=0 dropped_bytes=0 class-id=3 allocated-bandwidth=155661(kbps) guaranteed-bandwidth=116746(kbps) max-bandwidth=155661(kbps) current-bandwidth=0(kbps) priority=medium forwarded_bytes=0 dropped_packets=0 dropped_bytes=0 class-id=4 allocated-bandwidth=194576(kbps) guaranteed-bandwidth=77830(kbps) max-bandwidth=194576(kbps) current-bandwidth=0(kbps) priority=high forwarded_bytes=0 dropped_packets=0 dropped_bytes=0