Keeping sessions in established ADVPN shortcuts while they remain in SLA
In an SD-WAN hub and spoke configuration where ADVPN is used, when a primary shortcut goes out of SLA, traffic switches to the backup shortcut. During idle timeout, sessions will prefer using the primary parent tunnel and try to establish a new primary shortcut. However, because it is out of SLA, traffic switches back to the backup shortcut, which causes unnecessary traffic interruption.
The sla-stickiness
option keeps existing sessions on the established ADVPN shortcuts while they remain in SLA instead of switching to a new link every idle timeout. New sessions will be routed through the primary shortcut if it is in SLA.
config system sdwan config service edit <id> set mode sla set sla-stickiness {enable | disable} next end end
The sla-stickiness
option can be applied in the following use cases.
Use case 1:
-
The sessions will switch over to the backup shortcut due to the primary shortcut being out of SLA.
-
After an idle timeout, the primary shortcut is torn down, and the routes will be reinstalled on the primary parent tunnel.
-
When
sla-stickiness
is enabled, even though the primary parent tunnel is preferred, established ADVPN sessions will remain on the backup shortcut (stickiness) instead of switching to the primary parent tunnel. -
New sessions will be routed to the primary parent tunnel and trigger the primary shortcut, then traffic switches to the primary shortcut if it is in SLA.
Use case 2:
-
The sessions will switch over to the backup shortcut due to the primary shortcut being out of SLA.
-
After some time, the primary shortcut becomes in SLA.
-
When
sla-stickiness
is enabled, even though primary shortcut is preferred, established ADVPN sessions will remain on the backup shortcut (stickiness) instead of switching to the primary shortcut. -
New sessions will be routed through the primary shortcut.
Example configuration
The following example demonstrates using the sla-stickiness
option in use case 1.
After an idle timeout occurs, existing sessions remain on the spoke12-p1_0 backup shortcut tunnel. New sessions will try to create a shortcut over spoke11-p1, but will fall back to spoke12-p1_0 when it detects spoke11-p1 is out of SLA.
To configure shortcut stickiness for ADVPN shortcuts:
-
Configure SD-WAN on the Spoke_1 FortiGate:
config system sdwan set status enable config zone edit "virtual-wan-link" next end config members edit 1 set interface "spoke11-p1" next edit 2 set interface "spoke12-p1" next end config health-check edit "1" set server "9.0.0.1" set members 1 2 config sla edit 1 next end next end config service edit 1 set name "1" set mode sla set sla-stickiness enable set dst "all" set src "10.1.100.0" config sla edit "1" set id 1 next end set priority-members 1 2 next end end
-
Verify the SD-WAN configuration.
-
Verify the health check status:
# diagnose sys sdwan health-check Health Check(1): Seq(1 spoke11-p1): state(alive), packet-loss(0.000%) latency(0.368), jitter(0.051), mos(4.404), bandwidth-up(999999), bandwidth-dw(1000000), bandwidth-bi(1999999) sla_map=0x1 Seq(2 spoke12-p1): state(alive), packet-loss(0.000%) latency(0.211), jitter(0.019), mos(4.404), bandwidth-up(999999), bandwidth-dw(999979), bandwidth-bi(1999978) sla_map=0x1
-
Verify the service status:
# diagnose sys sdwan service4 Service(1): Address Mode(IPV4) flags=0x2200 use-shortcut-sla sla-stickiness Tie break: cfg Gen(1), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(sla), sla-compare-order Members(2): 1: Seq_num(1 spoke11-p1), alive, sla(0x1), gid(0), cfg_order(0), local cost(0), selected 2: Seq_num(2 spoke12-p1), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected Src address(1): 10.1.100.0-10.1.100.255 Dst address(1): 0.0.0.0-255.255.255.255
The SD-WAN service rule prefers the primary parent tunnel (spoke11-p1) over the backup parent tunnel (spoke12-p1) before shortcuts are established.
-
-
Send traffic from PC-1 to PC-2 to trigger the primary shortcut. Verify the diagnostics.
-
Run a sniffer trace:
# diagnose sniffer packet any 'host 192.168.5.44' 4 interfaces=[any] filters=[host 192.168.5.44] 14.878761 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 14.878905 spoke11-p1 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 14.879842 spoke11-p1 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 14.880082 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply 15.879761 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 15.879882 spoke11-p1_0 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 15.880433 spoke11-p1_0 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 15.880496 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply
The SD-WAN service rule sends traffic to the parent tunnel (spoke11-p1) initially, and then switches to the primary shortcut tunnel (spoke11-p1_0) once it is established.
-
Verify the service status:
# diagnose sys sdwan service4 Service(1): Address Mode(IPV4) flags=0x2200 use-shortcut-sla sla-stickiness Tie break: cfg Gen(2), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(sla), sla-compare-order Member sub interface(3): 2: seq_num(1), interface(spoke11-p1): 1: spoke11-p1_0(57) Members(3): 1: Seq_num(1 spoke11-p1_0), alive, sla(0x1), gid(0), cfg_order(0), local cost(0), selected 2: Seq_num(1 spoke11-p1), alive, sla(0x1), gid(0), cfg_order(0), local cost(0), selected 3: Seq_num(2 spoke12-p1), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected Src address(1): 10.1.100.0-10.1.100.255 Dst address(1): 0.0.0.0-255.255.255.255
The SD-WAN service rule prefers the primary shortcut tunnel (spoke11-p1_0) over other tunnels.
-
-
Make the primary shortcut be out of SLA. The traffic will switch to the backup parent tunnel and trigger the backup shortcut. Verify the diagnostics.
-
Run a sniffer trace:
# diagnose sniffer packet any 'host 192.168.5.44' 4 interfaces=[any] filters=[host 192.168.5.44] 20.588046 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 20.588157 spoke12-p1 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 20.588791 spoke12-p1 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 20.588876 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply 21.589079 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 21.589190 spoke12-p1_0 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 21.589661 spoke12-p1_0 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 21.589733 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply
When the primary shortcut tunnel goes out of SLA (spoke11-p1_0, alive, sla(0x0)), traffic reroutes to the backup parent tunnel (spoke12-p1) and then to the backup shortcut tunnel (spoke12-p1_0) once established.
-
Verify the service status:
# diagnose sys sdwan service4 Service(1): Address Mode(IPV4) flags=0x2200 use-shortcut-sla sla-stickiness Tie break: cfg Gen(23), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(sla), sla-compare-order Member sub interface(4): 1: seq_num(1), interface(spoke11-p1): 1: spoke11-p1_0(62) 3: seq_num(2), interface(spoke12-p1): 1: spoke12-p1_0(63) Members(4): 1: Seq_num(1 spoke11-p1), alive, sla(0x1), gid(0), cfg_order(0), local cost(0), selected 2: Seq_num(2 spoke12-p1_0), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected 3: Seq_num(2 spoke12-p1), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected 4: Seq_num(1 spoke11-p1_0), alive, sla(0x0), gid(0), cfg_order(0), local cost(0), selected Src address(1): 10.1.100.0-10.1.100.255 Dst address(1): 0.0.0.0-255.255.255.255
The backup shortcut tunnel (spoke12-p1_0) is now preferred.
-
-
After an idle timeout, the primary shortcut is torn down. The primary parent tunnel is now preferred, but traffic is still kept on the backup shortcut due to
sla-stickiness
being enabled. Verify the diagnostics.-
Verify the service status:
# diagnose sys sdwan service4 Service(1): Address Mode(IPV4) flags=0x2200 use-shortcut-sla sla-stickiness Tie break: cfg Gen(24), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(sla), sla-compare-order Member sub interface(3): 3: seq_num(2), interface(spoke12-p1): 1: spoke12-p1_0(63) Members(3): 1: Seq_num(1 spoke11-p1), alive, sla(0x1), gid(0), cfg_order(0), local cost(0), selected 2: Seq_num(2 spoke12-p1_0), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected 3: Seq_num(2 spoke12-p1), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected Src address(1): 10.1.100.0-10.1.100.255 Dst address(1): 0.0.0.0-255.255.255.255
-
Run a sniffer trace:
# diagnose sniffer packet any 'host 192.168.5.44' 4 interfaces=[any] filters=[host 192.168.5.44] 1.065143 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 1.065218 spoke12-p1_0 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 1.065471 spoke12-p1_0 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 1.065508 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply 2.066155 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 2.066198 spoke12-p1_0 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 2.066442 spoke12-p1_0 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 2.066480 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply 3.067201 port2 in 10.1.100.22 -> 192.168.5.44: icmp: echo request 3.067255 spoke12-p1_0 out 10.1.100.22 -> 192.168.5.44: icmp: echo request 3.067507 spoke12-p1_0 in 192.168.5.44 -> 10.1.100.22: icmp: echo reply 3.067544 port2 out 192.168.5.44 -> 10.1.100.22: icmp: echo reply
-
-
Send new traffic from PC1 to PC2. The traffic is routed to the primary parent tunnel and triggers the primary shortcut, then traffic will switch to the primary shortcut if it is in SLA. Verify the connection.
-
Run a sniffer trace:
# diagnose sniffer packet any 'host 192.168.5.4' 4 interfaces=[any] filters=[host 192.168.5.4] 17.120310 port2 in 10.1.100.22 -> 192.168.5.4: icmp: echo request 17.120475 spoke11-p1 out 10.1.100.22 -> 192.168.5.4: icmp: echo request 17.121096 spoke11-p1 in 192.168.5.4 -> 10.1.100.22: icmp: echo reply 17.121151 port2 out 192.168.5.4 -> 10.1.100.22: icmp: echo reply 18.121331 port2 in 10.1.100.22 -> 192.168.5.4: icmp: echo request 18.121480 spoke11-p1_0 out 10.1.100.22 -> 192.168.5.4: icmp: echo request 18.121954 spoke11-p1_0 in 192.168.5.4 -> 10.1.100.22: icmp: echo reply 18.122007 port2 out 192.168.5.4 -> 10.1.100.22: icmp: echo reply ...
At first, traffic tries to go to the primary parent tunnel so that it can trigger the primary shortcut to establish. The primary shortcut (spoke11-p1_0) is in SLA and new traffic flows through it.
... 14.194066 port2 in 10.1.100.22 -> 192.168.5.4: icmp: echo request 14.194247 spoke12-p1_0 out 10.1.100.22 -> 192.168.5.4: icmp: echo request 14.194499 spoke12-p1_0 in 192.168.5.4 -> 10.1.100.22: icmp: echo reply 14.194565 port2 out 192.168.5.4 -> 10.1.100.22: icmp: echo reply 15.195093 port2 in 10.1.100.22 -> 192.168.5.4: icmp: echo request 15.195174 spoke12-p1_0 out 10.1.100.22 -> 192.168.5.4: icmp: echo request 15.195326 spoke12-p1_0 in 192.168.5.4 -> 10.1.100.22: icmp: echo reply 15.195361 port2 out 192.168.5.4 -> 10.1.100.22: icmp: echo reply
After the primary shortcut goes out of SLA, the traffic switches to the backup shortcut (spoke12-p1_0).
-
Verify the service status:
# diagnose sys sdwan service4 Service(1): Address Mode(IPV4) flags=0x2200 use-shortcut-sla sla-stickiness Tie break: cfg Gen(36), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(sla), sla-compare-order Member sub interface(4): 1: seq_num(1), interface(spoke11-p1): 1: spoke11-p1_0(67) 3: seq_num(2), interface(spoke12-p1): 1: spoke12-p1_0(66) Members(4): 1: Seq_num(1 spoke11-p1), alive, sla(0x1), gid(0), cfg_order(0), local cost(0), selected 2: Seq_num(2 spoke12-p1_0), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected 3: Seq_num(2 spoke12-p1), alive, sla(0x1), gid(0), cfg_order(1), local cost(0), selected 4: Seq_num(1 spoke11-p1_0), alive, sla(0x0), gid(0), cfg_order(0), local cost(0), selected Src address(1): 10.1.100.0-10.1.100.255 Dst address(1): 0.0.0.0-255.255.255.255
New traffic switches back to the backup shortcut while the primary shortcut is still out of SLA.
-