Fortinet white logo
Fortinet white logo
7.0.0

Testing and verification

Testing and verification

  1. Traffic flow during normal operation

  2. Traffic flow when Branch1 ISP1 is impaired

  3. Traffic flow when Branch1 ISP1 and ISP2 are impaired

Traffic flow during normal operation

These steps go through debug commands on the hub and branch FortiGates during normal operation. Simple traffic is generated to observe the steering that is occurring on the hub. The routing and session tables are shown to help understand the traffic flow. Artificial delays are added to show delays in the ISP links, but the delays do no exceed the health check's thresholds.

To test the traffic flow:
  1. On the Hub PC, start a continuous ping to the Branch1 PC at 192.168.136.10:

    hub1:~$ ping 192.168.136.10
    PING 192.168.136.10 (192.168.136.10) 56(84) bytes of data.
    64 bytes from 192.168.136.10: icmp_seq=1 ttl=61 time=74.0 ms
    64 bytes from 192.168.136.10: icmp_seq=2 ttl=61 time=68.7 ms
    ...
  2. On FGT-Branch1, check the status of the health check:

    FGT-Branch1 # diagnose sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(66.847), jitter(0.139) sla_map=0x1
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(73.992), jitter(0.352) sla_map=0x1

    Both HUB1-VPN1 and HUB1-VPN2 are alive and below the threshold.

  3. On FGT-Branch2, check the status of the health check:

    FGT-Branch2 # diagnose sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(77.116), jitter(0.206) sla_map=0x1
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(83.929), jitter(0.275) sla_map=0x1
    

    Both HUB1-VPN1 and HUB1-VPN2 are alive and below threshold.

  4. On FGT-HUB1, check the BGP routes learned from FGT-Branch1:

    FGT-HUB1 # get router info bgp network 192.168.136.0/24
    VRF 0 BGP routing table entry for 192.168.136.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       192.168.132.2
      Advertised to peer-groups:
      VPN1 VPN2
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.16.11 from 169.254.16.11 (192.168.136.1)
          Origin IGP metric 0, route tag 1, localpref 100, valid, internal, best
          Community: 65001:1
          Advertised Path ID: 1
           Last update: Thu Jul 22 09:38:32 2021
    
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.17.11 from 169.254.17.11 (192.168.136.1)
          Origin IGP metric 0, route tag 2, localpref 100, valid, internal, best
          Community: 65001:2
          Advertised Path ID: 2
           Last update: Thu Jul 22 09:38:31 2021
    

    The route was received by FGT-HUB1 from BGP neighbors 169.254.16.11 and 169.254.17.11, which correspond to the two tunnels on FGT-Branch1. Each is advertising their respective communities for successful health checks, and the hub assigns them route-tags 1 and 2.

    Repeat the command for network 192.168.138.0/24 on Branch2 to get similar outputs.

  5. On FGT-HUB1, check all of the BGP routes installed in the routing table:

    FGT-HUB1 # get router info routing-table bgp
    Routing table for VRF=0
    B       192.168.128.0/24 [20/0] via 192.168.132.2, port4, 1d01h58m
    B       192.168.136.0/24 [200/0] via 169.254.16.11, VPN1, 1d01h47m
                             [200/0] via 169.254.17.11, VPN2, 1d01h47m
    B       192.168.138.0/24 [200/0] via 169.254.16.10, VPN1, 1d01h47m
                             [200/0] via 169.254.17.10, VPN2, 1d01h47m
  6. On FGT-HUB1, check the status of the SD-WAN service rules:

    FGT-HUB1 # diagnose sys sdwan service
    
    Service(10): Address Mode(IPV4) flags=0x200
      Gen(5), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(1 VPN1), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(2):
            192.168.136.0-192.168.136.255
            192.168.138.0-192.168.138.255
    
    
    Service(11): Address Mode(IPV4) flags=0x200
      Gen(5), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(2):
            192.168.136.0-192.168.136.255
            192.168.138.0-192.168.138.255
    
    
    Service(12): Address Mode(IPV4) flags=0x200
      Gen(5), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Service disabled caused by no destination.
      Members(2):
        1: Seq_num(1 VPN1), alive, selected
        2: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255

    SD-WAN service rules 10, 11, and 12 are shown:

    • Rule 10 matches traffic going to networks with BGP route-tag 1, currently 192.168.136.0/24 and 192.168.138.0/24. The SD-WAN rule uses manual mode to steer traffic to VPN1. Traffic that is being sent to both branches should match service rule 10 and be steered to VPN1.

    • Rule 11 matches traffic going to networks with BGP route-tag 2, currently also 192.168.136.0/24 and 192.168.138.0/24, but because this rule is lower that rule 10, traffic does not hit it.

  7. On FGT-HUB1, check the SD-WAN service rules in the policy route format:

    FGT-HUB1 # diagnose firewall proute list
    list route policy info(vf=root):
    …
    id=2130771978(0x7f01000a) vwl_service=10(ToBranches1) vwl_mbr_seq=1 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1)
    source(1): 0.0.0.0-255.255.255.255
    destination(2): 192.168.136.0-192.168.136.255 192.168.138.0-192.168.138.255
    hit_count=1 last_used=2021-07-23 11:02:46
    
    id=2130771979(0x7f01000b) vwl_service=11(ToBranches2) vwl_mbr_seq=2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(2): 192.168.136.0-192.168.136.255 192.168.138.0-192.168.138.255
    hit_count=0 last_used=2021-07-22 09:27:22
    …

    Traffic hits SD-WAN service rule 10, steering it to VPN1, while no traffic hit rule 11.

  8. On both FGT-HUB1 and FGT-Branch1, check the session table to again confirm that traffic is hitting the right tunnel:

    • Hub:

      FGT-HUB1 # diagnose sys session filter dst 192.168.136.10
      FGT-HUB1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=1 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=VPN1_1/ vlan_cos=0/255
      state=log may_dirty f00
      statistic(bytes/packets/allow_err): org=168/2/1 reply=168/2/1 tuples=2
      tx speed(Bps/kbps): 86/0 rx speed(Bps/kbps): 86/0
      orgin->sink: org pre->post, reply pre->post dev=6->10/10->6 gwy=169.254.16.11/192.168.132.2
      hook=pre dir=org act=noop 192.168.128.10:3422->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3422->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00004d74 tos=ff/ff app_list=0 app=0 url_cat=0
      sdwan_mbr_seq=1 sdwan_service_id=10
      rpdb_link_id=ff00000a rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3040000
      total session 1
      
    • Branch:

      FGT-Branch1 # diagnose sys session filter dst 192.168.136.10
      FGT-Branch1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=61 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=/HUB1-VPN1 vlan_cos=0/255
      state=log may_dirty npu f00 app_valid
      statistic(bytes/packets/allow_err): org=5124/61/1 reply=5124/61/1 tuples=2
      tx speed(Bps/kbps): 84/0 rx speed(Bps/kbps): 84/0
      orgin->sink: org pre->post, reply pre->post dev=9->6/6->9 gwy=192.168.136.10/169.254.16.1
      hook=pre dir=org act=noop 192.168.128.10:3422->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3422->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=0000287e tos=ff/ff app_list=2000 app=24466 url_cat=0
      sdwan_mbr_seq=0 sdwan_service_id=0
      rpdb_link_id=00000000 rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3041008
      total session 1
      

Traffic flow when Branch1 ISP1 is impaired

Impairment to Branch1 ISP1 is simulated by adding additional latency that exceeds the Branch1 health check's thresholds. A continuous ping is used to show the effects that this has on traffic flow.

To test the traffic flow:
  1. On the Hub PC, start a continuous ping to the Branch1 PC at 192.168.136.10.

  2. Impair the ISP1 link on Branch1. In this example, latency is increased to 180ms.

  3. On FGT-Branch1, check the status of the health check:

    FGT-Branch1 # diagnose sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(186.852), jitter(0.156) sla_map=0x0
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(73.891), jitter(0.168) sla_map=0x1
    

    HUB1-VPN1 is above the latency threshold of 150ms. Its sla_map=0x0 indicates that the SLA is exceeded.

  4. On FGT-HUB1, check the routes learned from FGT-Branch1:

    FGT-HUB1 # get router info bgp network 192.168.136.0/24
    VRF 0 BGP routing table entry for 192.168.136.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       192.168.132.2
      Advertised to peer-groups:
      VPN1 VPN2
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.16.11 from 169.254.16.11 (192.168.136.1)
          Origin IGP metric 0, route tag 11, localpref 100, valid, internal, best
          Community: 65001:11
          Advertised Path ID: 1
           Last update: Fri Jul 23 12:00:53 2021
    
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.17.11 from 169.254.17.11 (192.168.136.1)
          Origin IGP metric 0, route tag 2, localpref 100, valid, internal, best
          Community: 65001:2
          Advertised Path ID: 2
           Last update: Thu Jul 22 09:38:31 2021
    

    The route learned over VPN1 from Branch1 has community string 65001:11, showing that the health checks for this link has failed. FGT-HUB1 assigns route-tag 11 to the route. The same route learned over VPN2 from Branch1 is still assigned route-tag 2 based on community string 65001:2. The Branch1 FortiGate has successfully updated the hub on the status of its VPN overlays.

  5. On FGT-HUB1, check all of the BGP routes installed in the routing table:

    FGT-HUB1 # get router info routing-table bgp
    Routing table for VRF=0
    B       192.168.128.0/24 [20/0] via 192.168.132.2, port4, 1d02h52m
    B       192.168.136.0/24 [200/0] via 169.254.16.11, VPN1, 00:18:57
                             [200/0] via 169.254.17.11, VPN2, 00:18:57
    B       192.168.138.0/24 [200/0] via 169.254.16.10, VPN1, 1d02h41m
                             [200/0] via 169.254.17.10, VPN2, 1d02h41m
    

    The routes to 192.168.136.0/24 over VPN1 is still installed, but its route-rag has changed.

  6. On FGT-HUB1, check the status of the SD-WAN service rules:

    FGT-HUB1 # diagnose sys sdwan service
    
    Service(10): Address Mode(IPV4) flags=0x200
      Gen(6), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(1 VPN1), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.138.0-192.168.138.255Service(11): Address Mode(IPV4) flags=0x200
      Gen(6), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(2):
            192.168.136.0-192.168.136.255
            192.168.138.0-192.168.138.255
    
    
    Service(12): Address Mode(IPV4) flags=0x200
      Gen(6), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(2):
        1: Seq_num(1 VPN1), alive, selected
        2: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.136.0-192.168.136.255
    

    Service rule 10, which matches route-tag 1, does not show the destination route-tag address of 192.168.136.0/24. Service rule 11, which matches route-tag 2, does not show the destination route-tag address of 192.168.136.0/24, as expected. The HUB PC's ping to 192.168.136.10 matches service rule 11 now, and not rule 10.

  7. On the Hub PC, confirm that the pings are still flowing, and check the instant when the delay was introduced. Some pings might have continued to pass through VPN1 as the health check and routing updated occurred:

    64 bytes from 192.168.136.10: icmp_seq=123 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=124 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=125 ttl=61 time=189 ms
    64 bytes from 192.168.136.10: icmp_seq=126 ttl=61 time=75.8 ms
    64 bytes from 192.168.136.10: icmp_seq=127 ttl=61 time=75.6 ms
    64 bytes from 192.168.136.10: icmp_seq=128 ttl=61 time=75.7 ms
    
  8. On FGT-HUB1, check the SD-WAN service rules in the policy route format:

    FGT-HUB1 # diagnose firewall proute list
    list route policy info(vf=root):
    …
    id=2130771978(0x7f01000a) vwl_service=10(ToBranches1) vwl_mbr_seq=1 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.138.0-192.168.138.255
    hit_count=3 last_used=2021-07-23 11:58:54
    
    id=2130771979(0x7f01000b) vwl_service=11(ToBranches2) vwl_mbr_seq=2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(2): 192.168.136.0-192.168.136.255 192.168.138.0-192.168.138.255
    hit_count=2 last_used=2021-07-23 12:00:59
    …

    The proute that corresponds to service rule 10 does not list 192.168.136.0/24 as a destination. The proute that corresponds to service rule 11 includes 192.168.136.0/24 as a destination, and indicates that there are hits on that policy.

  9. On both FGT-HUB1 and FGT-Branch1, check the session table to again confirm that traffic is hitting the right tunnel:

    • Hub:

      FGT-HUB1 # diagnose sys session filter dst 192.168.136.10
      FGT-HUB1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=35 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=VPN2_1/ vlan_cos=0/255
      state=log may_dirty f00
      statistic(bytes/packets/allow_err): org=3024/36/1 reply=3024/36/1 tuples=2
      tx speed(Bps/kbps): 84/0 rx speed(Bps/kbps): 84/0
      orgin->sink: org pre->post, reply pre->post dev=6->11/11->6 gwy=169.254.17.11/192.168.132.2
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00004f53 tos=ff/ff app_list=0 app=0 url_cat=0
      sdwan_mbr_seq=2 sdwan_service_id=11
      rpdb_link_id=ff00000b rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3040000
      total session 1
      

      The session information shows that traffic is passing through tunnel VPN2_1, and is steered by SD-WAN service rule 11.

    • Branch:

      FGT-Branch1 # diagnose sys session filter dst 192.168.136.10
      FGT-Branch1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=89 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=/HUB1-VPN2 vlan_cos=0/255
      state=log may_dirty npu f00 app_valid
      statistic(bytes/packets/allow_err): org=7560/90/1 reply=7560/90/1 tuples=2
      tx speed(Bps/kbps): 84/0 rx speed(Bps/kbps): 84/0
      orgin->sink: org pre->post, reply pre->post dev=10->6/6->10 gwy=192.168.136.10/169.254.17.1
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00002910 tos=ff/ff app_list=2000 app=24466 url_cat=0
      sdwan_mbr_seq=0 sdwan_service_id=0
      rpdb_link_id=00000000 rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3041008
      total session 1
      

      The session on Branch1 shows that traffic is passing through tunnel HUB1-VPN2.

Traffic flow when Branch1 ISP1 and ISP2 are impaired

Impairment to Branch1 ISP1 and ISP2 are simulated by adding additional latency that exceeds the Branch1 health check's thresholds. Because both ISPs are impaired, traffic matches SD-WAN service rule 12 on the hub, and flows over VPN1.

To test the traffic flow:
  1. On the Hub PC, start a continuous ping to the Branch1 PC at 192.168.136.10.

  2. Impair the ISP1 and ISP2 links on Branch1. In this example, ISP1 latency is increased to 180ms and ISP2 latency is increased to 195ms

  3. On FGT-Branch1, check the status of the health check:

    FGT-Branch1 # diag sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(186.795), jitter(0.199) sla_map=0x0
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(203.939), jitter(0.235) sla_map=0x0

    HUB1-VPN1 and HUB1-VPN2 are both above the latency threshold of 150ms. Its sla_map=0x0 indicates that the SLA is exceeded.

  4. On FGT-HUB1, check the routes learned from FGT-Branch1:

    FGT-HUB1 # get router info bgp network 192.168.136.0/24
    VRF 0 BGP routing table entry for 192.168.136.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       192.168.132.2
      Advertised to peer-groups:
      VPN1 VPN2
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.16.11 from 169.254.16.11 (192.168.136.1)
          Origin IGP metric 0, route tag 11, localpref 100, valid, internal, best
          Community: 65001:11
          Advertised Path ID: 1
           Last update: Fri Jul 23 12:44:41 2021
    
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.17.11 from 169.254.17.11 (192.168.136.1)
          Origin IGP metric 0, route tag 11, localpref 100, valid, internal, best
          Community: 65001:11
          Advertised Path ID: 2
           Last update: Fri Jul 23 12:59:28 2021
    
    

    The route learned over both VPN1 and VPN2 from Branch1 has community string 65001:11, showing that the health checks for these links have failed. FGT-HUB1 assigns route-tag 11 to the routes. The Branch1 FortiGate has successfully updated the hub on the status of its VPN overlays.

  5. On FGT-HUB1, check the status of the SD-WAN service rules:

    FGT-HUB1 # diagnose sys sdwan service
    
    Service(10): Address Mode(IPV4) flags=0x200
      Gen(9), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(1 VPN1), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.138.0-192.168.138.255
    
    
    Service(11): Address Mode(IPV4) flags=0x200
      Gen(9), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.138.0-192.168.138.255
    
    
    Service(12): Address Mode(IPV4) flags=0x200
      Gen(9), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(2):
        1: Seq_num(1 VPN1), alive, selected
        2: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.136.0-192.168.136.255

    Service rules 10 and 11 do not show the destination route-tag address of 192.168.136.0/24, so neither rule is matched. Traffic falls to service rule 12, which catches failed route-tags. Traffic is steered in the order of VPN1 and VPN2.

  6. On the Hub PC, confirm that the pings are still flowing, and check the instant when the delays were introduced. Some pings might have experienced small delays and continued to pass through VPN2, but eventually theroutes were updated on the hub, and traffic passed through VPN1 again :

    64 bytes from 192.168.136.10: icmp_seq=910 ttl=61 time=205 ms
    64 bytes from 192.168.136.10: icmp_seq=911 ttl=61 time=205 ms
    64 bytes from 192.168.136.10: icmp_seq=912 ttl=61 time=205 ms
    64 bytes from 192.168.136.10: icmp_seq=913 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=914 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=915 ttl=61 time=188 ms
    
  7. On FGT-HUB1, check the SD-WAN service rules in the policy route format:

    FGT-HUB1 # diagnose firewall proute list
    list route policy info(vf=root):
    
    id=2130771978(0x7f01000a) vwl_service=10(ToBranches1) vwl_mbr_seq=1 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.138.0-192.168.138.255
    hit_count=5 last_used=2021-07-23 12:44:19
    
    id=2130771979(0x7f01000b) vwl_service=11(ToBranches2) vwl_mbr_seq=2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.138.0-192.168.138.255
    hit_count=5 last_used=2021-07-23 12:44:47
    
    id=2130771980(0x7f01000c) vwl_service=12(ToBranches3) vwl_mbr_seq=1 2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1) oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.136.0-192.168.136.255
    hit_count=2 last_used=2021-07-23 12:59:33
    
  8. On both FGT-HUB1 and FGT-Branch1, check the session table to again confirm that traffic is hitting the right tunnel:

    • Hub:

      FGT-HUB1 # diagnose sys session filter dst 192.168.136.10
      FGT-HUB1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=2440 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=VPN1_1/ vlan_cos=0/255
      state=log may_dirty f00
      statistic(bytes/packets/allow_err): org=204792/2438/1 reply=204792/2438/1 tuples=2
      tx speed(Bps/kbps): 83/0 rx speed(Bps/kbps): 83/0
      orgin->sink: org pre->post, reply pre->post dev=6->10/10->6 gwy=169.254.16.11/192.168.132.2
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00004f53 tos=ff/ff app_list=0 app=0 url_cat=0
      sdwan_mbr_seq=1 sdwan_service_id=12
      rpdb_link_id=ff00000c rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3040000
      total session 1
      
    • Branch:

      FGT-Branch1 # diagnose sys session filter dst 192.168.136.10
      FGT-Branch1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=2462 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=/HUB1-VPN1 vlan_cos=0/255
      state=log may_dirty npu f00 app_valid
      statistic(bytes/packets/allow_err): org=206640/2460/1 reply=206640/2460/1 tuples=2
      tx speed(Bps/kbps): 83/0 rx speed(Bps/kbps): 83/0
      orgin->sink: org pre->post, reply pre->post dev=9->6/6->9 gwy=192.168.136.10/169.254.16.1
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00002910 tos=ff/ff app_list=2000 app=24466 url_cat=0
      sdwan_mbr_seq=0 sdwan_service_id=0
      rpdb_link_id=00000000 rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3041008
      total session 1
      

Testing and verification

Testing and verification

  1. Traffic flow during normal operation

  2. Traffic flow when Branch1 ISP1 is impaired

  3. Traffic flow when Branch1 ISP1 and ISP2 are impaired

Traffic flow during normal operation

These steps go through debug commands on the hub and branch FortiGates during normal operation. Simple traffic is generated to observe the steering that is occurring on the hub. The routing and session tables are shown to help understand the traffic flow. Artificial delays are added to show delays in the ISP links, but the delays do no exceed the health check's thresholds.

To test the traffic flow:
  1. On the Hub PC, start a continuous ping to the Branch1 PC at 192.168.136.10:

    hub1:~$ ping 192.168.136.10
    PING 192.168.136.10 (192.168.136.10) 56(84) bytes of data.
    64 bytes from 192.168.136.10: icmp_seq=1 ttl=61 time=74.0 ms
    64 bytes from 192.168.136.10: icmp_seq=2 ttl=61 time=68.7 ms
    ...
  2. On FGT-Branch1, check the status of the health check:

    FGT-Branch1 # diagnose sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(66.847), jitter(0.139) sla_map=0x1
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(73.992), jitter(0.352) sla_map=0x1

    Both HUB1-VPN1 and HUB1-VPN2 are alive and below the threshold.

  3. On FGT-Branch2, check the status of the health check:

    FGT-Branch2 # diagnose sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(77.116), jitter(0.206) sla_map=0x1
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(83.929), jitter(0.275) sla_map=0x1
    

    Both HUB1-VPN1 and HUB1-VPN2 are alive and below threshold.

  4. On FGT-HUB1, check the BGP routes learned from FGT-Branch1:

    FGT-HUB1 # get router info bgp network 192.168.136.0/24
    VRF 0 BGP routing table entry for 192.168.136.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       192.168.132.2
      Advertised to peer-groups:
      VPN1 VPN2
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.16.11 from 169.254.16.11 (192.168.136.1)
          Origin IGP metric 0, route tag 1, localpref 100, valid, internal, best
          Community: 65001:1
          Advertised Path ID: 1
           Last update: Thu Jul 22 09:38:32 2021
    
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.17.11 from 169.254.17.11 (192.168.136.1)
          Origin IGP metric 0, route tag 2, localpref 100, valid, internal, best
          Community: 65001:2
          Advertised Path ID: 2
           Last update: Thu Jul 22 09:38:31 2021
    

    The route was received by FGT-HUB1 from BGP neighbors 169.254.16.11 and 169.254.17.11, which correspond to the two tunnels on FGT-Branch1. Each is advertising their respective communities for successful health checks, and the hub assigns them route-tags 1 and 2.

    Repeat the command for network 192.168.138.0/24 on Branch2 to get similar outputs.

  5. On FGT-HUB1, check all of the BGP routes installed in the routing table:

    FGT-HUB1 # get router info routing-table bgp
    Routing table for VRF=0
    B       192.168.128.0/24 [20/0] via 192.168.132.2, port4, 1d01h58m
    B       192.168.136.0/24 [200/0] via 169.254.16.11, VPN1, 1d01h47m
                             [200/0] via 169.254.17.11, VPN2, 1d01h47m
    B       192.168.138.0/24 [200/0] via 169.254.16.10, VPN1, 1d01h47m
                             [200/0] via 169.254.17.10, VPN2, 1d01h47m
  6. On FGT-HUB1, check the status of the SD-WAN service rules:

    FGT-HUB1 # diagnose sys sdwan service
    
    Service(10): Address Mode(IPV4) flags=0x200
      Gen(5), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(1 VPN1), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(2):
            192.168.136.0-192.168.136.255
            192.168.138.0-192.168.138.255
    
    
    Service(11): Address Mode(IPV4) flags=0x200
      Gen(5), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(2):
            192.168.136.0-192.168.136.255
            192.168.138.0-192.168.138.255
    
    
    Service(12): Address Mode(IPV4) flags=0x200
      Gen(5), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Service disabled caused by no destination.
      Members(2):
        1: Seq_num(1 VPN1), alive, selected
        2: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255

    SD-WAN service rules 10, 11, and 12 are shown:

    • Rule 10 matches traffic going to networks with BGP route-tag 1, currently 192.168.136.0/24 and 192.168.138.0/24. The SD-WAN rule uses manual mode to steer traffic to VPN1. Traffic that is being sent to both branches should match service rule 10 and be steered to VPN1.

    • Rule 11 matches traffic going to networks with BGP route-tag 2, currently also 192.168.136.0/24 and 192.168.138.0/24, but because this rule is lower that rule 10, traffic does not hit it.

  7. On FGT-HUB1, check the SD-WAN service rules in the policy route format:

    FGT-HUB1 # diagnose firewall proute list
    list route policy info(vf=root):
    …
    id=2130771978(0x7f01000a) vwl_service=10(ToBranches1) vwl_mbr_seq=1 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1)
    source(1): 0.0.0.0-255.255.255.255
    destination(2): 192.168.136.0-192.168.136.255 192.168.138.0-192.168.138.255
    hit_count=1 last_used=2021-07-23 11:02:46
    
    id=2130771979(0x7f01000b) vwl_service=11(ToBranches2) vwl_mbr_seq=2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(2): 192.168.136.0-192.168.136.255 192.168.138.0-192.168.138.255
    hit_count=0 last_used=2021-07-22 09:27:22
    …

    Traffic hits SD-WAN service rule 10, steering it to VPN1, while no traffic hit rule 11.

  8. On both FGT-HUB1 and FGT-Branch1, check the session table to again confirm that traffic is hitting the right tunnel:

    • Hub:

      FGT-HUB1 # diagnose sys session filter dst 192.168.136.10
      FGT-HUB1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=1 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=VPN1_1/ vlan_cos=0/255
      state=log may_dirty f00
      statistic(bytes/packets/allow_err): org=168/2/1 reply=168/2/1 tuples=2
      tx speed(Bps/kbps): 86/0 rx speed(Bps/kbps): 86/0
      orgin->sink: org pre->post, reply pre->post dev=6->10/10->6 gwy=169.254.16.11/192.168.132.2
      hook=pre dir=org act=noop 192.168.128.10:3422->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3422->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00004d74 tos=ff/ff app_list=0 app=0 url_cat=0
      sdwan_mbr_seq=1 sdwan_service_id=10
      rpdb_link_id=ff00000a rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3040000
      total session 1
      
    • Branch:

      FGT-Branch1 # diagnose sys session filter dst 192.168.136.10
      FGT-Branch1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=61 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=/HUB1-VPN1 vlan_cos=0/255
      state=log may_dirty npu f00 app_valid
      statistic(bytes/packets/allow_err): org=5124/61/1 reply=5124/61/1 tuples=2
      tx speed(Bps/kbps): 84/0 rx speed(Bps/kbps): 84/0
      orgin->sink: org pre->post, reply pre->post dev=9->6/6->9 gwy=192.168.136.10/169.254.16.1
      hook=pre dir=org act=noop 192.168.128.10:3422->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3422->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=0000287e tos=ff/ff app_list=2000 app=24466 url_cat=0
      sdwan_mbr_seq=0 sdwan_service_id=0
      rpdb_link_id=00000000 rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3041008
      total session 1
      

Traffic flow when Branch1 ISP1 is impaired

Impairment to Branch1 ISP1 is simulated by adding additional latency that exceeds the Branch1 health check's thresholds. A continuous ping is used to show the effects that this has on traffic flow.

To test the traffic flow:
  1. On the Hub PC, start a continuous ping to the Branch1 PC at 192.168.136.10.

  2. Impair the ISP1 link on Branch1. In this example, latency is increased to 180ms.

  3. On FGT-Branch1, check the status of the health check:

    FGT-Branch1 # diagnose sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(186.852), jitter(0.156) sla_map=0x0
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(73.891), jitter(0.168) sla_map=0x1
    

    HUB1-VPN1 is above the latency threshold of 150ms. Its sla_map=0x0 indicates that the SLA is exceeded.

  4. On FGT-HUB1, check the routes learned from FGT-Branch1:

    FGT-HUB1 # get router info bgp network 192.168.136.0/24
    VRF 0 BGP routing table entry for 192.168.136.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       192.168.132.2
      Advertised to peer-groups:
      VPN1 VPN2
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.16.11 from 169.254.16.11 (192.168.136.1)
          Origin IGP metric 0, route tag 11, localpref 100, valid, internal, best
          Community: 65001:11
          Advertised Path ID: 1
           Last update: Fri Jul 23 12:00:53 2021
    
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.17.11 from 169.254.17.11 (192.168.136.1)
          Origin IGP metric 0, route tag 2, localpref 100, valid, internal, best
          Community: 65001:2
          Advertised Path ID: 2
           Last update: Thu Jul 22 09:38:31 2021
    

    The route learned over VPN1 from Branch1 has community string 65001:11, showing that the health checks for this link has failed. FGT-HUB1 assigns route-tag 11 to the route. The same route learned over VPN2 from Branch1 is still assigned route-tag 2 based on community string 65001:2. The Branch1 FortiGate has successfully updated the hub on the status of its VPN overlays.

  5. On FGT-HUB1, check all of the BGP routes installed in the routing table:

    FGT-HUB1 # get router info routing-table bgp
    Routing table for VRF=0
    B       192.168.128.0/24 [20/0] via 192.168.132.2, port4, 1d02h52m
    B       192.168.136.0/24 [200/0] via 169.254.16.11, VPN1, 00:18:57
                             [200/0] via 169.254.17.11, VPN2, 00:18:57
    B       192.168.138.0/24 [200/0] via 169.254.16.10, VPN1, 1d02h41m
                             [200/0] via 169.254.17.10, VPN2, 1d02h41m
    

    The routes to 192.168.136.0/24 over VPN1 is still installed, but its route-rag has changed.

  6. On FGT-HUB1, check the status of the SD-WAN service rules:

    FGT-HUB1 # diagnose sys sdwan service
    
    Service(10): Address Mode(IPV4) flags=0x200
      Gen(6), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(1 VPN1), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.138.0-192.168.138.255Service(11): Address Mode(IPV4) flags=0x200
      Gen(6), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(2):
            192.168.136.0-192.168.136.255
            192.168.138.0-192.168.138.255
    
    
    Service(12): Address Mode(IPV4) flags=0x200
      Gen(6), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(2):
        1: Seq_num(1 VPN1), alive, selected
        2: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.136.0-192.168.136.255
    

    Service rule 10, which matches route-tag 1, does not show the destination route-tag address of 192.168.136.0/24. Service rule 11, which matches route-tag 2, does not show the destination route-tag address of 192.168.136.0/24, as expected. The HUB PC's ping to 192.168.136.10 matches service rule 11 now, and not rule 10.

  7. On the Hub PC, confirm that the pings are still flowing, and check the instant when the delay was introduced. Some pings might have continued to pass through VPN1 as the health check and routing updated occurred:

    64 bytes from 192.168.136.10: icmp_seq=123 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=124 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=125 ttl=61 time=189 ms
    64 bytes from 192.168.136.10: icmp_seq=126 ttl=61 time=75.8 ms
    64 bytes from 192.168.136.10: icmp_seq=127 ttl=61 time=75.6 ms
    64 bytes from 192.168.136.10: icmp_seq=128 ttl=61 time=75.7 ms
    
  8. On FGT-HUB1, check the SD-WAN service rules in the policy route format:

    FGT-HUB1 # diagnose firewall proute list
    list route policy info(vf=root):
    …
    id=2130771978(0x7f01000a) vwl_service=10(ToBranches1) vwl_mbr_seq=1 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.138.0-192.168.138.255
    hit_count=3 last_used=2021-07-23 11:58:54
    
    id=2130771979(0x7f01000b) vwl_service=11(ToBranches2) vwl_mbr_seq=2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(2): 192.168.136.0-192.168.136.255 192.168.138.0-192.168.138.255
    hit_count=2 last_used=2021-07-23 12:00:59
    …

    The proute that corresponds to service rule 10 does not list 192.168.136.0/24 as a destination. The proute that corresponds to service rule 11 includes 192.168.136.0/24 as a destination, and indicates that there are hits on that policy.

  9. On both FGT-HUB1 and FGT-Branch1, check the session table to again confirm that traffic is hitting the right tunnel:

    • Hub:

      FGT-HUB1 # diagnose sys session filter dst 192.168.136.10
      FGT-HUB1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=35 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=VPN2_1/ vlan_cos=0/255
      state=log may_dirty f00
      statistic(bytes/packets/allow_err): org=3024/36/1 reply=3024/36/1 tuples=2
      tx speed(Bps/kbps): 84/0 rx speed(Bps/kbps): 84/0
      orgin->sink: org pre->post, reply pre->post dev=6->11/11->6 gwy=169.254.17.11/192.168.132.2
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00004f53 tos=ff/ff app_list=0 app=0 url_cat=0
      sdwan_mbr_seq=2 sdwan_service_id=11
      rpdb_link_id=ff00000b rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3040000
      total session 1
      

      The session information shows that traffic is passing through tunnel VPN2_1, and is steered by SD-WAN service rule 11.

    • Branch:

      FGT-Branch1 # diagnose sys session filter dst 192.168.136.10
      FGT-Branch1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=89 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=/HUB1-VPN2 vlan_cos=0/255
      state=log may_dirty npu f00 app_valid
      statistic(bytes/packets/allow_err): org=7560/90/1 reply=7560/90/1 tuples=2
      tx speed(Bps/kbps): 84/0 rx speed(Bps/kbps): 84/0
      orgin->sink: org pre->post, reply pre->post dev=10->6/6->10 gwy=192.168.136.10/169.254.17.1
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00002910 tos=ff/ff app_list=2000 app=24466 url_cat=0
      sdwan_mbr_seq=0 sdwan_service_id=0
      rpdb_link_id=00000000 rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3041008
      total session 1
      

      The session on Branch1 shows that traffic is passing through tunnel HUB1-VPN2.

Traffic flow when Branch1 ISP1 and ISP2 are impaired

Impairment to Branch1 ISP1 and ISP2 are simulated by adding additional latency that exceeds the Branch1 health check's thresholds. Because both ISPs are impaired, traffic matches SD-WAN service rule 12 on the hub, and flows over VPN1.

To test the traffic flow:
  1. On the Hub PC, start a continuous ping to the Branch1 PC at 192.168.136.10.

  2. Impair the ISP1 and ISP2 links on Branch1. In this example, ISP1 latency is increased to 180ms and ISP2 latency is increased to 195ms

  3. On FGT-Branch1, check the status of the health check:

    FGT-Branch1 # diag sys sdwan health-check HUB1_HC
    Health Check(HUB1_HC):
    Seq(1 HUB1-VPN1): state(alive), packet-loss(0.000%) latency(186.795), jitter(0.199) sla_map=0x0
    Seq(2 HUB1-VPN2): state(alive), packet-loss(0.000%) latency(203.939), jitter(0.235) sla_map=0x0

    HUB1-VPN1 and HUB1-VPN2 are both above the latency threshold of 150ms. Its sla_map=0x0 indicates that the SLA is exceeded.

  4. On FGT-HUB1, check the routes learned from FGT-Branch1:

    FGT-HUB1 # get router info bgp network 192.168.136.0/24
    VRF 0 BGP routing table entry for 192.168.136.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       192.168.132.2
      Advertised to peer-groups:
      VPN1 VPN2
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.16.11 from 169.254.16.11 (192.168.136.1)
          Origin IGP metric 0, route tag 11, localpref 100, valid, internal, best
          Community: 65001:11
          Advertised Path ID: 1
           Last update: Fri Jul 23 12:44:41 2021
    
      Original VRF 0
      Local, (Received from a RR-client)
        169.254.17.11 from 169.254.17.11 (192.168.136.1)
          Origin IGP metric 0, route tag 11, localpref 100, valid, internal, best
          Community: 65001:11
          Advertised Path ID: 2
           Last update: Fri Jul 23 12:59:28 2021
    
    

    The route learned over both VPN1 and VPN2 from Branch1 has community string 65001:11, showing that the health checks for these links have failed. FGT-HUB1 assigns route-tag 11 to the routes. The Branch1 FortiGate has successfully updated the hub on the status of its VPN overlays.

  5. On FGT-HUB1, check the status of the SD-WAN service rules:

    FGT-HUB1 # diagnose sys sdwan service
    
    Service(10): Address Mode(IPV4) flags=0x200
      Gen(9), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(1 VPN1), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.138.0-192.168.138.255
    
    
    Service(11): Address Mode(IPV4) flags=0x200
      Gen(9), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(1):
        1: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.138.0-192.168.138.255
    
    
    Service(12): Address Mode(IPV4) flags=0x200
      Gen(9), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(2):
        1: Seq_num(1 VPN1), alive, selected
        2: Seq_num(2 VPN2), alive, selected
      Src address(1):
            0.0.0.0-255.255.255.255
    
      Route tag address(1):
            192.168.136.0-192.168.136.255

    Service rules 10 and 11 do not show the destination route-tag address of 192.168.136.0/24, so neither rule is matched. Traffic falls to service rule 12, which catches failed route-tags. Traffic is steered in the order of VPN1 and VPN2.

  6. On the Hub PC, confirm that the pings are still flowing, and check the instant when the delays were introduced. Some pings might have experienced small delays and continued to pass through VPN2, but eventually theroutes were updated on the hub, and traffic passed through VPN1 again :

    64 bytes from 192.168.136.10: icmp_seq=910 ttl=61 time=205 ms
    64 bytes from 192.168.136.10: icmp_seq=911 ttl=61 time=205 ms
    64 bytes from 192.168.136.10: icmp_seq=912 ttl=61 time=205 ms
    64 bytes from 192.168.136.10: icmp_seq=913 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=914 ttl=61 time=188 ms
    64 bytes from 192.168.136.10: icmp_seq=915 ttl=61 time=188 ms
    
  7. On FGT-HUB1, check the SD-WAN service rules in the policy route format:

    FGT-HUB1 # diagnose firewall proute list
    list route policy info(vf=root):
    
    id=2130771978(0x7f01000a) vwl_service=10(ToBranches1) vwl_mbr_seq=1 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.138.0-192.168.138.255
    hit_count=5 last_used=2021-07-23 12:44:19
    
    id=2130771979(0x7f01000b) vwl_service=11(ToBranches2) vwl_mbr_seq=2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.138.0-192.168.138.255
    hit_count=5 last_used=2021-07-23 12:44:47
    
    id=2130771980(0x7f01000c) vwl_service=12(ToBranches3) vwl_mbr_seq=1 2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=10(VPN1) oif=11(VPN2)
    source(1): 0.0.0.0-255.255.255.255
    destination(1): 192.168.136.0-192.168.136.255
    hit_count=2 last_used=2021-07-23 12:59:33
    
  8. On both FGT-HUB1 and FGT-Branch1, check the session table to again confirm that traffic is hitting the right tunnel:

    • Hub:

      FGT-HUB1 # diagnose sys session filter dst 192.168.136.10
      FGT-HUB1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=2440 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=VPN1_1/ vlan_cos=0/255
      state=log may_dirty f00
      statistic(bytes/packets/allow_err): org=204792/2438/1 reply=204792/2438/1 tuples=2
      tx speed(Bps/kbps): 83/0 rx speed(Bps/kbps): 83/0
      orgin->sink: org pre->post, reply pre->post dev=6->10/10->6 gwy=169.254.16.11/192.168.132.2
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00004f53 tos=ff/ff app_list=0 app=0 url_cat=0
      sdwan_mbr_seq=1 sdwan_service_id=12
      rpdb_link_id=ff00000c rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3040000
      total session 1
      
    • Branch:

      FGT-Branch1 # diagnose sys session filter dst 192.168.136.10
      FGT-Branch1 # diagnose sys session list
      
      session info: proto=1 proto_state=00 duration=2462 expire=59 timeout=0 flags=00000000 socktype=0 sockport=0 av_idx=0 use=3
      origin-shaper=
      reply-shaper=
      per_ip_shaper=
      class_id=0 ha_id=0 policy_dir=0 tunnel=/HUB1-VPN1 vlan_cos=0/255
      state=log may_dirty npu f00 app_valid
      statistic(bytes/packets/allow_err): org=206640/2460/1 reply=206640/2460/1 tuples=2
      tx speed(Bps/kbps): 83/0 rx speed(Bps/kbps): 83/0
      orgin->sink: org pre->post, reply pre->post dev=9->6/6->9 gwy=192.168.136.10/169.254.16.1
      hook=pre dir=org act=noop 192.168.128.10:3550->192.168.136.10:8(0.0.0.0:0)
      hook=post dir=reply act=noop 192.168.136.10:3550->192.168.128.10:0(0.0.0.0:0)
      misc=0 policy_id=2 auth_info=0 chk_client_info=0 vd=0
      serial=00002910 tos=ff/ff app_list=2000 app=24466 url_cat=0
      sdwan_mbr_seq=0 sdwan_service_id=0
      rpdb_link_id=00000000 rpdb_svc_id=0 ngfwid=n/a
      npu_state=0x3041008
      total session 1