Fortinet white logo
Fortinet white logo

Administration Guide

Machine learning trouble-shooting

Machine learning trouble-shooting

ML based Anomaly Detection does not learn parameters successfully

If there isn't any data shown in the Machine Learning - Anomaly Detection module, first run grep ml to confirm the issue. If /bin/ml_init is displayed in the printout, it means this module doesn't work at all. This results in no data shown.

The following reasons might cause this issue:

  • Charset is not supported.

  • Request or response packets are not valid.

Verifying the charset

Machine Learning - Anomaly Detection collect samples of a domain to learn its URLs and parameters including the parameter name and value. If the charset used by the domain can't be recognized by the ML based Anomaly Detection module, it's impossible for it to parse the data properly. As a result, it can't build up a valid machine learning model.

The following charsets are supported:

BIG5; GB2312; ISO-2022-JP; ISO-2022-JP-2; ISO-2022-KR; ISO-8859-1; ISO-8859-2; ISO-8859-3; ISO-8859-4; ISO-8859-5; ISO-8859-6; ISO-8859-7; ISO-8859-8; ISO-8859-9; ISO-8859-10; ISO-8859-15; Shift-JIS; UTF-8;

FortiWeb identifies the charset defined in the Content Type: header of the response packets, for instance, Content-Type:text/html; charset=xxx. Charset can also be included in the HTTP response body as <META …. charset=xxx">.

  • If the charset is supported, FortiWeb will continue the model building process.

  • If there isn't any charset defined, FortiWeb will take it as UTF-8 and continue the model building process. On the Overview section, the Page Charset will be shown as Default.

  • If the charset defined in Content Type: isn't one of the supported charset, the machine learning model can't be built.

Verifying the request and response packets

The system checks the charset defined in the response packets, so the response packets should be valid at first:

  • The response code must be 200 or 302;

  • The maximum bytes buffered for HTTP response body is 2048; charset cannot be learnt if it’s out of this range.

The request packet should have parameters in its URL or request body. Refer to the following examples.

  • Parameters in URL. In this example, the parameter is testargument=2000.

    http://www.testdomain.com/autotest/test.html?testargument=2000

  • Parameters in the body. In this example, the parameter is myparameter=123.

    POST /autotest/csh/mlarg3.php HTTP/1.1

    Connection: keep-alive

    Accept-Encoding: gzip, deflate

    Accept: */*

    User-Agent: python-requests/2.12.2

    Host: testmydomain

    Cookie: cookiesession1=3473FD0DAS38CIHAIRSOZ3D9RDVTB577;

    X-Forwarded-For: 2.2.2.2

    Content-Length: 23

    Content-Type: application/x-www-form-urlencoded

    myparameter=123

The content-type of both the request body and response body should be one of the following:

  • multipart/related

  • application/soap+xml

  • text/xml, application/xml, application/vnd.syncml+xml, application/vnd.ms-sync.wbxml

  • multipart/form-data

  • text/html

  • application/x-www-form-urlencoded

  • text/plain

  • multipart/x-mixed-replace

  • application/rss+xml

  • application/xhtml+xml

  • application/json, text/json

Content types such as message/HTTP, application/rpc, application/x-amf, and application/vnd.syncml+wbxml are not supported.

Run the following commands to get more debug info on ML based Anomaly Detection.

diagnose debug application machine-learning 7

diagnose debug enable

ML based Anomaly Detection status does not change from Unconfirmed to Running stage

1. Check if the “Collected Samples” reaches 400 (the default start-min-count), which is the default number for an initial model to be built up;

2. Check if new requests meet the requirements of ip-expire-intval (1-24 hours) and ip-expire-cnts (source IPs).

You can set both value as 1 to make it easier for test.

3. Sending traffic from single source and multiple XFFs:

  • Enable Inline Protection Profile and choose “Use X-Header to Identify Original Client's IP”.
  • Need to use public IP addresses to test instead of private IPs.
  • Sometimes you may use curl to verify the functionalities, however please note that the behavior of different curl versions may vary. It’s better to double check the traffic/request actually sent out with packet capture or FortiWeb tlog.

    E.g, with curl 7.68.0 on Ubuntu 20.0.4, the XFF IP 102.11.2.3 will be recognized as the “Original Source” in tlog with the 1st curl command as below. But on Win10 with curl 7.78.0, just the 1st curl command cannot be identified as the “Original Source”; the other 3 formatted commands will take effect and trigger the machine learning process.

    curl http://direct.ama01.com/index.php?new_para=123 -H 'X-Forwarded-For:102.11.2.3'

    curl http://direct.ama01.com/index.php?new_para=123 -H “X-Forwarded-For:102.11.2.3”

    curl http://direct.ama01.com/index.php?new_para=123 -H X-Forwarded-For:102.11.2.3

    curl http://direct.ama01.com/index.php?new_para=123 -H X-FORWARDED-FOR:102.11.2.3

ML based Anomaly Detection does not block traffic

1. In Web Protection > ML Based Anomaly Detection > Tree View, click Test Sample, then enter a parameter value to verify whether it will be detected as an anomaly at the current strictness level.

Only if a parameter is recognized as an anomaly first by HMM model, it will be then sent to SVM model to double check if it’s a real attack.

2. Check if FortiWeb works in Active-Active-Standard or Active-Active-High-Volume mode, which are not supported yet on 6.3 & 6.4.

This issue has been resolved on FortiWeb 7.0 and later releases.

Machine Learning - Anomaly Detection upgrade&compatibility issues

FortiWeb 6.4 uses Redis while 6.3 uses MySQL. So after upgrading from 6.3 to 6.4, old machine learning data will be lost.

Upgrading from 6.3/6.4 to 7.0 is supported.

Machine learning trouble-shooting

Machine learning trouble-shooting

ML based Anomaly Detection does not learn parameters successfully

If there isn't any data shown in the Machine Learning - Anomaly Detection module, first run grep ml to confirm the issue. If /bin/ml_init is displayed in the printout, it means this module doesn't work at all. This results in no data shown.

The following reasons might cause this issue:

  • Charset is not supported.

  • Request or response packets are not valid.

Verifying the charset

Machine Learning - Anomaly Detection collect samples of a domain to learn its URLs and parameters including the parameter name and value. If the charset used by the domain can't be recognized by the ML based Anomaly Detection module, it's impossible for it to parse the data properly. As a result, it can't build up a valid machine learning model.

The following charsets are supported:

BIG5; GB2312; ISO-2022-JP; ISO-2022-JP-2; ISO-2022-KR; ISO-8859-1; ISO-8859-2; ISO-8859-3; ISO-8859-4; ISO-8859-5; ISO-8859-6; ISO-8859-7; ISO-8859-8; ISO-8859-9; ISO-8859-10; ISO-8859-15; Shift-JIS; UTF-8;

FortiWeb identifies the charset defined in the Content Type: header of the response packets, for instance, Content-Type:text/html; charset=xxx. Charset can also be included in the HTTP response body as <META …. charset=xxx">.

  • If the charset is supported, FortiWeb will continue the model building process.

  • If there isn't any charset defined, FortiWeb will take it as UTF-8 and continue the model building process. On the Overview section, the Page Charset will be shown as Default.

  • If the charset defined in Content Type: isn't one of the supported charset, the machine learning model can't be built.

Verifying the request and response packets

The system checks the charset defined in the response packets, so the response packets should be valid at first:

  • The response code must be 200 or 302;

  • The maximum bytes buffered for HTTP response body is 2048; charset cannot be learnt if it’s out of this range.

The request packet should have parameters in its URL or request body. Refer to the following examples.

  • Parameters in URL. In this example, the parameter is testargument=2000.

    http://www.testdomain.com/autotest/test.html?testargument=2000

  • Parameters in the body. In this example, the parameter is myparameter=123.

    POST /autotest/csh/mlarg3.php HTTP/1.1

    Connection: keep-alive

    Accept-Encoding: gzip, deflate

    Accept: */*

    User-Agent: python-requests/2.12.2

    Host: testmydomain

    Cookie: cookiesession1=3473FD0DAS38CIHAIRSOZ3D9RDVTB577;

    X-Forwarded-For: 2.2.2.2

    Content-Length: 23

    Content-Type: application/x-www-form-urlencoded

    myparameter=123

The content-type of both the request body and response body should be one of the following:

  • multipart/related

  • application/soap+xml

  • text/xml, application/xml, application/vnd.syncml+xml, application/vnd.ms-sync.wbxml

  • multipart/form-data

  • text/html

  • application/x-www-form-urlencoded

  • text/plain

  • multipart/x-mixed-replace

  • application/rss+xml

  • application/xhtml+xml

  • application/json, text/json

Content types such as message/HTTP, application/rpc, application/x-amf, and application/vnd.syncml+wbxml are not supported.

Run the following commands to get more debug info on ML based Anomaly Detection.

diagnose debug application machine-learning 7

diagnose debug enable

ML based Anomaly Detection status does not change from Unconfirmed to Running stage

1. Check if the “Collected Samples” reaches 400 (the default start-min-count), which is the default number for an initial model to be built up;

2. Check if new requests meet the requirements of ip-expire-intval (1-24 hours) and ip-expire-cnts (source IPs).

You can set both value as 1 to make it easier for test.

3. Sending traffic from single source and multiple XFFs:

  • Enable Inline Protection Profile and choose “Use X-Header to Identify Original Client's IP”.
  • Need to use public IP addresses to test instead of private IPs.
  • Sometimes you may use curl to verify the functionalities, however please note that the behavior of different curl versions may vary. It’s better to double check the traffic/request actually sent out with packet capture or FortiWeb tlog.

    E.g, with curl 7.68.0 on Ubuntu 20.0.4, the XFF IP 102.11.2.3 will be recognized as the “Original Source” in tlog with the 1st curl command as below. But on Win10 with curl 7.78.0, just the 1st curl command cannot be identified as the “Original Source”; the other 3 formatted commands will take effect and trigger the machine learning process.

    curl http://direct.ama01.com/index.php?new_para=123 -H 'X-Forwarded-For:102.11.2.3'

    curl http://direct.ama01.com/index.php?new_para=123 -H “X-Forwarded-For:102.11.2.3”

    curl http://direct.ama01.com/index.php?new_para=123 -H X-Forwarded-For:102.11.2.3

    curl http://direct.ama01.com/index.php?new_para=123 -H X-FORWARDED-FOR:102.11.2.3

ML based Anomaly Detection does not block traffic

1. In Web Protection > ML Based Anomaly Detection > Tree View, click Test Sample, then enter a parameter value to verify whether it will be detected as an anomaly at the current strictness level.

Only if a parameter is recognized as an anomaly first by HMM model, it will be then sent to SVM model to double check if it’s a real attack.

2. Check if FortiWeb works in Active-Active-Standard or Active-Active-High-Volume mode, which are not supported yet on 6.3 & 6.4.

This issue has been resolved on FortiWeb 7.0 and later releases.

Machine Learning - Anomaly Detection upgrade&compatibility issues

FortiWeb 6.4 uses Redis while 6.3 uses MySQL. So after upgrading from 6.3 to 6.4, old machine learning data will be lost.

Upgrading from 6.3/6.4 to 7.0 is supported.