Fortinet Document Library

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:


Table of Contents

CLI Reference

waf machine-learning-policy

How an anomaly detection model is built?

FortiWeb uses machine learning model to analyze the parameters in your domain and decide whether the value of the parameter is legitimate or not. The machine learning model is built upon vast amount of parameter value samples collected from the real requests to the domain.

When a sample is collected, the system generalized it into a pattern. For example, “abcd_123@abc.com” and “abcdefgecdf_12345678@efg.com” will both be generalized to the pattern “A_N@A.A”. The anomaly detection model is built based on the patterns, not the raw samples.

FortiWeb analyzes the characteristics of the patterns and builds an initial model when 400 samples are collected. The system runs the initial model to detect anomalies, while it keeps collecting more samples to refine it.

Once the number of samples accumulates to 1200, the system will evaluate whether the patterns vary largely since the initial model is built:

  • If there are very few patterns generalized, it indicates the patterns are stable. The system will switch the initial model to a standard model.
  • If a lot of new patterns keeps coming in, the system will continue collecting more samples to cover as much patterns as possible. It won't switch to standard model until the patterns become stable.

The standard model is much more reliable and accurate compared with the initial model. However, your domains may change as new URLs are added and existing parameters provide new functions. This means the mathematical model of the same parameter might be different from what FortiWeb originally observed. To keep the machine learning model up to date, FortiWeb continues collecting new samples to update it, where the outdated patterns are discarded and new patterns are introduced.

Syntax

config waf machine-learning-policy

edit <machine-learning-policy_id>

set start-min-count <start-min-count _int>

set renovate-short-time <renovate-short-time_int>

set waf machine-learning-policy

set switch-min-count <switch-min-count_int>

set switch-percent <switch-percent_int>

set sliding-win-time <sliding-win-time_int>

set sub-window-size <sub-window-size_int>

set waf machine-learning-policy

set denoise-percent <denoise-percent_int>

set denoise-threshold <denoise-threshold_int>

set sample-limit-by-ip <sample-limit-by-ip_int>

set svm-model {xss | sql-injection | code-injection | command-injection | lfi-rfi | common-injection | remote-exploits}

set svm-type {standard | extended}

set anomaly-detection-threshold <anomaly-detection-threshold_int>

set waf machine-learning-policy

set action-anomaly {alert | alert_deny | block-period}

set block-period-anomaly <block-period_int>

set severity-definitely {High | Info | Low | Medium}

set trigger-definitely <policy_name>

set status {enable | disable}

set ip-expire-intval <int>

set ip-expire-cnts <int>

set ip-argcount-limit {enable | disable}

set ip-list-type {Trust | Black}

set url-replacer-policy <policy_name>

set threat-model {enable | disable}

set parameters-limit-per-conn {enable | disable}

set anomaly-detection-threshold <anomaly-detection-threshold_int>

config waf machine-learning-policy

edit <allow-domain-name_id>

set domain-name <domain-name_str>

set domain-index <domain-index_id>

set character-set {AUTO | ISO-8859-1 | ISO-8859-2 | ISO-8859-3 | ISO-8859-4 | ISO-8859-5 | ISO-8859-6 | ISO-8859-7 | ISO-8859-8 | ISO-8859-9 | ISO-8859-10 | ISO-8859-15 | GB2312 | BIG5 | ISO-2022-JP | ISO-2022-JP-2 | Shift-JIS | ISO-2022-KR | UTF-8}

next

end

config source-ip-list

edit <source-ip-list_id>

set <ip>

next

end

next

end

 


Variable Description Default

<machine-learning-policy_id>

Enter the ID of the machine learning policy. It's the number displayed in the "#" column of the machine learning policy table on the Machine Learning Policy page. The valid range is 0–65535. No default

start-min-count <start-min-count _int>

An initial model will be built if the sample count reaches start-min-count.

400

renovate-short-time <renovate-short-time_int>

The system keeps updating the initial model. renovate-short-time defines how frequently FortiWeb updates the model if new patterns keep coming in.

The valid range is 15 to 1440.

15 (minutes)

renovate-long-time <renovate-long-time_int>

renovate-long-time defines how frequently FortiWeb updates the initial model even if no new pattern is generalized out of the samples collected in the past hours. For example, assuming you set the value to 8 (hours), and in the past 8 hours there isn't any new pattern, FortiWeb will update the model every 8 hours anyway.

The valid range is 8 to 720.

8 (hours)

switch-min-count <switch-min-count_int>

When the number of samples reaches switch-min-count, FortiWeb will evaluate whether to build a standard model.

The valid range is 800 to 3000.

1200

switch-percent <switch-percent_int>

switch-percent = the number of generalized patterns / the number of raw samples * 100 (%)

When the switch-percent is smaller than the value you set, FortiWeb switches the initial model to the standard model.

The valid range is 2 to 20.

5(%)

sliding-win-time <sliding-win-time_int>

After the standard model is built, FortiWeb keeps updating it according to the newest samples so that the model can be up to date even when your domain changes, such as when new URLs are added and existing parameters provide new functions.

sliding-win-time defines how frequently FortiWeb updates the standard model.

The valid range is 15-1440 in minutes.

15 (minutes)

sub-window-size <sub-window-size_int>

If there isn't any new pattern generalized during the sliding-win-time, the system will not update the standard model until the number of samples reaches the sub-window-size.

The sub-window-size can be set as 50 or 100.

 

50

sub-window-count <sub-window-count_int>

Every time the standard model is updated, FortiWeb counts it as one sub-window-count. If a certain times of sub-window-count have passed and there isn't any sample coming in for a pattern, FortiWeb considers this pattern outdated, and will discard it.

The sub-window-count can be set as 20, 40, or 80.

For example, assuming the sub-window-count is 20, then FortiWeb will discard a pattern if there isn't any sample collected for it after the model has been updated for 20 times consecutively.

40

denoise-percent <denoise-percent_int>

It's important to reduce the noisy samples in order to build an accurate model.

During the sample collecting period, the system ranks all the samples by their probabilities. The ones with the lowest probabilities will be selected as noisy reduction samples, and will be filtered further with denoise-threshold to determine whether it is a noise.

For example, if you set denoise-percent to 3, then the 3% samples with the lowest probabilities will be selected as noisy reduction samples.

The valid range is 1 to 10.

3 (%)

denoise-threshold <denoise-threshold_int>

The system uses the following formula to determine whether the noisy reduction samples are indeed noises:

The probability of the sample > μ + denoise-threshold * σ.

μ is the average probabilities of the noisy samples. σ is the denoise standard deviation.

Assume there is a circle with most of the samples crowded in the center, and several samples scattered around the edge of the circle. If the probability of the sample is larger than the value of "μ + the strictness level * σ", it means this sample is scattered far away from the center cluster. It indicates this sample might be an anomaly, i.e. a noise.

If you set the denoise-threshold larger, it means the system tolerates a longer distance that a sample is scattered from the center cluster. In this way, less samples will be treated as noises.

If you want to identify more samples as noises, set the denoise-threshold smaller.

The valid range is 1 to 10.

2

threat-model {enable | disable}

Enable to scan anomalies to verify whether they are attacks. It provides a method to check whether an anomaly is a real attack by the trained Support Vector Machine Model.

enable

svm-model {xss | sql-injection | code-injection | command-injection | lfi-rfi | common-injection | remote-exploits}

Enable or disable threat models for different types of threats such as cross-site scripting, SQL injection and code injection. Currently, seven trained Support Vector Machine Model are provided for seven attack types. enable

svm-type {standard | extended}

If standard is selected, the system automatically disables the svm models which can easily trigger false positives.

If extended is selected, the system enables all svm models.

standard

anomaly-detection-threshold <anomaly-detection-threshold_int>

The value of the anomaly-detection-threshold ranges from 1 to 10.

The system uses the following formula to calculate the anomaly threshold:

The probability of the anomaly > μ + the strictness level * σ

If the probability of the sample is larger than the value of "μ + the strictness level * σ", this sample will be identified as anomaly.

μ and σ are calculated based on the probabilities of all the samples collected during the sample collection period, where μ is the average value of all the parameters' probabilities, σ is the standard deviation. They are fixed values. So, the value of "μ + the strictness level * σ" varies with the strictness level you set. The smaller the value of the strictness level is, the more strict the anomaly detection model will be.

This option sets a global value for all the parameters. If you want to adjust the strictness level for a specific parameter, See Manage anomaly-detecting settings.

0.1

parameters-limit-per-conn {enable | disable}

Enable to avoid collecting samples solely for the parameters in the same connection. The anomaly detection will be more effective if the system builds machine learning models for parameters diversely distributed in different connections.

enable

action-anomaly {alert | alert_deny | block-period}

Choose the action FortiWeb takes when definite attack is verified.
alert—Accepts the connection and generates an alert email and/or log message.
alert_deny—Blocks the request (or resets the connection) and generates an alert and/or log message.
block-period—Blocks the request for a certain period of time.
alert_deny

block-period-anomaly <block-period_int>

Enter the number of seconds that you want to block the requests. The valid range is 1–3,600 seconds.
This option only takes effect when you choose Period Block in Action.
600

severity-definitely {High | Info | Low | Medium}

Select the severity level for this anomaly type. The severity level will be displayed in the alert email and/or log message. High

trigger-definitely <policy_name>

Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If definite anomaly is detected, it will trigger the system to send email and/or log messages according to the trigger policy. No default.

status {enable | disable}

Enable to change the status to Running, while disable to change the status to Stopped. enable

url-replacer-policy <policy_name>

Select the name of the URL Replacer Policy that you have created in Machine Learning Templates. If web applications have dynamic URLs or unusual parameter styles, you must adapt URL Replacer Policy to recognize them. No default.

trigger-potential <policy_name>

Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If potential anomaly is detected, it will trigger the system to send email and/or log messages according to the trigger policy.  

<allow-domain-name_id>

Enter the ID of the policy. The valid range is 1–65,535. No default.

ip-list-type {Trust | Black}

Allow or deny sample collection from the Source IP list. Trust

domain-name <domain-name_str>

Add full domain name or use wildcard '*' to cover multiple domains under one profile. No default.

domain-index <domain-index_id>

The number automatically assigned by the system when the domain name is created. No default.

character-set {AUTO | ISO-8859-1 | ISO-8859-2 | ISO-8859-3 | ISO-8859-4 | ISO-8859-5 | ISO-8859-6 | ISO-8859-7 | ISO-8859-8 | ISO-8859-9 | ISO-8859-10 | ISO-8859-15 | GB2312 | BIG5 | ISO-2022-JP | ISO-2022-JP-2 | Shift-JIS | ISO-2022-KR | UTF-8}

The corresponding character code when manually setting the domain. No default.

<source-ip-list_id>

Enter the ID of the source IP. The valid range is 1–9,223,372,036,854,775,807 No default.

<ip>

Enter the IP range for the source IP list. No default.

ip-expire-intval <int>

ip-expire-cnts <int>

An parameter is in unconfirmed status initially, and it will be set to confirmed if the parameter is contained in the requests from a certain number of different source IPs within the given time. Otherwise, the parameter will be discarded.

ip-expire-cnts defines the "the number of different source IPs", while the ip-expire-intval defines the given time period.

The valid range for ip-expire-intval is 1-24 in hours, and the default value is 4.
The valid range for ip-expire-cnts is 1-5, and the default value is 3.

4/3

ip-argcount-limit {enable | disable}

Enable it so that each source IP can create at most 20 new arguments in every 30 minutes.

disable

sample-limit-by-ip <sample-limit-by-ip_int>

The limitation number of samples collected from each IP. The valid range is 0–5000. 30

 

Related Topics

waf machine-learning-policy

How an anomaly detection model is built?

FortiWeb uses machine learning model to analyze the parameters in your domain and decide whether the value of the parameter is legitimate or not. The machine learning model is built upon vast amount of parameter value samples collected from the real requests to the domain.

When a sample is collected, the system generalized it into a pattern. For example, “abcd_123@abc.com” and “abcdefgecdf_12345678@efg.com” will both be generalized to the pattern “A_N@A.A”. The anomaly detection model is built based on the patterns, not the raw samples.

FortiWeb analyzes the characteristics of the patterns and builds an initial model when 400 samples are collected. The system runs the initial model to detect anomalies, while it keeps collecting more samples to refine it.

Once the number of samples accumulates to 1200, the system will evaluate whether the patterns vary largely since the initial model is built:

  • If there are very few patterns generalized, it indicates the patterns are stable. The system will switch the initial model to a standard model.
  • If a lot of new patterns keeps coming in, the system will continue collecting more samples to cover as much patterns as possible. It won't switch to standard model until the patterns become stable.

The standard model is much more reliable and accurate compared with the initial model. However, your domains may change as new URLs are added and existing parameters provide new functions. This means the mathematical model of the same parameter might be different from what FortiWeb originally observed. To keep the machine learning model up to date, FortiWeb continues collecting new samples to update it, where the outdated patterns are discarded and new patterns are introduced.

Syntax

config waf machine-learning-policy

edit <machine-learning-policy_id>

set start-min-count <start-min-count _int>

set renovate-short-time <renovate-short-time_int>

set waf machine-learning-policy

set switch-min-count <switch-min-count_int>

set switch-percent <switch-percent_int>

set sliding-win-time <sliding-win-time_int>

set sub-window-size <sub-window-size_int>

set waf machine-learning-policy

set denoise-percent <denoise-percent_int>

set denoise-threshold <denoise-threshold_int>

set sample-limit-by-ip <sample-limit-by-ip_int>

set svm-model {xss | sql-injection | code-injection | command-injection | lfi-rfi | common-injection | remote-exploits}

set svm-type {standard | extended}

set anomaly-detection-threshold <anomaly-detection-threshold_int>

set waf machine-learning-policy

set action-anomaly {alert | alert_deny | block-period}

set block-period-anomaly <block-period_int>

set severity-definitely {High | Info | Low | Medium}

set trigger-definitely <policy_name>

set status {enable | disable}

set ip-expire-intval <int>

set ip-expire-cnts <int>

set ip-argcount-limit {enable | disable}

set ip-list-type {Trust | Black}

set url-replacer-policy <policy_name>

set threat-model {enable | disable}

set parameters-limit-per-conn {enable | disable}

set anomaly-detection-threshold <anomaly-detection-threshold_int>

config waf machine-learning-policy

edit <allow-domain-name_id>

set domain-name <domain-name_str>

set domain-index <domain-index_id>

set character-set {AUTO | ISO-8859-1 | ISO-8859-2 | ISO-8859-3 | ISO-8859-4 | ISO-8859-5 | ISO-8859-6 | ISO-8859-7 | ISO-8859-8 | ISO-8859-9 | ISO-8859-10 | ISO-8859-15 | GB2312 | BIG5 | ISO-2022-JP | ISO-2022-JP-2 | Shift-JIS | ISO-2022-KR | UTF-8}

next

end

config source-ip-list

edit <source-ip-list_id>

set <ip>

next

end

next

end

 


Variable Description Default

<machine-learning-policy_id>

Enter the ID of the machine learning policy. It's the number displayed in the "#" column of the machine learning policy table on the Machine Learning Policy page. The valid range is 0–65535. No default

start-min-count <start-min-count _int>

An initial model will be built if the sample count reaches start-min-count.

400

renovate-short-time <renovate-short-time_int>

The system keeps updating the initial model. renovate-short-time defines how frequently FortiWeb updates the model if new patterns keep coming in.

The valid range is 15 to 1440.

15 (minutes)

renovate-long-time <renovate-long-time_int>

renovate-long-time defines how frequently FortiWeb updates the initial model even if no new pattern is generalized out of the samples collected in the past hours. For example, assuming you set the value to 8 (hours), and in the past 8 hours there isn't any new pattern, FortiWeb will update the model every 8 hours anyway.

The valid range is 8 to 720.

8 (hours)

switch-min-count <switch-min-count_int>

When the number of samples reaches switch-min-count, FortiWeb will evaluate whether to build a standard model.

The valid range is 800 to 3000.

1200

switch-percent <switch-percent_int>

switch-percent = the number of generalized patterns / the number of raw samples * 100 (%)

When the switch-percent is smaller than the value you set, FortiWeb switches the initial model to the standard model.

The valid range is 2 to 20.

5(%)

sliding-win-time <sliding-win-time_int>

After the standard model is built, FortiWeb keeps updating it according to the newest samples so that the model can be up to date even when your domain changes, such as when new URLs are added and existing parameters provide new functions.

sliding-win-time defines how frequently FortiWeb updates the standard model.

The valid range is 15-1440 in minutes.

15 (minutes)

sub-window-size <sub-window-size_int>

If there isn't any new pattern generalized during the sliding-win-time, the system will not update the standard model until the number of samples reaches the sub-window-size.

The sub-window-size can be set as 50 or 100.

 

50

sub-window-count <sub-window-count_int>

Every time the standard model is updated, FortiWeb counts it as one sub-window-count. If a certain times of sub-window-count have passed and there isn't any sample coming in for a pattern, FortiWeb considers this pattern outdated, and will discard it.

The sub-window-count can be set as 20, 40, or 80.

For example, assuming the sub-window-count is 20, then FortiWeb will discard a pattern if there isn't any sample collected for it after the model has been updated for 20 times consecutively.

40

denoise-percent <denoise-percent_int>

It's important to reduce the noisy samples in order to build an accurate model.

During the sample collecting period, the system ranks all the samples by their probabilities. The ones with the lowest probabilities will be selected as noisy reduction samples, and will be filtered further with denoise-threshold to determine whether it is a noise.

For example, if you set denoise-percent to 3, then the 3% samples with the lowest probabilities will be selected as noisy reduction samples.

The valid range is 1 to 10.

3 (%)

denoise-threshold <denoise-threshold_int>

The system uses the following formula to determine whether the noisy reduction samples are indeed noises:

The probability of the sample > μ + denoise-threshold * σ.

μ is the average probabilities of the noisy samples. σ is the denoise standard deviation.

Assume there is a circle with most of the samples crowded in the center, and several samples scattered around the edge of the circle. If the probability of the sample is larger than the value of "μ + the strictness level * σ", it means this sample is scattered far away from the center cluster. It indicates this sample might be an anomaly, i.e. a noise.

If you set the denoise-threshold larger, it means the system tolerates a longer distance that a sample is scattered from the center cluster. In this way, less samples will be treated as noises.

If you want to identify more samples as noises, set the denoise-threshold smaller.

The valid range is 1 to 10.

2

threat-model {enable | disable}

Enable to scan anomalies to verify whether they are attacks. It provides a method to check whether an anomaly is a real attack by the trained Support Vector Machine Model.

enable

svm-model {xss | sql-injection | code-injection | command-injection | lfi-rfi | common-injection | remote-exploits}

Enable or disable threat models for different types of threats such as cross-site scripting, SQL injection and code injection. Currently, seven trained Support Vector Machine Model are provided for seven attack types. enable

svm-type {standard | extended}

If standard is selected, the system automatically disables the svm models which can easily trigger false positives.

If extended is selected, the system enables all svm models.

standard

anomaly-detection-threshold <anomaly-detection-threshold_int>

The value of the anomaly-detection-threshold ranges from 1 to 10.

The system uses the following formula to calculate the anomaly threshold:

The probability of the anomaly > μ + the strictness level * σ

If the probability of the sample is larger than the value of "μ + the strictness level * σ", this sample will be identified as anomaly.

μ and σ are calculated based on the probabilities of all the samples collected during the sample collection period, where μ is the average value of all the parameters' probabilities, σ is the standard deviation. They are fixed values. So, the value of "μ + the strictness level * σ" varies with the strictness level you set. The smaller the value of the strictness level is, the more strict the anomaly detection model will be.

This option sets a global value for all the parameters. If you want to adjust the strictness level for a specific parameter, See Manage anomaly-detecting settings.

0.1

parameters-limit-per-conn {enable | disable}

Enable to avoid collecting samples solely for the parameters in the same connection. The anomaly detection will be more effective if the system builds machine learning models for parameters diversely distributed in different connections.

enable

action-anomaly {alert | alert_deny | block-period}

Choose the action FortiWeb takes when definite attack is verified.
alert—Accepts the connection and generates an alert email and/or log message.
alert_deny—Blocks the request (or resets the connection) and generates an alert and/or log message.
block-period—Blocks the request for a certain period of time.
alert_deny

block-period-anomaly <block-period_int>

Enter the number of seconds that you want to block the requests. The valid range is 1–3,600 seconds.
This option only takes effect when you choose Period Block in Action.
600

severity-definitely {High | Info | Low | Medium}

Select the severity level for this anomaly type. The severity level will be displayed in the alert email and/or log message. High

trigger-definitely <policy_name>

Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If definite anomaly is detected, it will trigger the system to send email and/or log messages according to the trigger policy. No default.

status {enable | disable}

Enable to change the status to Running, while disable to change the status to Stopped. enable

url-replacer-policy <policy_name>

Select the name of the URL Replacer Policy that you have created in Machine Learning Templates. If web applications have dynamic URLs or unusual parameter styles, you must adapt URL Replacer Policy to recognize them. No default.

trigger-potential <policy_name>

Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If potential anomaly is detected, it will trigger the system to send email and/or log messages according to the trigger policy.  

<allow-domain-name_id>

Enter the ID of the policy. The valid range is 1–65,535. No default.

ip-list-type {Trust | Black}

Allow or deny sample collection from the Source IP list. Trust

domain-name <domain-name_str>

Add full domain name or use wildcard '*' to cover multiple domains under one profile. No default.

domain-index <domain-index_id>

The number automatically assigned by the system when the domain name is created. No default.

character-set {AUTO | ISO-8859-1 | ISO-8859-2 | ISO-8859-3 | ISO-8859-4 | ISO-8859-5 | ISO-8859-6 | ISO-8859-7 | ISO-8859-8 | ISO-8859-9 | ISO-8859-10 | ISO-8859-15 | GB2312 | BIG5 | ISO-2022-JP | ISO-2022-JP-2 | Shift-JIS | ISO-2022-KR | UTF-8}

The corresponding character code when manually setting the domain. No default.

<source-ip-list_id>

Enter the ID of the source IP. The valid range is 1–9,223,372,036,854,775,807 No default.

<ip>

Enter the IP range for the source IP list. No default.

ip-expire-intval <int>

ip-expire-cnts <int>

An parameter is in unconfirmed status initially, and it will be set to confirmed if the parameter is contained in the requests from a certain number of different source IPs within the given time. Otherwise, the parameter will be discarded.

ip-expire-cnts defines the "the number of different source IPs", while the ip-expire-intval defines the given time period.

The valid range for ip-expire-intval is 1-24 in hours, and the default value is 4.
The valid range for ip-expire-cnts is 1-5, and the default value is 3.

4/3

ip-argcount-limit {enable | disable}

Enable it so that each source IP can create at most 20 new arguments in every 30 minutes.

disable

sample-limit-by-ip <sample-limit-by-ip_int>

The limitation number of samples collected from each IP. The valid range is 0–5000. 30

 

Related Topics