waf machine-learning-policy
How an anomaly detection model is built?
FortiWeb uses machine learning model to analyze the parameters in your domain and decide whether the value of the parameter is legitimate or not. The machine learning model is built upon vast amount of parameter value samples collected from the real requests to the domain.
When a sample is collected, the system generalized it into a pattern. For example, “abcd_123@abc.com” and “abcdefgecdf_12345678@efg.com” will both be generalized to the pattern “A_N@A.A”. The anomaly detection model is built based on the patterns, not the raw samples.
FortiWeb analyzes the characteristics of the patterns and builds an initial model when 400 samples are collected. The system runs the initial model to detect anomalies, while it keeps collecting more samples to refine it.
Once the number of samples accumulates to 1200, the system will evaluate whether the patterns vary largely since the initial model is built:
- If there are very few patterns generalized, it indicates the patterns are stable. The system will switch the initial model to a standard model.
- If a lot of new patterns keeps coming in, the system will continue collecting more samples to cover as much patterns as possible. It won't switch to standard model until the patterns become stable.
The standard model is much more reliable and accurate compared with the initial model. However, your domains may change as new URLs are added and existing parameters provide new functions. This means the mathematical model of the same parameter might be different from what FortiWeb originally observed. To keep the machine learning model up to date, FortiWeb continues collecting new samples to update it, where the outdated patterns are discarded and new patterns are introduced.
To use this command, your administrator account’s access control profile must have either w
or rw
permission to the wafgrp
area. For details, see Permissions.
Syntax
config waf machine-learning-policy
edit <machine-learning-policy_id>
set start-min-count <start-min-count _int>
set renovate-short-time <renovate-short-time_int>
set waf machine-learning-policy
set switch-min-count <switch-min-count_int>
set switch-percent <switch-percent_int>
set sliding-win-time <sliding-win-time_int>
set sub-window-size <sub-window-size_int>
set waf machine-learning-policy
set denoise-percent <denoise-percent_int>
set denoise-threshold <denoise-threshold_int>
set sample-limit-by-ip <sample-limit-by-ip_int>
set svm-type {standard | extended}
set anomaly-detection-threshold <anomaly-detection-threshold_int>
set waf machine-learning-policy
set action-anomaly {alert | alert_deny | block-period}
set block-period-anomaly <block-period_int>
set severity-definitely {High | Info | Low | Medium}
set trigger-definitely <policy_name>
set ip-argcount-limit {enable | disable}
set ip-list-type {Trust | Black}
set url-replacer-policy <policy_name>
set threat-model {enable | disable}
set parameters-limit-per-conn {enable | disable}
set anomaly-detection-threshold <anomaly-detection-threshold_int>
config allow-domain-name
set domain-name <domain-name_str>
set domain-index <domain-index_id>
set hmm-probability-sample-length-check {enable | disable}
set sample-length-threshold <int>
set hmm-probability-threshold <int>
next
end
config source-ip-list
edit <source-ip-list_id>
set <ip>
next
end
next
end
Variable | Description | Default |
Enter the ID of the machine learning policy. It's the number displayed in the "#" column of the machine learning policy table on the Machine Learning Policy page. The valid range is 0–65535. | No default
|
|
An initial model will be built if the sample count reaches |
|
|
The system keeps updating the initial model. The valid range is 15 to 1440. |
15 (minutes) |
|
The valid range is 8 to 720. |
8 (hours) |
|
When the number of samples reaches The valid range is 800 to 3000. |
|
|
When the The valid range is 2 to 20. |
|
|
After the standard model is built, FortiWeb keeps updating it according to the newest samples so that the model can be up to date even when your domain changes, such as when new URLs are added and existing parameters provide new functions.
The valid range is 15-1440 in minutes. |
15 (minutes) |
|
If there isn't any new pattern generalized during the The
|
50 |
|
Every time the standard model is updated, FortiWeb counts it as one The For example, assuming the |
40 |
|
It's important to reduce the noisy samples in order to build an accurate model. During the sample collecting period, the system ranks all the samples by their probabilities. The ones with the lowest probabilities will be selected as noisy reduction samples, and will be filtered further with For example, if you set The valid range is 1 to 10. |
3 (%) |
|
The system uses the following formula to determine whether the noisy reduction samples are indeed noises: The probability of the sample > μ + μ is the average probabilities of the noisy samples. σ is the denoise standard deviation. Assume there is a circle with most of the samples crowded in the center, and several samples scattered around the edge of the circle. If the probability of the sample is larger than the value of "μ + the strictness level * σ", it means this sample is scattered far away from the center cluster. It indicates this sample might be an anomaly, i.e. a noise. If you set the If you want to identify more samples as noises, set the The valid range is 1 to 10. |
2 |
|
Enable to scan anomalies to verify whether they are attacks. It provides a method to check whether an anomaly is a real attack by the trained Support Vector Machine Model. |
|
|
svm-model {xss | sql-injection | code-injection | command-injection | lfi-rfi | common-injection | remote-exploits} |
Enable or disable threat models for different types of threats such as cross-site scripting, SQL injection and code injection. Currently, seven trained Support Vector Machine Model are provided for seven attack types. | enable
|
If If |
|
|
anomaly-detection-threshold <anomaly-detection-threshold_int> |
The value of the anomaly-detection-threshold ranges from 1 to 10. The system uses the following formula to calculate the anomaly threshold: The probability of the anomaly > μ + the strictness level * σ If the probability of the sample is larger than the value of "μ + the strictness level * σ", this sample will be identified as anomaly. μ and σ are calculated based on the probabilities of all the samples collected during the sample collection period, where μ is the average value of all the parameters' probabilities, σ is the standard deviation. They are fixed values. So, the value of "μ + the strictness level * σ" varies with the strictness level you set. The smaller the value of the strictness level is, the more strict the anomaly detection model will be. This option sets a global value for all the parameters. If you want to adjust the strictness level for a specific parameter, See Manage anomaly-detecting settings. |
0.1
|
Enable to avoid collecting samples solely for the parameters in the same connection. The anomaly detection will be more effective if the system builds machine learning models for parameters diversely distributed in different connections. |
|
|
Choose the action FortiWeb takes when definite attack is
verified.alert —Accepts the connection and generates an alert
email and/or log message.
alert_deny —Blocks the request (or resets the
connection) and generates an alert and/or log message.
block-period —Blocks the request for a certain period of
time. |
alert_deny
|
|
block-period-anomaly <block-period_int> |
Enter the number of seconds that you want to block
the requests. The valid range is 1–3,600 seconds. This option only takes effect when you choose Period Block in Action. |
600
|
severity-definitely {High | Info | Low | Medium} |
Select the severity level for this anomaly type. The severity level will be displayed in the alert email and/or log message. | High
|
Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If definite anomaly is detected, it will trigger the system to send email and/or log messages according to the trigger policy. | No default. | |
Enable to change the status to Running, while disable to change the status to Stopped. | enable
|
|
Select the name of the URL Replacer Policy that you have created in Machine Learning Templates. If web applications have dynamic URLs or unusual parameter styles, you must adapt URL Replacer Policy to recognize them. | No default. | |
Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If potential anomaly is detected, it will trigger the system to send email and/or log messages according to the trigger policy. | ||
Enter the ID of the policy. The valid range is 1–65,535. | No default. | |
Allow or deny sample collection from the Source IP list. | Trust
|
|
Add full domain name or use wildcard '*' to cover multiple domains under one profile. | No default. | |
The number automatically assigned by the system when the domain name is created. | No default. | |
Enable to check whether the parameter value is in unexpected length or of high anomaly probability. |
disable |
|
If the length of the parameter value is larger than the specified threshold, the system will not send it to SVM model for further validation. Instead, it will be directly treated as an anomaly. The valid range is 0-1,024. 0 means not applicable. |
0 |
|
If the anomaly probability of the parameter value is larger than the specified threshold, the system will not send it to SVM model for further validation. Instead, it will be directly treated as an anomaly. The valid range is 0-2,000. 0 means not applicable. If you are not sure how to set a proper probability value, there are two places where you can refer:
|
0 |
|
character-set {AUTO | ISO-8859-1 | ISO-8859-2 | ISO-8859-3 | ISO-8859-4 | ISO-8859-5 | ISO-8859-6 | ISO-8859-7 | ISO-8859-8 | ISO-8859-9 | ISO-8859-10 | ISO-8859-15 | GB2312 | BIG5 | ISO-2022-JP | ISO-2022-JP-2 | Shift-JIS | ISO-2022-KR | UTF-8} |
The corresponding character code when manually setting the domain. | No default. |
Enter the ID of the source IP. The valid range is 1–9,223,372,036,854,775,807 | No default. | |
Enter the IP range for the source IP list. | No default. | |
An parameter is in unconfirmed status initially, and it will be set to confirmed if the parameter is contained in the requests from a certain number of different source IPs within the given time. Otherwise, the parameter will be discarded.
The valid range for |
4/3 |
|
Enable it so that each source IP can create at most 20 new arguments in every 30 minutes. |
disable |
|
The limitation number of samples collected from each IP. The valid range is 0–5000. | 30
|