Fortinet black logo

User Guide

ML Based Bot Detection

Copy Link
Copy Doc ID 2ffc9903-bcb4-11e9-8977-00505692583a:832614
Download PDF

ML Based Bot Detection

The AI-based bot detection model complements the existing signature and threshold based rules. It detects sophisticated bots and CC attacks that can sometimes go undetected.

Compared with the traditional mechanisms to detect bots, the ML based bot detection model saves you the trouble to experiment on an appropriate threshold to detect abnormal user behaviors. For example, how could you know how many times of HTTP requests initiated by a user should be considered as abnormal? With the traditional mechanism, you may need to experiment on different threshold values and continuously check the attack log until no related attack logs are reported for the regular traffic.

Things are much easier if you use the ML based bot detection model. FortiWeb Cloud uses SVM (Support Vector Machine) algorithm to build up the bot detection model that self-learns the traffic profiles of regular clients. When the traffic from a new client flows in, it is compared against that of the regular clients. If they don't match, the bot detection model classifies the new client as an anomaly. When the traffic profiles of the regular clients vary dramatically (e.g. the functions of your application have changed, so that users behave differently when they visit your application),FortiWeb Cloud automatically refreshes the bot detection model to adapt to the changes.

Moreover, test shows that the bot detection model performs much better, specially when it detects crawlers and scrapers. The traffic is comprehensively evaluated from 13 dimensions. It helps increase the detection accuracy and decrease the false positive rate.

To configure a ML based bot detection rule:

  1. Go to BOT MITIGATION > ML Based Detection (Beta).
    You must have already enabled this module in Add Modules. See How to add or remove a module.
  2. Select the Model Settings tab.
  3. Configure the following settings.
  4. Client Identification Method

    FortiWeb Cloud collects samples from the real users to build a machine learning model. Select whether to use IP, IP and User-Agent, or Cookie to identify a user.

    • IP: The traffic data in one sample should come from the same source IP.

    • IP and User-Agent: The traffic data in one sample should come from the same source IP and User-Agent (the browser).

    • Cookie: The traffic data in one sample should have the same cookie value.

    Model Type

    Multiple models are built during the model building stage. The system uses training accuracy, cross-validation value, and testing accuracy to select qualified models.

    The Model Type is used to select the one final model out of all the qualified models.

    • If you configure the Model Type to Moderate, the system chooses the model which has the highest training accuracy among all the qualified models.

    • If you configure the Model Type to Strict, the system chooses the model which has the lowest training accuracy among all the qualified models.

    The Strict Model has a higher likelihood of identifying anomalies, but also carries the risk of incorrectly identifying regular users as bots.

    The Moderate Model is relatively lenient making it less prone to false positive detections, but comes with the risk of allowing actual bots to go undetected.

    There isn't a perfect option for every situation. Whichever model type you choose, you can always leverage the options in Anomaly Detection Settings and Action Settings to mitigate the side effects, for example, using Bot Confirmation to avoid false positive detections.

    Anomaly Count

    If the system detects certain times of anomalies from a user, it takes actions such as sending alerting emails or blocking the traffic from this user.

    Anomaly Count controls how many times of anomalies are allowed for each user.

    For example, the Anomaly Count is set to 4, and the system has detected 3 anomalies in the last 6 samples. If the 7th sample is detected again as an anomaly, the system will take actions.

    Please note that if no valid traffic is collected for the 7th sample (for example, the user leaves your application), the system will clear the anomaly count and the user information. If the user revisits your application, he/she will be treated as new users and the system starts anomaly counting afresh.

    Since this option allows certain times of anomalies from a user, it might be a good choice if you want to avoid false positive detections.

    Challenge

    If a bot is detected, the system will use the following methods to confirm it's indeed a bot.

    • Disable: Do not execute browser verification.

    • Real Browser Enforcement: The system sends a JavaScript to the client to verify whether it is a web browser.

    • CAPTCHA Enforcement: The system requires clients to successfully fulfill a CAPTCHA request.

    It will trigger the action policy if the traffic is not from web browser.

    Block Duration

    Enter the number of seconds that you want to block the requests. The valid range is 1–3,600 seconds (1 hour).

    This option only takes effect when you choose Period Block in Action.

    Source IP List

    Click Create New to list the source IP ranges of the samples. FortiWeb Cloud will collect samples from the specified IP ranges.

    Exception URLs

    Due to the nature of some web pages, such as the stock list web page, even regular users may behave like bots because they tend to frequently refresh the pages. You may need to add these URLs in the exception list, otherwise the model may be invalid because too many bot-like behaviors are recorded in the samples.

    Click Create New to list exception URLs. The system will collect samples for any URL except the ones in the Exception URLs list.

  5. Select the action that FortiWeb Cloud takes when it detects a violation of the rule from the top right corner.
    To configure the actions, you must first enable the Advanced Configuration in Global > System Settings > Settings.

    Alert

    Accept the request and generate a log message.

    Alert & Deny

    Block the request (or reset the connection) and generate a log message.

    Period Block

    Block the current request. Moreover, all the subsequent requests from the same client in the next 10 minutes will also be blocked. The default blocking period is 10 minutes. You can configure this value according to your own needs.

    Deny (no log)

    Block the request (or reset the connection) without generating a log message.

  6. Click SAVE.

ML Based Bot Detection

The AI-based bot detection model complements the existing signature and threshold based rules. It detects sophisticated bots and CC attacks that can sometimes go undetected.

Compared with the traditional mechanisms to detect bots, the ML based bot detection model saves you the trouble to experiment on an appropriate threshold to detect abnormal user behaviors. For example, how could you know how many times of HTTP requests initiated by a user should be considered as abnormal? With the traditional mechanism, you may need to experiment on different threshold values and continuously check the attack log until no related attack logs are reported for the regular traffic.

Things are much easier if you use the ML based bot detection model. FortiWeb Cloud uses SVM (Support Vector Machine) algorithm to build up the bot detection model that self-learns the traffic profiles of regular clients. When the traffic from a new client flows in, it is compared against that of the regular clients. If they don't match, the bot detection model classifies the new client as an anomaly. When the traffic profiles of the regular clients vary dramatically (e.g. the functions of your application have changed, so that users behave differently when they visit your application),FortiWeb Cloud automatically refreshes the bot detection model to adapt to the changes.

Moreover, test shows that the bot detection model performs much better, specially when it detects crawlers and scrapers. The traffic is comprehensively evaluated from 13 dimensions. It helps increase the detection accuracy and decrease the false positive rate.

To configure a ML based bot detection rule:

  1. Go to BOT MITIGATION > ML Based Detection (Beta).
    You must have already enabled this module in Add Modules. See How to add or remove a module.
  2. Select the Model Settings tab.
  3. Configure the following settings.
  4. Client Identification Method

    FortiWeb Cloud collects samples from the real users to build a machine learning model. Select whether to use IP, IP and User-Agent, or Cookie to identify a user.

    • IP: The traffic data in one sample should come from the same source IP.

    • IP and User-Agent: The traffic data in one sample should come from the same source IP and User-Agent (the browser).

    • Cookie: The traffic data in one sample should have the same cookie value.

    Model Type

    Multiple models are built during the model building stage. The system uses training accuracy, cross-validation value, and testing accuracy to select qualified models.

    The Model Type is used to select the one final model out of all the qualified models.

    • If you configure the Model Type to Moderate, the system chooses the model which has the highest training accuracy among all the qualified models.

    • If you configure the Model Type to Strict, the system chooses the model which has the lowest training accuracy among all the qualified models.

    The Strict Model has a higher likelihood of identifying anomalies, but also carries the risk of incorrectly identifying regular users as bots.

    The Moderate Model is relatively lenient making it less prone to false positive detections, but comes with the risk of allowing actual bots to go undetected.

    There isn't a perfect option for every situation. Whichever model type you choose, you can always leverage the options in Anomaly Detection Settings and Action Settings to mitigate the side effects, for example, using Bot Confirmation to avoid false positive detections.

    Anomaly Count

    If the system detects certain times of anomalies from a user, it takes actions such as sending alerting emails or blocking the traffic from this user.

    Anomaly Count controls how many times of anomalies are allowed for each user.

    For example, the Anomaly Count is set to 4, and the system has detected 3 anomalies in the last 6 samples. If the 7th sample is detected again as an anomaly, the system will take actions.

    Please note that if no valid traffic is collected for the 7th sample (for example, the user leaves your application), the system will clear the anomaly count and the user information. If the user revisits your application, he/she will be treated as new users and the system starts anomaly counting afresh.

    Since this option allows certain times of anomalies from a user, it might be a good choice if you want to avoid false positive detections.

    Challenge

    If a bot is detected, the system will use the following methods to confirm it's indeed a bot.

    • Disable: Do not execute browser verification.

    • Real Browser Enforcement: The system sends a JavaScript to the client to verify whether it is a web browser.

    • CAPTCHA Enforcement: The system requires clients to successfully fulfill a CAPTCHA request.

    It will trigger the action policy if the traffic is not from web browser.

    Block Duration

    Enter the number of seconds that you want to block the requests. The valid range is 1–3,600 seconds (1 hour).

    This option only takes effect when you choose Period Block in Action.

    Source IP List

    Click Create New to list the source IP ranges of the samples. FortiWeb Cloud will collect samples from the specified IP ranges.

    Exception URLs

    Due to the nature of some web pages, such as the stock list web page, even regular users may behave like bots because they tend to frequently refresh the pages. You may need to add these URLs in the exception list, otherwise the model may be invalid because too many bot-like behaviors are recorded in the samples.

    Click Create New to list exception URLs. The system will collect samples for any URL except the ones in the Exception URLs list.

  5. Select the action that FortiWeb Cloud takes when it detects a violation of the rule from the top right corner.
    To configure the actions, you must first enable the Advanced Configuration in Global > System Settings > Settings.

    Alert

    Accept the request and generate a log message.

    Alert & Deny

    Block the request (or reset the connection) and generate a log message.

    Period Block

    Block the current request. Moreover, all the subsequent requests from the same client in the next 10 minutes will also be blocked. The default blocking period is 10 minutes. You can configure this value according to your own needs.

    Deny (no log)

    Block the request (or reset the connection) without generating a log message.

  6. Click SAVE.