Fortinet white logo
Fortinet white logo

User Guide

Overview

Overview

FortiSIEM 7.0.0 introduces Machine Learning, which allows you to run various Machine Learning tasks on data collected by FortiSIEM.

Machine Learning Job Types

The following Machine Learning Tasks are supported:

  • Anomaly Detection - The objective of an Anomaly Detection task is to learn what is normal in a dataset, and create an alert if the new values deviate from the normal dataset, by user specified threshold. Learning is done during the Training phase and alert creation is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Classification – The objective of a Classification task to learn how to assign labels to items based on various fields in a dataset and then assign a label to a new item based on current values. This requires labels to be present in the dataset. Labels can be binary e.g. malware/not malware, spam/not-spam, or can belong to more than 2 classes as well. Learning the label assignment is done during the Training phase and assigning labels to new data is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Clustering - The objective of a Clustering task is to group similar items based on a set of fields in a dataset, and then create an alert if an item belongs to a different group based on current values. Learning the groups is done during the Training phase and alert creation is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Forecasting – The objective of a Forecasting task is to learn a time based trend of a field in a dataset and then predict future values of that field. Learning the time based trend is done during the Training phase and future prediction is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Regression - The objective of a Regression task is to build a model for predicting a Target field based on other fields in a dataset, and then create an alert if the new values of the Target field exceeds the predicted value, by user specified threshold. Building the prediction model is done during the Training phase and alert creation is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.

Running Modes and Algorithms

The following Machine Learning running modes are supported:

  • Local – In this mode, both Training and Inference phases run on the Supervisor and Worker cluster. During the Training phase, the user has to choose the Machine Learning Algorithm and its optimal parameters.
  • Local Auto – In this mode, both Training and Inference phases run on the Supervisor and Worker cluster. FortiSIEM picks the best Machine Learning Algorithm with a tuned parameter set. In contrast to Local mode, the Training phase in Local Auto mode takes a significantly longer time to complete, since many algorithms are evaluated and the performance of each algorithm is optimized by tuning the parameter set.
  • AWS - In this mode, both the Training and Inference stages take place on AWS. It is important to note that for regression, BinaryClassification, and MulticlassClassification, the algorithm parameters can be auto-tuned on AWS and the user can also define the Object Metric for Hyperparameter Tuning Job. However, for Anomaly Detection and Clustering, the user is required to configure the algorithm parameters themselves.
  • AWS Auto - In this mode, Amazon SageMaker Autopilot helps you automatically build, train, and tune the best machine learning models without manual model selection and tuning. Although the training time might be longer (2 – 3 hours), using Autopilot can still be suitable when facing a dataset with a large number of features and complexity, Autopilot can automatically perform feature engineering and select the best model, helping you achieve good results in a short amount of time.

The following table shows the supported Machine Learning algorithms for each Task type and running mode.

Task

Running Mode

Supported Machine Learning Algorithms

Anomaly Detection

Local

Local Outlier Factor, Elliptic Envelope, Isolation Forest, Statistical Deviation, Bipartite Edge Anomaly Detection (Proprietary)

AWS

Random Cut Forest

Classification

Local

Logistic Regression, Decision Tree Classifier, Random Forest Classifier, SGDClassifier, Support Vector Classifier (SVC)

Local Auto

Decision Tree Classifier, Random Forest Classifier, SGDClassifier, Support Vector Classifier (SVC)

Binary Classification

AWS

Linear Learner

AWS Auto

See https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-model-support-validation.html

Multiclass Classification

AWS

Linear Learner

AWS Auto

See https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-model-support-validation.html

Clustering

Local

KMeans, GMeans, DBScan, BIRCH, Spectral Clustering

Local Auto

BIRCH, DBSscan, KMeans

AWS

KMeans

Forecasting

Local

ARIMA, State Space Dynamic factor MQ

Regression

Local

Linear Regression, Decision Tree Regressor, Random Forest Regressor, SGDRegressor, Support Vector Regression (SVR)

Local Auto

Decision Tree Regressor, Random Forest Regressor, SGDRegressor, Support Vector Regression (SVR)

AWS

Linear Learner

AWS Auto

See https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-model-support-validation.html

A Machine Learning Job

The concept of a Machine Learning Job is introduced. It consists of the following attributes:

  • Job Id – The unique ID for this job, assigned by FortiSIEM.
  • Scope – System defined or User defined. System defined Jobs are provided by FortiSIEM. User defined jobs are created by the user. Note that System defined jobs are simply templates and must be trained.
  • Name – Name of the job.
  • Description – Description of the job.
  • Task – One of the following Machine Learning Task Categories: Anomaly Detection, Regression, Clustering, Forecasting, Classification.
  • Machine Learning Algorithm – A relevant algorithm for the Machine Learning Task.
  • Report – The FortiSIEM Report that provides the data for training and inference.
  • Report Time Window – The time window for which the report must be run. In other words, the fataset contains data during this Report Time Window.
  • Target – The fields in the FortiSIEM Report that should be used as the target for the Machine Learning Task.
  • Features – The fields in the FortiSIEM Report that should be used as the features for the Machine Learning Task.
  • Organization (For FortiSIEM Service Provider deployments only) – The Organization for which the Report must be run. Currently the supported values are – All Organization data combined or a specific Organization at a time. In other words, if you want to run the machine learning task for 2 Organizations, then 2 separate Machine Learning jobs must be created.
  • Inference Schedule – The frequency at which inference must be done. The purpose of Inference is to detect deviations from the model built during Training phase or to create future data for Forecasting.
  • Re-training Schedule – The frequency at which training must be repeated. The purpose of retraining is to capture new patterns in the data.
  • Action – Specifies the action to be taken after Inference is completed: whether to create an Alert or send an email containing the anomalies found.

System defined Machine Learning Job templates are provided in Resources > Machine Learning Jobs. Note that these are templates and cannot be run like Rules and Reports, since they are missing the Machine Learning model. A System defined Machine Learning Job does not set the following attributes:

  1. Report Time Window
  2. Organization
  3. Inference Schedule
  4. Re-training schedule
  5. Action

These attributes must be provided by the user while training and scheduling a machine learning job.

Running a System Defined Machine Learning Job

To build a model and schedule a System defined job, the user must take the following steps:

  1. Go to Analytics > Machine Learning and select the job.
  2. Create an input dataset by running the report
    1. Choose Report Time Window
    2. Chose Organization for FortiSIEM Service Provider deployments
  3. Train the dataset from Analytics >Train. A model will be built. You can tune the tune the algorithm parameters and Train again.
  4. Once the model is ready, you can schedule the job to run for Inference from Analytics >Schedule.
  5. A new job id will be assigned and will show in Resources > Machine Learning Jobs as an User defined job.
  6. As part of Inference action, the alert will show up in Incidents or email will be sent.
  7. You can edit the Inference and Re-training schedules by editing a job in Resources > Machine Learning Jobs.

Creating a Machine Learning Job From Scratch

A Machine Learning job can also be created from scratch. To do this, the user must take the following steps:

  1. Navigate to Analytics > Machine Learning and Select a Report from the Reports > Machine Learning Reports folder. If the report is not present in that folder, then you can pre-built a report from Analytics > Search and save it in Reports > Machine Learning Reports before proceeding.
  2. Create an input dataset by running the report
    1. Choose Report Time Window
    2. Chose Organization for FortiSIEM Service Provider deployments
  3. Train the dataset from Analytics >Train.
    1. Choose Run Mode.
    2. Choose Task.
    3. Choose Algorithm and tune the parameters if needed.
    4. Choose Training factor.
    5. Click Train. You can tune the tune the algorithm parameters and Train again.
  4. Once the model is ready, you can schedule the job to run for Inference from Analytics >Schedule.
  5. A new job id will be assigned and will show in Resources > Machine Learning Jobs as an User defined job.
  6. As part of Inference action, the alert will show up in Incidents or email will be sent.
  7. You can edit the Inference and Re-training schedules by editing a job in Resources > Machine Learning Jobs.

Overview

Overview

FortiSIEM 7.0.0 introduces Machine Learning, which allows you to run various Machine Learning tasks on data collected by FortiSIEM.

Machine Learning Job Types

The following Machine Learning Tasks are supported:

  • Anomaly Detection - The objective of an Anomaly Detection task is to learn what is normal in a dataset, and create an alert if the new values deviate from the normal dataset, by user specified threshold. Learning is done during the Training phase and alert creation is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Classification – The objective of a Classification task to learn how to assign labels to items based on various fields in a dataset and then assign a label to a new item based on current values. This requires labels to be present in the dataset. Labels can be binary e.g. malware/not malware, spam/not-spam, or can belong to more than 2 classes as well. Learning the label assignment is done during the Training phase and assigning labels to new data is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Clustering - The objective of a Clustering task is to group similar items based on a set of fields in a dataset, and then create an alert if an item belongs to a different group based on current values. Learning the groups is done during the Training phase and alert creation is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Forecasting – The objective of a Forecasting task is to learn a time based trend of a field in a dataset and then predict future values of that field. Learning the time based trend is done during the Training phase and future prediction is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.
  • Regression - The objective of a Regression task is to build a model for predicting a Target field based on other fields in a dataset, and then create an alert if the new values of the Target field exceeds the predicted value, by user specified threshold. Building the prediction model is done during the Training phase and alert creation is done during the Inference phase. The dataset for both Training and Inference phases is provided by running FortiSIEM reports.

Running Modes and Algorithms

The following Machine Learning running modes are supported:

  • Local – In this mode, both Training and Inference phases run on the Supervisor and Worker cluster. During the Training phase, the user has to choose the Machine Learning Algorithm and its optimal parameters.
  • Local Auto – In this mode, both Training and Inference phases run on the Supervisor and Worker cluster. FortiSIEM picks the best Machine Learning Algorithm with a tuned parameter set. In contrast to Local mode, the Training phase in Local Auto mode takes a significantly longer time to complete, since many algorithms are evaluated and the performance of each algorithm is optimized by tuning the parameter set.
  • AWS - In this mode, both the Training and Inference stages take place on AWS. It is important to note that for regression, BinaryClassification, and MulticlassClassification, the algorithm parameters can be auto-tuned on AWS and the user can also define the Object Metric for Hyperparameter Tuning Job. However, for Anomaly Detection and Clustering, the user is required to configure the algorithm parameters themselves.
  • AWS Auto - In this mode, Amazon SageMaker Autopilot helps you automatically build, train, and tune the best machine learning models without manual model selection and tuning. Although the training time might be longer (2 – 3 hours), using Autopilot can still be suitable when facing a dataset with a large number of features and complexity, Autopilot can automatically perform feature engineering and select the best model, helping you achieve good results in a short amount of time.

The following table shows the supported Machine Learning algorithms for each Task type and running mode.

Task

Running Mode

Supported Machine Learning Algorithms

Anomaly Detection

Local

Local Outlier Factor, Elliptic Envelope, Isolation Forest, Statistical Deviation, Bipartite Edge Anomaly Detection (Proprietary)

AWS

Random Cut Forest

Classification

Local

Logistic Regression, Decision Tree Classifier, Random Forest Classifier, SGDClassifier, Support Vector Classifier (SVC)

Local Auto

Decision Tree Classifier, Random Forest Classifier, SGDClassifier, Support Vector Classifier (SVC)

Binary Classification

AWS

Linear Learner

AWS Auto

See https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-model-support-validation.html

Multiclass Classification

AWS

Linear Learner

AWS Auto

See https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-model-support-validation.html

Clustering

Local

KMeans, GMeans, DBScan, BIRCH, Spectral Clustering

Local Auto

BIRCH, DBSscan, KMeans

AWS

KMeans

Forecasting

Local

ARIMA, State Space Dynamic factor MQ

Regression

Local

Linear Regression, Decision Tree Regressor, Random Forest Regressor, SGDRegressor, Support Vector Regression (SVR)

Local Auto

Decision Tree Regressor, Random Forest Regressor, SGDRegressor, Support Vector Regression (SVR)

AWS

Linear Learner

AWS Auto

See https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-model-support-validation.html

A Machine Learning Job

The concept of a Machine Learning Job is introduced. It consists of the following attributes:

  • Job Id – The unique ID for this job, assigned by FortiSIEM.
  • Scope – System defined or User defined. System defined Jobs are provided by FortiSIEM. User defined jobs are created by the user. Note that System defined jobs are simply templates and must be trained.
  • Name – Name of the job.
  • Description – Description of the job.
  • Task – One of the following Machine Learning Task Categories: Anomaly Detection, Regression, Clustering, Forecasting, Classification.
  • Machine Learning Algorithm – A relevant algorithm for the Machine Learning Task.
  • Report – The FortiSIEM Report that provides the data for training and inference.
  • Report Time Window – The time window for which the report must be run. In other words, the fataset contains data during this Report Time Window.
  • Target – The fields in the FortiSIEM Report that should be used as the target for the Machine Learning Task.
  • Features – The fields in the FortiSIEM Report that should be used as the features for the Machine Learning Task.
  • Organization (For FortiSIEM Service Provider deployments only) – The Organization for which the Report must be run. Currently the supported values are – All Organization data combined or a specific Organization at a time. In other words, if you want to run the machine learning task for 2 Organizations, then 2 separate Machine Learning jobs must be created.
  • Inference Schedule – The frequency at which inference must be done. The purpose of Inference is to detect deviations from the model built during Training phase or to create future data for Forecasting.
  • Re-training Schedule – The frequency at which training must be repeated. The purpose of retraining is to capture new patterns in the data.
  • Action – Specifies the action to be taken after Inference is completed: whether to create an Alert or send an email containing the anomalies found.

System defined Machine Learning Job templates are provided in Resources > Machine Learning Jobs. Note that these are templates and cannot be run like Rules and Reports, since they are missing the Machine Learning model. A System defined Machine Learning Job does not set the following attributes:

  1. Report Time Window
  2. Organization
  3. Inference Schedule
  4. Re-training schedule
  5. Action

These attributes must be provided by the user while training and scheduling a machine learning job.

Running a System Defined Machine Learning Job

To build a model and schedule a System defined job, the user must take the following steps:

  1. Go to Analytics > Machine Learning and select the job.
  2. Create an input dataset by running the report
    1. Choose Report Time Window
    2. Chose Organization for FortiSIEM Service Provider deployments
  3. Train the dataset from Analytics >Train. A model will be built. You can tune the tune the algorithm parameters and Train again.
  4. Once the model is ready, you can schedule the job to run for Inference from Analytics >Schedule.
  5. A new job id will be assigned and will show in Resources > Machine Learning Jobs as an User defined job.
  6. As part of Inference action, the alert will show up in Incidents or email will be sent.
  7. You can edit the Inference and Re-training schedules by editing a job in Resources > Machine Learning Jobs.

Creating a Machine Learning Job From Scratch

A Machine Learning job can also be created from scratch. To do this, the user must take the following steps:

  1. Navigate to Analytics > Machine Learning and Select a Report from the Reports > Machine Learning Reports folder. If the report is not present in that folder, then you can pre-built a report from Analytics > Search and save it in Reports > Machine Learning Reports before proceeding.
  2. Create an input dataset by running the report
    1. Choose Report Time Window
    2. Chose Organization for FortiSIEM Service Provider deployments
  3. Train the dataset from Analytics >Train.
    1. Choose Run Mode.
    2. Choose Task.
    3. Choose Algorithm and tune the parameters if needed.
    4. Choose Training factor.
    5. Click Train. You can tune the tune the algorithm parameters and Train again.
  4. Once the model is ready, you can schedule the job to run for Inference from Analytics >Schedule.
  5. A new job id will be assigned and will show in Resources > Machine Learning Jobs as an User defined job.
  6. As part of Inference action, the alert will show up in Incidents or email will be sent.
  7. You can edit the Inference and Re-training schedules by editing a job in Resources > Machine Learning Jobs.