FortiSOAR Performance Benchmarking for v7.3.1

This document details the performance benchmark tests conducted in Fortinet labs. The objective of this performance test is to measure the time taken to create alerts in FortiSOAR, and complete the execution of corresponding playbooks on the created alerts in the following cases:

Single-node FortiSOAR appliance
Cluster setup of FortiSOAR

The data from this benchmark test can help you in determining your scaling requirements for a FortiSOAR instance to handle the expected workload in your environment.

Summary of the tests performed

The following tests are performed on both a single FortiSOAR node, and a cluster of two active-active FortiSOAR nodes:

Ingestion of alerts into FortiSOAR.
Ingestion of alerts followed by extraction of associated indicators.
Ingestion of alerts, then extraction of associated indicators, followed by enrichment of the extracted indications using one source.
Sustain the processes of ingestion of alerts and extraction of indicators for a 12-hour period.

Environment

FortiSOAR Virtual Appliance Specifications

Component	Specifications
FortiSOAR Build Version	7.3.1-2105
CPU	16 CPUs
Memory	32 GB
Storage	300 GB virtual disk, HDD with IOPS 3000, attached to a VMware ESXI instance

Operating System Specifications

Operating System	Kernel Version
Rocky Linux 8.7	4.18.0-425.3.1.el8.x86_64

External Tools Used

Tool Name	Version
FortiMonitor	Cloud
Internal Script to gather data

Pre-test conditions on both the standalone FortiSOAR machine and the FortiSOAR High Availability (HA) cluster

At the start of each test run -

The test environment contained zero alerts.
The test environment contained only the FortiSOAR built-in connectors such as IMAP, Utilities, etc.
The system playbooks were deactivated such as, alert assignment notification, SLA calculation, etc., and there were no running playbooks.
The playbook execution logs were purged.
Configured tunables as follows:
- Changed celery workers to 16
- Elastic heaps size to 8GB
- Changes related to PostgresSQL:
  - max_connections = 1200
  - shared_buffers = 5GB
  - effective_cache_size = 15GB
  - maintenance_work_mem = 2GB
  - default_statistics_target = 500
  - effective_io_concurrency = 2
  - work_mem = 546kB
  - min_wal_size = 4GB
  - max_wal_size = 16GB
  - max_parallel_workers_per_gather = 4
  - max_parallel_workers = 8
  - max_parallel_maintenance_workers = 4
- Changes related to NGINX:
  - worker_processes auto;
  - worker_connections 1024;
  - access_log off;
  - keepalive_timeout 15;

In a production environment the network bandwidth, especially for outbound connections, to applications such as VirusTotal might vary, which could affect the observations.

Test setups

For a single-node FortiSOAR instance - Configure the standalone FortiSOAR appliance according to the specifications mentioned in the Environment topic.
For the FortiSOAR HA cluster - Create an HA cluster of two FortiSOAR instances that are joined in the active-active state. Configure both the FortiSOAR appliances according to the specifications mentioned in the Environment topic.

Tests performed

Test 1: Ingest alerts into FortiSOAR using the ingestion playbook

This test is executed by manually triggering ingestion playbook that creates alerts in FortiSOAR.

Steps followed

Created the alerts using the ingestion playbook.
You can download and use the sample playbooks contained in "PerfBenchmarking_Test01_PB_Collection_7_3.zip" file if you want to run the tests in your environment. All the sample playbook collections are included in Appendix: JSONs for sample playbooks.
Once the alerts are created, measure the total time taken to create all the alerts in FortiSOAR.

Observations

The data in the following table outlines the number of alerts ingested and the total time taken to ingest those alerts:

Number of alerts created in FortiSOAR	Total number of playbooks executed in FortiSOAR	Total time (in seconds) taken to create all alerts in a standalone FortiSOAR appliance	Total time (in seconds) taken to create all alerts in an HA active-active FortiSOAR cluster **
1	1	0.35	0.38
5	1	0.56	0.61
10	1	0.86	0.90
25	1	1.71	1.92
50	1	3.22	3.64
100	1	6.82	7.27
** The slight overhead of workload distribution across the HA cluster outweighs the scaling benefits under very low load. That is why the total execution time for a single alert creation is more in the case of a two-node cluster.

Note: Once this test is completed, refer to the pre-test conditions before starting a new test.

Test 2: Ingest alerts into FortiSOAR followed by automated indicator extraction

This test is executed by manually triggering the ingestion playbook that creates alerts in FortiSOAR. Once the alerts are created in FortiSOAR, an "Extraction" playbook is triggered and the total time taken for all the extraction playbooks to complete their execution is calculated.

Steps followed

Created the alerts using the ingestion playbook.
Post alert creation, the "Extraction" playbooks are triggered. You can download and use the sample playbooks contained in the "PerfBenchmarking_Test02_PB_Collection_7_3.zip" file if you want to run the tests in your environment. All the sample playbook collections are included in Appendix: JSONs for sample playbooks.

Observations

The data in the following table outlines the number of alerts ingested, the total time taken to ingest those alerts, and the total time taken for all the triggered playbooks to complete their execution:

Number of alerts created in FortiSOAR	Total number of playbooks executed in FortiSOAR	Total time (in seconds) taken to create all alerts in a standalone FortiSOAR appliance	Total time (in seconds) taken to create all alerts in an HA active-active FortiSOAR cluster
1	2	1.43	1.53 **
5	6	1.94	1.81
10	11	2.82	2.51
25	26	7.17	5.55
50	51	9.49	8.19
100	101	14.61	11.62
** The slight overhead of workload distribution across the HA cluster outweighs the scaling benefits under very low load. That is why the total execution time for a single alert creation is more in the case of a two-node cluster.

Note: Once this test is completed, refer to the pre-test conditions before starting a new test.

Test 3: Ingest alerts into FortiSOAR, automated indicator extraction, followed by enrichment of the extracted indicators

The test was executed using an automated testbed that starts the ingestion which in turn creates alerts in FortiSOAR. Once the alerts are created in FortiSOAR, "Extraction" and "Enrichment" playbooks are triggered and the total time taken for all the extraction and enrichment playbooks to complete their execution is calculated.

The setup for this test is exactly the same as Test 2, however this test additionally requires the "VirusTotal" connector to be configured.

Steps followed

Created the alerts using the ingestion playbook.
Post alert creation, the "Extraction" and "Enrichment" playbooks are triggered. You can download and use the sample playbooks contained in the "PerfBenchmarking_Test03_PB_Collection_7_3.zip" file if you want to run the tests in your environment. All the sample playbook collections are included in Appendix: JSONs for sample playbooks.

Observations

Number of alerts created in FortiSOAR	Total number of playbooks executed in FortiSOAR	Total time (in seconds) taken to create all alerts in a standalone FortiSOAR appliance	Total time (in seconds) taken to create all alerts in an HA active-active FortiSOAR cluster
1	8	5.46	5.48
5	36	5.38	6.12
10	71	9.74	7.35
25	176	20.3	15.19
50	351	34.87	27.56
100	700	1 minute 4 seconds	48.35

Note: Once this test is completed, refer to the pre-test conditions before starting a new test. Also, note that enrichment of playbooks makes API calls over the Internet, and the times mentioned in this table to execute playbooks is inclusive of this time.

Sustained Data Ingestion Test

A sustenance test was also conducted on a standalone FortiSOAR appliance with the configuration as defined in "Test 2", i.e., the test is executed by manually triggering the ingestion playbook, “FortiSIEM -> Ingest 100 Alerts”, which creates alerts in FortiSOAR. Once the alerts are created in FortiSOAR, an "Extraction" playbook is triggered and the total time taken for all the extraction playbooks to complete their execution is calculated.

Number of alerts: 100/min

Duration: 12 hours

Playbooks configured: As defined in Test 2 comprising of "Ingestion" and "Indicator Extraction" playbooks.

Total number of playbooks executed: 72720

Results

The system performed well under the sustained load. All 72100 alerts were successfully ingested and all the extraction playbooks were successfully completed without any queuing.

Graphs

The following graphs are plotted for the vital statistics for the system that was under test during the period of the test run.

CPU Percentage Usage Graph

Analysis of CPU Percentage Usage when the test run was in progress on the appliance:

CPU Load Average Utilization Graph for single FSR Node

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test" was running the CPU utilization was normal and the performance of the system did not get impacted.

Memory Utilization Graph

Analysis of memory utilization when the test run was in progress on the appliance:

Memory Utilization Graph for single FSR Node

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test" was running the Memory utilization was around 55%.

IO Wait Graph

Analysis of IO Wait when the test run was in progress on the appliance:

IO Wait Graph for single FSR Node

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test" was running the IO Wait time was normal, with the average IO Wait being around 1% of the CPU idle time.

Read/Write IO Wait Graph for ElasticSearch

Analysis of Read/Write IO Wait for ElasticSearch when the test run was in progress on the appliance:

Read/Write IO Wait Graph for ElasticSearch for single FSR Node

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test" was running the "Read" Wait for the ElasticSearch disk was almost 0 milliseconds. The "Write" Wait for the ElasticSearch disk averaged around 0.4 milliseconds, with the maximum wait of 0.55 milliseconds and the minimum wait of 0.25 milliseconds.

Read/Write IO Wait Graph for PostgreSQL

Analysis of Read/Write IO Wait for PostgreSQL when the test run was in progress on the appliance:

Read/Write IO Wait Graph for PostgreSQL for single FSR Node

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test" was running, the "Read" Wait for the PostgreSQL disk averaged around 0 milliseconds. The "Write" Wait for the PostgreSQL disk averaged around 0.3 milliseconds, with the maximum wait of 1.1 milliseconds and the minimum wait of 0.21 milliseconds.

Sustained Data Ingestion Test for the cluster of two active-active FortiSOAR nodes

A sustenance test was also conducted on the FortiSOAR active-active two node cluster with the configuration as defined in "Test 2", i.e., the test is executed by manually triggering the ingestion playbook, “FortiSIEM -> Ingest 100 Alerts”, which creates alerts in FortiSOAR. Once the alerts are created in FortiSOAR, an "Extraction" playbook is triggered and the total time taken for all the extraction playbooks to complete their execution is calculated.

Number of alerts: 100/min

Duration: 12 hours

Playbooks configured: As defined in Test 2 comprising of "Ingestion" and "Indicator Extraction" playbooks.

Total number of playbooks executed: 72720

Results

The system performed well under the sustained load. All 72100 alerts were successfully ingested and all the extraction playbooks were successfully completed without any queuing.

Graphs

The following graphs are plotted for the vital statistics for the HA cluster that was under test during the period of the test run.

All the graphs included in this section are from the Primary/Active Node.

CPU Percentage Usage Graph

Analysis of CPU Percentage Usage when the test run was in progress on the appliance:

CPU Load Average Utilization Graph for the HA active-active cluster of two FSR Nodes

Memory Utilization Graph

Analysis of memory utilization when the test run was in progress on the appliance:

Memory Utilization Graph for the HA active-active cluster of two FSR Nodes

IO Wait Graph

Analysis of IO Wait when the test run was in progress on the appliance:

IO Wait Graph for the HA active-active cluster of two FSR Nodes

Read/Write IO Wait Graph for ElasticSearch

Analysis of Read/Write IO Wait for ElasticSearch when the test run was in progress on the appliance:

Read/Write IO Wait Graph for ElasticSearch for the HA active-active cluster of two FSR Nodes

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test for an HA cluster" was running the "Read" Wait for the ElasticSearch disk was almost 0 milliseconds. The "Write" Wait for the ElasticSearch disk averaged around 1 millisecond, with the maximum wait of 16 milliseconds and the minimum wait of 0.3 milliseconds.

Read/Write IO Wait Graph for PostgreSQL

Analysis of Read/Write IO Wait for PostgreSQL when the test run was in progress on the appliance:

Read/Write IO Wait Graph for PostgreSQL for the HA active-active cluster of two FSR Nodes

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test for an HA cluster" was running, the "Read" Wait for the PostgreSQL disk averaged around 0 milliseconds. The "Write" Wait for the PostgreSQL disk averaged around 0.5 millisecond, with the maximum wait of 3.4 milliseconds and the minimum wait of 0.2 milliseconds.

Appendix: JSONs for sample playbooks

You can download the JSON for the following sample playbook collections so that you can run the same tests in your environment to see the performance in your version/hardware platforms. Or, if you want to make some additions that are specific to your environment, you can also tweak the existing playbooks.

FortiSOAR Performance Benchmarking for v7.3.1

Single-node FortiSOAR appliance
Cluster setup of FortiSOAR

The data from this benchmark test can help you in determining your scaling requirements for a FortiSOAR instance to handle the expected workload in your environment.

Summary of the tests performed

The following tests are performed on both a single FortiSOAR node, and a cluster of two active-active FortiSOAR nodes:

Ingestion of alerts into FortiSOAR.
Ingestion of alerts followed by extraction of associated indicators.
Ingestion of alerts, then extraction of associated indicators, followed by enrichment of the extracted indications using one source.
Sustain the processes of ingestion of alerts and extraction of indicators for a 12-hour period.

Environment

FortiSOAR Virtual Appliance Specifications

Component	Specifications
FortiSOAR Build Version	7.3.1-2105
CPU	16 CPUs
Memory	32 GB
Storage	300 GB virtual disk, HDD with IOPS 3000, attached to a VMware ESXI instance

Operating System Specifications

Operating System	Kernel Version
Rocky Linux 8.7	4.18.0-425.3.1.el8.x86_64

External Tools Used

Tool Name	Version
FortiMonitor	Cloud
Internal Script to gather data

Pre-test conditions on both the standalone FortiSOAR machine and the FortiSOAR High Availability (HA) cluster

At the start of each test run -

The test environment contained zero alerts.
The test environment contained only the FortiSOAR built-in connectors such as IMAP, Utilities, etc.
The system playbooks were deactivated such as, alert assignment notification, SLA calculation, etc., and there were no running playbooks.
The playbook execution logs were purged.
Configured tunables as follows:
- Changed celery workers to 16
- Elastic heaps size to 8GB
- Changes related to PostgresSQL:
  - max_connections = 1200
  - shared_buffers = 5GB
  - effective_cache_size = 15GB
  - maintenance_work_mem = 2GB
  - default_statistics_target = 500
  - effective_io_concurrency = 2
  - work_mem = 546kB
  - min_wal_size = 4GB
  - max_wal_size = 16GB
  - max_parallel_workers_per_gather = 4
  - max_parallel_workers = 8
  - max_parallel_maintenance_workers = 4
- Changes related to NGINX:
  - worker_processes auto;
  - worker_connections 1024;
  - access_log off;
  - keepalive_timeout 15;

In a production environment the network bandwidth, especially for outbound connections, to applications such as VirusTotal might vary, which could affect the observations.

Test setups

For a single-node FortiSOAR instance - Configure the standalone FortiSOAR appliance according to the specifications mentioned in the Environment topic.
For the FortiSOAR HA cluster - Create an HA cluster of two FortiSOAR instances that are joined in the active-active state. Configure both the FortiSOAR appliances according to the specifications mentioned in the Environment topic.

Tests performed

Test 1: Ingest alerts into FortiSOAR using the ingestion playbook

This test is executed by manually triggering ingestion playbook that creates alerts in FortiSOAR.

Steps followed

Created the alerts using the ingestion playbook.
You can download and use the sample playbooks contained in "PerfBenchmarking_Test01_PB_Collection_7_3.zip" file if you want to run the tests in your environment. All the sample playbook collections are included in Appendix: JSONs for sample playbooks.
Once the alerts are created, measure the total time taken to create all the alerts in FortiSOAR.

Observations

The data in the following table outlines the number of alerts ingested and the total time taken to ingest those alerts:

Number of alerts created in FortiSOAR	Total number of playbooks executed in FortiSOAR	Total time (in seconds) taken to create all alerts in a standalone FortiSOAR appliance	Total time (in seconds) taken to create all alerts in an HA active-active FortiSOAR cluster **
1	1	0.35	0.38
5	1	0.56	0.61
10	1	0.86	0.90
25	1	1.71	1.92
50	1	3.22	3.64
100	1	6.82	7.27
** The slight overhead of workload distribution across the HA cluster outweighs the scaling benefits under very low load. That is why the total execution time for a single alert creation is more in the case of a two-node cluster.

Note: Once this test is completed, refer to the pre-test conditions before starting a new test.

Test 2: Ingest alerts into FortiSOAR followed by automated indicator extraction

Steps followed

Created the alerts using the ingestion playbook.
Post alert creation, the "Extraction" playbooks are triggered. You can download and use the sample playbooks contained in the "PerfBenchmarking_Test02_PB_Collection_7_3.zip" file if you want to run the tests in your environment. All the sample playbook collections are included in Appendix: JSONs for sample playbooks.

Observations

Number of alerts created in FortiSOAR	Total number of playbooks executed in FortiSOAR	Total time (in seconds) taken to create all alerts in a standalone FortiSOAR appliance	Total time (in seconds) taken to create all alerts in an HA active-active FortiSOAR cluster
1	2	1.43	1.53 **
5	6	1.94	1.81
10	11	2.82	2.51
25	26	7.17	5.55
50	51	9.49	8.19
100	101	14.61	11.62
** The slight overhead of workload distribution across the HA cluster outweighs the scaling benefits under very low load. That is why the total execution time for a single alert creation is more in the case of a two-node cluster.

Note: Once this test is completed, refer to the pre-test conditions before starting a new test.

Test 3: Ingest alerts into FortiSOAR, automated indicator extraction, followed by enrichment of the extracted indicators

The setup for this test is exactly the same as Test 2, however this test additionally requires the "VirusTotal" connector to be configured.

Steps followed

Created the alerts using the ingestion playbook.
Post alert creation, the "Extraction" and "Enrichment" playbooks are triggered. You can download and use the sample playbooks contained in the "PerfBenchmarking_Test03_PB_Collection_7_3.zip" file if you want to run the tests in your environment. All the sample playbook collections are included in Appendix: JSONs for sample playbooks.

Observations

Number of alerts created in FortiSOAR	Total number of playbooks executed in FortiSOAR	Total time (in seconds) taken to create all alerts in a standalone FortiSOAR appliance	Total time (in seconds) taken to create all alerts in an HA active-active FortiSOAR cluster
1	8	5.46	5.48
5	36	5.38	6.12
10	71	9.74	7.35
25	176	20.3	15.19
50	351	34.87	27.56
100	700	1 minute 4 seconds	48.35

Sustained Data Ingestion Test

Number of alerts: 100/min

Duration: 12 hours

Playbooks configured: As defined in Test 2 comprising of "Ingestion" and "Indicator Extraction" playbooks.

Total number of playbooks executed: 72720

Results

The system performed well under the sustained load. All 72100 alerts were successfully ingested and all the extraction playbooks were successfully completed without any queuing.

Graphs

The following graphs are plotted for the vital statistics for the system that was under test during the period of the test run.

CPU Percentage Usage Graph

Analysis of CPU Percentage Usage when the test run was in progress on the appliance:

CPU Load Average Utilization Graph for single FSR Node

Memory Utilization Graph

Analysis of memory utilization when the test run was in progress on the appliance:

Memory Utilization Graph for single FSR Node

IO Wait Graph

Analysis of IO Wait when the test run was in progress on the appliance:

IO Wait Graph for single FSR Node

Read/Write IO Wait Graph for ElasticSearch

Analysis of Read/Write IO Wait for ElasticSearch when the test run was in progress on the appliance:

Read/Write IO Wait Graph for ElasticSearch for single FSR Node

Read/Write IO Wait Graph for PostgreSQL

Analysis of Read/Write IO Wait for PostgreSQL when the test run was in progress on the appliance:

Read/Write IO Wait Graph for PostgreSQL for single FSR Node

Sustained Data Ingestion Test for the cluster of two active-active FortiSOAR nodes

Number of alerts: 100/min

Duration: 12 hours

Playbooks configured: As defined in Test 2 comprising of "Ingestion" and "Indicator Extraction" playbooks.

Total number of playbooks executed: 72720

Results

The system performed well under the sustained load. All 72100 alerts were successfully ingested and all the extraction playbooks were successfully completed without any queuing.

Graphs

The following graphs are plotted for the vital statistics for the HA cluster that was under test during the period of the test run.

All the graphs included in this section are from the Primary/Active Node.

CPU Percentage Usage Graph

Analysis of CPU Percentage Usage when the test run was in progress on the appliance:

CPU Load Average Utilization Graph for the HA active-active cluster of two FSR Nodes

Memory Utilization Graph

Analysis of memory utilization when the test run was in progress on the appliance:

Memory Utilization Graph for the HA active-active cluster of two FSR Nodes

IO Wait Graph

Analysis of IO Wait when the test run was in progress on the appliance:

IO Wait Graph for the HA active-active cluster of two FSR Nodes

Read/Write IO Wait Graph for ElasticSearch

Analysis of Read/Write IO Wait for ElasticSearch when the test run was in progress on the appliance:

Read/Write IO Wait Graph for ElasticSearch for the HA active-active cluster of two FSR Nodes

Read/Write IO Wait Graph for PostgreSQL

Analysis of Read/Write IO Wait for PostgreSQL when the test run was in progress on the appliance:

Read/Write IO Wait Graph for PostgreSQL for the HA active-active cluster of two FSR Nodes

Using the system resources specified in the "Environment" and "Pre-Test Conditions" topics, it was observed that when the "Sustained Data Ingestion Test for an HA cluster" was running, the "Read" Wait for the PostgreSQL disk averaged around 0 milliseconds. The "Write" Wait for the PostgreSQL disk averaged around 0.5 millisecond, with the maximum wait of 3.4 milliseconds and the minimum wait of 0.2 milliseconds.