Fortinet black logo

What's New in 6.7.0

What's New in 6.7.0

This document describes the additions for the FortiSIEM 6.7.0 release.

New Features

Active-Active Supervisor Cluster

This release allows you to run multiple Supervisors in an Active-Active mode. This means that you can log in to any Supervisor node and perform any operation. Rule, Query, Inline report processing and App Server functionality are distributed across the available Supervisor nodes. An Active-Active Supervisor Cluster provides resilience against Supervisor failures, higher number of GUI users, Agents, Collectors, and higher volume of rule and query processing.

An Active-Active Supervisor cluster is built around the concept of Leader and Follower Supervisor nodes, set up as a linked list.

  1. The first Supervisor node where you install License, is a Leader. On this node, its UUID matches the UUID in the installed FortiCare License. On this node, Redis and PostGreSQL database services run in Master mode.

  2. Next, you add a Supervisor Follower node, which follows the Leader node, in the sense that its PostGreSQL and Redis databases are replicated from corresponding databases in the Leader node. On this node, both Redis and PostGreSQL database services run in Follower (that is, read only) mode.

  3. You can add another Supervisor Follower node and it will follow the Supervisor node added in Step 2. PostGreSQL and Redis databases are replicated from the corresponding databases running on the node created in Step 2. On this node, both Redis and PostGreSQL database services run in Follower (that is, read only) mode.

  4. You can add more Supervisor Follower nodes by successfully chaining from the tail Follower node (in this case, the node created in Step 3).

The steps to set up Supervisor nodes are available under Adding Primary Leader and Adding Primary Follower under Configuring and Maintaining Active-Active Supervisor Cluster.

If the Supervisor Leader node fails, you can elect its Follower as (temporary) Leader and within 2 weeks install a new FortiCare License matching its UUID. If any other Supervisor fails, you simply delete the node from the GUI.

Steps to handle various Supervisor failure scenarios are available in Failover Operations.

A Load Balancer can be set up in front of the Supervisor nodes. Agents, Collectors should be configured to communicate to the Supervisor cluster via the Load Balancer. The Load Balancer should be configured to route web sessions from Agents and Collectors to Supervisor Cluster in a sticky fashion, meaning same HTTPS session is routed to the same Supervisor node.

For an example setup of a FortiWeb Load Balancer, see External Load Balancer Configuration.

Agents and Collectors in versions earlier than 6.7.0 will still communicate to the Supervisor they were configured with (likely the Leader node). Once upgraded to 6.7.0, Agents and Collectors will communicate via the Load Balancer. The list of Supervisors or the Load Balancer needs to be configured in FortiSIEM GUI in ADMIN > Settings > Cluster Config.

Active-Active Supervisor is a Licensed feature, that means that a new License with High Availability feature included, needs to be installed to to enable you to add a Supervisor Follower node. The License details shows the availability of the High Availability feature in ADMIN > License and in the output of CLI: “phLicenseTool --verify”.

caution icon

An Active-Active Supervisor Cluster works with the Disaster Recovery feature from earlier releases. Note that the Supervisors in Active-Active Supervisor Cluster are Read/Write, while the Supervisor in Disaster Recovery is Read Only.

AWS S3 Archive for ClickHouse

If you are deployed in AWS and running ClickHouse for event storage, you can now use AWS S3 for archiving events when ClickHouse Hot or Warm storage becomes full. The AWS S3 data movement and query is done within ClickHouse and the retention policy works for the entire storage pipeline (Hot-> Warm->Archive).

caution icon

Note that this AWS S3 Archive feature can be used only if you are deployed in AWS. If you are deployed in on-premise, then AWS S3 Archive does not work because of higher network latency between ClickHouse Supervisor/Worker and AWS S3 storage.

Additionally, when storing ClickHouse data in AWS S3, Fortinet recommends turning Bucket Versioning off, or suspending it (if it was previously enabled). This is because data in ClickHouse files may change and versioning will keep both copies of data - new and old. With time, the number of stale objects may increase, resulting in higher AWS S3 costs. If versioning was previously enabled for the bucket, Fortinet recommends suspending it and configuring a policy to delete non-current versions.

Steps to set up a ClickHouse AWS S3 Archive is under Creating ClickHouse Archive Storage.

For additional information on ClickHouse deployments, see Configuring ClickHouse Based Deployments.

For information on configuration for Worker nodes, see Initial Configuration in ClickHouse Configuration.

Custom ServiceNow Incident Integration

In earlier releases, FortiSIEM used ServiceNow built-in incident database table for Incident Outbound and Inbound integration. In this release, a custom ServiceNow database table can be used instead.

For details on how to create custom ServiceNow integration, see ServiceNow SOAP Based Integration.

Key Enhancements

Compliance Reports for Saudi Arabia Essential Cybersecurity Controls

Compliance reports have been added for Saudi Arabia Essential Cybersecurity Controls (KSA-ECC).

Detailed Audit Log for Failed CMDB Outbound Integrations

A detailed audit log (PH_AUDIT_CMDB_OUTBOUND_INT_FAILED) is created for failed CMDB Outbound Integrations. It captures the host for which the Integration failed.

A Sample log:

2022-11-15 16:21:32,193 INFO [http-listener-2(7)] com.ph.phoenix.framework.logging.PhAudit - [PH_AUDIT_CMDB_OUTBOUND_INT_FAILED]:[phCustId]=1,[reptVendor]=ServiceNow,[hostName]=HOST-10.1.1.1,[eventSeverity]=PHL_INFO,[phEventCategory]=2,[infoURL]=https://ven02318.service-now.com/,[hostIpAddr]=10.1.1.1,[procName]=AppServer,[srcIpAddr]=10.10.1.1,[user]=admin,[objType]=Device,[phLogDetail]=HOST-10.1.1.1 was not uploaded because: Could not find this device type,Cisco WLAN Controller, in Mapping Config file.

Detailed Reason for Query REST API Failure

The Query REST API provides a detailed reason in situations where the Query fails.

API: https://<Supervisor_IP>/phoenix/rest/query/progress/<queryId>

//Error case
<response requestId="10201" timestamp="1666376638748">
    <result>
        <progress>100</progress>
        <error code="9">
            <description>Not supported expr: LEN</description>
        </error>
    </result>
</response>

//Good case
<response requestId="10201" timestamp="1666376638748">
    <result>
        <progress>100</progress>
        <error code="0">
            <description></description>
        </error>
    </result>
</response>

Bug Fixes and Minor Enhancements

Bug ID

Severity

Module

Description

859950, 858445 Major App Server Rule evaluation is slow when filter condition uses individual Malware and Country Group Items.
854349 Major App Server Malware hash update from large 100K line CSV file causes App Server CPU to go 90%.
857967, 857550 Major Rule RuleWorker pauses when there is a rule change and App Server is busy.

861554

Minor

App Server

Custom rules for Org trigger incidents even if they are disabled for that Org but may be enabled for other Orgs.

860526 Minor App Server Windows/Linux Agent status may become incorrect after App Server restart.
860517 Minor App Server SQL Injection vulnerability in CMDB Report Display fields.
858900 Minor App Server Worker rule download from App Server may be slow due to sub-optimal handling of Techniques and Tactics.
858787 Minor App Server Device usage counts incorrect in enterprise environment after the upgrade from 6.4.0 to 6.6.2.
857944 Minor App Server OKTA Authentication redirects to Flash UI.
851078 Minor App Server Incident email notification may be slow when email is mis-configured.
851077 Minor App Server Incident queries by Incident ID may time out if there are lots of incidents stored in PostGreSQL over a long period of time.
848287 Minor App Server Source/Destination Interface SNMP Index does not display due to App Server inserting the name in front.
843361 Minor App Server Windows Agent license is counted even after moving the server to Decomissioned folder in CMDB.
843342 Minor App Server Incident title and name are empty for auto clear Incidents triggered by OSPF Neighbor Down Rule.
840694 Minor App Server AGENT method disappears from CMDB Method column when SNMP discovery is re-ran.
839552 Minor App Server Cloned User defined Event Type becomes System.
838600 Minor App Server Device name change does not take effect on collectors that do not discover/monitor the device.
835741 Minor App Server Custom rules in 'performance, availability and change' category are triggered for devices which are in Maintenance Window.
819401 Minor App Server STIX/TAXII 2.0 feed is not being pulled from Anomali free service.
809914 Minor App Server Dashboard is orphaned when the owner is deleted from FortiSIEM.
853913 Minor Data Some reports, e.g. FDC Top Victims and FGT Traffic Top Applications, are incorrect.
848577 Minor Data AOWUA_DNSParser does not set Event Severity.
846604 Minor Data System rules involving City triggers incorrectly with City = N/A.
842119 Minor Data FortiSandboxParser does not parse 'File Name' attribute.
839546 Minor Data FortiDeceptor logs are not being correctly parsed by FortiSIEM.
822231 Minor Data Event Name for PAN-IDP-37264 is incorrect.
813268 Minor Data In CheckpointCEFParser, the TCP/UDP ports are not being parsed correctly.
853819 Minor Data Purger When the retention policy triggers, the archive data for CUSTOMER_1 contains other org's data.
801973 Minor Data Purger Two issues:
1. EnforceRetentionPolicy tool does not archive
2. Retention policy does not cover events with missing event attributes.
857192 Minor Discovery LDAP user discovery may hang.
839636 Minor Discovery FortiGate Discovery via API + SSH stops at 1% during discovery.
838934 Minor Discovery SNMP Test Connectivity fails against FortiOS 7.2.1, but discovery succeeds.
841669 Minor Event Pulling WMI/OMI event pulling may lag behind in some cases.
862020 Minor Event Pulling Agent Generic HTTPS Advanced Event Puller incorrectly sets lastPollTime window to local time instead of UTC.
859767 Minor GUI Incorrect Elasticsearch Org Bucket mapping from file when 10 groups are used, but there are gaps 50,001-50,010. GUI maps group 50,011 to 50,000.
859557 Minor GUI Unable to delete Dashboard Slideshows in super/global and orgs.
853371 Minor GUI Interface Usage Dashboard does not show data, but drill down from widget will show data.
852369, 852362 Minor GUI Cannot add extra ClickHouse disk to Supervisor without manual edit.
846233 Minor GUI Changing the logo and image of 'Global User Reports Template' in PDF export causes 'Contents' part to break.
841762 Minor GUI The search box in the collector health page doesn't work for fields like Organization and Collector ID.
841636 Minor GUI Full Admin user created using AD Group Role mapping cannot edit credentials that other users created.
840625 Minor GUI SAML users with full admin access CANNOT modify anything in User Settings pop up.
833083 Minor GUI Can't delete manager for user in CMDB.
842025 Minor Linux Agent Linux Agent fails to start up when ipv6 is disabled on ubuntu 20.04.5.
860690 Minor Parser Event parsing falls behind at 2k EPS if all events are forwarded to Kafka.
766137 Minor Parser Enhancement - WinOSWmiParser cannot parse some Spanish Attributes.
848258 Minor Performance Monitoring If Windows Agent is installed first, then SNMP performance monitoring does NOT pull all metrics on Windows servers.
838534 Minor Performance Monitoring Unable To Monitor Hyper-V Performance Metrics Through winexe-static without enabling SMB1Protocol.

866034

Minor

Query

On ClickHouse, queries of the type "Destination IP IN Networks > Group: Public DNS Servers" hits the SQL Size limit and does not run.

860571 Minor Query PctChange function in Query fails on ClickHouse and phQueryMaster restarted.
846629 Minor Query For EventDB, Queries using != operator on strings may not return proper results on certain conditions.
839542 Minor Query Queries with multiple expressions will fail when there is no order clause.

861196

Minor

Rule

phFortiInsightAI process may consume high CPU on Workers from excessive Win-Security-4624 logs.

857424 Minor System 'hdfs_archive_with_purge' is not kept from the previous phoenix config.
844287 Minor System FortiSIEM upgrade does not backup network_param.json.
842161 Minor System Collector stores certain REST API responses in files.
810999 Minor System TraceEnable is not disabled on port 80 for http.
792418 Minor System Remove Malware IOC feeds that are either obsolete or- RFE -Remove/replace out of box integrations, which are no longer active.
840314 Enhancement App Server Public Query REST API must contain detail error messages when the query fails to run.
838829 Enhancement App Server Very large Malware IOC update causes App Server to run out of memory.
857014 Enhancement Data Update latest FortiSwitch SysObjectId list.
852149 Enhancement Data For Cisco Umbrella Parser, several fields are not parsed.
835806 Enhancement Data Imperva parser update.
835763 Enhancement Data Cisco ISE parser need to normalize mac addresses.
833721 Enhancement Data RSAAuthenticationServerParser parser update.
831476 Enhancement Data MicrosoftExchangeTrackingLogParser misparsed value based on value containing ',' as delimiter.
830604 Enhancement Data WinOSWmiParser needs enhancement for rule - Windows: Pass the Hash Activity 2.
824927 Enhancement Data Apache Parser needs to parse URI query.
856273 Enhancement System Restrict EnforceRetentionPolicy tool to run only as admin.
839129 Enhancement System CMDB back up enhancements.
839053 Enhancement System Exit code from "monctl start" incorrectly returns failure if phMonitor is still running.
833918 Enhancement System Suppress unnecessary ansible warning messages and clean up task description.

Known Issues

General

This release cannot be installed in IPV6 networks.

ClickHouse Related

  1. If you are running ClickHouse event database and want to do Active-Active Supervisor failover, then your Supervisor should not be the only ClickHouse Keeper node. In that case, once the Supervisor is down, the ClickHouse cluster will be down and inserts will fail. It is recommended that you have 3 ClickHouse Keeper nodes running on Workers.

  2. If you are running ClickHouse, then during a Supervisor upgrade to FortiSIEM 6.7.0 or later, instead of shutting down Worker nodes, you need to stop the backend processes by running the following command from the command line.

    phtools --stop all

  3. If you are running Elasticsearch or FortiSIEM EventDB and switch to ClickHouse, then you need to follow two steps to complete the database switch.

    1. Set up the disks on each node in ADMIN > Setup> Storage and ADMIN > License > Nodes.

    2. Configure ClickHouse topology in ADMIN > Settings > Database > ClickHouse Config.

  4. In a ClickHouse environment, Queries will not return results if none of the query nodes within a shard are reachable from Supervisor and responsive. In other words, if at least 1 query node in every shard is healthy and responds to queries, then query results will be returned. To avoid this condition, make sure all Query Worker nodes are healthy.

  5. If more than 1 ClickHouse Keeper node is down, then the following manual steps are required before you can delete the Keeper nodes from the GUI.

    Step 1. Delete the failed nodes from ClickHouse Keeper config file.

    On all surviving ClickHouse Keeper nodes, modify the Keeper config located at /data-clickhouse-hot-1/clickhouse-keeper/conf/keeper.xml

    <yandex>
        <keeper_server>
             <raft_configuration>
                 <server>
                   -- server 1 info --
                 </server>
                 <server>
                    -- server 2 info --
                 </server>
              </raft_configuration>
          </keeper_server>
    </yandex>
    

    Remove any <server></server> block corresponding to the nodes that are down.

    Step 2. Prepare systemd argument for recovery

    On all surviving ClickHouse Keeper nodes, make sure /data-clickhouse-hot-1/clickhouse-keeper/.systemd_argconf file has a single line as follows:

    ARG1=--force-recovery

    Step 3. Restart ClickHouse Keeper

    On all surviving ClickHouse Keeper nodes, run the following command.

    systemctl restart ClickHouseKeeper

    Check to make sure ClickHouse Keeper is restarted with --force-recovery option. Check with the following command.

    systemctl status ClickHouseKeeper

    The output of the status command will show whether --force-recovery is being applied at "Process:" line, like so:

    Process: 487110 ExecStart=/usr/bin/clickhouse-keeper --daemon --force-recovery --config=/data-clickhouse-hot-1/clickhouse-keeper/conf/keeper.xml (code=exited, status=0/SUCCESS)

    Step 4. Check to make sure ClickHouse Server(s) are up

    After Step 3 is done on all surviving ClickHouse Keeper nodes, wait for a couple of minutes to make sure the sync up between ClickHouse Keeper and ClickHouse Server finish successfully. Use four letter commands to check the status of ClickHouse Keepers.

    On each surviving ClickHouse Keeper node, issue the following command.

    echo stat | nc localhost 2181

    If "clickhouse-client" shell is available, then ClickHouse Server is up and fully synced up with ClickHouse Keeper cluster.

    Extra commands such as mntr(maintenance) and srvr(server) will also provide overlapping and other information as well.

    Step 5. Update Redis key for ClickHouse Keepers

    On Supervisor Leader Redis, set the following key to the comma-separated list of surviving ClickHouse Keeper nodes' IP addresses. For example, assuming 172.30.58.96 and 172.30.58.98 are surviving ClickHouse Keeper nodes, the command will be:

    set cache:ClickHouse:clickhouseKeeperNodes "172.30.58.96,172.30.58.98"

    You can use the following command to check the values before and after setting to the new value:

    get cache:ClickHouse:clickhouseKeeperNode

    Step 6. Update ClickHouse Config via GUI.

    After all the changes from Steps 1-5 above have been successfully executed, you can delete all the down ClickHouse Keeper from ADMIN > Settings > ClickHouse Config page.

    Note: For re-adding the ClickHouse Keeper nodes that are down.

    1. First, make sure that the ClickHouse Keeper nodes that are down are removed from ADMIN > License > Nodes page.

    2. After powering up the ClickHouse Keeper node, as root, use the following script to cleanup the residual config from previously deployed instance.

      /opt/phoenix/phscripts/clickhouse/cleanup_replicated_tables_and_keeper.sh

      Since not all nodes act as ClickHouse Keeper Node and ClickHouse data node at the same time, the script might warn that tables or keeper directory is not found. The warning messages can be ignored.

    3. After the cleanup in (b), add the nodes back via ADMIN > License > Nodes, and, then, can be added back to ClickHouse Keeper or ClickHouse Data cluster.

Disaster Recovery Related

After failing over from Primary to Secondary, you need to restart phMonitor on all Secondary Workers for the queries to work properly.

Elasticsearch Related

  1. In Elasticsearch based deployments, queries containing "IN Group X" are handled using Elastic Terms Query. By default, the maximum number of terms that can be used in a Terms Query is set to 65,536. If a Group contains more than 65,536 entries, the query will fail.

    The workaround is to change the “max_terms_count” setting for each event index. Fortinet has tested up to 1 million entries. The query response time will be proportional to the size of the group.

    Case 1. For already existing indices, issue the REST API call to update the setting

    PUT fortisiem-event-*/_settings
    {
      "index" : {
        "max_terms_count" : "1000000"
      }
    }
    

    Case 2. For new indices that are going to be created in the future, update fortisiem-event-template so those new indices will have a higher max_terms_count setting

    1. cd /opt/phoenix/config/elastic/7.7

    2. Add "index.max_terms_count": 1000000 (including quotations) to the “settings” section of the fortisiem-event-template.

      Example:

      ...

        "settings": {
          "index.max_terms_count": 1000000,
      

      ...

    3. Navigate to ADMIN > Storage > Online and perform Test and Deploy.

    4. Test new indices have the updated terms limit by executing the following simple REST API call.

      GET fortisiem-event-*/_settings

  2. FortiSIEM uses dynamic mapping for Keyword fields to save Cluster state. Elasticsearch needs to encounter some events containing these fields before it can determine their type. For this reason, queries containing group by on any of these fields will fail if Elasticsearch has not seen any event containing these fields. Workaround is to first run a non-group by query with these fields to make sure that these fields have non-null haves.

EventDB Related

Currently, Policy based retention for EventDB does not cover two event categories: (a) System events with phCustId = 0, e.g. a FortiSIEM External Integration Error, FortiSIEM process crash etc., and (b) Super/Global customer audit events with phCustId = 3, e.g. audit log generated from a Super/Global user running an adhoc query. These events are purged when disk usage reaches high watermark.

HDFS Related

If you are running real-time Archive with HDFS, and have added Workers after the real-time Archive has been configured, then you will need to perform a Test and Deploy for HDFS Archive again from the GUI. This will enable HDFSMgr to know about the newly added Workers.

High Availability Related

If you make changes to the following files on any node in the FortiSIEM Cluster, then you will have to manually copy these changes to other nodes.

  1. FortiSIEM Config file (/opt/phoenix/config/phoenix_config.txt): If you change a Supervisor (respectively Worker, Collector) related change in this file, then the modified file should be copied to all Supervisors (respectively Workers, Collectors).

  2. FortiSIEM Identity and Location Configuration file (/opt/phoenix/config/identity_Def.xml): This file should be identical in Supervisors and Workers. If you make a change to this file on any Supervisor or Worker, then you need to copy this file to all other Supervisors and Workers.

  3. FortiSIEM Profile file (ProfileReports.xml): This file should be identical in Supervisors and Workers. If you make a change to this file on any Supervisor or Worker, then you need to copy this file to all other Supervisors and Workers.

  4. SSL Certificate (/etc/httpd/conf.d/ssl.conf): This file should be identical in Supervisors and Workers. If you make a change to this file on any Supervisor or Worker, then you need to copy this file to all other Supervisors and Workers.

  5. Java SSL Certificates (files cacerts.jks, keyfile and keystore.jks under /opt/glassfish/domains/domain1/config/): If you change these files on a Supervisor, then you have to copy these files to all Supervisors.

  6. Log pulling External Certificates

  7. Event forwarding Certificates define in FortiSIEM Config file (/opt/phoenix/config/phoenix_config.txt): If you change on one node, you need to change on all nodes.

  8. Custom cron job: If you change this file on a Supervisor, then you have to copy this file to all Supervisors.

What's New in 6.7.0

This document describes the additions for the FortiSIEM 6.7.0 release.

New Features

Active-Active Supervisor Cluster

This release allows you to run multiple Supervisors in an Active-Active mode. This means that you can log in to any Supervisor node and perform any operation. Rule, Query, Inline report processing and App Server functionality are distributed across the available Supervisor nodes. An Active-Active Supervisor Cluster provides resilience against Supervisor failures, higher number of GUI users, Agents, Collectors, and higher volume of rule and query processing.

An Active-Active Supervisor cluster is built around the concept of Leader and Follower Supervisor nodes, set up as a linked list.

  1. The first Supervisor node where you install License, is a Leader. On this node, its UUID matches the UUID in the installed FortiCare License. On this node, Redis and PostGreSQL database services run in Master mode.

  2. Next, you add a Supervisor Follower node, which follows the Leader node, in the sense that its PostGreSQL and Redis databases are replicated from corresponding databases in the Leader node. On this node, both Redis and PostGreSQL database services run in Follower (that is, read only) mode.

  3. You can add another Supervisor Follower node and it will follow the Supervisor node added in Step 2. PostGreSQL and Redis databases are replicated from the corresponding databases running on the node created in Step 2. On this node, both Redis and PostGreSQL database services run in Follower (that is, read only) mode.

  4. You can add more Supervisor Follower nodes by successfully chaining from the tail Follower node (in this case, the node created in Step 3).

The steps to set up Supervisor nodes are available under Adding Primary Leader and Adding Primary Follower under Configuring and Maintaining Active-Active Supervisor Cluster.

If the Supervisor Leader node fails, you can elect its Follower as (temporary) Leader and within 2 weeks install a new FortiCare License matching its UUID. If any other Supervisor fails, you simply delete the node from the GUI.

Steps to handle various Supervisor failure scenarios are available in Failover Operations.

A Load Balancer can be set up in front of the Supervisor nodes. Agents, Collectors should be configured to communicate to the Supervisor cluster via the Load Balancer. The Load Balancer should be configured to route web sessions from Agents and Collectors to Supervisor Cluster in a sticky fashion, meaning same HTTPS session is routed to the same Supervisor node.

For an example setup of a FortiWeb Load Balancer, see External Load Balancer Configuration.

Agents and Collectors in versions earlier than 6.7.0 will still communicate to the Supervisor they were configured with (likely the Leader node). Once upgraded to 6.7.0, Agents and Collectors will communicate via the Load Balancer. The list of Supervisors or the Load Balancer needs to be configured in FortiSIEM GUI in ADMIN > Settings > Cluster Config.

Active-Active Supervisor is a Licensed feature, that means that a new License with High Availability feature included, needs to be installed to to enable you to add a Supervisor Follower node. The License details shows the availability of the High Availability feature in ADMIN > License and in the output of CLI: “phLicenseTool --verify”.

caution icon

An Active-Active Supervisor Cluster works with the Disaster Recovery feature from earlier releases. Note that the Supervisors in Active-Active Supervisor Cluster are Read/Write, while the Supervisor in Disaster Recovery is Read Only.

AWS S3 Archive for ClickHouse

If you are deployed in AWS and running ClickHouse for event storage, you can now use AWS S3 for archiving events when ClickHouse Hot or Warm storage becomes full. The AWS S3 data movement and query is done within ClickHouse and the retention policy works for the entire storage pipeline (Hot-> Warm->Archive).

caution icon

Note that this AWS S3 Archive feature can be used only if you are deployed in AWS. If you are deployed in on-premise, then AWS S3 Archive does not work because of higher network latency between ClickHouse Supervisor/Worker and AWS S3 storage.

Additionally, when storing ClickHouse data in AWS S3, Fortinet recommends turning Bucket Versioning off, or suspending it (if it was previously enabled). This is because data in ClickHouse files may change and versioning will keep both copies of data - new and old. With time, the number of stale objects may increase, resulting in higher AWS S3 costs. If versioning was previously enabled for the bucket, Fortinet recommends suspending it and configuring a policy to delete non-current versions.

Steps to set up a ClickHouse AWS S3 Archive is under Creating ClickHouse Archive Storage.

For additional information on ClickHouse deployments, see Configuring ClickHouse Based Deployments.

For information on configuration for Worker nodes, see Initial Configuration in ClickHouse Configuration.

Custom ServiceNow Incident Integration

In earlier releases, FortiSIEM used ServiceNow built-in incident database table for Incident Outbound and Inbound integration. In this release, a custom ServiceNow database table can be used instead.

For details on how to create custom ServiceNow integration, see ServiceNow SOAP Based Integration.

Key Enhancements

Compliance Reports for Saudi Arabia Essential Cybersecurity Controls

Compliance reports have been added for Saudi Arabia Essential Cybersecurity Controls (KSA-ECC).

Detailed Audit Log for Failed CMDB Outbound Integrations

A detailed audit log (PH_AUDIT_CMDB_OUTBOUND_INT_FAILED) is created for failed CMDB Outbound Integrations. It captures the host for which the Integration failed.

A Sample log:

2022-11-15 16:21:32,193 INFO [http-listener-2(7)] com.ph.phoenix.framework.logging.PhAudit - [PH_AUDIT_CMDB_OUTBOUND_INT_FAILED]:[phCustId]=1,[reptVendor]=ServiceNow,[hostName]=HOST-10.1.1.1,[eventSeverity]=PHL_INFO,[phEventCategory]=2,[infoURL]=https://ven02318.service-now.com/,[hostIpAddr]=10.1.1.1,[procName]=AppServer,[srcIpAddr]=10.10.1.1,[user]=admin,[objType]=Device,[phLogDetail]=HOST-10.1.1.1 was not uploaded because: Could not find this device type,Cisco WLAN Controller, in Mapping Config file.

Detailed Reason for Query REST API Failure

The Query REST API provides a detailed reason in situations where the Query fails.

API: https://<Supervisor_IP>/phoenix/rest/query/progress/<queryId>

//Error case
<response requestId="10201" timestamp="1666376638748">
    <result>
        <progress>100</progress>
        <error code="9">
            <description>Not supported expr: LEN</description>
        </error>
    </result>
</response>

//Good case
<response requestId="10201" timestamp="1666376638748">
    <result>
        <progress>100</progress>
        <error code="0">
            <description></description>
        </error>
    </result>
</response>

Bug Fixes and Minor Enhancements

Bug ID

Severity

Module

Description

859950, 858445 Major App Server Rule evaluation is slow when filter condition uses individual Malware and Country Group Items.
854349 Major App Server Malware hash update from large 100K line CSV file causes App Server CPU to go 90%.
857967, 857550 Major Rule RuleWorker pauses when there is a rule change and App Server is busy.

861554

Minor

App Server

Custom rules for Org trigger incidents even if they are disabled for that Org but may be enabled for other Orgs.

860526 Minor App Server Windows/Linux Agent status may become incorrect after App Server restart.
860517 Minor App Server SQL Injection vulnerability in CMDB Report Display fields.
858900 Minor App Server Worker rule download from App Server may be slow due to sub-optimal handling of Techniques and Tactics.
858787 Minor App Server Device usage counts incorrect in enterprise environment after the upgrade from 6.4.0 to 6.6.2.
857944 Minor App Server OKTA Authentication redirects to Flash UI.
851078 Minor App Server Incident email notification may be slow when email is mis-configured.
851077 Minor App Server Incident queries by Incident ID may time out if there are lots of incidents stored in PostGreSQL over a long period of time.
848287 Minor App Server Source/Destination Interface SNMP Index does not display due to App Server inserting the name in front.
843361 Minor App Server Windows Agent license is counted even after moving the server to Decomissioned folder in CMDB.
843342 Minor App Server Incident title and name are empty for auto clear Incidents triggered by OSPF Neighbor Down Rule.
840694 Minor App Server AGENT method disappears from CMDB Method column when SNMP discovery is re-ran.
839552 Minor App Server Cloned User defined Event Type becomes System.
838600 Minor App Server Device name change does not take effect on collectors that do not discover/monitor the device.
835741 Minor App Server Custom rules in 'performance, availability and change' category are triggered for devices which are in Maintenance Window.
819401 Minor App Server STIX/TAXII 2.0 feed is not being pulled from Anomali free service.
809914 Minor App Server Dashboard is orphaned when the owner is deleted from FortiSIEM.
853913 Minor Data Some reports, e.g. FDC Top Victims and FGT Traffic Top Applications, are incorrect.
848577 Minor Data AOWUA_DNSParser does not set Event Severity.
846604 Minor Data System rules involving City triggers incorrectly with City = N/A.
842119 Minor Data FortiSandboxParser does not parse 'File Name' attribute.
839546 Minor Data FortiDeceptor logs are not being correctly parsed by FortiSIEM.
822231 Minor Data Event Name for PAN-IDP-37264 is incorrect.
813268 Minor Data In CheckpointCEFParser, the TCP/UDP ports are not being parsed correctly.
853819 Minor Data Purger When the retention policy triggers, the archive data for CUSTOMER_1 contains other org's data.
801973 Minor Data Purger Two issues:
1. EnforceRetentionPolicy tool does not archive
2. Retention policy does not cover events with missing event attributes.
857192 Minor Discovery LDAP user discovery may hang.
839636 Minor Discovery FortiGate Discovery via API + SSH stops at 1% during discovery.
838934 Minor Discovery SNMP Test Connectivity fails against FortiOS 7.2.1, but discovery succeeds.
841669 Minor Event Pulling WMI/OMI event pulling may lag behind in some cases.
862020 Minor Event Pulling Agent Generic HTTPS Advanced Event Puller incorrectly sets lastPollTime window to local time instead of UTC.
859767 Minor GUI Incorrect Elasticsearch Org Bucket mapping from file when 10 groups are used, but there are gaps 50,001-50,010. GUI maps group 50,011 to 50,000.
859557 Minor GUI Unable to delete Dashboard Slideshows in super/global and orgs.
853371 Minor GUI Interface Usage Dashboard does not show data, but drill down from widget will show data.
852369, 852362 Minor GUI Cannot add extra ClickHouse disk to Supervisor without manual edit.
846233 Minor GUI Changing the logo and image of 'Global User Reports Template' in PDF export causes 'Contents' part to break.
841762 Minor GUI The search box in the collector health page doesn't work for fields like Organization and Collector ID.
841636 Minor GUI Full Admin user created using AD Group Role mapping cannot edit credentials that other users created.
840625 Minor GUI SAML users with full admin access CANNOT modify anything in User Settings pop up.
833083 Minor GUI Can't delete manager for user in CMDB.
842025 Minor Linux Agent Linux Agent fails to start up when ipv6 is disabled on ubuntu 20.04.5.
860690 Minor Parser Event parsing falls behind at 2k EPS if all events are forwarded to Kafka.
766137 Minor Parser Enhancement - WinOSWmiParser cannot parse some Spanish Attributes.
848258 Minor Performance Monitoring If Windows Agent is installed first, then SNMP performance monitoring does NOT pull all metrics on Windows servers.
838534 Minor Performance Monitoring Unable To Monitor Hyper-V Performance Metrics Through winexe-static without enabling SMB1Protocol.

866034

Minor

Query

On ClickHouse, queries of the type "Destination IP IN Networks > Group: Public DNS Servers" hits the SQL Size limit and does not run.

860571 Minor Query PctChange function in Query fails on ClickHouse and phQueryMaster restarted.
846629 Minor Query For EventDB, Queries using != operator on strings may not return proper results on certain conditions.
839542 Minor Query Queries with multiple expressions will fail when there is no order clause.

861196

Minor

Rule

phFortiInsightAI process may consume high CPU on Workers from excessive Win-Security-4624 logs.

857424 Minor System 'hdfs_archive_with_purge' is not kept from the previous phoenix config.
844287 Minor System FortiSIEM upgrade does not backup network_param.json.
842161 Minor System Collector stores certain REST API responses in files.
810999 Minor System TraceEnable is not disabled on port 80 for http.
792418 Minor System Remove Malware IOC feeds that are either obsolete or- RFE -Remove/replace out of box integrations, which are no longer active.
840314 Enhancement App Server Public Query REST API must contain detail error messages when the query fails to run.
838829 Enhancement App Server Very large Malware IOC update causes App Server to run out of memory.
857014 Enhancement Data Update latest FortiSwitch SysObjectId list.
852149 Enhancement Data For Cisco Umbrella Parser, several fields are not parsed.
835806 Enhancement Data Imperva parser update.
835763 Enhancement Data Cisco ISE parser need to normalize mac addresses.
833721 Enhancement Data RSAAuthenticationServerParser parser update.
831476 Enhancement Data MicrosoftExchangeTrackingLogParser misparsed value based on value containing ',' as delimiter.
830604 Enhancement Data WinOSWmiParser needs enhancement for rule - Windows: Pass the Hash Activity 2.
824927 Enhancement Data Apache Parser needs to parse URI query.
856273 Enhancement System Restrict EnforceRetentionPolicy tool to run only as admin.
839129 Enhancement System CMDB back up enhancements.
839053 Enhancement System Exit code from "monctl start" incorrectly returns failure if phMonitor is still running.
833918 Enhancement System Suppress unnecessary ansible warning messages and clean up task description.

Known Issues

General

This release cannot be installed in IPV6 networks.

ClickHouse Related

  1. If you are running ClickHouse event database and want to do Active-Active Supervisor failover, then your Supervisor should not be the only ClickHouse Keeper node. In that case, once the Supervisor is down, the ClickHouse cluster will be down and inserts will fail. It is recommended that you have 3 ClickHouse Keeper nodes running on Workers.

  2. If you are running ClickHouse, then during a Supervisor upgrade to FortiSIEM 6.7.0 or later, instead of shutting down Worker nodes, you need to stop the backend processes by running the following command from the command line.

    phtools --stop all

  3. If you are running Elasticsearch or FortiSIEM EventDB and switch to ClickHouse, then you need to follow two steps to complete the database switch.

    1. Set up the disks on each node in ADMIN > Setup> Storage and ADMIN > License > Nodes.

    2. Configure ClickHouse topology in ADMIN > Settings > Database > ClickHouse Config.

  4. In a ClickHouse environment, Queries will not return results if none of the query nodes within a shard are reachable from Supervisor and responsive. In other words, if at least 1 query node in every shard is healthy and responds to queries, then query results will be returned. To avoid this condition, make sure all Query Worker nodes are healthy.

  5. If more than 1 ClickHouse Keeper node is down, then the following manual steps are required before you can delete the Keeper nodes from the GUI.

    Step 1. Delete the failed nodes from ClickHouse Keeper config file.

    On all surviving ClickHouse Keeper nodes, modify the Keeper config located at /data-clickhouse-hot-1/clickhouse-keeper/conf/keeper.xml

    <yandex>
        <keeper_server>
             <raft_configuration>
                 <server>
                   -- server 1 info --
                 </server>
                 <server>
                    -- server 2 info --
                 </server>
              </raft_configuration>
          </keeper_server>
    </yandex>
    

    Remove any <server></server> block corresponding to the nodes that are down.

    Step 2. Prepare systemd argument for recovery

    On all surviving ClickHouse Keeper nodes, make sure /data-clickhouse-hot-1/clickhouse-keeper/.systemd_argconf file has a single line as follows:

    ARG1=--force-recovery

    Step 3. Restart ClickHouse Keeper

    On all surviving ClickHouse Keeper nodes, run the following command.

    systemctl restart ClickHouseKeeper

    Check to make sure ClickHouse Keeper is restarted with --force-recovery option. Check with the following command.

    systemctl status ClickHouseKeeper

    The output of the status command will show whether --force-recovery is being applied at "Process:" line, like so:

    Process: 487110 ExecStart=/usr/bin/clickhouse-keeper --daemon --force-recovery --config=/data-clickhouse-hot-1/clickhouse-keeper/conf/keeper.xml (code=exited, status=0/SUCCESS)

    Step 4. Check to make sure ClickHouse Server(s) are up

    After Step 3 is done on all surviving ClickHouse Keeper nodes, wait for a couple of minutes to make sure the sync up between ClickHouse Keeper and ClickHouse Server finish successfully. Use four letter commands to check the status of ClickHouse Keepers.

    On each surviving ClickHouse Keeper node, issue the following command.

    echo stat | nc localhost 2181

    If "clickhouse-client" shell is available, then ClickHouse Server is up and fully synced up with ClickHouse Keeper cluster.

    Extra commands such as mntr(maintenance) and srvr(server) will also provide overlapping and other information as well.

    Step 5. Update Redis key for ClickHouse Keepers

    On Supervisor Leader Redis, set the following key to the comma-separated list of surviving ClickHouse Keeper nodes' IP addresses. For example, assuming 172.30.58.96 and 172.30.58.98 are surviving ClickHouse Keeper nodes, the command will be:

    set cache:ClickHouse:clickhouseKeeperNodes "172.30.58.96,172.30.58.98"

    You can use the following command to check the values before and after setting to the new value:

    get cache:ClickHouse:clickhouseKeeperNode

    Step 6. Update ClickHouse Config via GUI.

    After all the changes from Steps 1-5 above have been successfully executed, you can delete all the down ClickHouse Keeper from ADMIN > Settings > ClickHouse Config page.

    Note: For re-adding the ClickHouse Keeper nodes that are down.

    1. First, make sure that the ClickHouse Keeper nodes that are down are removed from ADMIN > License > Nodes page.

    2. After powering up the ClickHouse Keeper node, as root, use the following script to cleanup the residual config from previously deployed instance.

      /opt/phoenix/phscripts/clickhouse/cleanup_replicated_tables_and_keeper.sh

      Since not all nodes act as ClickHouse Keeper Node and ClickHouse data node at the same time, the script might warn that tables or keeper directory is not found. The warning messages can be ignored.

    3. After the cleanup in (b), add the nodes back via ADMIN > License > Nodes, and, then, can be added back to ClickHouse Keeper or ClickHouse Data cluster.

Disaster Recovery Related

After failing over from Primary to Secondary, you need to restart phMonitor on all Secondary Workers for the queries to work properly.

Elasticsearch Related

  1. In Elasticsearch based deployments, queries containing "IN Group X" are handled using Elastic Terms Query. By default, the maximum number of terms that can be used in a Terms Query is set to 65,536. If a Group contains more than 65,536 entries, the query will fail.

    The workaround is to change the “max_terms_count” setting for each event index. Fortinet has tested up to 1 million entries. The query response time will be proportional to the size of the group.

    Case 1. For already existing indices, issue the REST API call to update the setting

    PUT fortisiem-event-*/_settings
    {
      "index" : {
        "max_terms_count" : "1000000"
      }
    }
    

    Case 2. For new indices that are going to be created in the future, update fortisiem-event-template so those new indices will have a higher max_terms_count setting

    1. cd /opt/phoenix/config/elastic/7.7

    2. Add "index.max_terms_count": 1000000 (including quotations) to the “settings” section of the fortisiem-event-template.

      Example:

      ...

        "settings": {
          "index.max_terms_count": 1000000,
      

      ...

    3. Navigate to ADMIN > Storage > Online and perform Test and Deploy.

    4. Test new indices have the updated terms limit by executing the following simple REST API call.

      GET fortisiem-event-*/_settings

  2. FortiSIEM uses dynamic mapping for Keyword fields to save Cluster state. Elasticsearch needs to encounter some events containing these fields before it can determine their type. For this reason, queries containing group by on any of these fields will fail if Elasticsearch has not seen any event containing these fields. Workaround is to first run a non-group by query with these fields to make sure that these fields have non-null haves.

EventDB Related

Currently, Policy based retention for EventDB does not cover two event categories: (a) System events with phCustId = 0, e.g. a FortiSIEM External Integration Error, FortiSIEM process crash etc., and (b) Super/Global customer audit events with phCustId = 3, e.g. audit log generated from a Super/Global user running an adhoc query. These events are purged when disk usage reaches high watermark.

HDFS Related

If you are running real-time Archive with HDFS, and have added Workers after the real-time Archive has been configured, then you will need to perform a Test and Deploy for HDFS Archive again from the GUI. This will enable HDFSMgr to know about the newly added Workers.

High Availability Related

If you make changes to the following files on any node in the FortiSIEM Cluster, then you will have to manually copy these changes to other nodes.

  1. FortiSIEM Config file (/opt/phoenix/config/phoenix_config.txt): If you change a Supervisor (respectively Worker, Collector) related change in this file, then the modified file should be copied to all Supervisors (respectively Workers, Collectors).

  2. FortiSIEM Identity and Location Configuration file (/opt/phoenix/config/identity_Def.xml): This file should be identical in Supervisors and Workers. If you make a change to this file on any Supervisor or Worker, then you need to copy this file to all other Supervisors and Workers.

  3. FortiSIEM Profile file (ProfileReports.xml): This file should be identical in Supervisors and Workers. If you make a change to this file on any Supervisor or Worker, then you need to copy this file to all other Supervisors and Workers.

  4. SSL Certificate (/etc/httpd/conf.d/ssl.conf): This file should be identical in Supervisors and Workers. If you make a change to this file on any Supervisor or Worker, then you need to copy this file to all other Supervisors and Workers.

  5. Java SSL Certificates (files cacerts.jks, keyfile and keystore.jks under /opt/glassfish/domains/domain1/config/): If you change these files on a Supervisor, then you have to copy these files to all Supervisors.

  6. Log pulling External Certificates

  7. Event forwarding Certificates define in FortiSIEM Config file (/opt/phoenix/config/phoenix_config.txt): If you change on one node, you need to change on all nodes.

  8. Custom cron job: If you change this file on a Supervisor, then you have to copy this file to all Supervisors.