Fortinet black logo

Custom job templates

Custom job templates

When you select a template for your custom job, you might need to fill out additional fields depending on the template you select. The following templates require additional configuration before you can apply them.

Backup Table Validation

The Backup Table Validation template is used to verify the data integrity of the backup data at the selected location.

Select the storage group and enter the Hadoop Distributed File System (HDFS) URL for the backup location.

Custom Template

Custom templates are used to create the content for custom jobs for when built-in jobs don't meet your specific needs. You can create custom templates to operate the host, collect information, take actions, and more.

Custom templates require you to use the Ansible playbook YAML format to define the content. For information about Ansible specifications, refer to the official Ansible documentation.

The following example template collects the disk usage of the BigData Controller and sends it to a Slack channel:

- name: Collect disk usage and send to slack

hosts: controllerIp

vars:

- slack_url: "https://hooks.slack.com/services/xxxxxxx" # your slack app webhook url

tasks:

- name: Collect disk usage

command:"df -h"

register: result

- name: Send to slack

uri:

url:"{{ slack_url }}"

body:'{"text": "{{ result.stdout }}"}'

body_format: json

method: POST

The follow table shows all the Ansible inventory group names you can use as hosts values in your playbook and template. Those values are pre-populated in the Ansible inventory and are automatically applied with each execution.

  • hdfs_datanode
  • hdfs_namenode
  • kudu_tserver
  • kudu_hive_metastore
  • zookeeper
  • kafka_broker
  • impala_catalog
  • impala
  • impala_statestore
  • yarn_nodemanager
  • yarn_resource_manager
  • spark_history_server

These inventory groups can be used to select the host(s) that have the named services running.

For example, using “host: kudu_tserver” in your playbook allows it to be executed on all hosts has kudu-tserver instance.

  • hdfs_datanode_reachable
  • hdfs_namenode_reachable
  • kudu_tserver_reachable
  • kudu_reachable
  • hive_metastore_reachable
  • zookeeper_reachable
  • kafka_broker_reachable
  • impala_catalog_reachable
  • impala_reachable
  • impala_statestore_reachable
  • yarn_nodemanager_reachable
  • yarn_resource_manager_reachable
  • spark_history_server_reachable

These groups can be used to select one of the reachable hosts that belong to the named service.

For example: kudu has instances spreading on 3 hosts, and “hosts:kudu_reachable” will randomly return one that is reachable at the execution time.

  • metastore
  • datanode
  • master

These groups can be used to select hosts the belong to the named role.

  • metastore_reachable
  • datanode_reachable
  • master_reachable

These groups can be used to select a random host that is reachable at the execution time, from the ones with the named role.

  • controllerIp

This group can be used to the BigData Controller host.

In addition to these groups, you can also use the host name shown in the Hosts page to directly select a particular host for the playbook execution.

Data Log Type Appendix

The Data Log Type Appendix is run to re-generate the list of available log types for LogView.

Tooltip

This is a resource intensive operation. Run this only if the available log types sidebar of LogView is not working properly.

Docker System Prune

The Docker System Prune template is run to remove all unused docker containers, networks, and images (both dangling and unreferenced) to clear disk space.

Facet Formation Manual Run

The Facet Formation Manual Run enables you to manually run a facet formation. Run this job only when the FortiView query performance is exceptionally slow.

First, select a storage group, and then select the time to do facet formation. You can choose between starting the facet formation from the beginning, or from a specific time.

HDFS Safemode Leave

The HDFS Safemode Leave template enables you to leave the HDFS safe mode from an unexpected shutdown.

Hive Metastore Backup

The Hive Metastore Backup template creates a backup of the data in Hive Metastore and saves it to an HDFS location.

Hive Metastore Restore

The Hive Metastore Restore template restores the data in Hive Metastore from an HDFS location.

Kafka Deep Clean

The Kafka Deep Clean template deep cleans Kafka topics and reinstalls Kafka (see How to recover from an unhealthy service status).

Kafka Rebalance

The Kafka Rebalance template rebalances the data load across the Security Event Manager hosts. This is useful for when a Kafka node is decommissioned or when a new Kafka node joins or leaves the cluster. It includes replica leadership rebalance and partition rebalance. For more information, see Scaling FortiAnalyzer-BigData.

NTP Sync

The NTP Sync template performs a manual NTP time sync on all the BigData hosts. Run this job when Kudu time synchronization is unsynced (see How to recover from an unhealthy service status).

Purge Data Pipeline

This job resets the watermark and performs a clean restart of the pipeline.

Tooltip

Any unprocessed data will be lost (see How to recover from an unhealthy service status).

Custom job templates

When you select a template for your custom job, you might need to fill out additional fields depending on the template you select. The following templates require additional configuration before you can apply them.

Backup Table Validation

The Backup Table Validation template is used to verify the data integrity of the backup data at the selected location.

Select the storage group and enter the Hadoop Distributed File System (HDFS) URL for the backup location.

Custom Template

Custom templates are used to create the content for custom jobs for when built-in jobs don't meet your specific needs. You can create custom templates to operate the host, collect information, take actions, and more.

Custom templates require you to use the Ansible playbook YAML format to define the content. For information about Ansible specifications, refer to the official Ansible documentation.

The following example template collects the disk usage of the BigData Controller and sends it to a Slack channel:

- name: Collect disk usage and send to slack

hosts: controllerIp

vars:

- slack_url: "https://hooks.slack.com/services/xxxxxxx" # your slack app webhook url

tasks:

- name: Collect disk usage

command:"df -h"

register: result

- name: Send to slack

uri:

url:"{{ slack_url }}"

body:'{"text": "{{ result.stdout }}"}'

body_format: json

method: POST

The follow table shows all the Ansible inventory group names you can use as hosts values in your playbook and template. Those values are pre-populated in the Ansible inventory and are automatically applied with each execution.

  • hdfs_datanode
  • hdfs_namenode
  • kudu_tserver
  • kudu_hive_metastore
  • zookeeper
  • kafka_broker
  • impala_catalog
  • impala
  • impala_statestore
  • yarn_nodemanager
  • yarn_resource_manager
  • spark_history_server

These inventory groups can be used to select the host(s) that have the named services running.

For example, using “host: kudu_tserver” in your playbook allows it to be executed on all hosts has kudu-tserver instance.

  • hdfs_datanode_reachable
  • hdfs_namenode_reachable
  • kudu_tserver_reachable
  • kudu_reachable
  • hive_metastore_reachable
  • zookeeper_reachable
  • kafka_broker_reachable
  • impala_catalog_reachable
  • impala_reachable
  • impala_statestore_reachable
  • yarn_nodemanager_reachable
  • yarn_resource_manager_reachable
  • spark_history_server_reachable

These groups can be used to select one of the reachable hosts that belong to the named service.

For example: kudu has instances spreading on 3 hosts, and “hosts:kudu_reachable” will randomly return one that is reachable at the execution time.

  • metastore
  • datanode
  • master

These groups can be used to select hosts the belong to the named role.

  • metastore_reachable
  • datanode_reachable
  • master_reachable

These groups can be used to select a random host that is reachable at the execution time, from the ones with the named role.

  • controllerIp

This group can be used to the BigData Controller host.

In addition to these groups, you can also use the host name shown in the Hosts page to directly select a particular host for the playbook execution.

Data Log Type Appendix

The Data Log Type Appendix is run to re-generate the list of available log types for LogView.

Tooltip

This is a resource intensive operation. Run this only if the available log types sidebar of LogView is not working properly.

Docker System Prune

The Docker System Prune template is run to remove all unused docker containers, networks, and images (both dangling and unreferenced) to clear disk space.

Facet Formation Manual Run

The Facet Formation Manual Run enables you to manually run a facet formation. Run this job only when the FortiView query performance is exceptionally slow.

First, select a storage group, and then select the time to do facet formation. You can choose between starting the facet formation from the beginning, or from a specific time.

HDFS Safemode Leave

The HDFS Safemode Leave template enables you to leave the HDFS safe mode from an unexpected shutdown.

Hive Metastore Backup

The Hive Metastore Backup template creates a backup of the data in Hive Metastore and saves it to an HDFS location.

Hive Metastore Restore

The Hive Metastore Restore template restores the data in Hive Metastore from an HDFS location.

Kafka Deep Clean

The Kafka Deep Clean template deep cleans Kafka topics and reinstalls Kafka (see How to recover from an unhealthy service status).

Kafka Rebalance

The Kafka Rebalance template rebalances the data load across the Security Event Manager hosts. This is useful for when a Kafka node is decommissioned or when a new Kafka node joins or leaves the cluster. It includes replica leadership rebalance and partition rebalance. For more information, see Scaling FortiAnalyzer-BigData.

NTP Sync

The NTP Sync template performs a manual NTP time sync on all the BigData hosts. Run this job when Kudu time synchronization is unsynced (see How to recover from an unhealthy service status).

Purge Data Pipeline

This job resets the watermark and performs a clean restart of the pipeline.

Tooltip

Any unprocessed data will be lost (see How to recover from an unhealthy service status).