This document provides information about the CSV Data Management connector using which you can perform different operations on CSV files like reading the CSV file, performing deduplication, merging two CSV files, joining two CSV files, concatenating two CSV files, and returning well-formatted datasets.
This connector uses functionality from the 'Polars', 'Pandas', and 'Numpy' python modules for merging, joining, and concatenating CSV files. The CSV file data is converted into a data frame and then processed. To know more about these operations see the following:
https://pola-rs.github.io/polars-book/user-guide/
https://pandas.pydata.org/docs/user_guide/merging.html
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html
Connector Version: 1.2.0
FortiSOAR™ Version Tested on: 7.4.0-3024
Authored By: Fortinet
Certified: Yes
The following enhancements have been made to the CSV Data Management connector in version 1.2.0:
Use the Content Hub to install the connector. For the detailed procedure to install a connector, click here.
You can also use the following yum command as a root user to install connectors from an SSH session:
yum install cyops-connector-csv-data-management
You do not require to configure this connector since it performs different operations on CSV files like reading files, performing deduplication, merging two CSV files, etc. For the description of the Content Hub and other details, click here.
The following automated operations can be included in playbooks and you can also use the annotations to access operations:
| Function | Description | Annotation and Category |
|---|---|---|
| Extract Data from Single CSV | Extracts data from a CSV file based on the specified column names and other input parameters. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_csv_file Investigation |
| Merge and Extract Data from two CSV | Extracts data from CSV files based on the specified column names and other input parameters, by merging two CSV files. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_and_merge_csv_file Investigation |
| Concat and Extract Data from two CSV | Extracts data from CSV files based on the specified column names and other input parameters, by concatenating two CSV files. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_and_concat_csv_file Investigation |
| Join and Extract Data from two CSV | Extracts data from CSV files based on the specified column names and other input parameters, by joining two CSV files. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_and_join_csv_file Investigation |
| Convert JSON to CSV | Converts the output of a playbook step, which is in the JSON format, and converts it to the CSV file. NOTE: The JSON provided in the specified playbook step output must be simple JSON for this operation to work; if the input is a complex JSON this operation will fail to create the CSV file. |
json_to_csv Investigation |
All database operations such as filtering of datasets, deduplication of values, etc. occur on the dataset that is the result of the concat, join, or merge actions. The following points should be noted:
| Parameter | Description |
|---|---|
| Type |
Select the method using which you want to submit the CSV file whose data you want to extract. You can choose between Attachment ID and File IRI. |
| Reference ID |
Specify the reference ID of the file based on the 'Type' you have selected.
|
| Column Names | Specify the comma-separated column names that you want to extract from the specified CSV file. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV file. |
| Number of rows to skip | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
| Parameter | Description |
|---|---|
| Type | Select the method using which you want to submit the first CSV file that you want to merge and extract their data. You can choose between Attachment ID and File IRI. |
| First File Reference ID |
Specify the reference ID of the first CSV file based on the 'Type' you have selected.
|
| First File Column Names | Specify the comma-separated column names that you want to extract from the specified first CSV file. |
| Number of rows to skip from First File | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Type | Select the method using which you want to submit the second CSV file that you want to merge and extract their data. You can choose between Attachment ID and File IRI. |
| Second File Reference ID |
Specify the reference ID of the second CSV file based on the 'Type' you have selected.
|
| Second File Column Names | Specify the comma-separated column names that you want to extract from the specified second CSV file. |
| Number of rows to skip from Second File | Specify the number of rows you want to skip from the top of the second specified CSV file. Note: The first row is skipped even if it has column names. |
| Merge on Column | Specify the column name that is common in the two specified CSV files using which you want to merge the data from both files. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV files. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
| Parameter | Description |
|---|---|
| Type | Select the method using which you want to submit the first CSV file that you want to concat and extract their data. You can choose between Attachment ID and File IRI. |
| First File Reference ID |
Specify the reference ID of the first CSV file based on the 'Type' you have selected.
|
| First File Column Names | Specify the comma-separated column names that you want to extract from the specified first CSV file. |
| Number of rows to skip from First File | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Type | Select the method using which you want to submit the second CSV file that you want to concat and extract their data. You can choose between Attachment ID and File IRI. |
| Second File Reference ID |
Specify the reference ID of the second CSV file based on the 'Type' you have selected.
|
| Second File Column Names | Specify the comma-separated column names that you want to extract from the specified second CSV file. |
| Number of rows to skip from Second File | Specify the number of rows you want to skip from the top of the second specified CSV file. Note: The first row is skipped even if it has column names. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV files. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
| Parameter | Description |
|---|---|
| Type | Select the method using which you want to submit the first CSV file that you want to join and extract their data. You can choose between Attachment ID and File IRI. |
| First File Reference ID |
Specify the reference ID of the first CSV file based on the 'Type' you have selected.
|
| First File Column Names | Specify the comma-separated column names that you want to extract from the specified first CSV file. |
| Number of rows to skip from First File | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Type | Select the method using which you want to submit the second CSV file that you want to join and extract their data. You can choose between Attachment ID and File IRI. |
| Second File Reference ID |
Specify the reference ID of the second CSV file based on the 'Type' you have selected.
|
| Second File Column Names | Specify the comma-separated column names that you want to extract from the specified second CSV file. |
| Number of rows to skip from Second File | Specify the number of rows you want to skip from the top of the second specified CSV file. Note: The first row is skipped even if it has column names. |
| Merge on Column | Specify the column name that is common in the two specified CSV files using which you want to merge the data from both files. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV files. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
NOTE: The JSON provided in the specified playbook step output must be simple JSON for this operation to work; if the input is a complex JSON this operation will fail to create the CSV file.
| Parameter | Description |
|---|---|
| Input Type |
Select the method using which you want to submit the playbook whose step output you want to convert from the JSON format to the CSV format. You can choose between JSON, Attachment ID, or File IRI.
|
No output schema is available at this time.
The Sample - CSV Data Management - 1.2.0 playbook collection comes bundled with the CSV Data Management connector. These playbooks contain steps using which you can perform all supported actions. You can see bundled playbooks in the Automation > Playbooks section in FortiSOAR™ after importing the CSV Data Management connector.
Note: If you are planning to use any of the sample playbooks in your environment, ensure that you clone those playbooks and move them to a different collection since the sample playbook collection gets deleted during the connector upgrade and delete.
This document provides information about the CSV Data Management connector using which you can perform different operations on CSV files like reading the CSV file, performing deduplication, merging two CSV files, joining two CSV files, concatenating two CSV files, and returning well-formatted datasets.
This connector uses functionality from the 'Polars', 'Pandas', and 'Numpy' python modules for merging, joining, and concatenating CSV files. The CSV file data is converted into a data frame and then processed. To know more about these operations see the following:
https://pola-rs.github.io/polars-book/user-guide/
https://pandas.pydata.org/docs/user_guide/merging.html
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html
Connector Version: 1.2.0
FortiSOAR™ Version Tested on: 7.4.0-3024
Authored By: Fortinet
Certified: Yes
The following enhancements have been made to the CSV Data Management connector in version 1.2.0:
Use the Content Hub to install the connector. For the detailed procedure to install a connector, click here.
You can also use the following yum command as a root user to install connectors from an SSH session:
yum install cyops-connector-csv-data-management
You do not require to configure this connector since it performs different operations on CSV files like reading files, performing deduplication, merging two CSV files, etc. For the description of the Content Hub and other details, click here.
The following automated operations can be included in playbooks and you can also use the annotations to access operations:
| Function | Description | Annotation and Category |
|---|---|---|
| Extract Data from Single CSV | Extracts data from a CSV file based on the specified column names and other input parameters. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_csv_file Investigation |
| Merge and Extract Data from two CSV | Extracts data from CSV files based on the specified column names and other input parameters, by merging two CSV files. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_and_merge_csv_file Investigation |
| Concat and Extract Data from two CSV | Extracts data from CSV files based on the specified column names and other input parameters, by concatenating two CSV files. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_and_concat_csv_file Investigation |
| Join and Extract Data from two CSV | Extracts data from CSV files based on the specified column names and other input parameters, by joining two CSV files. Optionally, you can also select an option to deduplicate the resultant recordset based on the specified column(s). | read_and_join_csv_file Investigation |
| Convert JSON to CSV | Converts the output of a playbook step, which is in the JSON format, and converts it to the CSV file. NOTE: The JSON provided in the specified playbook step output must be simple JSON for this operation to work; if the input is a complex JSON this operation will fail to create the CSV file. |
json_to_csv Investigation |
All database operations such as filtering of datasets, deduplication of values, etc. occur on the dataset that is the result of the concat, join, or merge actions. The following points should be noted:
| Parameter | Description |
|---|---|
| Type |
Select the method using which you want to submit the CSV file whose data you want to extract. You can choose between Attachment ID and File IRI. |
| Reference ID |
Specify the reference ID of the file based on the 'Type' you have selected.
|
| Column Names | Specify the comma-separated column names that you want to extract from the specified CSV file. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV file. |
| Number of rows to skip | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
| Parameter | Description |
|---|---|
| Type | Select the method using which you want to submit the first CSV file that you want to merge and extract their data. You can choose between Attachment ID and File IRI. |
| First File Reference ID |
Specify the reference ID of the first CSV file based on the 'Type' you have selected.
|
| First File Column Names | Specify the comma-separated column names that you want to extract from the specified first CSV file. |
| Number of rows to skip from First File | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Type | Select the method using which you want to submit the second CSV file that you want to merge and extract their data. You can choose between Attachment ID and File IRI. |
| Second File Reference ID |
Specify the reference ID of the second CSV file based on the 'Type' you have selected.
|
| Second File Column Names | Specify the comma-separated column names that you want to extract from the specified second CSV file. |
| Number of rows to skip from Second File | Specify the number of rows you want to skip from the top of the second specified CSV file. Note: The first row is skipped even if it has column names. |
| Merge on Column | Specify the column name that is common in the two specified CSV files using which you want to merge the data from both files. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV files. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
| Parameter | Description |
|---|---|
| Type | Select the method using which you want to submit the first CSV file that you want to concat and extract their data. You can choose between Attachment ID and File IRI. |
| First File Reference ID |
Specify the reference ID of the first CSV file based on the 'Type' you have selected.
|
| First File Column Names | Specify the comma-separated column names that you want to extract from the specified first CSV file. |
| Number of rows to skip from First File | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Type | Select the method using which you want to submit the second CSV file that you want to concat and extract their data. You can choose between Attachment ID and File IRI. |
| Second File Reference ID |
Specify the reference ID of the second CSV file based on the 'Type' you have selected.
|
| Second File Column Names | Specify the comma-separated column names that you want to extract from the specified second CSV file. |
| Number of rows to skip from Second File | Specify the number of rows you want to skip from the top of the second specified CSV file. Note: The first row is skipped even if it has column names. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV files. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
| Parameter | Description |
|---|---|
| Type | Select the method using which you want to submit the first CSV file that you want to join and extract their data. You can choose between Attachment ID and File IRI. |
| First File Reference ID |
Specify the reference ID of the first CSV file based on the 'Type' you have selected.
|
| First File Column Names | Specify the comma-separated column names that you want to extract from the specified first CSV file. |
| Number of rows to skip from First File | Specify the number of rows you want to skip from the top of the first specified CSV file. Note: The first row is skipped even if it has column names. |
| Type | Select the method using which you want to submit the second CSV file that you want to join and extract their data. You can choose between Attachment ID and File IRI. |
| Second File Reference ID |
Specify the reference ID of the second CSV file based on the 'Type' you have selected.
|
| Second File Column Names | Specify the comma-separated column names that you want to extract from the specified second CSV file. |
| Number of rows to skip from Second File | Specify the number of rows you want to skip from the top of the second specified CSV file. Note: The first row is skipped even if it has column names. |
| Merge on Column | Specify the column name that is common in the two specified CSV files using which you want to merge the data from both files. |
| Deduplicate Values on | Specify the column name using which you want to deduplicate data from the specified CSV files. |
| Filter Dataset | (Optional) Select the method using which you want to filter data. You can choose to filter column data using regex as a filter (On Values Matching a Regex) or using the 'is in' filter (On Specified values). If you choose the 'On Values Matching a Regex' option, then you must specify the following parameters:
|
| Convert recordset into batch | Select this option to return rows as recordsets in a list of 20 batches. If this option is left cleared, then the complete result is returned in a single recordset. |
| Save as attachment | Select this option to save the resultant recordSet as an attachment in the CSV format. |
No output schema is available at this time.
NOTE: The JSON provided in the specified playbook step output must be simple JSON for this operation to work; if the input is a complex JSON this operation will fail to create the CSV file.
| Parameter | Description |
|---|---|
| Input Type |
Select the method using which you want to submit the playbook whose step output you want to convert from the JSON format to the CSV format. You can choose between JSON, Attachment ID, or File IRI.
|
No output schema is available at this time.
The Sample - CSV Data Management - 1.2.0 playbook collection comes bundled with the CSV Data Management connector. These playbooks contain steps using which you can perform all supported actions. You can see bundled playbooks in the Automation > Playbooks section in FortiSOAR™ after importing the CSV Data Management connector.
Note: If you are planning to use any of the sample playbooks in your environment, ensure that you clone those playbooks and move them to a different collection since the sample playbook collection gets deleted during the connector upgrade and delete.