Tools → Data Agent

About a Data Agent

Rather than opening a VPN or SSH tunnel between your external database and your Incorta cluster, you can install and configure a Data Agent service to run on the database’s host or a host on the same subnet as the database host. Typically, the database host resides behind a corporate firewall or on another interdepartmental subnet.

A Data Agent implementation is comprised of:

The Data Agent service supports the following data sources:

The data agent service enables the extraction of data from one or more data sources behind a firewall to an Incorta cluster. This means that you can have a single data agent that connects to multiple data sources in your organization. Your Incorta cluster can reside on-premises or in the cloud.

Requirements for the Data Agent service host

Here are the requirements to run a Data Agent service on a host:

  • Minimum of 16G RAM and 4 CPU
  • Ability to install Java or OpenJDK
  • The host must not block outgoing connections
  • The Incorta cluster must allow for two additional incoming ports
  • MySQL driver
Important

The connection between Incorta and a data agent service uses TLS/SSL. Authentication and requires a valid CA certificate or self-signed certificate. To learn more about TLS/SSL, refer to Security → HTTPS for Apache Tomcat with OpenSSL. The data agent encodes data for transfer using Google’s ProtoBuf library.

The data agent service connects to the Analytics Service and Loader Service through specific ports. The data agent service host must be able to accept incoming and outgoing communications from and to Incorta services.

Set up the Data Agent

Here are the high-level steps required to get the Data Agent working:

  1. Enable the Data Agent feature and get the installation package.
  2. Prepare the Data Agent Service host.
  3. Create a data agent object in the Data Manager and generate the authentication file.
  4. Copy the authentication file to the remote host.
  5. Start the Data Agent Service.
Important

In the case of upgrading to a newer version of the Data Agent, you must stop the Data Agent service before the upgrade.

Enable the Data Agent feature and get the installation package

The steps required to enable the Data Agent and get the package vary according to the cluster installation: On-Premises or Cloud cloud.incorta.com.

Note

Starting 2024.1.x, enabling or disabling the Data Agent feature or changing the service ports no longer requires restarting the cluster or any service.

Enable the Data Agent for an On-Premises Incorta cluster

For On-Premises, you must contact Incorta Support directly for the Data Agent package.

You must also set the Data Agent configurations manually in the Cluster Management Console (CMC).

  • Sign in as the CMC Administrator.
  • In the Clusters Manager, select the cluster.
  • In the Cluster Manager, select Cluster Configurations.
  • In Server Configurations, in the left panel, select Data Agent.
  • Turn on the Enable Data Agent toggle.
  • Specify the Data Agent properties as required:
    • Analytics Data Agent Port
    • Analytics Data Agent Controller Port (required starting 2024.7.x)
    • Loader Data Agent Port
    • SQLi Data Agent Port
    • Analytics Public Hosts and Port
    • Analytics Public Controller Hosts and Ports (required starting 2024.7.x)
    • Loader Public Hosts Ports
    • SQLi Public Hosts and Port
  • Select Save.
Note

Depending on your requirements, the host IP or DNS can be a public IP, public DNS, private IP, or private DNS.

PropertyDescription
Analytics Data Agent PortThe Analytics Service listens to a data agent service on this local port
Analytics Data Agent Controller PortThe Analytics Service listens to a data agent controller service on this local port.
Loader Data Agent PortThe Loader Service listens to a data agent service on this local port
SQLi Data Agent PortThe SQL Interface Service listens to a data agent service on this local port.
This option is available for On-Premises installations only.
Analytics Public Hosts and PortsThe host IP or DNS and the port. The data agent service connects to the Analytics Service using this HOST:PORT. The connection is forwarded to the specified Analytics Data Agent Port.
Analytics Public Controller Hosts and PortsThe host IP or DNS and the port. The data agent controller service connects to the Analytics Service using this HOST:PORT. The connection is forwarded to the specified Analytics Data Agent Controller Port.
Loader Public Hosts and PortsThe host IP or DNS and the port. The data agent service connects to the Loader Service using this HOST:PORT. The connection is forwarded to the specified Loader Data Agent Port.
SQLi Public Hosts and PortsThe host IP or host DNS and the port. The data agent service connects to the SQL Interface Service using this HOST:PORT. The connection is forwarded to the specified SQLi Data Agent Port.
This option is available for On-Premises installations only.

Enable and download the Data Agent for Incorta Cloud

For an Incorta Cloud cluster, you can enable and download the Data Agent from the Cloud Admin Portal.

  • Sign in to your Cloud Admin Portal as the Cloud Administrator.
  • In the Cloud Admin Portal, select the Incorta cluster. The cluster must be connected.
  • In Cluster Details, select Configurations.
  • If not previously enabled, enable the Data Agent by switching on the Enable Data Agent toggle.
  • Select the Download Data Agent link.
Note

When you enable the Data Agent from the Cloud Admin Portal, Incorta Cloud automatically enables and configures the Server Configurations for the cluster in the CMC.

Prepare the Data Agent host

Install Java or the OpenJDK

Before installing the data agent service on a host, you must first install Java or OpenJDK. The supported versions are:

  • Oracle Java 8
  • OpenJDK 8
  • OpenJDK 11

You can download OpenJDK 11 from https://jdk.java.net/archive.

The host environment must have a JAVA_HOME system environment variable with a value set to the OpenJDK directory. The PATH environment variable must include:

  • JAVA_HOME/bin for Linux
  • %JAVA_HOME%\bin for Windows
Important

The Data Agent may not start normally on a Windows machine when the OpenJDK version is 11.0.1. The OpenJDK version must be upgraded to 11.0.2 or later.

Unzip and install the Data Agent Service

You can install the data agent service on a host machine that runs Windows or Linux.

On a Windows host

Here are the steps to install the data agent service on a Windows host:

  • Copy the incorta.dataagent-X.Y.Z.zip file to the Windows host.
  • Unzip the incorta.dataagent-X.Y.Z.zip file to any local directory on the Windows host.
On a Linux host

Here are the steps to install the data agent service on a Linux host:

  • Secure copy the ZIP file to the Linux host. Here is an example:

    HOST_IP=192.168.128.100
    HOST_KEY_FILE=private.pem
    HOST_USER=incorta
    DATA_AGENT_FILE=incorta.dataagent-1.1.0.zip
    cd ~/Downloads
    scp -i ~/.ssh/${HOST_KEY_FILE} ${DATA_AGENT_FILE} ${HOST_USER}@${HOST_IP}:/tmp
  • Secure shell into the Linux host and unzip the incorta.dataagent-X.Y.Z.zip file to any local directory.

    ssh -i ~/.ssh/${HOST_KEY_FILE} ${HOST_USER}@${HOST_IP}
  • Unzip the ZIP file.

    DATA_AGENT_FILE=incorta.dataagent-1.1.0.zip
    cd /tmp
    unzip ${DATA_AGENT_FILE}

Configure the data agent service properties

The default memory size for the data agent in releases before 2024.7.x is 2G while the default is 6G as of 2024.7.x. You can increase this by editing the options.properties file located in the unzipped data agent directory.

Here are the steps to edit it on a Linux host:

  • Secure shell into the Linux host

    HOST_IP=192.168.128.100
    HOST_KEY_FILE=private.pem
    HOST_USER=incorta
    ssh -i ~/.ssh/${HOST_KEY_FILE} ${HOST_USER}@${HOST_IP}
  • Using VIM, or similar, edit the options.properties file.

    DATA_AGENT_PATH=/tmp/incorta.datagent/
    cd $DATA_AGENT_PATH
    vim options.properties
  • Modify the memorySize property (use the i keystroke for Insert mode)

    memorySize=8G
  • Save your changes to the file (use esc keystroke to return to Read mode, and the :wq! keystroke to save).

Deploy the MySQL driver

The Data Agent requires the MYSQL driver, which will no longer be included in the Data Agent package starting 2024.1.x. You can download the MySQL jar version 5.1.48 from the Maven repository and copy it to the required directories:

  • For releases before 24.7.0, you must copy the MySQL jar to <unzipped_data_agent_path>/lib.
  • Starting 2024.7.x, you must copy the MySQL jar to the following directories:
    • <unzipped_data_agent_path>/incorta.dataagent/lib
    • <unzipped_data_agent_path>/incorta.dataagent.controller/lib

Releases 2024.1.4 and 2024.7.x have introduced two scripts that help you have the previous steps automated. From the unzipped incorta.dataagent directory, run one of the following scripts depending on the OS of the machine you install the Data Agent on:

  • For Windows, run patch-mysql.bat <unzipped_data_agent_path>.
  • For Linux, run ./patch-mysql.sh <unzipped_data_agent_path>.

These scripts download the MySQL jar file version 5.1.48 from the Maven repository and mainly deploy it to the required directories.

Check <unzipped_data_agent_path>/patch-mysql.log to inspect the script's output.

Note

If you already have the MySQL driver downloaded, you can use the script to only copy the jar file to the required directories. Add the jar location to the script as follows:

  • For Windows, run patch-mysql.bat <unzipped_data_agent_path> <mysql_jar_location>.
  • For Linux, run ./patch-mysql.sh <unzipped_data_agent_path> <mysql_jar_location>.

Create a data agent in the Data Manager

You create a data agent in the Data Manager to authenticate and monitor a remote data agent service. Only a Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role for a given tenant can create a data agent instance that connects to a data agent service. A user that belongs to a group with the Schema Manager role can view the list of data agents in the Data Manager.

When you create a data agent in the Data Manager, you can generate and download an encrypted authentication file. The data agent service on the remote, on-premises host requires the generated .auth file.

Here are the steps to create a data agent in the Data Manager and generate the authentication file:

  • Sign in to the tenant as the Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role.
  • In the Navigation bar, select Data.
  • In the Action bar, select + New > Add Data Agent.
  • In the Create Data Agent dialog, enter a Data Agent Name and optionally enter a description.
  • In the Generate Authentication File dialog, select Generate Now.
Note

You can regenerate an authentication file from the Data Manager for a given data agent. The file contains all the information the data agent service needs to connect to the Incorta cluster. The file includes information regarding the various hosts, ports, and TLS/SSL certificates.

Copy the authentication file to the remote host

You must then copy the .auth file to the conf directory of the data agent service installation directory.

Starting 2024.7.x, you must copy the .auth file to the following directories:

  • <unzipped_data_agent_path>/incorta.dataagent/conf
  • <unzipped_data_agent_path>/incorta.dataagent.controller/conf

To copy the authentication file to the remote host:

  • Secure copy or upload the .auth file to the Linux or Windows remote host.
  • Move the .auth file to the conf the required directories depending on your Incorta release.

Start the data agent service

The steps required to start the data agent service vary according to the Incorta release. For releases before 2024.7.x, you start the data agent service via a script. Starting 2024.7.x, you start the Data Agent controller via a script, and then use the Data Manager in the Analytics platform to start, stop, or restart a data agent service.

Start the data agent service via a script for a Windows host (before 2024.7.x)

It is recommended that you use a service helper utility to monitor the data agent’s state and automatically restart it if it goes down. Here are the steps:

  • Install a service helper utility if you do not already have one.
  • Create a new service for the data agent and provide the full path to the agent.bat file.
  • Start the service.
  • Sign out from the host and sign back in to test that the service is still running.
  • View the events for the service in the Event Viewer.

Start the data agent service via a script for a Linux host (before 2024.7.x)

  • Secure shell into the remote host.
  • Navigate to the installation directory of the data agent service.
  • Run ./agent.sh start.
Note

Before upgrading to a newer data agent version, you must stop the data agent service first: Run the ./agent.sh stop command.

Confirm data agent service connection in the Data Manager

After starting the data agent, you can check its status in the Data Manager.

  • Sign in to the tenant as the Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role.
  • In the Navigation bar, select Data.
  • In the Action bar tab, select Data Agents.
  • Verify the status of the data agent as connected

Start the data agent via the Analytics platform (starting 2024.7.x)

To use the Data Manager to start or stop the data agent service on the remote host, you must start the Data Agent Controller.

Here are the required steps to start the Controller:

  1. On the remote host, navigate to the Data Agent’s unzipped directory.
  2. In the incorta.dataagent.controller directory, run one of the following scripts depending on the host machine’s OS:
    • Linux:./bin/controller.sh start
    • Windows: bin/controller.bat start
Note

You can also use a service helper utility on a Windows host to monitor the state of the Data Agent Controller and automatically restart it if it goes down.

After restarting the Controller, you can use the Data Manager to start the data agent. Here are the steps:

  • Sign in to the tenant as the Tenant Administrator (Super User) or user that belongs to a group with the SuperRole role.
  • In the Navigation bar, select Data.
  • In the Action bar tab, select Data Agents.
  • For the data agent you have created, select Start.

Create or edit an external data source using the data agent

You can now create a new or edit an existing external data source using the data agent. To learn more about how to create and edit an external data source, review Tools → Data Manager.

  • In the Create Data Source or Edit Data Source dialog, enable the Use Data Agent toggle.
  • For the Data Agent property, in the dropdown list, select the data agent.
  • Specify a connection string that is accessible to the host of the data agent service that includes the Private IP or Private DNS such as 127.0.0.1 for a local host or 192.168.128.100 (replace as required) for a database that is on the same subnet as the data agent host.