Data Applications → Oracle EBS for Google Cortex
Setting up Incorta to populate Google Cortex
Incorta brings your data into BigQuery so that you can take advantage of the business analytics capabilities provided by any Cortex applications.
Installing Incorta and hooking it up to your EBS instance is a relatively simple process and can be accomplished quickly so that you can start realizing value.
High-level steps
- Install the Incorta VM in GCP
- Deploy the Data Agent on the EBS server
- Import and configure the Incorta tenant for EBS
- Configure the Incorta Data Destination to write to BigQuery
- Load the Incorta Schemas
- Data arrives in BigQuery
Deploy Incorta Developer Edition
Get started by deploying Incorta Developer Edition on GCP using a VM instance and a GCS bucket for your Incorta Tenant directory. The process uses Terraform to set up the environment and create all the needed resources. This approach allows you to speed up the installation process from hours to minutes.
Installation prerequisites
- You must have an active GCP account, and your user must have permission to create the following resources.
- VPC network and Subnet
- External IP
- Firewall rule
- GCS bucket
- Custom IAM Role
- Service account and key
- VM instance
These permissions are only needed during environment setup and configuration.
Installation packages
- Install the gcloud CLI
- Install Terraform
- Clone incorta-developer-edition github repository
- Terraform files - Request these from your Incorta representative
- outputs.tf: Defines the outputs of your Terraform setup
- variables.tf: Contains the declarations for variables used in the configuration
- vm.tf: Contains the configuration for the GCP Virtual Machine
- network.tf: Contains the configuration for the GCP network components
- provider.tf: Contains the configuration for the GCP provider
- gcs-bucket.tf: Contains the configuration for the GCS Storage Bucket
- variables.tfvars.example: An example variables file
Resources that need to be created
- VPC network
- Subnetwork
- External IP
- Firewall rules
- Open ports 22, 6060, 8080
- By default, the firewall resource range is 0.0.0.0/0 (configurable via var file)
- GCS bucket
- Custom IAM Role permission
["storage.buckets.get", "storage.objects.create", "storage.objects.get", "storage.objects.list","storage.objects.delete","storage.objects.getIamPolicy","storage.objects.setIamPolicy","storage.objects.update"]
- Service account and key
- Use Custom IAM Role to access GCS bucket
- VM instance
Instructions
After cloning incorta-developer-edition github repository, navigate to the gcp directory under incorta-developer-edition directory.
cd incorta-developer-edition/gcp/
Authenticate GCP account
You must authenticate your Google Cloud Platform (GCP) account to allow Terraform to interact with it. To do this, run the following command.
gcloud auth login
Use the prompts to log in to your GCP account.
Configure your Incorta environment
The files under your gcp directory contain the configuration for the environment. Rename variables.tfvars.example to variables.tfvars and update it with your specific values.
vi variables.tfvarsproject = "gcp-project-id"provider-region = "us-central1"provider-zone = "us-central1-f"subnet-range = "10.0.0.0/8"machine-type = "e2-standard-16"private-key = "/home/ubuntu/.ssh/id_rsa"public-key = "/home/ubuntu/.ssh/id_rsa.pub"user = "ubuntu"firewall-source-ranges = ["0.0.0.0/0"]email = "admin@example.com"environment-name = "example"#### Using exist VPC and Subnet ###### if you want to use an exist VPC and Subnet , please update exist-vpc-name and exist-subnet-name with VPC and Subnet name# if you want TF to create a new VPC and Subnet , please keep exist-vpc-name and exist-subnet-name emptyexist-vpc-name = ""exist-subnet-name = ""
Initialize Terraform
Terraform needs to be initialized in your working directory. In the gcp directory containing the terraform files, run:
terraform init
This command initializes various local settings and data that will be used by subsequent commands.
Check the Terraform creation plan
To show the infrastructure creation plan, run the following:
terraform plan --var-file=variables.tfvars
Terraform will output the infrastructure plan and request a confirmation. If the command is successful, you can proceed to the next step to apply the configuration.
Apply the configuration
To create the infrastructure, run:
terraform apply --var-file=variables.tfvars
Terraform will output the proposed infrastructure plan and request a confirmation. Type yes to proceed. If the command is successful, Terraform will create the infrastructure.
Access the environment
After the infrastructure is created, Terraform will output the Incorta and CMC URLs. Your credentials will be in the outputs as well. Open a web browser and go to the Incorta Developer edition using the provided Incorta URL.
Terminate environment resources
To terminate the infrastructure, run:
terraform destroy --var-file=variables.tfvars
Configure memory allocation
Because the need for access to the Incorta UI is minimal in this use case, the allocation of resources may need to be modified. More resources should be given to the loader service since it is doing the majority of the work.
Here are recommendations for how to allocate resources with an Incorta server with 128GB of RAM. |Allocated to |Amount | |--- |--- | | Operating System | 8GB | | Spark | 20GB |
Loader
Allocaed to | Amount |
---|---|
On heap | 30GB |
Off heap | 50 GB |
CPU | 50% |
Analytics
Allocated to | Amount |
---|---|
On heap | 5GB |
Off heap | 15 GB |
CPU | 25% |
If you are using a different sized server, use the above table as a guideline and allocate resources proportionately.
Prepare Incorta to connect to Google BigQuery
Once the installation of Incorta is done, you will need to complete a couple of steps to prepare it to push your EBS data to BigQuery/Cortex.
At a high level, you will follow these steps to get Incorta talking to BigQuery:
- Import and configure the EBS tenant
- Configure Incorta to write to Google BigQuery via a Data Destination
- Load the schemas to push the source EBS data first to Incorta and then to BigQuery
Import and configure the Incorta Tenant for EBS
Incorta has prepared a data application that contains a preconfigured tenant, the data model, and other configurations required for Cortex EBS data applications.
Install the Oracle EBS Data Application
Deploy the data application for Oracle EBS Order to Cash for BigQuery which requires the following ZIP file:
InstallBQEBSVer<VERSION_NUMBER>.zip
The Oracle EBS data application requires the import of a tenant file to an Incorta Cluster. A CMC Administrator can perform the tenant import. A systems administrator with root access to the host can copy the tenant data files, configure Spark, and configure the necessary data sources.
Here are the steps to install the Oracle EBS data application:
- Contact Incorta Support to obtain the Oracle EBS data application ZIP file. Download and unzip the file.
- Within the unzipped Oracle EBS data application file, locate the
ebsVer<VERSION_NUMBER>.zip
file.
Import the EBS Data Application tenant
- Import the
ebsVer<VERSION_NUMBER>.zip
tenant:- Sign in to the CMC
- In the Navigation bar, select Clusters
- In the cluster list, select a Cluster name
- In the canvas tabs, select Tenants
- Select + → Import Tenant (if it asks for a security password, enter
admin
). - Drag and drop the
ebsVer<VERSION_NUMBER>.zip
to the Click or drag a file here to upload panel in the Import a tenant to the cluster dialog - Select Next
- Verify or enter the tenant properties.
- Verify or enter the tenant email properties.
- Select Create
Tenant properties
Property | Control | Description |
---|---|---|
Name | text box | Enter the tenant name. Select Check to determine if a Tenant already exists with the name entered. |
Username | text box | Enter the username for the Super User |
Password | text box | Enter the password for the Super User |
text box | Enter the email address for the Super User | |
Path | text box | Enter the shared storage path for tenant related data |
Pause scheduled jobs | toggle | Enable this property if the imported tenant will have all scheduled jobs paused on import |
Tenant email properties
Property | Control | Description |
---|---|---|
Sender’s Username Auth | toggle | Enable this property if the email requires username authentication |
System Email Username | text box | Enable Sender’s Username Auth to configure this property. Enter the username for the system email. |
System Email Address | text box | Enter the system email address |
System Email Password | text box | Enter the password for the system email address |
SMTP Host | text box | Enter the Simple Mail Transfer Protocol (SMTP) host for the system email |
SMTP Port | text box | Enter the SMTP port number |
Share Notifications | toggle | Enable this property to share notifications |
Import the EBS Data Application tenant data files
The EBS data application package contains a few CSV data files in the data folder that you must import by navigating to Incorta UI->Data-Local Data Files and selecting New→Add Data Source→Upload Data files. Drag those files into the popup window and select the option Overwrite Duplicates.
Configure Spark for the EBS Data Application
Allocate 10 GB or more to the Spark Worker process and restart Spark:
- Edit the
<INCORTA__INSTALLATION_PATH>/IncortaNode/spark/conf/spark-env.sh
file with vim. - Set the SPARK_WORKER_MEMORY value to 10g (or greater)
SPARK_WORKER_MEMORY = 10g
- Restart Sparkcd /<incorta home>//IncortaAnalytics/IncortaNode./stopSpark.sh./startSpark.sh
- Check the Spark configuration using the Spark Master UI:
http://<HOST_NAME>:9091
Configure the EBS Data Application data source
- A data agent is needed as Incorta will use it to run table extract SQL statements on the EBS source Oracle database which will be outside GCP
- Deploy the Data Agent on the EBS source. Refer to this document on how to install, configure, and test it
- Sign in to the Incorta Direct Data Platform™
- In the Navigation bar, select Data
- In the Context bar, select the External Data Source tab
Edit the EBS data source
- Select Edit (pencil icon) to the right of the EBS data source
- Enter the connection properties for your EBS instance:
- Username
- Password
- Connection Pool: 10
- Connection String:
jdbc:oracle:thin:@<HOST>:<PORT>/<DATABASE>
- Test the connection, if the data agent is correctly set up, then you will see a success message
Configure the initial extract date
Optionally configure the _$$Default_InitialExtractDate_
global variable to store the initial extract date or rolling period. This will limit rows for large transaction tables during full load.
- In the Navigation bar, select Schema
- In the Schema Manager, select the Global Variables tab
- In the list view of global variables, select the Default_InitialExtractDate global variable to open it
- Update the Value
- Select OK
- In the EBS_ schema tables there is a filter on the sql extract as follows -
WHERE creation_date >= $$Default_InitialExtractDate
as shown below
Configure the Incorta Data Destination to write to BigQuery
- In the Incorta UI, navigate to Data → Data Destinations and find the GoogleBigQuery data destination. Refer to the Connecting and Pushing Data to BigQuery article on the Incorta Community for instructions.
- By default, the GoogleBigQuery data destination is assigned to the schemas, and the target schema is set to CORTEX_ORACLE_CDC. You must update the target schema name to the value defined in the Cortex config.json file. Incorta will create and write the dataset to BigQuery when the schemas are loaded. Please verify that these settings have been configured for all the schemas that are sent to BigQuery.
Loading data in Incorta
In the Incorta UI, navigate to Scheduler → Load Plans and schedule the load plan for your EBS_ cortex application_.
By default, the load plans are set to incremental, allowing you to load just the latest data changes in EBS. You can schedule the timing of your incremental loads by navigating to Scheduler → Load Plans → selecting your EBS application → Load Plan. Click on the Change link to open the Schedule UI, allowing you to schedule the interval for your Incremental loads.
The data is loaded in two steps.
Load the full data history
The first time a load plan runs, it will execute a full load of each included schema, even if configured to run incrementally. This happens because there is no data in Incorta to append to, so it will pick up everything that is available per the definition of each dataset. This will take care of loading historical data. It is also possible to run Full Loads directly. Note that full loads will remove all data from the tables that are being populated before reloading them fully.
To monitor a load plan, navigate to Schema → Load Plan. You can check the row counts from the various stages of the load - Extracted, Rejected , Loaded, and Sent to Destination.
Incremental loads
After the initial load, Incorta will continue to run the following loads in Incremental mode. Incremental loads handle upserts, i.e. updated records and inserted records, which are pulled into Incorta and then immediately pushed to BigQuery whenever an incremental load executes.
Load failures
If a load from a source table fails in Incorta, incremental data will not write to BigQuery for Cortex to pick up. Incorta can send an alert to a designated email address to notify an administrator of a failure.
Configure the Incorta tenant to be able to send an email if you wish to receive alerts. Then follow these instructions to set up notifications for the Incorta schemas you wish to track.
Make sure to select Notify Upon Failure and populate the Recipients field with an appropriate email address to monitor load jobs.
Incorta will not attempt to reload the data until the next scheduled load in case of a load failure. You can review the logs in Incorta to determine the cause of the issue and take any necessary steps to address it. If further assistance is required, you can work with Incorta Support to resolve the issue. Once the issue is resolved, data will be caught up the next time a scheduled load occurs. Alternatively, you can manually run a load at the schema or individual table level to catch up on the data.
Handling source deletes
When loading data incrementally, Incorta inserts new records and updates existing records that have already been loaded, but by default it does not delete records. Source systems like Oracle EBS, however, may allow users to delete data. This means that if records are deleted in the source, those records may still exist in Incorta. Fortunately, there are options for how to manage source deleted records in Incorta and subsequently BigQuery.
Starting with version 2024.7.x, Incorta has a feature that allows you to purge records from Incorta that have been deleted in the source system. The purge feature is implemented by creating an exclusion set to compare against the table in Incorta that has deletes to track and then running a purge job. The purge job will physically delete records from parquet and memory and then push the corresponding delete statements to BigQuery.
Delete handling setup
For each table that needs delete handling, the first step is to create a corresponding skinny table that contains only the primary key column(s) for the table and that is always loaded in full. It is recommended that these tables be placed into a separate schema whose only purpose is to hold these types of tables.
The second step is to create a Materialized View (MV) that will compare the PK table against the base table to create the exclusion set which represents the records that have been deleted in the source table. The MV can be placed in the schema with the PK table. Here is an example of the MV code you would write to create the table that holds your exclusion set.
SELECT I.INVOICE_IDFROM EBS_AP.AP_INVOICES_ALL IANTI JOIN PK_SCHEMA.AP_INVOICES_ALL_PK PON I.INVOICE_ID = P.INVOICE_ID
Configure delete handling
Next, configure your table to use the exclusion set table that you have created by opening the Advanced Settings tab.
Toggle on the Synchronizing delete operations option and select the exclusion set identifier schema and table.
Define the column mappings between the target table and the exclusion set and save.
Purge deleted records
The final step to delete records from Incorta is to run a purge job. This can be run table by table as needed or can be scheduled from a separate Load Plan. The purge job will remove records from Incorta and will send delete statements to BigQuery to sync up the corresponding table in the CDC layer.
Pre-Seeded delete handling configuration
Most Oracle EBS customers do not delete records from their EBS tables, so delete handling is not enabled automatically. Incorta delivers pre-seeded schemas for delete handling with sample PK tables and Exclusion Set MVs that can be enabled and used as models for setting up delete handling with additional tables if needed.
To enable the delivered configuration for the tables used, follow the instructions in the configure delete handling section above. To add delete handling for ned tables, follow the instructions starting above starting with the Delete handling setup section and using the provided samples as models.
Appendix
This is the full list of Incorta Schema Tables that are used for Cortex Oracle EBS Order to Cash. These tables are pushed to BigQuery as part of the scheduled load plans. | Schema Name | Table Name | |------------------|--------------------------------| | EBS_AR | AR_ADJUSTMENTS_ALL | | EBS_AR | AR_AGING_BUCKET_LINES_B | | EBS_AR | AR_AGING_BUCKET_LINES_TL | | EBS_AR | AR_AGING_BUCKETS | | EBS_AR | AR_BATCH_SOURCES_ALL | | EBS_AR | AR_BATCHES_ALL | | EBS_AR | AR_CASH_RECEIPT_HISTORY_ALL | | EBS_AR | AR_CASH_RECEIPTS_ALL | | EBS_AR | AR_COLLECTORS | | EBS_AR | AR_DISTRIBUTIONS_ALL | | EBS_AR | AR_PAYMENT_SCHEDULES_ALL | | EBS_AR | AR_RECEIPT_METHODS | | EBS_AR | AR_RECEIVABLE_APPLICATIONS_ALL | | EBS_AR | AR_RECEIVABLES_TRX_ALL | | EBS_AR | RA_BATCH_SOURCES_ALL | | EBS_AR | RA_CM_REQUESTS_ALL | | EBS_AR | RA_CUST_TRX_LINE_GL_DIST_ALL | | EBS_AR | RA_CUST_TRX_TYPES_ALL | | EBS_AR | RA_CUSTOMER_TRX_ALL | | EBS_AR | RA_CUSTOMER_TRX_LINES_ALL | | EBS_AR | RA_RULES | | EBS_AR | RA_SALESREPS_ALL | | EBS_AR | RA_TERMS_B | | EBS_CAL_COMMON | GL_DATE_PERIOD_MAP | | EBS_CAL_COMMON | GL_PERIOD_SETS | | EBS_CAL_COMMON | GL_PERIOD_TYPES | | EBS_CAL_COMMON | GL_PERIODS | | EBS_FIN_COMMON | AP_BANK_ACCOUNTS_ALL | | EBS_FIN_COMMON | CHARTOFACCOUNT | | EBS_FIN_COMMON | FINANCIALS_SYSTEM_PARAMS_ALL | | EBS_FIN_COMMON | GL_ACCOUNT_SEG | | EBS_FIN_COMMON | GL_BALANCING_SEG | | EBS_FIN_COMMON | GL_CODE_COMBINATIONS | | EBS_FIN_COMMON | GL_CODE_COMBINATIONS_KFV | | EBS_FIN_COMMON | GL_COST_CENTER_SEG | | EBS_FIN_COMMON | GL_DAILY_CONVERSION_TYPES | | EBS_FIN_COMMON | GL_DAILY_RATES | | EBS_FIN_COMMON | GL_LEDGERS | | EBS_FIN_COMMON | GL_SETS_OF_BOOKS | | EBS_FIN_COMMON | GLCURRENCYTYPES | | EBS_FND_COMMON | FND_APPLICATION | | EBS_FND_COMMON | FND_CURRENCIES | | EBS_FND_COMMON | FND_DESCR_FLEX_COLUMN_USAGES | | EBS_FND_COMMON | FND_LOOKUP_TYPES | | EBS_FND_COMMON | FND_LOOKUP_VALUES | | EBS_FND_COMMON | FND_TERRITORIES | | EBS_FND_COMMON | FND_USER | | EBS_FND_COMMON | FND_USER_RESP_GROUPS | | EBS_HR_COMMON | HR_ALL_ORGANIZATION_UNITS | | EBS_HR_COMMON | HR_LOCATIONS_ALL | | EBS_HR_COMMON | HR_OPERATING_UNITS | | EBS_HR_COMMON | JTF_LOC_HIERARCHIES_VL | | EBS_HR_COMMON | ORG_ORGANIZATION_DEFINITIONS | | EBS_HR_COMMON | PER_ALL_ASSIGNMENTS_F | | EBS_HR_COMMON | XLE-ENTITY_PROFILES | | EBS_HR_COMMON | PER_ALL_PEOPLE_F | | EBS_HR_COMMON | PER_ALL_PEOPLE_X | | EBS_ITEM_COMMON | CST_ITEM_COSTS | | EBS_ITEM_COMMON | ITEM_FUNCTIONAL_AREA | | EBS_ITEM_COMMON | MTL_CATEGORIES_B | | EBS_ITEM_COMMON | MTL_CATEGORIES_B_KFV | | EBS_ITEM_COMMON | MTL_CATEGORY_SETS_B | | EBS_ITEM_COMMON | MTL_CATEGORY_SETS_TL | | EBS_ITEM_COMMON | MTL_DEFAULT_CATEGORY_SETS | | EBS_ITEM_COMMON | MTL_ITEM_CATALOG_GROUPS | | EBS_ITEM_COMMON | MTL_ITEM_CATEGORIES | | EBS_ITEM_COMMON | MTL_ITEM_STATUS | | EBS_ITEM_COMMON | MTL_PARAMETERS | | EBS_ITEM_COMMON | MTL_SECONDARY_INVENTORIES | | EBS_ITEM_COMMON | MTL_SYSTEM_ITEMS_B | | EBS_ITEM_COMMON | MTL_SYSTEM_ITEMS_TL | | EBS_ITEM_COMMON | ORG_ORGANIZATION_DEFINITIONS | | EBS_ONT | OE_AGREEMENTS_B | | EBS_ONT | OE_BLANKET_HEADERS_ALL | | EBS_ONT | OE_BLANKET_LINES_ALL | | EBS_ONT | OE_CREDIT_CHECK_RULES | | EBS_ONT | OE_DROP_SHIP_SOURCES | | EBS_ONT | OE_HOLD_AUTHORIZATIONS | | EBS_ONT | OE_HOLD_DEFINITIONS | | EBS_ONT | OE_HOLD_RELEASES | | EBS_ONT | OE_HOLD_SOURCES_ALL | | EBS_ONT | OE_LINE_SETS | | EBS_ONT | OE_LOOKUPS | | EBS_ONT | OE_ORDER_HEADERS_ALL | | EBS_ONT | OE_ORDER_HOLDS_ALL | | EBS_ONT | OE_ORDER_LINES_ALL | | EBS_ONT | OE_ORDER_LINES_HISTORY | | EBS_ONT | OE_ORDER_PRICE_ATTRIBS | | EBS_ONT | OE_ORDER_SOURCES | | EBS_ONT | OE_PC_ASSIGNMENTS | | EBS_ONT | OE_PC_CONSTRAINTS | | EBS_ONT | OE_PRICE_ADJUSTMENTS | | EBS_ONT | OE_REASONS | | EBS_ONT | OE_SALES_CREDIT_TYPES | | EBS_ONT | OE_SALES_CREDITS | | EBS_ONT | OE_SETS | | EBS_ONT | OE_SYSTEM_PARAMETERS_ALL | | EBS_ONT | OE_TRANSACTION_TYPES_ALL | | EBS_ONT | QP_PRICING_ATTRIBUTES | | EBS_ONT | SO_ORDER_TYPES_ALL | | EBS_PARTY_COMMON | AP_SUPPLIER_CONTACTS | | EBS_PARTY_COMMON | AP_SUPPLIER_SITES_ALL | | EBS_PARTY_COMMON | AP_SUPPLIERS | | EBS_PARTY_COMMON | AR_CUSTOMERS | | EBS_PARTY_COMMON | CUSTOMERADDRESS | | EBS_PARTY_COMMON | HZ_CONTACT_POINTS | | EBS_PARTY_COMMON | HZ_CUST_ACCOUNTS | | EBS_PARTY_COMMON | HZ_CUST_ACCT_SITES_ALL | | EBS_PARTY_COMMON | HZ_CUST_PROFILE_AMTS | | EBS_PARTY_COMMON | HZ_CUST_PROFILE_CLASSES | | EBS_PARTY_COMMON | HZ_CUST_SITE_USES_ALL | | EBS_PARTY_COMMON | HZ_CUSTOMER_PROFILES | | EBS_PARTY_COMMON | HZ_LOCATIONS | | EBS_PARTY_COMMON | HZ_ORG_CONTACTS | | EBS_PARTY_COMMON | HZ_PARTIES | | EBS_PARTY_COMMON | HZ_PARTY_SITES | | EBS_PARTY_COMMON | HZ_RELATIONSHIPS |