References → Advanced SQLi and Null Handling

Overview

When first introduced, the Null Handling feature was coupled with the Advanced SQL Interface (Advanced SQLi). Enabling Advanced SQLi automatically enabled Null Handling. However, starting with 2024.7.3, Null Handling has been decoupled from Advanced SQLi, allowing both features to be enabled or disabled independently.

While you can now control Null Handling separately, disabling it when using Advanced SQLi may lead to data inconsistencies. To ensure consistent query results across tools, including third-party BI tools and Incorta tools such as insights, Incorta Copilot (Nexus), and Business Notebooks, it is strongly recommended to enable Null Handling when leveraging Advanced SQLi.

The Advanced SQLi selects the best execution path for queries by utilizing either the Engine or Spark for performance optimization. While Spark naturally respects null values, Incorta Engine requires the Null Handling option to be enabled to correctly treat nulls as true nulls, rather than zeros or empty values.

Inconsistency example

The following example illustrates potential inconsistencies when Advanced SQLi is used while the Null Handling option is disabled in the CMC.

Raw data example

Consider the following raw data in the ProductSalesSchema.ProductData table:

ProductIDProductNameCategoryPrice
100WebcamAccessoriesnull
101MonitorAccessories100
102MouseAccessories50
103TabletElectronicsnull
104LaptopElectronicsnull

Spark SQL view query

The following is the query of a Spark SQL view that calculates the average and percentile of the Price grouped by Category:

SELECT Category, avg(Price) as avgPrice, percentile(Price , 0.3) as percentilePrice
FROM ProductSalesSchema.ProductData
group by Category
ORDER BY Category;

Execution in the Query Editor

Since this query includes the percentile function, Advanced SQLi directs it to Spark. When executed in the Query Editor, Spark handles null values correctly, producing the following result:

CategoryavgPricepercentilePrice
Accessories75.065.0
Electronicsnullnull

Exploring the entire view in the Analyzer

Exploring the entire view in the Analyzer and creating a listing table containing all columns also sends the query to Spark, resulting in the same output:

CategoryavgPricepercentilePrice
Accessories75.0065.00
Electronicsnullnull

Inconsistent behavior without Null Handling

If you remove the percentilePrice column and only explore avgPrice, the query is routed to the Incorta Engine. When the Null Handling option is disabled, the Engine treats null values in the Price column as zeros, resulting in the following incorrect output:

CategoryavgPrice
Accessories50.00
Electronics0.00

Consistent behavior with Null Handling enabled

Enabling the Null Handling option ensures the Engine treats null values as true nulls, resulting in consistent output with Spark:

CategoryavgPrice
Accessories75.00
Electronicsnull