References → SQL interface (SQLi)

About the SQL interface (SQLi)

The Analytics Service exposes all tenants in an Incorta cluster as individual databases over the SQL interface (SQLi) using the PostgreSQL protocol.

The PostgreSQL protocol supports a JDBC connection using the appropriate PostgreSQL JDBC driver. With the SQLi, an external application such as Microsoft Power BI or Tableau can query a tenant database using Structured Query Language (SQL).

Internally, the SQLi manages how to process an external SQL query. Based off the query shape, the SQLi determines if the in-memory query engine of the Analytics Service processes the query or if Apache Spark processes the query using the SQL App and Spark SQL.

Looker Studio

If you are connecting to your data source using a recent version of Looker Studio, the Show Table feature does not work as it uses default schema that is named public. For more information, refer to Looker official documentation.

About SQL query shapes

To learn more about the typical query shapes, data models, scale sizes, and benchmark queries, visit tpc.org.

Here are few that are relevant to data warehousing and analytics related benchmarks that most professional in the industry find relevant:

NameLink
TPC-DShttp://tpc.org/tpcds/default5.asp
TPC-Hhttp://tpc.org/tpch/default5.asp

Analytics Service native support for Left Outer Joins

By default, the Loader Service builds a Direct Data Map using LEFT OUTER JOINS for a table relationships in a physical schema. For this reason, the Analytics Service in-memory engine supports SQL query shapes that feature LEFT OUTER JOINS in the SQL statement.

Spark SQL

Apache Sparks supports numerous query shapes including various JOIN types natively:

  • INNER JOIN
  • LEFT JOIN
  • RIGHT JOIN
  • FULL JOIN
  • CROSS JOIN
  • SEMI JOIN
  • ANTI JOIN

To learn more, please review Apache Spark SQL Join Syntax

External integrations often require robust JOIN support for various query shapes.

Note

Joins are not recommended when there are no relationships between the tables in the query.

For example, using a CROSS JOIN (Cartesian join) can produce very large result sets, which may lead to performance issues, especially with large tables. When it’s not recommended:

  • With large data sets: Since a Cartesian join returns every possible combination of rows from two tables, the result set can grow exponentially if the tables are large, which may degrade performance.
  • Unintentional Usage: Often, a CROSS JOIN occurs unintentionally when the query does not include a WHERE or JOIN condition, for example, SELECT * FROM products, employees. This can result in an unexpectedly large number of rows being returned.

SQli ports

There are two default ports for the SQLi.

  • 5436 directs external queries to the SQLi interface
  • 5442 directs external queries to the Apache Spark

The SQLi directs a query that conform to a LEFT OUTER JOIN shape to the Analytics Service. For all other query shapes, the SQLi directs the query to Apache Spark using Spark SQL. A listener application receives and process the job. The listener application is com.incorta.sqlApp.App.