Connectors → Google BigQuery

About Google BigQuery

Google BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. It is a Platform as a Service that supports querying using ANSI SQL. It also has built-in machine learning capabilities.

Google BigQuery Connector Updates

This section is to explore the updates in the newer versions of the BigQuery connector available on the Incorta connectors marketplace.

In order to get the newer version of the connector, please update the connector using the marketplace.

VersionUpdates
2.2.0.2A new extra option, sql.chunks.limit, is now available to specify the maximum number of chunks that can be created by a data set when chunking is enabled. The default is 100, and the maximum is 300. If you specify a limit that exceeds 300, the connector resets it to 300.
Recommendation

Keep your connector up-to-date with the latest connector version released to get all introduced fixes and enhancements.

About the Google BigQuery Connector

The Google BigQuery Connector enables Incorta to access data stored in Google BigQuery data sets. You can access the data you want with a standard SQL query. The Google BigQuery Connector uses the com.simba.googlebigquery.jdbc42.Driver driver.

The Google BigQuery connector supports the following Incorta specific functionality:

FeatureSupported
Chunking
Data Agent
Encryption at Ingest
Incremental Load
Multi-Source
OAuth
Performance Optimized
Remote
Single-Source
Spark Extraction
Webhook Callbacks

Security configurations for the Google BigQuery connector

Prerequisites for use of the BigQuery connector are the following, which are typically handled by a Security Administrator or System Administrator who manages your organization’s G Suite accounts:

  • Create a G Suite Account
  • Within the G Suite Account, create a Google Cloud Project
  • Within the Google Cloud Project,
    • Enable the BigQuery API
    • Create the BigQuery dataset(s)

The Security Administrator or System Administrator must then create a service account key. A service account key can be created with the following steps:

  • In the Cloud Console, go to the Create service account key page.
  • From the Service account list, select New service account.
  • In the Service account name field, enter a name.
  • From the Role list, select a role.
  • Select Create. A JSON file that contains your key downloads to your computer. The Incorta Administrator must copy this JSON to the Incorta server. The full path to this file will be entered in the connector properties.
Note

The Role field affects which resources your service account can access in your project. You can revoke these roles or grant additional roles later. In production environments, do not grant the Owner, Editor, or Viewer roles. For more information, see Manage access to projects, folders, and organizations.

Steps to connect Google BigQuery and Incorta

To connect Google BigQuery and Incorta, here are the high level steps, tools, and procedures:

Create an external data source

Here are the steps to create a external data source with the Google BigQuery connector:

  • Sign in to the Incorta Direct Data Platform™.
  • In the Navigation bar, select Data.
  • In the Action bar, select + NewAdd Data Source.
  • In the Choose a Data Source dialog, in Query service, select BigQuery.
  • In the New Data Source dialog, specify the applicable connector properties.
  • To test, select Test Connection.
  • Select Ok to save your changes.

Google BigQuery connector properties

Here are the properties for the Google BigQuery connector:

PropertyControlDescription
Data Source Nametext boxEnter the name of the data source
Project IDtext boxEnter the Google Cloud Project ID
Private Key Pathtext boxEnter the full path to the Google BigQuery private key file on the Incorta server. For example, home/incorta/IncortaAnalytics/
IncortaNode/runtime/lib/
bigqueryfullkey.json
Connection Pooltext boxEnter the connection pool. The default is 30.
Connection Propertiestext boxOptionally enter connector properties for a custom connection to Google BigQuery in the format: propertyName=propertyValue, where each connector property is on a new line.

The available connector properties are specified by com.simba.googlebigquery.jdbc42.Driver. Refer to the Google BigQuery Driver Configuration Options for further information.
Extra Optionstext boxEnter supported extra options in the form of key=value.

Create a schema with the Schema Wizard

Here are the steps to create an Google BigQuery schema with the Schema Wizard:

  • Sign in to the Incorta Direct Data Platform™.
  • In the Navigation bar, select Schema.
  • In the Action bar, select + New → Schema Wizard
  • In (1) Choose a Source, specify the following:
    • For Enter a name, enter the schema name.
    • For Select a Datasource, select the Google BigQuery external data source.
    • Optionally create a description.
  • In the Schema Wizard footer, select Next.
  • In (2) Manage Tables, in the Data Panel, navigate the directory tree as necessary to select the Google BigQuery files. You can either check the Select All checkbox or select individual sheets.
  • In the Schema Wizard footer, select Next.
  • In (3) Finalize, in the Schema Wizard footer, select Create Schema.

Create a schema with the Schema Designer

Here are the steps to create an Google BigQuery schema using the Schema Designer:

  • Sign in to the Incorta Direct Data Platform™.
  • In the Navigation bar, select Schema.
  • In the Action bar, select + New → Create Schema.
  • In Name, specify the schema name, and select Save.
  • In Start adding tables to your schema, select SQL Database.
  • In the Data Source dialog, specify the Google BigQuery table data source properties.
  • Select Add.
  • In the Table Editor, in the Table Summary section, enter the table name.
  • To save your changes, select Done in the Action bar.

Google BigQuery table data source properties

For a schema table in Incorta, you can define the following Google BigQuery specific data source properties as follows:

PropertyControlDescription
Typedrop down listDefault is SQL Database
Data Sourcedrop down listSelect the Google BigQuery external data source
IncrementaltoggleEnable the incremental load configuration for the schema table
Querytext boxEnter the SQL query to retrieve data from the Google BigQuery dataset
Update Querytext boxEnable Incremental to configure this property. Enter the SQL query to retrieve data updates from the Google BigQuery dataset.
Timestamp Field Typedrop down listEnable Incremental to configure this property. Select the format of the table date column:
  ●  Timestamp
  ●  Unix Epoch (seconds)
  ●  Unix Epoch (milliseconds)
Fetch Sizetext boxUsed for performance improvement, fetch size defines the number of records that will be retrieved from the database in each batch until all records are retrieved. The default is 5000.
Chunking Methoddrop down listChunking methods allow for parallel extraction of large tables. The default is No Chunking. There are two chunking methods:
  ●   By Size of Chunking (Single Table)
  ●   By Date/Timestamp
Chunk Sizetext boxSelect By Size of Chunking for the Chunking Method to set this property. Enter the number of records to extract in each chunk in relation to the Fetch Size. The default is 3 times the Fetch Size.
Order Columndrop down listSelect By Size of Chunking for the Chunking Method to set this property. Select a column in the source table you want to order by before chunking. It's typically an ID column and it must be numeric.
Upper Bound for Order Columntext boxOptional. Enter the maximum value for the order column.
Lower Bound for Order Columntext boxOptional. Enter the minimum value for the order column.
Order Column [Date/Timestamp]drop down listSelect By Date/Timestamp for the Chunking Method to set this property. Select a column in the source table you want to order by before chunking. It should be a Date/Timestamp column.
Chunk Perioddrop down listSelect the chunk period that will be used in dividing chunks:
  ●   Daily
  ●   Weekly (default)
  ●   Monthly
  ●   Yearly
  ●   Custom
Number of daystext boxSelect Custom for the Chunk Period to set this property. Enter the chunking period in days
CallbacktoggleEnable this option to call back on the source data set
Callback URLtext boxEnable Callback to configure this property. Specify the URL.

View the schema diagram with the Schema Diagram Viewer

Here are the steps to view the schema diagram using the Schema Diagram Viewer:

  • Sign in to the Incorta Direct Data Platform™.
  • In the Navigation bar, select Schema.
  • In the list of schemas, select the Google BigQuery schema.
  • In the Schema Designer, in the Action bar, select Diagram.

Load the schema

Here are the steps to perform a Full Load of the Google BigQuery schema using the Schema Designer:

  • Sign in to the Incorta Direct Data Platform™.
  • In the Navigation bar, select Schema.
  • In the list of schemas, select the Google BigQuery schema.
  • In the Schema Designer, in the Action bar, select LoadFull Load.
  • To review the load status, in Last Load Status, select the date.

Explore the schema

With the full load of the Google BigQuery schema complete, you can use the Analyzer to explore the schema, create your first insight, and save the insight to a new dashboard.

To open the Analyzer from the schema, follow these steps:

  • In the Navigation bar, select Schema.
  • In the Schema Manager, in the List view, select the Google BigQuery schema.
  • In the Schema Designer, in the Action bar, select Explore Data.