Tools → Data Studio Manager

About the Data Studio Manager

The Data Studio Manager is an Incorta Premium Feature that allows users to easily create Materialized Views (MVs) for their data using a simple drag-and-drop interface.

With Data Studio, users can perform various transformations on their data without complex coding, by connecting a series of analytical steps called Recipes. Starting with the base table loaded into Incorta, the data undergoes a series of transformations in which each step applies specific recipes to refine, shape, and analyze the data. This streamlined progression ensures that data seamlessly transitions from its initial state to a refined, actionable form within Data Studio.

Data Studio Manager Access Permissions

Only an admin can enable Data Studio within the Cluster Management Console (CMC). For more information, refer to Enabling the Data Studio.

Note

Starting 2024.7.2, the Incorta Premium offering must first be enabled before enabling Data Studio.

To access the Data Studio Manager, in the Navigation bar, select Data Studio. This opens the landing page where users manage multiple data flows.

Data Studio Manager Anatomy

The Data Studio Manager consists of the following:

Action bar

Select + New in the Action bar to open the Add New menu. The available options are:

  • Create Data Flow
  • Import Data Flow

The Search bar contains a search text box for finding existing data flows by name. The results include data flows you own or have view, share, or edit access to. You can scroll through the list to find your desired data flow.

List view

Each data flow appears as a row in the list view. The following table shows the available properties:

PropertyDescription
NameThe name of the data flow
OwnerThe display name of the user creator
StatusThe connection state: Connected or Disconnected
DescriptionA brief explanation of the data flow
CreatedThe date and time the data flow was created
Last ModifiedThe date and time the data flow was last modified
Vertical Ellipsis (⋮)The More Options (⋮) menu allows you to perform the following:
  ●   View Details — View information about the data flow, such as ID, Name, Sample, Size, Owner’s Name, etc.
  ●   Edit — Modify the name and description of the data flow
  ●   Share — Share the data flow with View, Edit, or Share access levels
  ●   Export — Download the data flow in .zip format
  ●  Delete — Remove the data flow (confirmation required)
  ●   Disconnect — Shown only when the instance is connected

The default sort in the list view is ascending by Name.
To change the sort for data flows, select the Name, Owner, Created, or Last Modified column heading, and change the arrow to point down to sort descending, and up to sort ascending.

Data Studio Manager Actions

Using the Data Studio Manager, you can perform the following actions:

Create data flows

  1. In the Navigation bar, select Data Studio.
  2. In the Action bar, select + New → Create Data Flow.
  3. In the Create Data Flow dialog:
    • In the Name field, enter a unique name of the data flow (note: names are case-sensitive).
    • In the Description field, optionally provide a brief description.
  4. Select Create.
  5. The Data Flow Editor opens, allowing you to begin building your flow.

Import data flows

  1. In the Navigation bar, select Data Studio.
  2. In the Action bar, select + New → Import Data Flow.
  3. Under Import Options, check the Overwrite existing dataflows box if you want to replace an existing data flow with the same name.
  4. In the Import Data Flow dialog, perform one of the following:
    • Drag <dataflow_name>.zip to the Import Data Flow dialog, or
    • Click inside the Click or drag a file here to upload area, then browse to the location of the <dataflow_name>.zip, and select Open.
  5. Wait for the upload to complete.
  6. In the Import Results dialog, review the operation status.
  7. Select Import.
    A confirmation message appears: Dataflow(s) imported successfully!

Edit data flows

  1. Select More Options (⋮) → Edit for the desired data flow.
  2. In the Edit Dataflow dialog, modify the Name and Description fields.
  3. Select Edit.
    A confirmation message appears: Dataflow(s) edited successfully!

Share data flows

You can share data flows with users or groups who have any of the following roles: Schema Manager, Advanced Analyzer User, or SuperRole.

Note

The option to share a data flow is available starting 2026.3.0.

  1. For a specific data flow, select More options ().

  2. Select Share.

  3. In the Share dialog, type the name of the user in the With: field.

  4. Select the eye icon to set the access level:

    • Can View

      • Allows users to open and review the data flow, preview cached data, and view recipe nodes, code, and information.
      • Users with View access cannot edit, revalidate, delete, disconnect, or share the flow.
    • Can Share

      • Includes all view permissions and also allows users to share the data flow with others, granting either view or share access.
    • Can Edit

      • Allows users to modify the data flow, including adding or editing recipes, validating data, deleting or disconnecting flows, and managing configurations.
      • To edit a data flow, the user must also have access to the associated schema(s).
  5. Select Share.
    A confirmation message appears, indicating that access has been shared.

Connect or disconnect data flows

  • To connect a data flow: Select the desired data flow to initiate and connect it.
  • To disconnect a data flow: Select More Options (⋮) → Disconnect for the desired data flow.

Export data flows

Export multiple data flows

  1. Select the checkboxes next to the desired data flows.
  2. In the Action Bar, select More Options (⋮) → Export to download them as a .zip file.

Export a single data flow

For a specific data flow, select More Options (⋮) → Export to download it as a .zip file.

Delete data flows

Delete multiple data flows

  1. Select the checkboxes next to the desired data flows.
  2. In the Action Bar, select More Options (⋮) → Delete.
  3. A confirmation prompt appears. Select Yes to permanently delete the data flows.

Delete a single data flow

  1. For a specific flow, select More Options (⋮) → Delete.
  2. A confirmation prompt appears. Select Yes to permanently delete the data flow.

Data Flow Editor Anatomy

To access the Data Flow Editor, open a specific data flow from the Data Studio Manager.
The interface consists of the following key components:

Action bar

The Action bar provides key actions for managing your data flow. It includes the following controls:

ControlDescription
+ RecipeAdd a recipe to transform and enrich data.
Note:
  ●   This option is available before 2026.3.0.
  ●   Starting 2026.3.0, a Recipes panel is available.
SettingsConfigure data sampling preferences and Spark properties.

Note: The Spark properties configuration is available starting 2026.3.0.
More Options (⋮)The menu includes the following options:
  ●   Re-validate Dataflow
  ●   Deploy All MVs
  ●   Refresh Schemas
  ●   Share Access (available starting 2026.3.0)
Close (X)Exit the current data flow editor.

View toolbar

The View toolbar provides navigation and layout controls for interacting with the data flow within the Canvas:

ControlDescription
Search barSearch for specific recipes. The search results will highlight and navigate to the matching recipe
+ Zoom in / - Zoom outAdjust the zoom level within the Canvas
Maximize CanvasExpand the data flow within the Canvas for a full view
Layout dropdown menuToggle between Default and Compact layouts of the data flow

Canvas

The Canvas serves as the central area where you build your data flow and connect data recipes. It displays all data flow components, including recipes and joins. Add recipes by selecting + Recipe in the Action bar or by dragging and dropping datasets from the Data panel.

Overview panel

The Overview panel displays a zoomed-out view of your entire data flow. Select any area within the Overview panel to instantly zoom into that section on the Canvas.
The Overview panel is located in the upper-left corner of the Canvas. Select the arrow icon to expand or collapse the panel as needed.

Recipes panel

The Recipes panel (available starting 2026.3.0) appears on the left side of the Data Flow Editor. It is expanded by default, scrollable, and includes a search field for quick recipe discovery.

The panel contains the following recipe categories:

  • Input & Output, including the Input Table and Save MV recipes
  • Content Transformation
  • Structure Transformation
  • Data Quality and Validation
  • Advanced Querying
  • Category.genAIOperations

Edit panel

The Edit panel appears on the right side of the Data Flow Editor when you select a recipe, allowing you to configure its properties. For more information, refer to Add a new Recipe.

Results pane

The Results pane provides a detailed view of the recipe’s output. It opens at the bottom of the Canvas when you select Explore in the Recipe panel, and includes:

  • Result Set — Displays output records in a paginated table
  • Profiling View — Offers insights into the result set through statistics, histograms, frequencies, and patterns depending on the dataset.
  • Filter — Enables filtering data by selecting specific columns.
  • Columns (available starting 2026.3.0) - Displays the schema definition, listing the recipe’s columns and their data types.
  • Alerts (available starting 2026.3.0) - Shows warnings and alerts when required recipe configurations are missing or become invalid.
  • Code - Displays an auto-generated script that shows the transformation logic of the selected recipe.
Note

Starting 2026.3.0, Data Studio supports Oracle metadata.

Recipe Actions panel - available before 2026.3.0 (Select to expand)

The Recipe Actions panel (available before 2026.3.0) appears on the right side of the Canvas when you select a recipe. It displays contextual actions and information specific to the selected recipe. The Recipe Actions panel includes the following features:

FeatureActionDescription
ExploreOpens the Results paneDisplays the recipe’s output data for profiling and filtering.
DeleteRemoves the selected recipeDisabled if the recipe has dependent child recipes.
Re-validateRe-validates the selected recipeConfirms that the recipe configuration is valid.
Preview codeDisplays an auto-generated scriptShows the transformation logic of the selected recipe.

Starting 2026.3.0, this option is included in the Results pane.
InfoDisplays recipe metadataIncludes details such as Name, Type, Result Status, and Parameters.
Data panel - available before 2026.3.0 (Select to expand)

The Data panel (available before 2026.3.0) manages datasets and allows dragging them onto the Canvas to create new recipes. It is located on the left side of the View Toolbar.

Data Flow Editor Actions

Using the Data Flow Editor, you can perform the following actions to build and manage your data flow:

Add a new Recipe

  1. Select a recipe:
    • Before 2026.3.0, in the Action bar, select + Recipe, then choose a recipe type.
    • Starting 2026.3.0, navigate to the Recipes panel on the left side of the Data Flow Editor.
  2. Choose a recipe type from the categories below, then configure its setting:

Note: The recipe name links to its configuration guide.

CategoryRecipeDescription
Content TransformationFilterRemove records from a dataset based on a condition.
Change TypeChange the data type of column(s).
SelectSelect which columns to keep or remove from a dataset.
UnpivotTranspose your dataset into columns and values.
SortSort data within a dataset.
FormulaAdd custom logic to create a new calculated field.
SampleSelect a subset of records within your dataset.
AggregationAggregate your data set and set granularity through 'group by' logic.
SplitSplit the dataset into two datasets.
RenameRename column labels in your dataset.
Structure TransformationJoinJoin two datasets based on a set of join logic
UnionUnion two datasets together.
Data Quality and ValidationFuzzy JoinCleanse data through providing a lookup table.
Data QualityUnleash the power of AI in your Dataflow.
Advanced QueryingPythonInject custom pySpark into your Dataflow.
SQLInject custom SQL into your data Dataflow.
Gen AILLMUnleash the power of AI in your Dataflow.
Deploy and Eject OperationsSave MVSave your data flow output to a Materialized View.

3. Select Save.
A confirmation message appears: Recipe added successfully!

Add a dataset as a recipe

  1. Select the Data Panel icon next to the View Toolbar to expand the Data Panel.
  2. In the Manage Dataset panel, select the checkboxes next to the tables you want to add.
  3. Add a recipe to the Canvas using one of the following methods:
    • Drag and drop a dataset from the Data Panel onto the canvas, or
    • Select the (+) icon next to a dataset in the Data Panel.

This creates a recipe on the canvas based on the selected dataset.

Create a Join recipe

  1. Select and drag from one recipe to another.
    This action automatically creates a Join recipe between the two datasets.
  2. Configure the Join recipe settings, including:
    • Recipe Name
    • Join Type
    • Left and Right Input
    • Match On
    • Join Condition

For more information on configuring the Join Recipe, refer to References → Join Recipe.

Filter data and create shortcut recipes

  1. Select a recipe on the Canvas.
  2. In the Recipe panel, select Explore.
  3. In the Results pane, select the Filter icon.
  4. Select the column(s) you want to filter.
  5. From the Selection Type dropdown menu, choose:
    • Include — keeps only selected values
    • Exclude — removes selected values
  6. Select Apply. The Filter data appears in the Result Set pane.
  7. Select Save ( 💾) to create a new recipe based on this filter.
  8. Enter a Name for the new recipe shortcut.
  9. Select Save.
    A confirmation message appears: Recipe added successfully!

View Recipe information with a pop-up

  1. Hover over a recipe on the Canvas to view its information.

  2. A pop-up displays key details, including status, last run time, duration, and row count.

    For recipes with multiple outputs, such as the Split Recipe, row counts are shown for each output.

    Note

    The Recipe pop-up is available starting 2026.3.0.

Preview code for a recipe

  1. Select a recipe on the Canvas.
  2. In the Recipe panel, select the Preview Code icon.
  3. View the auto-generated script that represents the transformation logic of the selected recipe.
  4. Select X to close the code view.

Configure data flow settings

  1. In the Action bar, select Settings.

  2. In the Settings dialog, Enable Sampling is toggled on by default with a sample size of 1000.

  3. Edit the Sample Size based on your performance and profiling needs.

    Note

    Disabling sampling or increasing the sample size may slow down execution.

  4. Under Spark Properties, select Add Property to define key–value pairs. You can add, edit, or delete configurations as needed.

    Note
    • The option to configure Spark properties is available starting 2026.3.0.

    • By default, the Spark application uses the following configuration values:

      • spark.driver.memory: 1 g
      • spark.executor.memory: 1 g
      • spark.executor.cores: 1

      You can modify these default values using the new Spark configuration option.

  5. Select Save or Save & Restart to apply the changes.
    A confirmation message appears: Sampling size was changed successfully. Results will be updated after re-initializing the dataflow.

Re-validate a data flow

  1. In the Action bar, select More Options (⋮) → Re-validate Dataflow.
  2. A confirmation prompt appears. Select Yes to revalidate your entire data flow.

Re-validate a recipe

  1. Select a recipe on the Canvas
  2. In the Recipe panel, select the Re-validate icon.
  3. A confirmation prompt appears. Select Yes to revalidate your selected recipe.

Deploy all MVs

  1. In the Action bar, select More Options (⋮) → Deploy All MVs.
  2. A confirmation prompt appears. Select Yes to deploy all MV recipes in the data flow.

Refresh schemas

  1. In the Action bar, select More Options (⋮) → Refresh Schemas.
  2. A confirmation prompt appears. Select Yes to refresh the schemas.

Best Practices

  • Materialized View (MV) deployment strategies
    Choose between:
    • Updates via data flow.
    • Static data flow, where updates are managed by editing in the Notebook.
  • Data sampling
    • By default, data sampling is limited to 1,000 records for profiling to ensure optimal performance.
    • Disabling sampling can impact performance.
  • Naming conventions
    • Materialized Views (MVs) can be traced back to their data flows.
    • Use descriptive and consistent names for data flows and MVs.