References → All Data Studio Recipes
The Data Studio is a rapid low-code/no-code MV development enviroment. With the Data Studio, developers can connect a series of analytical steps (Recipes) to develop an enriched dataset. Once the analytics steps are defined in the Data Studio, the developer will output the Dataflow as pySpark into a Physical Schema of their choice.
Activating the Data Studio
To activate the data studio, please refer to this guide.
Recipe Index
Recipes are foundational tools to performing data quality checks, cleansing, analysis, blending and much more.
Content Transformation
Change Type - Change the data type of column(s).
Filter - Remove records from a dataset based on a condition.
Select - Select which columns to keep or remove from a dataset.
Sort - Sort data within a dataset.
Unpivot - Transpose your dataset based into columns and values.
Formula - Add customer logic to create a new calculated field.
Sample - Select a subset of records within your dataset.
Aggregation - Aggregate youdata set and set granularity through 'group by' logic.
Rename - Rename column labels in your dataset.
Split - Split the dataset into two datasets.
Structure Transformation
Join - Join two datasets based on a set of join logic.
Union - Union two datasets together.
Data Quality and Validation
Fuzzy Join - Cleanse data through providing a lookup table.
Data Quality - Write a set of conditions and flag any violations in your dataset.
Advanced Querying
Python - Inject custom pySpark into your data flow.
SQL - Inject custom SQL into your data flow.
Deploy and Eject Operations
Save MV - Save your data flow output to a Materialized View.