Release Notes 2022.2.0
Release Highlights
The goal of the Incorta Cloud 2022.2.0 release is to enhance analytical capabilities, data management, and performance. To that end, the 2022.2.0 release introduces SSH File Transfer Protocol (SFTP) servers as a data destination. The release also improves performance through a new compaction mechanism and parallel materialized view (MV) enrichment. This release introduces a new catalog folder structure that is compatible with Delta Lake.
New Features
Data Management Layer
Dashboards, Visualizations, and Analytics
Architecture and Application Layer
In addition to the new features, there are other enhancements and fixes.
Delta Lake compatible catalog folder structure
This release introduces a new catalog folder structure that is compatible with Delta Lake. This new enhancement results in changes to the Shared Storage directory structure, the output of a compaction (deduplication) job, and the way compacted-parquet consumers access the compacted segments.
- The new compaction (deduplication) mechanism reduces the I/O operations during a load job and saves disk space.
- Only Parquet files that have duplicates are rewritten to create compacted segments.
- Compacted segments are saved per compaction job under a new directory in the
source
area under the object directory: the_rewritten
directory. - Extracted Parquet files with no duplicates are no longer copied to the compacted segments directory (
_rewritten
). - Different versions of the compacted Parquet files can exist under the
_rewritten
directory. - A group of metadata files is generated per compaction job in Delta Lake file format to point to all Parquet files (whether extracted or rewritten) that constitute a compacted version. These metadata files are saved to a new directory,
_delta_log
, that exists also in thesource
area under the object directory. - Consumers of compacted Parquet files (MVs, SQLi on Spark port, internal and external Notebook services, and the Preview data function) will use the latest
<CompactedVersionID>.checkpoint.parquet
metadata file to find out which Parquet file versions, whether extracted or compacted, to read data from. In addition, the Cleanup job checks the same file before deleting the unused compacted Parquet versions.
Upgrade Considerations
- To read the Delta Lake metadata files, Spark can use either its native Delta Lake Reader (Default) or the Incorta Custom Delta Lake Reader. Each reader requires specific JAR files. The JAR files compatible with your Incorta Cloud Spark version will be automatically added to your cluster during the upgrade process.
- After upgrading to the 2022.2.0 release, consumers will continue to read from the old
compacted
directory until the first full or incremental load job (loading from source).
After the 2022.2.0 release upgrade, the first full or incremental load job may take a longer time to create the new structure and perform full compaction of all the required Parquet files.
For more information, refer to Data Ingestion and Loading → 2022.2.0 enhancements to the compaction process.
SFTP servers as a Data Destination
Incorta now supports SFTP servers as a data destination, where you can export dashboard data. The following are the supported visualizations that you can send to a SFTP server data destination:
You can export Pivot table insights to the XLSX file format only.
For more information, refer to Concepts → Data Destination.
Geo icon for Geometry columns in the Advanced Map visualization
This release introduces a new geo icon for the Geometry column in the Analyzer’s Data panel for Advanced Map visualizations. This new geo icon distinguishes geometry columns that use custom shapes from string columns. For more information on geo data, refer to Visualizations → Advanced Map.
Parallel enrichment of independent MVs within the same physical schema
This release improves the prolonged transformation phase of independent MVs within the same physical schema. During a schema load job, the Loader Service will start the enrichment of multiple independent MVs in parallel whenever resources are available. This tends to reduce the MV in-queue time and the enrichment time as a whole. For MVs with cyclic dependencies, ordering these MVs still depends on the alphabetical order of the MV names.
Additional Enhancements and Fixes
In addition to the new features and major enhancements mentioned above, this release introduces some additional enhancements and fixes that help to make Incorta more stable, engaging, and reliable.
Visualizations
- Enhanced the performance of Listing tables with multiple grouping dimensions
- Fixed an issue in several visualizations in which the number format is incorrect when the Aggregation is SUM by changing the default Number Format to Decimal instead of Rounded. The affected visualizations are: Pie, Pie Donut, Sunburst, Dual X-Axis, Map, Bubble Map, Pyramid, Treemap, Heatmap, Tab Cloud, Bubble, Packed Bubble, and Sankey
- Fixed an issue with the Bubble Map visualization in which changing the color of a Measure in the Properties panel did not reflect a color change in the visualization
- Fixed an issue in which downloading insights as PNG, JPEG, or SVG did not function properly on Safari Mac OS
- Fixed an issue that caused a chart to show a simple or exponential average line while the Period field was empty
Dashboards
- Enhanced the performance of dashboard rendering by increasing the number of threads in grouping and aggregation
- Fixed an issue in which the Send Now Dashboard option did not function properly when choosing PDF, HTML, or HTML file formats
- Fixed an issue in which dashboard folders were undimmed in the Save as dialog for users with view and/or share access rights
Analyzer
- Fixed an issue that caused the Analyzer to show an error message when visualizing data that had grouping dimensions with empty string values
- Fixed an issue with Area and Line visualizations in which the charts did not render in the Analyzer when a filter is applied to a Measure column
- Fixed an issue with the Column chart visualization in which adding an average line to a Measure column and then filtering this Measure resulted in the removal of the average line
- Fixed an issue with the Column Data Type drop down list in the Data panel in which logging in with the Arabic language did not translate the Column Data Type drop down list into Arabic
Cluster Management Console (CMC)
- Enhanced the CMC logs to exclude password, client and app secret, API key, and passphrase values when changed
- Enhanced insight query performance by increasing Insight Max Groups UI Default from 500,000 to 1,000,000,000. You can edit this setting in the CMC for a specific tenant or in Default Tenant Configurations → Advanced.
- Fixed an issue with the CMC Logs Manager, in which users were not able to download files that are equal to 1 GB
Materialized Views and SQLi
- Fixed an issue in which a case sensitive tenant name resulted in a failure when connecting to SQLi using BI tools
- Enhanced the SQLi audit file by preventing writes of the same query multiple times and correcting the number of records in the audit file
Miscellaneous
- Enhanced the Loader Service logs to show both the physical schema and the object names if the Cleanup job fails to read the schema XML in the metadata database due to an invalid filter condition
- Fixed an issue in which embedding dashboards with prompt filters that use session variables in a Salesforce iFrame logs out of Incorta Analytics instead of displaying the dashboard
- Fixed an issue with physical schemas in which extracting data from a Data Agent as a data source caused a performance issue
- For Incorta Cloud Trial users, the in-product chat no longer requires an email address.
Known Issues
The following are known issues in this release:
- Schema load logs for MVs are missing the following when entering the transforming phase: HA, Task, Schema Name, Table Name, Job ID, Job Type (Transforming), ETL, Load Type (INCR)
- Schema MV loads fail with a transformation error
- Box authorization error occurs when creating a new connection
- Notebook auto-complete for Spark Python is not working as expected